Four modes of creole formation

A ‘pidgin’ is a language formed by contact between speakers of different languages. A ‘creole’ is what happens when a pidgin becomes a birth language for children raised where a pidgin is spoken. Pidgins are simple languages, stripped to the running gears, Often creoles re-complexify in later generations, retaining grammar mostly from one parent language and vocabulary mostly from the other.


My interest in the historical linguistics of pidgins and creoles began a very long time ago when I noticed that pidgins, wherever they arise, are usually morphologically a lot like English – analytic (positional) grammar with few inflections, SVO order oftener than can be accounted for by the fact that English is often one of the parent languages. Why should this be?


Nicholas Ostler’s excellent Empires of the Word deepened the question by proposing that analytic SVO grammar is the common factor in languages like English, Chinese and Malay that have been very successful at spreading from their original homelands. In his account, that is because this class of language has the lowest complexity barrier to acquisition for adult speakers.


That would explain pidgins all right – they look like they do because they’re invented by adults as the simplest possible way to establish communication. And English, with similar traits, is a non-pidgin that has spread like crazy because it combines the prestige of the Anglosphere with being exceptionally easy for native speakers of other languages to learn.


Er, but why is English like that in the first place?



A few years back I tripped over – and was instantly fascinated by – the notion that English is best understood historically as an old creole. Most educated people know the ha-ha-only-serious line about English being the result of attempts by Norman men-at-arms to pick up Saxon barmaids; this refers to pidgin formation between late Anglo-Saxon and old French after 1066, with later Middle English viewed as the succeeding creole.


But there may have been an even more important creolization 150 years previously when Vikings conquered the region of central England known as the Danelaw and their West Norse collided with midlands Anglo-Saxon of the time. Midlands Anglo-Saxon could be considered to have been replaced by an Anglo-Norse creole with a simplified version of old Anglo-Saxon grammar and a lot of Norse vocabulary (including items as basic as most of the pronouns that English has used since).


So, double creolization with the grammar getting simplified at each stage. But wait! There’s more! We don’t know for sure because the Anglo-Saxons who invaded Britain after the Roman collapse in 410 weren’t literate, but it is quite possible that there was an even earlier creolization following on their domination of the Celtic natives.


So, running this forward in time, the oldest Anglo-Saxon collides with Celtic languages; the resulting creole becomes middle Anglo-Saxon. Which then collides with Old Norse; the creole from this becomes late Anglo-Saxon. That is what collides with Old French and, badda boom badda bing, Middle English.


At each stage the language grammar gets progressively more stripped to its running gears – more analytic, more SVO, more pidgin-like – only to partly re-complexify (as creoles do) and pick up loads of vocabulary items along the way.


Unfortunately we have only a badly foreshortened view of all this because the Anglo-Saxon manuscripts are mostly from the late period. The only creolization process we can more or less track is the latest of the three; the previous two have to be inferred from, for example, the fact that late Anglo-Saxon had already adopted the pronouns of West Norse.


The other problem with this theory is actually a problem with linguists. Ever since philologists started reconstructing the history of the Indo-European language group in the 1800s, linguists have loved nice tidy evolutionary tree structures. Crosslinks that mess up that picture, like sprachbunds or creole formations, are not loved. There is still – at least it seems to me, as an amateur but careful observer – a tendency in academia to want to banish pidgin and creole formation to the periphery, denying that these can be central features of the history of “great” languages.


Part of this comes from historical connotation; pidgin and creole formation were first noticed in the record of the Age of Exploration, when Europeans were having a high old time rambling all over the globe trading with, warring on, enslaving, and having sex with various kinds of dusky-skinned natives. Pidgin/creole formation has still got an associative whiff about it of dandies keeping octoroon mistresses that’s a bit unseemly.


OK, so what if we try looking past that? What happens if, instead of admitting there’s been language hybridization and applying the label “creole” only when forced and the contact event was recent, we think of creolization as a language-formation process that is historically normal under certain circumstances, and go looking for examples and recurring patterns?


Now, much of the rest of this is mostly me being speculative. And I’m not a trained historical linguist with a PhD union card, so its unlikely any of the pros will care much what I think. But I see some fascinating vistas opening up.


For one thing, there are recurring patterns. I can see four: trade creoles, conquest creoles, camp creoles, and court creoles. The lines among these are not perfectly sharp, and sometimes a creole will be repurposed after it forms, but they illustrate four basic modes of formation.


A trade creole is a language evolved from a trade pidgin. There are numerous examples in the South Pacific, of which Tok Pisin in Papua New Guinea is among the best known; these are recently formed within the last 200 years and linguists do apply the label “creole” to them.


On the other hand, Swahili seems to be an old trade creole, resulting from contact between coastal Bantu languages in West Africa and Arab traders and beginning to form not long after 600CE. Linguists normally don’t label it as a creole, but it has kept the pattern of simplified and regularized grammar relative to its root languages. One marker of this is that unlike most of the area’s Bantu languages, Swahili has no tonal system – a feature contact pidgins invariably drop.


The Indonesian archipelago is rife with dozens of trade creoles formed by contact among different Austronesian languages and others (including Chinese and Sanskrit). National Indonesian and Malay are themselves well understood to be old trade creoles, though in the normal way of such things the “creole” label is seldom applied.


A conquest creole is a case like Middle English where an incoming military elite forms a contact pidgin with the natives that displaces the native “pure” language. I have previously noted that this seems to fit what happened to middle Anglo-Saxon in the Danelaw beginning 150 years earlier. If there was a still earlier creolization event mixing Anglo-Saxon and Celtic, Middle Anglo-Sazion too would have been a conquest creole.


A thing to look for in conquest creoles is the formation of a creole continuum in which the language of the invaders is the acrolect, a relatively less modified version of the indigenous language is the basilect, and individuals routinely code-switch from lower to higher forms and vice-versa without recognizing that this involves not just a change in vocabulary but substantial morphological shifts as well. I don’t think you get this in trade creoles, where social relationships along the contact front are more horizontal.


Another example, helpfully demonstrating multiple creolizations in a language’s back story in case we are tempted to think of English as unique, is Maltese – grammar from Arabic conquerors, vocabulary from Romance-speaking subjects. Recently (last 150 years) “old” Maltese has been largely replaced by an Anglo-Maltese creole.


Next, the camp creole. This is a creole formed from a pidgin invented as a military command language in a multilingual empire.


The best known of these is Hindi/Urdu, the latter name for which literally means “camp language” (it’s related to Mongol “ordu” and the derived English word “horde”). It originated as a military pidgin formed by contact among a largish group of related North Indian languages around the 7th century CE (so, about as old as Swahili).


I tripped over a minority theory of the origin of modern German a while back. The usual story about this is that it’s the language of Luther’s Bible, but apparently some experts think it is more properly viewed as having originated more recently as a military pidgin in Frederick the Great’s Prussia. At the time there were different so-called “dialects” of German that were quite mutually unintelligible, so this theory makes functional sense.


My source even proposed that this is why modern German tends to verb-final order in sentences – as a way of making command verbs more prominent.


The label “creole” is, as you are probably expecting by now, not generally applied to either Hindi or Modern German. In the case of Hindi, though, it fits the generally accepted interpretation of the language’s history. And I’m betting the military-pidgin account of modern German hasn’t gotten quite the attention it deserves.


Our fourth variety is the court creole. This is like a camp creole, but instead of arising from a military pidgin it develops among the ruling elite in the capital of a multi-lingual empire. Because once your polity gets past a certain size you need to recruit administrators, servants, concubines and whatnot from places where the language isn’t yours, and they then have the same contact problem as a military camp.


There are a couple of solutions to this problem that don’t involve spinning up a new language. You might be able to actually impose your language on the natives; the Romans were pretty effective at this. You might be able to adapt a camp creole that your armies already speak; this is how Hindustani became a court language with a literary tradition. But if all else fails, your capital is going to grow its own creole because it has to.


The type example I had in mind for “court creole” when I began writing was Mandarin Chinese, but there was the problem that it’s tonal, a feature normally lost during pidginization. On the other hand, Wikipedia says straight up that Mandarin arose during the Ming dynasty “as a practical measure, to circumvent the mutual unintelligibility of the varieties of Chinese”, which certainly sounds like my court-creole notion in action.


The standard account describes Mandarin as a “koine”, which is elsewhere defined as a contact language arising among mutually intelligible varieties – and thus without undergoing drastic complexity reduction through a pidginization phase. That would remove the mystery about the retention of tonality.


But something is off here. We actually know that mutual unintelligibility was a problem at the time; one Emperor put a complaint on record that he couldn’t understand the speech of certain provincial officials, and founded language academies to attack the problem. This doesn’t really sound like the conventional koine-formation story.


We might be running up against an edge case for the “koine” and “creole” categories where it’s difficult to know which applies. Or linguists might be exhibiting their usual flinch about using the label “creole” outside of a dandies-and-octoroons situation. It’s difficult to know and I am certainly too ignorant of historical Chinese linguistics to justify a strong opinion.


What I think we can propose is that the specialists ought to take a fresh look at the period sources and see if they can detect any traces of what looks like pidginization (with its characteristic loss of grammatical complexity) in the period when Mandarin was forming.


What we can say without dispute is that like English and Malay (but, to be fair, unlike Swahili) modern Mandarin has retained a lot of pidgin-like traits commonly found in creoles – SVO analytic grammar, simple phonology, a spoken form easily acquired by adults from other language groups (even as its written form is infamously difficult).


Which brings us to the end of my speculation. I wish I had an unambiguous example of a court creole to lay down, but just this brief survey should make clear that creolization events have been both more common and far more important than you’d think from the linguistics textbooks.


Even the oldest attested human language may have been a creole. There are structural and lexical indications in Sumerian that it may have fused from a couple of rather dissimilar languages spoken still earlier in the Fertile Crescent!


So, hey, academic linguists, stop being such prudes about language hybridization, eh? It’s limiting your vision.

 •  0 comments  •  flag
Share on Twitter
Published on April 02, 2017 07:56
No comments have been added yet.


Eric S. Raymond's Blog

Eric S. Raymond
Eric S. Raymond isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Eric S. Raymond's blog with rss.