Monday, April 30, 2018

Stratification: how linguists traditionally identify borrowings

In my previous blog post, I illustrated how important it is to take the systemic aspects of sound change into account when comparing languages. What surfaces as a surprisingly regular process is in fact a process during which the sound system of a language changes. Since the words in a given language are derived from the sound system, a change in the system will necessarily change all words in which the respective sound occurs.

On one hand, this makes it much more difficult for linguists to identify homologous words across languages. On the other hand, however, it enables us to identify borrowings, by searching for exceptions to regular sound correspondences. I will be discussing the latter here.

Sound changes and borrowing

In order to illustrate how this can be done in practice, consider the examples of 15 cognates between German and English in the following table:

No. German  English
1 Dach  thatch
2 Daumen  thumb
3 Degen  thane
4 Ding  thing
5 drei  three
6 Durst  thirst
7 denken  think
8 Dieb  thief
9 dreschen  thresh
10 Drossel  throat

When comparing these words quickly, it is easy to see that in all cases where German has a d as the initial sound, English has a th. This sound correspondence, as we call it in historical linguistics, reflects a very typical systematic similarity between English and German, which we can identify for all related words in English and German which go back to Proto-Germanic θ-, a very regular sound change which is well accounted for in Indo-European linguistics.

Not all homologous words between English and German, however, show this correspondences, as we can easily see from the five examples provided in the next table:

No. German English
11 Dill dill
12 dumm dumb
13 Damm dam
14 Dunst dunst
15 Dollar dollar

It is easy to see that these words don't fit our expected pattern (d matching th as the first consonant). It is also clear from the overall similarity of the words that it is rather unlikely that they trace back to different words, and thus turn out to be not cognate at all. One of the simplest possible explanations for the divergence from our initial d in German corresponding to θ in English, which now surfaces as d = d, is borrowing, be it from German to English, from English to German, or from some third language.

Among the five examples, the final one, Dollar is the easiest to explain, as we are dealing with a recent borrowing of the name of the U.S. currency. English dollar itself has another cognate with German, namely German Taler, the name of a currency from ancient times (see here for the full etymology, based on Pfeifer 1993).

The other four terms in the table may seem less straightforward to explain as borrowings, as they are by no means of recent origin; but we can confirm their exceptional status by contrasting them with older Middle High German readings (11-14th century), which are listed in the following table for all 15 of our examples:

No. German English Middle High German
1 Dach thatch dah
2 Daumen thumb dūm
3 Degen thane degan
4 Ding thing ding
5 drei three drī
6 Durst thirst durst
7 denken think denken
8 Dieb thief diob
9 dreschen thresh dreskan
10 Drossel throat drozze
11 Dill dill tilli
12 dumm dumb tumb
13 Damm dam tam
14 Dunst dunst tunst
15 Dollar dollar

As can be easily seen from this table, examples 11-14 all have a t as the initial consonant in Middle High German, and not d, as in the other cases. The change from original Proto-Germanic d to t in German is a well-attested sound change, for which we have many examples in the form of sound correspondences (cf. day vs. Tag, do vs. tun, etc.). We can therefore conclude that the Middle High German readings like tilli vs. English dill reflect the readings we would expect if all words had changed according to the rules. Since no regular change from t in Middle High German to d in Standard High German can be attested, it is furthermore safe to assume that the words have been modified under the influence of contact with other Germanic language varieties.

Here, English is not the most obvious candidate for contact; and the influence is rather due to contact with neighboring language varieties in the North-West of Germany, such as Frisian or Dutch. Similar to English, they have retained the original d (cf. Dutch dille vs. English dill). If speakers of High German varieties borrowed the term from speakers of Low German varieties, they would re-introduce the original d into their language, as we can see in our examples 11-14.

Why some of these borrowings took place and some did not is hard to say. That people in the North-West, living on the coast, know more about the building of dams, for example, is probably a good explanation why High German borrowed the term: obviously, the High German speakers did not use the word tam all that frequently, but instead heard the word dam often in conversations with neighboring varieties closer to the coast. For the other words, however, it is difficult to tell what was the reason for the success of the alternative forms.


Despite its important role for historical language comparison, the kind of analysis described here, by which linguists infer exceptional patterns in order to identify borrowings, is not well documented, either in handbooks of historical linguistics or in the journal literature. Following Lee and Sagart (2008), it is probably best called stratification analysis, since linguists try to identify the layers of contact and inheritance which surface in the form of sound correspondences. If these layers are correctly identified, linguists can often not only determine the direction in which a borrowing occurred, but also the relative time window in which this borrowing must have happened. This is the reason why linguists can often give very detailed word histories, which show where a word was first borrowed and how it then traveled through linguistic landscapes.

As for so many methods in historical language comparison, it is difficult to identify a straightforward counterpart of this technique in biology. What probably comes closest is the usage of GC content as a proxy for the inference of directed networks of lateral gene transfer (as described in, for example, Popa et al. 2011). In contrast to lateral gene transfer in biology, however, our linguistic word histories are often much more detailed, especially in those cases where we have well-documented languages.

For the future, I hope that increased efforts to formalize the process of cognate identification, cognate annotation, and phonetic alignments in computer-assisted frameworks to historical language comparison may help to improve the way we infer borrowings in linguistics. There are so many open questions about lateral word transfer in historical linguistics that we cannot answer by sifting manually through datasets. We will need all the support we can get from automatic and semi-automatic approaches, if we want to shed some light on the many mysterious non-vertical aspects of language evolution.


Lee, Y.-J. and L. Sagart (2008) No limits to borrowing: The case of Bai and Chinese. Diachronica 25.3: 357-385.

Pfeifer, W. (1993) Etymologisches Wörterbuch des Deutschen. Akademie: Berlin.

Popa, O., E. Hazkani-Covo, G. Landan, W. Martin, and T. Dagan (2011) Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes. Genome Research 21.4: 599-609.

No comments:

Post a Comment