Monday, October 28, 2019

Typology of sound change (Open problems in computational diversity linguistics 9)

We are getting closer to the end of my list of open problems in computational diversity linguistics. After this post, there is only one left, for November, followed by an outlook and a wrap-up in December.

In last month's post, devoted to the Typology of semantic change, I discussed the general aspects of a typology in linguistics, or — to be more precise — how I think that linguists use the term. One of the necessary conditions for a typology to be meaningful is that the phenomenon under questions shows enough similarities across the languages of the world, so that patterns or tendencies can be identified regardless of the historical relations between human languages.

Sound change in this context refers to a very peculiar phenomenon observed in the change of spoken languages, by which certain sounds in the inventory of a given language change their pronunciation over time. This often occurs across all of the words in which these sounds recur, or across only those sounds which appear to occur in specific phonetic contexts.

As I have discussed this phenomenon in quite a few past blog posts, I will not discuss it any more here, but I will rather simply refer to the specific task, that this problem entails:
Assuming (if needed) a given time frame, in which the change occurs, establish a general typology that informs about the universal tendencies by which sounds occurring in specific phonetic environments are subject to change.
Note that my view of "phonetic environment" in this context includes an environment that would capture all possible contexts. When confronted with a sound change that seems to affect a sound in all phonetic contexts, in which the sound occurs in the same way, linguists often speak of "unconditioned sound change", as they do not find any apparent condition for this change to happen. For a formal treatment, however, this is unsatisfying, since the lack of a phonetic environment is also a specific condition of sound change.

Why it is hard to establish a typology of sound change

As is also true for semantic change, discussed as Problem 8 last month, there are three major reasons why it is hard to establish a typology of sound change. As a first problem, we find, again, the issue of acquiring the data needed to establish the typology. As a second problem, it is also not clear how to handle the data appropriately in order to allow us to study sound change across different language families and different times. As a third problem, it is also very difficult to interpret sound change data when trying to identify cross-linguistic tendencies.

Problem 1

The problem of acquiring data about sound change processes in sufficient size is very similar to the problem of semantic change: most of what we know about sound change has been inferred by comparing languages, and we do not know how confident we can be with respect to those inferences. While semantic change is considered to be notoriously difficult to handle (Fox 1995: 111), scholars generally have more confidence in sound change and the power of linguistic reconstruction. The question remains, however, as to how confident we can really be, which divides the field into the so-called "realists" and the so-called "abstractionalists" (see Lass 2017 for a recent discussion of the debate).

As a typical representative of abstractionalism in linguistic reconstruction, consider the famous linguist Ferdinand de Saussure, who emphasized that the real sound values which scholars reconstructed for proposed ancient words in unattested languages like, for example, Indo-European, could as well be simply replaced by numbers or other characters, serving as identifiers (Saussure 1916: 303). The fundamental idea here, when reconstructing a word for a given proto-language, is that a reconstruction does not need to inform us about the likely pronunciation of a word, but rather about the structure of the word in contrast to other words.

This aspect of historical linguistics is often difficult to discuss with colleagues from other disciplines, since it seems to be very peculiar, but it is very important in order to understand the basic methodology. The general idea of structure versus substance is that, once we accept that the words in a languages are built by drawing letters from an alphabet, the letters themselves do not have a substantial value, but have only a value in contrast to other letters. This means that a sequence, such as "ABBA" can be seen as being structurally identical with "CDDC", or "OTTO". The similarity should be obvious: we have the same letter in the beginning and the end of each word, and the same letter being repeated in the middle of each word (see List 2014: 58f for a closer discussion of this type of similarity).

Since sequence similarity is usually not discussed in pure structural terms, the abstract view of correspondences, as it is maintained by many historical linguists, is often difficult to discuss across disciplines. The reason why linguists tend to maintain it is that languages tend to change not only their words by mutating individual sounds, but that whole sound systems change, and new sounds can be gained during language evolution, or lost (see my blogpost from March 2018 for a closer elaboration of the problem of sound change).

It is important to emphasize, however, that despite prominent abstractionalists such as Ferdinand de Saussure (1857-1913), and in part also Antoine Meillet (1866-1936), the majority of linguists think more realistically about their reconstructions. The reason is that the composition of words based on sounds in the spoken languages of the world usually follows specific rules, so-called phonotactic rules. These may vary to quite some degree among languages, but are also restricted by some natural laws of pronunciability. Thus, although languages may show impressively long chains of one consonant following another, there is a certain limit to the number of consonants that can follow each other without a vowel. Sound change is thus believed to originate roughly in either production (speakers want to pronounce things in a simpler, more convenient way) or perception (listeners misunderstand words and store erroneous variants, see Ohala 1989 for details). Therefore, a reconstruction of a given sound system based on the comparison of multiple languages gains power from a realistic interpretation of sound values.

The problem with the abstractionalist-realist debate, however, is that linguists usually conduct some kind of a mixture between the two extremes. That means that they may reconstruct very concrete sound values for certain words, where they have very good evidence, but at the same time, they may come up with abstract values that serve as place holders in lack of better evidence. The most famous example are the Indo-European "laryngeals", whose existence is beyond doubt for most historical linguistics, but whose sound values cannot be reconstructed with high reliability. As a result, linguists tend to spell them with subscript numbers as *h₁, *h₂, and *h₃. Any attempt to assemble data about sound change processes in the languages of the world needs to find a way to cope with the different degrees of evidence we find in linguistic analyses.

Problem 2

This leads us directly to our second problem in handling sound change data appropriately in order to study sound change processes. Given that many linguists propose changes in the typical A > B / C (A becomes B in context C) notation, a possible way of thinking about establishing a first database of sound changes would consist of typing these changes from the literature and making a catalog out of it. Apart from the interpretation of the data in abstractionalist-realist terms, however, such a way of collecting the data would have a couple of serious shortcomings.

First, it would mean that the analysis of the linguist who proposed the sound change is taken as final, although we often find many debates about the specific triggers of sound change, and it is not clear whether there would be alternative sound change rules that could apply just as well (see Problem 3 on the task of automatic sound law induction for details). Second, as linguists tend to report only what changes, while disregarding what does not change, we would face the same problem as in the traditional study of semantic change: the database would suffer from a sampling bias, as we could not learn anything about the stability of sounds. Third, since sound change depends not only on production and perception, but also on the system of the language in which sounds are produced, listing sounds deprived of examples in real words would most likely make it impossible to take these systemic aspects of sound change into account.

Problem 3

This last point now leads us to the third general difficulty, the question of how to interpret sound change data, assuming that one has had the chance to acquire enough of it from a reasonably large sample of spoken languages. If we look at the general patterns of sound change observed for the languages of the world, we can distinguish two basic conditions of sound change, phonetic conditions and systemic conditions. Phonetic conditions can be further subdivided into articulatory (= production) and acoustic (= perception) conditions. When trying to explain why certain sound changes can be observed more frequently across different languages of the world, many linguists tend to invoke phonetic factors. If the sound p, for example, turns into an f, this is not necessarily surprising given the strong similarity of the sounds.

But similarity can be measured in two ways: one can compare the similarity with respect to the production of a sound by a speaker, and with respect to the perception of the sound by a listener. While production of sounds is traditionally seen as the more important factor contributing to sound change (Hock 1991: 11), there are clear examples for sound change due to misperception and re-interpretation by the listeners (Ohala 1989: 182). Some authors go as far as to claim that production-driven changes reflect regular internal language change (which happens gradually during acquisition, or (depending on the theory) also in later stages (Bybee 2002), while perception-based changes rather reflect change happening in second language acquisition and language contact (Mowrey and Pagliuca 1995: 48).

While the interaction of production and perception has been discussed in some detail in the linguistic literature, the influence of systemic factors has so far been only rarely regarded. What I mean by this factor is the idea that certain changes in language evolution may be explained exclusively as resulting from systemic constellations. As a straightforward example, consider the difference in design space for the production of consonants, vowels, and tones. In order to maintain pronunciability and comprehensiblity, it is useful for the sound system of a given language to fill in those spots in the design space that are maximally different from each other. The larger the design space and the smaller the inventory, the easier it is to guarantee its functionality. Since design spaces for vowels and tones are much smaller than for consonants, however, these sub-systems are more easily disturbed, which could be used to explain the presence of chain shifts of vowels, or flip- flop in tone systems (Wang 1967: 102). Systemic considerations play an increasingly important role in evolutionary theory, and, as shown in List et al. (2016), also be used as explanations for phenomena as strange as the phenomenon of Sapir's drift (Sapir 1921).

However, the crucial question, when trying to establish a typology of sound change, is how these different effects could be measured. I think it is obvious that collections of individual sound changes proposed in the literature are not enough. But what data would be sufficient or needed to address the problem is not entirely clear to me either.

Traditional approaches

As the first traditional approach to the typology of sound change, one should mention the intuition inside the heads of the numerous historical linguists who study particular language families. Scholars trained in historical linguistics usually start to develop some kind of intuition about likely and unlikely tendencies in sound change, and in most parts they also agree on this. The problem with this intuition, however, is that it is not explicit, and it seems even that it was never the intention of the majority of historical linguists to make their knowledge explicit. The reasons for this reluctance with respect to formalization and transparency are two-fold. First, given that every individual has invested quite some time in order to grow their intuition, it is possible that the idea of having a resource that distributes this intuition in a rigorously data-driven and explicit manner yields the typical feeling of envy in quite a few people who may then think: «I had to invest so much time in order to learn all this by heart. Why should young scholars now get all this knowledge for free?» Second, given the problems outlined in the previous section, many scholars also strongly believe that it is impossible to formalize the problem of sound change tendencies.

The by far largest traditional study of the typology of sound change is Kümmel's (2008) book Konsonantenwandel (Consonant Change), in which the author surveys sound change processes discussed in the literature on Indo-European and Semitic languages. As the title of the book suggests, it concentrates on the change of consonants, which are (probably due to the larger design space) also the class of sounds that shows stronger cross-linguistic tendencies. The book is based on a thorough inspection of the literature on consonant change in Indo-European and Semitic linguistics. The procedure by which this collection was carried out can be seen as the gold standard, which any future attempt of enlarging the given collection should be carried out.

What is specifically important, and also very difficult to achieve, is the harmonization of the evidence, which is nicely reflected in Kümmel's introduction, where he mentions that one of the main problems was to determine what the scholars actually meant with respect to phonetics and phonology, when describing certain sound changes (Kümmel 2008: 35). The major drawback of the collection is that it is not (yet) available in digital form. Given the systematicity with which the data was collected, it should be generally possible to turn the collection into a database; and it is beyond doubt that this collection could offer interesting insights into certain tendencies of sound change.

Another collection of sound changes collected from the literature is the mysterious Index Diachronica, a collection of sound changes collected from various language families by a person who wishes to remain anonymous. Up to now, this collection even has a Searchable Index that allows scholars to click on a given sound and to see in which languages this sound is involved in some kind of sound change. What is a pity about the resource is that it is difficult to use, given that one does not really know where it actually comes from, and how the information was extracted from the sources. If the anonymous author would only decide to put it (albeit anonymously, or under a pseudonym) on a public preprint server, such as, for example, Humanities Commons, this would be excellent, as it would allow those who are interested in pursuing the idea of collecting sound changes from the literature an excellent starting point to check the sources, and to further digitize the resource.

Right now, this resource seems to be mostly used by conlangers, ie., people who create artificial languages as a hobby (or profession). Conlangers are often refreshingly pragmatic, and may come up with very interesting and creative ideas about how to address certain data problems in linguistics, which "normal" linguists would refuse to do. There is a certain tendency in our field to ignore certain questions, either because scholars think it would be too tedious to collect the data to address that problem, or they consider it impossible to be done "correctly" from the start.

As a last and fascinating example, I have to mention the study by Yang and Xu (2019), in which the authors review studies of concrete examples of tone change in South-East Asian languages, trying to identify cross-linguistic tendencies. Before I read this study, I was not aware that tone change had at all been studied concretely, since most linguists consider the evidence for any kind of tendency far too shaky, and reconstruct tone exclusively as an abstract entity. The survey by Yang and Xu, however, shows clearly that there seem to be at least some tendencies, and that they can be identified by invoking a careful degree of abstraction when comparing tone change across different languages.

For the detailed reasons outlined in the previous paragraph, I do not think that a collection of sound change examples from the literature addresses the problem of establishing a typology of sound change. Specifically, the fact that sound change collections usually do not provide any tangible examples or frequencies of a given sound change within the language where it occurred, but also the fact that they do not offer any tendencies of sounds to resist change, is a major drawback, and a major loss of evidence during data collection. However, I consider these efforts as valuable and important contributions to our field. Given that they allow us to learn a lot about some very general and well-confirmed tendencies of sound change, they are also an invaluable source of inspiration when it comes to working on alternative approaches.

Computational approaches

To my knowledge, there are no real computational approaches to the study of sound change so far. What one should mention, however, are initial attempts to measure certain aspects of sound change automatically. Thus, Brown et al. (2013) measure sound correspondences across the world's languages, based on a collection of 40-item wordlists for a very large sample of languages. The limitations of this study can be found in the restricted alphabet being used (all languages are represented by a reduced transcription system of some 40 letters, called the ASJP code. While the code originally allowed representing more that just 40 sounds, since the graphemes can be combined, the collection was carried out inconsistently for different languages, which has now led to the situation that the majority of computational approaches treat each letter as a single sound, or consider only the first element of complex grapheme combinations.

While sound change is a directional process, sound correspondences reflect the correspondence of sounds in different languages as a result of sound change, and it is not trivial to extract directional information from sound correspondence data alone. Thus, while the study of Brown et al. is a very interesting contribution, also providing a very straightforward methodology, it does not address the actual problem of sound change.

The study also has other limitations. First, the approach only measures those cases where sounds differ in two languages, and thus we have the same problem that we cannot tell how likely it is that two identical sounds correspond. Second, the study ignores phonetic environment (or context), which is an important factor in sound change tendencies (some sound changes, for example, tend to occur only in word endings, etc.). Third, the study considers only sound correspondences across language pairs, while it is clear that one can often find stronger evidence for sound correspondences when looking at multiple languages (List 2019).

Initial ideas for improvement

What we need in order to address the problem of establishing a true typology of sound change processes, are, in my opinion:
  1. a standardized transcription system for the representation of sounds across linguistic resources,
  2. increased amounts of readily coded data that adhere to the standard transcription system and list cognate sets of ancestral and descendant languages,
  3. good, dated phylogenies that allow to measure how often sound changes appear in a certain time frame,
  4. methods to infer the sound change rules (Problem 3), and
  5. improved methods for ancestral state reconstruction that would allow us to identify sound change processes not only for the root and the descendant nodes, but also for intermediate stages.
It is possible that even these five points are not enough yet, as I am still trying to think about how one should best address the problem. But what I can say for sure is that one needs to address the problem step by step, starting with the issue of standardization — and that the only way to account for the problems mentioned above is to collect the pure empirical evidence on sound change, not the summarized results discussed in the literature. Thus, instead of saying that some source quotes that in German, the t became a ts at some point, I want to see a dataset that provides this in the form of concrete examples that are large enough to show the regularity of the findings and ideally also list the exceptions.

The advantage of this procedure is that the collection is independent of the typical errors that usually occur when data are collected from the literature (usually also by employing armies of students who do the "dirty" work for the scientists). It would also be independent of individual scholars' interpretations. Furthermore, it would be exhaustive — that is, one could measure not only the frequency of a given change, but also the regularity, the conditioning context, or the systemic properties

The disadvantage is, of course, the need to acquire standardized data in a large-enough size for a critical number of languages and language families. But, then again, if there were no challenges involved in this endeavor, I would not present it as an open problem of computational diversity linguistics.


With the newly published database of Cross-Linguist Transcription Systems (CLTS, Anderson et al. 2018), the first step towards a rigorous standardization of transcription systems has already been made. With our efforts towards a standardization of wordlists that can also be applied in the form of a retro-standardization to existing data (Forkel et al. 2018), we have proposed a further step of how lexical data can be collected efficiently for a large sample of the worlds' spoken languages (see also List et al. 2018). Work on automated cognate detection and workflows for computer-assisted language comparison has also drastically increased the efficiency of historical language comparison.

So, we are advancing towards a larger collection of high-quality and historically compared datasets; and it is quite possible that we will, in a couple of years from now, arrive at a point where the typology of sound change is no longer a dream by me and many colleagues, but something that may actually be feasible to extract from cross-linguistic data that has been historically annotated. But until then, many issues still remain unsolved; and in order to address these, it would be useful to work towards pilot studies, in order to see how well the ideas for improvement, outlined above, can actually be implemented.


Anderson, Cormac and Tresoldi, Tiago and Chacon, Thiago Costa and Fehn, Anne-Maria and Walworth, Mary and Forkel, Robert and List, Johann-Mattis (2018) A Cross-Linguistic Database of Phonetic Transcription Systems. Yearbook of the Poznań Linguistic Meeting 4.1: 21-53.

Brown, Cecil H. and Holman, Eric W. and Wichmann, Søren (2013) Sound correspondences in the worldś languages. Language 89.1: 4-29.

Bybee, Joan L. (2002) Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change 14: 261-290.

Forkel, Robert and List, Johann-Mattis and Greenhill, Simon J. and Rzymski, Christoph and Bank, Sebastian and Cysouw, Michael and Hammarström, Harald and Haspelmath, Martin and Kaiping, Gereon A. and Gray, Russell D. (2018) Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics. Scientific Data 5.180205: 1-10.

Fox, Anthony (1995) Linguistic Reconstruction. An Introduction to Theory and Method. Oxford: Oxford University Press.

Hock, Hans Henrich (1991) Principles of Historical Linguistics. Berlin: Mouton de Gruyter.

Kümmel, Martin Joachim (2008): Konsonantenwandel [Consonant change]. Wiesbaden:Reichert.
Lass, Roger (2017): Reality in a soft science: the metaphonology of historical reconstruction. Papers in Historical Phonology 2.1: 152-163.

List, Johann-Mattis (2014) Sequence Comparison in Historical Linguistics. Düsseldorf: Düsseldorf University Press.

List, Johann-Mattis and Pathmanathan, Jananan Sylvestre and Lopez, Philippe and Bapteste, Eric (2016) Unity and disunity in evolutionary sciences: process-based analogies open common research avenues for biology and linguistics. Biology Direct 11.39: 1-17.

List, Johann-Mattis and Greenhill, Simon J. and Anderson, Cormac and Mayer, Thomas and Tresoldi, Tiago and Forkel, Robert (2018) CLICS². An improved database of cross-linguistic colexifications assembling lexical data with help of cross-linguistic data formats. Linguistic Typology 22.2: 277-306.

List, Johann-Mattis (2019): Automatic inference of sound correspondence patterns across multiple languages. Computational Linguistics 1.45: 137-161.

Mowrey, Richard and Pagliuca, William (1995) The reductive character of articulatory evolution. Rivista di Linguistica 7: 37–124.

Ohala, J. J. (1989) Sound change is drawn from a pool of synchronic variation. In: Breivik, L. E. and Jahr, E. H. (eds.) Language Change: Contributions to the Study of its Causes. Berlin: Mouton de Gruyter., pp.173-198.

Sapir, Edward (1921[1953]) Language. An Introduction to the Study of Speech.

de Saussure, Ferdinand (1916) Cours de linguistique générale. Lausanne: Payot.

William S-Y. Wang (1967) Phonological features of tone. International Journal of American Linguistics 33.2: 93-105.

Yang, Cathryn and Xu, Yi (2019) A review of tone change studies in East and Southeast Asia. Diachronica 36.3: 417-459.


  1. The most famous example are the Indo-European "laryngeals", whose existence is beyond doubt for most historical linguistics, but whose sound values cannot be reconstructed with high reliability. As a result, linguists tend to spell them with subscript numbers as *h₁, *h₂, and *h₃.

    This claim took on a life of its own decades ago. It continues to be repeated at almost every occasion as the textbook example of phonemes whose pronunciation can't be reconstructed, even though it's really quite clear that *h₂ must have been [χ] (with hints of an earlier [q]) and that *h₁ must have been [h] or possibly [ʔ] (or both at different times).

    The only difficult one is the rare *h₃. There's evidence it was voiced (*pi-ph₃- > *pib-, *h₂ap-h₃on- > *abon-), but that would make it the only voiced fricative ([ʁ]?) in the whole system (by all appearances, [z] existed, but only as an allophone of *s before word-internal voiced plosives); and the question of how it changed adjacent *e to *o remains unresolved because the original pronunciation of *o is actually a much more difficult problem than that of *h₂.

    1. Note the wording "cannot be reconstructed with high reliability". Would you not agree that the sound value is less reliable for h₂ than for, say, a? It's a matter of degree, not of absolutive terms, and you find this tradition of using abstract letters in many reconstruction systems (Austronesian, Old Chinese, etc.). And as long as linguists do not start to agree (and they don't agree) and re-write all laryngeals with real sound symbols, I do not see what should be wrong about this claim. If all was so simple as you say, people would have abandoned abstract notation long time ago.

    2. Part of the persistence of the subscript notation is probably due to the laryngeals being treated as a bundle. Even if we think *h₁ *h₂ can be probably rewritten as *h *χ, but don't have as clear of an opinion on *h₃, it's not like anyone is going to leave it as the sole number-subscripted phoneme in the transcription.

      *a is a much debated PIE segment too; another, more stable consonant like *m or *n would make a better example probably.