Tuesday, January 12, 2016

Directional processes in language change

Given that we are still in the process of beginning the new year, it seems to be in order to talk about directions — not in general, but rather in specific, namely, about directions in language change. This is important in so far as many processes in language evolution are directional. This means that they follow a specific direction from a state X to a state Y, and this is frequently attested across a large number of the languages of the world, while the opposite process, that state Y changes to state X, is extremely rare or even unattested.

In language evolution there are a lot of well-known and well-investigated processes with a strong directional tendency. In sound change, for example, a [p] can easily become an [f], whether it is in the Indo-European, the Austronesian, or the Sino-Tibetan languages. Yet the opposite process, that an [f] becomes a [p] is extremely rare. Similar tendencies hold for a [k] becoming a [ʧ], as in Italian [ˈtʃɛnto] cento "hundred", going back to Latin [kɛntum] centum "hundred", or a [g] becoming a [h], as in Czech [ɦora] hora "mountain", going back to Proto-Slavic *gora "mountain" (Derksen 2008).

In semantic change, unidirectional tendencies can also be observed, although it is often more difficult to identify them, let alone generalising them. Nevertheless, I think it is a rather safe bet to claim that words which originally mean "head" have a certain tendency to shift their meaning to denote "(the) first, the boss" or "the upper part, the top", while the opposite shift (that words which mean "boss" or "top" will be used to denote "head") is very unlikely to happen. Finally, in grammatical change, or, to be more precise, in grammaticalization (the process by which languages acquire new grammatical categories) directionality is one of the most important constraints (Haspelmath 2004).

Linguists usually know these tendencies very well, and they use them in their daily work, be it when trying to reconstruct the original pronuncation of words in unattested ancestral languages, when deciphering historical documents, or when tracing the semantic development of words through history. Directional changes are also important in evolutionary biology. Ratchet-like (that means: unidirectional) processes serve as a major explanans for constructive neutral evolution (Gray et al. 2010), direction is at the core of lateral gene transfer, and — as David mentioned in an earlier post — the usage of directional (non-reversible) models in phylogenetic reconstruction even provides an elegant way to root a tree (see also Huelsenbeck et al. 2002).

Given the active transfer of ideas from the biological to the linguistic domain in the last two decades, and the important role that directional processes play in both domains, it is surprising to me that methodological transfer has so far been almost exclusively limited to time-reversible models. The only approach known to me that explicitly makes use of linguistic knowledge of directions is that of Baxter (2006). In this paper, Baxter analysed phonological mergers in Chinese dialects within a framework of Camin-Sokal parsimony (Camin and Sokal 1965).

Phonological merger is a specific systemic process in language evolution. When sounds change (and they always change in some way), it may happen that two formerly distinct sounds are pronounced in the same way. As a result, words that formerly sounded different may suddenly sound alike, such as English write and right, which remain different only in their spelling not pronunciation. Mergers are a prototypical irreversible process. Once a merger has happened, speakers cannot go back, unless they recorded the original distinction and artificially tuned their language. But even this may be less easy than it seems — it is always easy to reduce distinctions. For example, most English speakers wouldn't have many difficulties in artificially pronouncing all instances of s as sh during a conversation. But being asked to pronounce a randomly chosen set of words with s as sh will turn out to be much more difficult. For this reason, mergers are an ideal data type for directional models of language change. Their drawback is, however, that they are difficult to determine, which may also be the reason why Baxter's approach has never been tested on other language families since then.

It may be justified to use time-reversible models for analyses that use lexical data, especially cognate sets, as in the approaches following Gray and Atkinson (2003), since it is difficult to determine the impact of directional processes on lexical replacement. Furthermore, due to the specific way the data is sampled, it is extremely difficult to determine directions. Yet in many other approaches that use different types of data, especially in those cases that model sound change processes (Hruschka et al. 2015, Wheeler and Whiteley 2015) or grammatical change (Longobardi et al. 2013), it might have a substantial impact on the results if directionality was explicitly modeled.

What does this mean for the directions for the New Year? I keep being surprised by the similarities between evolutionary biology and historical linguistics, be it the organization of information in genomes and languages, the processes that drive evolution, the philosophical questions underlying our investigations, or the quarrels among scholars in their fields. Unfortunately, much of the transfer from the biological to the linguistic domain is still very simplistic, often ignoring the specific differences between the two domains. On the other hand, many fruitful analogies are still out there but have not yet been properly investigated. So, as a direction for those who work in interdisciplinary domains in this New Year, I think we should try to avoid reinventing the wheel, and we should also pay attention to not putting wheels on sledges.

  • Baxter, W. (2006): Mandarin dialect phylogeny. Cah. Linguistique -- Asie Orientale 35.1. 71-114.
  • Camin, J. and R. Sokal (1965): A method for deducing branching sequences in phylogeny. Evolution 19.3. 311-327.
  • Derksen, R. (2008): Etymological dictionary of the Slavic inherited lexicon . Brill: Leiden and Boston.
  • Gray, R. and Q. Atkinson (2003): Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426.6965. 435-439.
  • Gray, M., J. Lukes, J. Archibald, P. Keeling, and W. Doolittle (2010): Cell biology. Irremediable complexity?. Science 330.6006. 920-921.
  • Haspelmath, M. (2004): On directionality in language change with particular reference to grammaticalization. In: Fischer, O., M. Norde, and H. Perridon (eds.): Up and down the cline -- The nature of grammaticalization. John Benjamins Publishing Company: 17-44.
  • Hruschka, D., S. Branford, E. Smith, J. Wilkins, A. Meade, M. Pagel, and T. Bhattacharya (2015): Detecting regular sound changes in linguistics as events of concerted evolution. Curr. Biol. 25.1. 1-9.
  • Huelsenbeck, J., J. Bollback, and A. Levine (2002): Inferring the root of a phylogenetic tree. Systems Biology 51.1. 32-43.
  • Longobardi, G., C. Guardiano, G. Silvestri, A. Boattini, and A. Ceolin (2013): Toward a syntactic phylogeny of modern Indo-European languages. J. Hist. Linguist. 3.1. 122-152.
  • Wheeler, W. and P. Whiteley (2014): Historical linguistics as a sequence optimization problem: the evolution and biogeography of Uto-Aztecan languages. Cladistics 30.1. 1-13.

No comments:

Post a Comment