The diasystematic structure of languages and its impact on language evolution

What is a language?

It is not easy to define exactly, what a language is. We find one reason for this in the daily use of the word “language” in non-linguistic contexts. What we call a language does not depend on purely linguistic criteria. The criteria we normally use are social and cultural.

If we were to define languages with help of linguistic criteria, we would use the degree to which speakers understand each other; and in most cases, we could draw some line around areas of what linguists would call “mutual intelligibility” (similar to the criterion of “interbreedability” in biology). But mutual intelligibility does not usually serve as the criterion by which we define languages in everyday situations. For example, we tend to say that the people from Shanghai, Beijing, and Meixian (all cities in China) all speak “Chinese”. On the other hand, we think that people from Scandinavia speak “Norwegian”, “Swedish”, and “Danish”, although there three are no more different than are the former three.

The table above (taken from List 2014: 11f, with adaptations) gives phonetic transcriptions of translations of the sentence “The North Wind and the Sun were disputing which was the stronger” in three Chinese “dialects” (Beijing Chinese, which is also called Mandarin or Standard Chinese, spoken in Beijing and all over the country as a second language; Shanghainese, spoken in Shanghai; and Hakka Chinese, spoken in Meixian), and three Scandinavian “languages” (Norwegian, Swedish, and Danish). In the table, I have put all words that have the same meaning in one column (ie. I have aligned them semantically). Furthermore, I have highlighted the words which share a common etymological origin (call them “homologs” or “cognates”) with a gray background. In red, I have added a more or less literal translation of the respective column.

As the phonetic transcriptions of the sentences show, the Chinese varieties differ to a similar, if not even greater, degree as the Scandinavian ones. And we find this variation both in the way the meaning of the sentence is expressed by the choice of words, and in the degree of etymological similarity between the words. Note, further, that none of the three Chinese dialects is mutually intelligible with any other of the dialects, while we know from famous TV series like Broen/Bron that Danish and Swedish people can often understand each other quite well (with some effort); and Norwegians and Swedes are mutually intelligible most of the time. Nevertheless, we address the latter three speech traditions as the three languages “Norwegian”, “Swedish”, and “Danish”, while we say that the speech of the people in Shanghai, Beijing, and Meixian are merely specific variants of one and the same “Chinese” language.

Languages as Diasystems

One could say that this is just a cultural problem, not a linguistic one, we are facing here. So we could say that there are two different ways of distinguishing languages from dialects. One would be the linguistic one, which uses mutual intelligibility as a unique criterion to tell languages from dialects. The other one would be the cultural definition of languages as, say, “dialects with an army” (a definition usually attributed to Uriel Weinreich).

But this is, unfortunately, only part of the real story, since the cultural definition of the boundaries of a language has a direct impact on the way languages evolve. In societies such as China, for example, a very largeproportion of all speakers is bilingual. Apart from their home dialect, speakers are also able to speak Standard Chinese (also called Mandarin Chinese), and they use it to talk to people from different regions, or to read and to write. So, from a pure linguistic viewpoint, it is not necessarily useful to break up the Chinese dialects into distinct languages, since these dialects are located within a larger speech society that is united by a common language for written and interdialectal communication.

In order to describe this complex structure of our modern languages, linguists have proposed the model of the “diasystem”, which is very common in the discipline of sociolinguistics. This model goes back to the aforementioned dialectologist Uriel Weinreich (1926–1967) who originally thought of some linguistic construct which would make it possible to describe different dialects in a uniform way (Weinreich 1954). According to the modern form of the model, a language is a complex aggregate of different linguistic systems, “which coexist and mutually influence each other” (Coseriu 1973: 40, my translation from the German).

An important aspect for determining a linguistic diasystem is the presence of a “Dachsprache” (“roof language”). This is a linguistics variety that serves as a standard for interdialectal communication (Goossens 1973: 11). The different linguistic varieties (dialects, but also sociolects) that are connected by such a standard constitute the “variety space” of a language (Oesterreicher 2001). I have tried to illustrate this in the following figure (taken from List 2014: 13).

As you can see from the figure, there are different “dimensions” according to which the varieties of a language can differ. The figure shows three of them. First, there are “diatopic varieties” which point to the division of a language into different dialects (varying regarding the place where they are spoken).

Second, there are “diastratic varieties”, pointing to different social layers in which the varieties are used. Compare, for example, the language of a football player with that of a politician, which are similar in their tendency to say nothing in many words (especially after hard defeats or before unpopular decisions to be told to the public), but which differ a lot regarding their choice of words. Third, there are “diastratic varieties”, which are varieties depending on the situation in which people speak. Compare, for example, the way our politician speeks when giving a speech to the public with the speech when discussion big politics behind closed doors.

But these three dimensions of language variation are not all that a diasystem of a language has to offer! We can further identify different speech habits when looking at the medium that is used to produce language; and there are significant differences in many respects when writing or reading something, or when speaking and listening. This dimension is commonly called “diamesic” (varying in dependency of the “medium”).

Last, but not least, we should also note that we do not necessarily speak and understand the language from only one time. Think of modern German kids in school who are forced to read Goethe's Faust, bitterly lamenting the old-fashioned style of the language, but think also about different generations of speakers living in the same speech society. This last dimension of language variety is usually called the “diachronic dimension”. The following image tries to summarize the different dimensions in which the diasystem of a language can vary.

Diasystematic aspects of language change

Given all of these fancy terms starting with “dia” and ending in “ic”, one may think that they are a mere play with thoughts developed by a bunch of linguist geeks who are interested in sociology. Why can't we just forget about all these different kinds of “variation” and keep on modeling our languages as bags of words? Applying computational methods from biology will be much easier, and as long as we use networks once in a while, we are not completely giving ourselves in to the dark side of the Force, which knows only trees. Unfortunately, this is not possible, since the diasystematic structure actually has an impact on the way in which languages change!

As an example from practice, let me tell you how I tried to buy cigarettes when I was in China for the first time. At the time, I had just started to learn Mandarin Chinese, and was really suffering from the difficulty of the language. But I had searched my dictionary several times, and looked up all the important words I needed to tell the man at the kiosk which cigarettes I wanted to have. My choice was “Marlboro”, since it was the only brand I recognized.

Although having only a complete beginner's knowledge of Chinese, I knew, as a linguist, that the language is peculiar in one specific respect — it has a very, very restricted structure of possible syllables. So one can't say “Saint Petersburg” in Chinese, since syllables in Chinese are not allowed to end in a “t” (as in “Saint”), an “s” (as in the syllable “ters”), or a “g” (as in the syllable “burg”). Instead, Chinese speakers will say Shèngbǐdébǎo. I also knew that there is no sound for “r”, and that this sound is often rendered by using a “l” instead.

So, based on this background knowledge, I “translated” the pronunciation of the word “Marlboro” into what I thought by then was perfectly understandable Mandarin, and told the man at the shop that I wanted to have a pocket of mābóluō cigarettes. Unfortunately, he didn't understand at all, what I wanted, and only when I pointed with my finger to the packets of Marlboro cigarettes did he finally understand, and say, “Ah, wànbǎolù !”.

So, I learned that “Marlboro” in Mandarin Chinese is called wànbǎolù, not mābóluō, written 万宝路, literally meaning 10 000-treasure-road, which can be translated as “road of 10 000 treasures”. (Good brand name, actually, especially for cigarettes.) It was only some months later that I understood why my prediction for the Mandarin Chinese pronunciation of “Marlboro” failed so dramatically, when I heard people from Hong Kong pronouncing the word wànbǎolù 万宝路 in Cantonese, the Chinese dialect they speak in Hong Kong. There, wànbǎolù 万宝路 becomes something like [maːn²²-pow³⁵-low³²] (the numbers are tone marks), which sounds very, very similar to the mābóluō I had falsely predicted for Mandarin Chinese.

In the image above, I have tried to depict the process by which “Marlboro” becomes the “road of 10 000 treasures”. What we are dealing with here is a complex pattern of change: both phrases, Mandarin Chinese wànbǎolù and Cantonese [maːn²²-pow³⁵-low³²], are homologous. This applies to their three parts (10 000 + treasure + road), since the phrase itself was presumably not present in earlier stages of Chinese. In the ancestor language of Cantonese and Mandarin Chinese, a variety we usually call “Middle Chinese” (spoken around 600 AD), the phrase “road of 10 000 treasures” would have sounded approximately like [mjon³-paw²-lu³]. In Mandarin Chinese, the pronunciation changed greatly, while it changed only slightly in Cantonese.

When Marlboro entered China, it was probably only sold in Hong Kong in the beginning. So, in order to trigger the interest of Hong Kong consumers, the marketing stragegists did a good job in choosing a translation that sounded both very similar to the original product while at the same time having a nice and promising meaning. They would use Chinese characters to write down the product name. When Marlboro, or the “road of 10 000 treasures” then entered the rest of China, people would read the phrase, but pronounce it in their own way — reading the Chinese characters in Mandarin Chinese just yields wànbǎolù, and not mābóluō.

The transfer of the word from one dialect to another was thus made via the diamesic dimension, via the writing system, not via the spoken language. And this is the way that many, many words (also very basic terms) are exchanged between the Chinese dialect varieties — via their “roof language”, which is the common writing system. And since this change doesn't involve the direct borrowing of a spoken word, it is barely perceivable, since it leaves no direct traces in the pronunciation of the words. While normal borrowings in other languages usually sound outlandish, borrowings in Chinese dialects which make their way from one variety to another via the writing system just sound like any other possible word in the recipient dialect.


In the same way in which languages may change via the interaction between their written and spoken varieties, the interaction between varieties from the other dimensions may also trigger change. Words originating in one social layer may be transferred to other layers; dialect words of one dialect may become popular and henceforth be used in all dialects; and even those varieties of our languages which are only accessible via stories or books may be revived, at least in part, and find a new steady place in our regular speech, up to the moment where we again cease to use them. The diasystematic structure of languages plays a crucial role in their development. Due to the diasystematic character of languages, language change involves complex network-like structures within one and the same (dia)system. If we really aim to depict language evolution in all its complexity, then it is definitely not a good thing to ignore the diasystematic aspect of languages.


