Monday, March 23, 2020

Evolution unchained: The development of person names and the limits of sequences

What do person names like Jack and Hans have in common, and what unites Joe and Pepe? Both name pairs go back to a common ancestor. For Jack and Hans, this would be John (ultimately going back to Iōánnēs in Greek), and for Joe and Pepe, this would be Josef (originally from Hebrew). Given the striking dissimilarity of the names in their current form, the pathways of change by which they have evolved into their current shape are quite complicated.

While the German name Hans can be easily shown to be a short form of the German variant Johannes, the evolution of Jack is more complicated. First (at least this is what people on Wikipedia suppose), Iōánnēs becomes John in English, similar to the process that transformed German Johannes into Hans. Then, in an ancient form of English, a diminutive was built for John, which yielded the form Jenkin, with the diminutive suffix -kin that has a homologous counterpart in German -chen (which can be attached to Hans as well, yielding Hänschen). Etymologically, Jack is little Johnny.

While Joe in English is a shortening of Josef, the development of Pepe is again a bit more complex. First, we find the form Giuseppe as an Italian counterpart of Josef. How this form then yielded Pepe as a diminutive is not completely clear to me; but since we find the pe in the Italian form, we can think of a process by which Giuseppe becomes Giuseppepe, leaving Pepe after the deletion of the initial two syllables.

The complexity of person-name evolution

Even from these two examples alone, we can already see that the evolution of person names can easily become quite complex. If all words in all spoken languages in the world evolved in the same way in which our person names evolve, we would have a big problem in historical linguistics, since the amount of speculation in our etymologies would drastically increase.

When comparing etymologically related words from different languages, we generally assume that they show regular correspondences among their sound segments. This presupposes that there is still enough sound material that reflects these correspondences, allowing us to detect and assess them. But since the evolution of person names rarely consists of the regular modification of sounds, but rather results in the deletion, reduplication, and rearrangement of whole word parts, there is rarely enough left in the end that could be used as the basis for a classical sequence comparison.

With the name Tina in German being the short form of Bettina, Christina, and at times even Katharina, and with Bettina itself going back to Elisabeth, and with Tina becoming Tinchen, Tinka, or Tine, we face an almost insurmountable challenge when trying to model the complexity of the various patterns by which names can change.

Modeling word derivation with directed networks

That words do not evolve solely by the alternation of sounds, but also by different forms of derivation, is nothing new for historical linguistics. We face the problem, for example, when looking for etymologically related words in the basic lexicon of phylogenetically related languages. However, these phenomena can be easily investigated by enhanced means of annotation. The evolution of person names, on the other hand, presents us with larger challenges.

While working as a research fellow in France in 2015-2016, I had the time to develop a small tool that allows us to represent derivational relations between related words with help of a directed network, and thus allows us to model these relations in a rough way. Such a graph is directed, and our words are the nodes in the network, with the edges drawn between the assumed ancestor word forms and their descendants. This tool, which I then called DeriViz, is still available online. and makes it possible to visualize network relations between words.

I have now conducted a small experiment with this tool, by taking name variants of Elisabeth, as they are listed in Wikipedia, and trying to model them in a directed network, along with intermediate stages. You can do this easily yourself, by copying the network that I have constructed in text form below, and pasting it into the field for data entry at the DeriViz-Homepage. The network will be visualized when you press on the OK button; and you can play with it by dragging it around.
Elisabeth → BETT
BETT → Betty
BETT → Bettina
BETT → Bettine
BETT → Betsi
Elisabeth → ELISABETH
Elisabeth → ILSA
ILSA → Ilsa
ILSA → Ilse
Elisabeth → Isabella
Elisabeth → LISA
LISA → Lieschen
LISA → Liese
LISA → Liesel
LISA → Lis
LISA → Lisa
LISA → Lisbeth
LISA → Lisette
LISA → Lise
LISA → Liesl
Elisabeth → LILA 
LISA → Lila
LISA → Liliane
LISA → Lilian
LISA → Lilli
Elisabeth → Sisi
I intentionally reduced the amount of data here, in order to make sure that the graphic can still be inspected. But it is clear that even this simple model, which assumes unique ancestor-descendant relations among all of the derived person names, is stretched to its limits when applied to names as productive as Elisabeth, at least as far as the visualization is concerned.

Derivation network of names derived from Elisabeth

If you now imagine that there are various processes that turn an ancestral name into a descendant name, and that one would ideally want to model the differences between these processes as well, one can see easily that it is indeed not a trivial problem to model the evolution of person names (and we are not even speaking of inferring any of these relations).

How names evolve

Names evolve in various ways along different dimensions. With respect to their primary function, or their use, we tend to use, among others, nick names. Formally, nick names are often a short form of an original name, but depending on the community of speakers, it is also possible that there is a formal procedure by which a nick name can be derived from a base name. Thus, every speaker of Russian should know that Jekaterina can be turned into Katerina, which can be turned into Katja, which can be turned into Katjuscha, or, in the case of a Vocative, into Katj. Once the primary function of a name changes, its form usually also changes, as we can now see in many examples.

But the form can also change when a name crosses language borders. If you go with your name into another country, and the speakers have problems pronouncing certain sounds that occur in your name, it is very likely that they will adjust your name's pronunciation to the phonetic needs of their own language, and modify it. Names cross language borders very quickly, since we tend not to leave them at home when visiting or migrating to foreign countries. As a result, a great deal of the diversity of person names  observed today is due to the migration of names across the world's larger linguistic communities.

How we change names when building short forms or nick names, or when trying to adapt a name to a given target language, depends on the structure of the language. The most important part is the phonology of the language in which the change happens. For example, when transferring a name from one language to another, and the new language lacks some of the sounds in the original name, speakers will replace them with those sounds which they perceive to be closest to the lacking ones.

But the modification is not restricted to the replacement of sounds. My own given name, Mattis, for example, usually has the stress on the first syllable, but in France, most people tend to call me Matisse, with the accent on the second syllable, reflecting the general tendency to stress the last syllable of a word in French. In Russian, on the other hand, Mattis could be perfectly pronounced, but since people do not know the name, they often confuse it with its variant Matthias, which then sounds like Matjes when pronounced in Russian (which is the name for soused herring in Germany). There are more extreme cases; and both English and German speakers are also good at drastically adjusting foreign names to the needs of their mother tongues.

It would be nice if it was possible to investigate the huge diversity in the evolution of person names more systematically. In principle, this should be possible. I think, starting from directed networks is definitely a good idea; but it would probably have to be extended by distinguishing different types of graph edges. Even if a given selection may not handle all of the processes known to us, it might help to collect some primary data in the first place.

With a large enough set of well-annotated data, on the other hand, one might start to look into the development of algorithms that could infer derivation relationships between person names; or one could analyze the data and search for the most frequent processes of person name evolution. Many more analyses might be possible. One could see to which degree the processes differ across languages, or how names migrate from one language to another across times, usage types, and maybe even across fashions.


I assume that the result of such a collection would be interesting not only for couples who are about to replicate themselves, but would also be interesting for historical research and research in the field of cultural evolution. Whether such a collection will ever exist, however, seems less likely. The problem is that there are not enough scholars in the world who would be interested in this topic, as one can see from the very small number of studies that have been devoted to the problem up to now (as one of the few exceptions known to me, compare the nice overview of person name classification by Handschuh 2019). I myself would not be able to help in this endeavour, given that I lack the scholarly competence of investigating name evolution. But I would sure like to investigate and inspect the results, if they every become available.


Handschuh, Corinna (2019) The classification of names. A crosslinguistic study of sex-specific forms, classifiers, and gender marking on personal names. STUF — Language Typology and Universals 72.4: 539-572.

No comments:

Post a Comment