Tuesday, May 30, 2017

Killer arguments and the nature of proof in historical sciences

Some long time ago, somebody told me this joke, which I just found again on the internet in an English version (following jokes.cc.com, with modifications based on my memory):
Teacher: "Four crows are on the fence. The farmer shoots one. How many are left?"
Little Johnny: "None."
Teacher: "Listen carefully: Four crows are on the fence. The farmer shoots one. How many are left?"
Little Johnny: "None."
Teacher: "Can you explain that answer?"
Little Johnny: "One is shot, the others fly away. There are none left."
Teacher: "Well, that isn't the correct answer, but I like the way you think."
Little Johnny: "Teacher, can I ask a question?"
Teacher: "Sure."
Little Johnny: "There are three women in the park. The first one reads a love novel, the second one reads the newspaper, and the third one updates her FaceBook profile, which one of them is married?"
Teacher: "The one reading the newspaper?"
Little Johnny: "No. The one with the wedding ring on, but I like the way you think."
Given the title of this post, you may wonder why I tell you that joke. The reason is that for me, the essence of the joke is expressing the situation we often have in the historical sciences when we talk about "proof", be it of the closer relationship of different species, or the ultimate relationship of languages. Given the evidence we are given, we can reach an awful lot of conclusions in order to arrive at a convincing story, but if we see the wedding ring on somebody's hand, we know the true story no matter what other evidence we are given. The wedding ring in the joke serves as a killer argument — no matter what other evidence we consider, it is much more likely that the person who is married is the one with the ring than anybody else.

We often face similar situations in the historical sciences where we seek some kind of true story behind a couple of facts, when we are given external evidence that is just pointing to the right answer, or — let's be careful — the most probable answer, independent of where the other evidence might point to. We can think of similar situations in crime investigations, where we may think that a large body of evidence convicts some person as a murderer until we see some video proof that reveals the real offender.

That crime investigations have a lot in common with research in the historical sciences has been noted before by many people, notably the famous Umberto Eco (1932-2016), who edited a whole anthology on the role of circumstantial evidence in linguistics, semiotics, and philosophy (Eco and Sebeok 1983) where scholars compared the work of Sherlock Holmes with the work of people in the historical sciences. What Sherlock Holmes and historical linguists (and also evolutionary biologists) have in common is the use of abduction as their fundamental mode of reasoning. The term itself goes back to Charles Sanders Peirce (1839-1914), who distinguished it from deduction and induction:
Accepting the conclusion that an explanation is needed when facts contrary to what we should expect emerge, it follows that the explanation must be such a proposition as would lead to the prediction of the observed facts, either as necessary consequences or at least as very probable under the circumstances. A hypothesis then, has to be adopted, which is likely in itself, and renders the facts likely. This step of adopting a hypothesis as being suggested by the facts, is what I call abduction. I reckon it as a form of inference, however problematical the hypothesis may be held. (Peirce 1931/1958: 7.202)
Our problem in the historical sciences is that we are searching an original situation: what was the case a long time ago, based on general knowledge about (evolutionary or historical) processes and the results of this situation. When Sherlock Holmes looks at a crime scene, he sees the results of an action and uses his knowledge of human behaviour to find the one who was responsible for the crime. When doctors listen to the heartbeat of patients who are short of breath, they try to find out what causes their disease by making use of their knowledge about symptoms and the diseases that could have caused them. When linguists look at words from different languages, they make use of their knowledge of processes of language change and language contact in order to work out why those languages are so similar.

As do medical practitioners or crime investigators, we have our general schema, our protocol, which we use to carry out our investigations. Biologists search for similar DNA sequences, linguists look for similar sound sequences. In most cases, this works fine, although we are usually left with uncertainties and things that do not really seem to add up. As long as we can quietly follow the protocol, we are fine; and even if the results of our research do not necessarily last for a long time, being superceded by more recent research, we usually have the impression that we did the best we could, given the complex circumstances with their complex circumstancial evidence. But once in a while, we uncover evidence similar to video proofs in crime investigation, or wedding rings as in the Little Johnny joke — evidence that is so striking that we have to put our protocol to one side and just accept that there is only one solution, no matter what the rest of our evidence or our protocol might point to.

In 1879, Ferdinand de Saussure (1857-1913) predicted two consonantal sounds in Proto-Indo-European based on circumstantial evidence (Saussure 1879). In 1927, Jerzy Kuryłowicz (1895-1978) could show that one of the sounds was still pronounced in Hittite, an Indo-European language that was not known during Saussure's time (Lehmann 1992: 33), and had just been deciphered. While Saussure followed protocol in his investigation, Kuryłowicz provided the video proof, and only since then, Saussure's hypothesis has become communis opinio in historical linguistics.

I assume that nobody will doubt the existence of different kinds of proof, different qualities of proof, in historical disciplines. If we are left with nothing else but our protocol, we can derive certain conclusions, but we can easily abandon our protocol once we have been presented with those killer arguments, that specific kind of proof that is so striking that we do not need to bother to have a look at any alternative facts again. I do not know of any similar examples in biology, but in linguistics (and in crime investigation, at least judging from the criminal novels I have read), it is obvious that our evidence cannot only be ranked, but that we also have a huge incline between the standard evidence we use to make most of our arguments and those killer arguments that are so striking that no doubt is left.

In the short story The Adventure of the Beryl Coronet, Sherlock Holmes says:
[When] you have excluded the impossible, whatever remains, however improbable, must be the truth.
But this is only partially true, as in Sherlock Holmes' cases the truth is usually (but not always!) presented in such a form that it does not leave any place for doubt. Sherlock Holmes is a genius at finding the wedding rings on the fingers of his witnesses. As historical scientists, we are often much less lucky, but probably also less talented than Mr. Holmes. We are thus left with the fundamental problem of not knowing how to find the killer evidence, or how to quantify the doubt in those cases where we just follow the general protocol of our discipline.

  • Eco, U. and T. Sebeok (1983) The Sign of Three. Dupin, Holmes, Peirce. Indiana University Press: Bloomington.
  • Lehmann, W. (1992) Historical linguistics. An Introduction. Routledge: London.
  • Peirce, C. (1931/1958) Collected Papers of Charles Sanders Peirce. Harvard University Press: Cambridge, Mass.
  • Saussure, F. (1879) Mémoire sur le système primitif des voyelles dans les langues indo-européennes. Teubner: Leipzig.

Tuesday, May 23, 2017

A test case for phylogenetic methods and stemmatics: the Divine Comedy

In a previous post I gave an outline of stemmatics, and briefly touched on the adoption and advantages of phylogenetic methods for textual criticism (On stemmatics and phylogenetic methods). Here I present the results of an empirical investigation I have been conducting, in which such methods are used to study some philological dilemmas of a cornerstone work in textual criticism, Dante Alighieri's Divine Comedy. I am reproducing parts of the text and the results of a paper still under review; the NEXUS file for this research is available on GitHub.

Before describing the analysis, I discuss the work and its tradition, as well as some of the open questions concerning its textual criticism. This should not only allow the main audience of this blog to understand (and perhaps question) my work, but it is also a way to familiarize you with the kind of research conducted in stemmatics. After all, the first step is the recensio, a deep review of all information that can be gathered about a work.

The Divine Comedy

The Divine Comedy is an Italian medieval poem, and one of the most successful and influential medieval works. It is written in a rigid structure that, when compared to other works, guaranteed it a certain resistance to copy errors, as most changes would be immediately evident. Composed of three canticas (Inferno, Purgatory, and Paradise), the first of its 100 cantos were written in 1306-07, with the work completed not long before the death of the author in 1321. Written mostly during Dante's exile from his home city, Florence (Tuscany), like many works of the time it was published as the author wrote it, and not only upon completion. In fact, it is even possible, while not proven, that the author changed some cantos and published revisions, thus being himself the source of unresolvable differences.

No original manuscript has survived, but scholarship has traced the development of the tradition from copies and historical research. The poem is one of the most copied works of the Middle Ages, with more than 600 known complete copies, besides another 200 partial and fragmentary witnesses. For comparison, there are around 80 copies of Chaucer's Canterbury Tales, which is itself a successful work by medieval standards.

Commercial enterprises soon developed to attend to the market demand of its success. In terms of geographical diffusion, quantitative data suggests that, before the Black Death that ravaged the city of Florence in 1348, scribal activity was more intense in Tuscany than in Northern Italy, where the author had died. Among the hypotheses for its textual evolution, the results of my investigation support the widespread hypothesis that Dante published his work with Florentine orthography in Northern Italy. That is, the first copies adopted Northern orthographic standards, which would then revert to Tuscan customs, with occasional misinterpretations, when the work found its way back to Florence. These essentials of the transmission must be considered when curating a critical edition, as the less numerous Northern manuscripts, albeit with an adapted orthography, can in general be assumed to be closer to the archetype (if there ever was one to speak of) than the Florentine ones.

The tradition is characterized by intentional contamination, as the work soon became a focus of politics and grammar prescriptivism. Errors and contamination have been demonstrated even in the earliest securely dated manuscript, the Landiano of 1336 (cf. Shaw, 2011), and can also be identified in the first commentaries dating from the 1320s (such as in the one by Jacopo Alighieri, the author's son).

Critical studies

Here are some details about previous studies. I have included considerable stemmatic information, but I include a biological analogy to help make sense for non-experts.

The first critical editions date from the 19th century, but a stemmatic approach was advanced only at the end of that century, by Michele Barbi. Facing the problem of applying Lachmann's method to a long text with a massive tradition, in 1891 Barbi proposed his list of around 400 loci (samples of the text), inviting scholars to contribute the readings in the manuscripts they had access to. His project, which was intended to establish a complete genealogy without the need for a full collatio, had disappointing results, with only a handful of responses. Mario Casella would later (1921) conduct the first formal stemmatic study of the poem, grouping some older manuscripts into two families, α and β, with an unequal number of witnesses but equal value for the emendatio. His two families are not rooted at a higher level, but he observed that they share errors, supporting the hypothesis of a common ancestor, likely copied by a Northern scribe.

Casella's stemma, reproduced from Shaw (2011).

Forty years later, Giorgio Petrocchi proposed to overcome the large stemma by employing only witnesses dating from before the editorial activity of Giovanni Boccaccio, as his alterations and influence were considered to be too pervasive. Petrocchi defended a cut-off date of 1355 as being necessary for a stemmatic approach, which would otherwise have been impossible, given the level of contamination of later copies. The restriction in the number of witnesses was contrasted with his expansion of the collatio to the entire text, criticizing Barbi's loci as subjective selections for which there was no proof of sufficiency.

Making use of analogies with biology, we may say that Barbi proposed to establish a tree from a reduced number of "proteins" for all possible "taxa". Casella considered this to be impracticable and, selecting a few representative "fossils", built a tree from a large number of phenotypic characteristics. Finally, Petrocchi produced a network while considering the entire "genome" for all "fossils" dated from before an event that, while well-supported in theory (we could compare its effects to a profound climate change), was nonetheless arbitrary.

Petrocchi's stemma, reproduced from Shaw (2011).

Questions about Petrocchi's methodology and assumptions were soon raised, particularly regarding the proclaimed influence of Boccaccio, without quantitative proofs either that his editions were as influential as asserted or that all later witnesses were superfluous for stemmatics. Later research focused on questioning his stemma. For example, the absence of consensus about the relationship between the Ash and Ham manuscripts, the supposedly weak demonstration of the polytomy of Mad, Rb, and Urb (the "Northern manuscripts"), and the dating of Gv (likely copied fifty to a hundred years after Petrocchi's assumption). Evidence was presented that Co, a key manuscript in his stemma, could not be an ancestor of Lau (its copyist was still active in the 15th century), and that Ga contained disjunctive errors not found in its supposed decedents. Abusing once more the biological analogy, the dating of his "fossils" was in some cases plainly wrong.

Federico Sanguineti presented an alternative stemma in 2001, arguing that a rigorous application of stemmatics would evidence errors made by Petrocchi. To that end, he decided to resurrect Barbi's loci and trace the first complete genealogy, without arbitrary and a priori decisions about the usefulness of the textual witnesses. Sanguineti defended the suggestion that, after this proper recensio, a small number of manuscripts (which he eventually set to seven) would be sufficient for emendation. His stemma, described as "optimistic in its elegance and minimalism" (Shaw 2011), resulted in a critical edition that heavily relied on a single manuscript, Urb, the only witness of his β family (as Rb was displaced from the proximity it had in Petrocchi's stemma, and Mad was excluded from the analysis). Keeping with the biological analogy, Sanguineti proposed building a tree from an extremely reduced number of "proteins", but for all "taxa". In the end, however, the reduced number of "proteins" was considered only for seven "taxa", selected mostly due to their age.

Sanguineti's stemma, reproduced from Shaw (2011).

The edition of Sanguineti was attacked by critics, who confronted the limited number of manuscripts used in the emendatio, the position of Rb, the high value attributed to LauSC, and the unparalleled importance of Urb, all resulting in an unexpected Northern coloring to the language of a Florentine writer. Regarding his methodology, reviewers pointed out that stemmatic principles had not been followed strictly, as the elimination was not restricted to descripti, but extended to branches that were considered to be too contaminated.

The digital edition of Prue Shaw (2011) was developed as a project for phylogenetic testing of Sanguineti's assumptions. Her edition includes complete manuscript transcriptions, and the transcriptions include all of the layers of revision of each manuscript (original readings and corrections by later hands), and are complemented by high-quality reproductions of the manuscripts. After testing the validity of Sanguineti's method and stemma, Shaw concluded that his claims do not "stand up to close scrutiny", and that the entire edition is compromised, because Rb "is shown unequivocally to be a collaterale of Urb, and not a member of α as [Sanguineti] maintains".

Applying phylogenetic methods

With the goal of following and, to a large part, replicating Shaw (2011), I have analyzed signals of phylogenetic proximity for validating stemmatic hypotheses, produced both a computer-generated and a computer-assisted phylogeny (equivalent to a stemma), and evaluated the performance of such phylogenies with methods of ancestral state reconstruction.

I wanted to investigate the phylogenetic proximity of witnesses and the statistical support for the published stemmas. After experiments with rooted graphs, I made a decision to use NeighborNets, in which splits are indicative of observed divergences and edge lengths are proportional to the observed differences. These unrooted split networks were preferable because they facilitated visual investigation, and also provided results for the subsequent steps. These involved exploring the topology and evaluating potential contaminations, guiding the elimination of taxa whose data would be redundant for establishing prior hypotheses of genealogical relationships. Analyses were conducted using all manuscript layers and critical editions, both with and without bootstrapping, thus obtaining results supported in terms of inferred trees as well as character data.

NeighborNet of the manuscripts and revisions from my data, generated with SplitsTree
(Huson & Bryant 2006)

The analysis confirmed most of the conclusions of Shaw (2011) — there are no doubts about the proximity and distinctiveness of Ash and Ham, with Sanguineti's hypothesis (in which they are collaterals) better supported than Petrocchi's hypothesis (in which the first is an ancestor of the second). The proximity of Mart and Triv was confirmed; but the position of the ancestors postulated by Petrocchi and Sanguineti should be questioned in face of the signals they share with LauSC, perhaps because of contamination. The most important finding, in line with Shaw and in contrast with the fundamental assumption of Sanguineti, is the clear demonstration of the relationship between Rb and Urb.

The relationship analyses allowed the generation of trees for further evaluation. Despite the goal of a full Bayesian tree-inference, I discarded this option because, without a careful and demanding selection of priors, it would yield flawed results. As such, I made the decision to build trees using both stochastic inference and user design (ie. manually). This postponed more complex topology analyses for future research, but generated the structures needed by the subsequent investigation steps; both trees are included in the datafile.

The second tree (shown below), allowing polytomies and manually constructed by myself, tries to combine the findings of Petrocchi and Sanguineti by resolving their differences with the support of the relationship analyses. Using Petrocchi's edition as a gold standard, and considering only single hypothesis reconstructions, parsimonious ancestral state reconstructions agree with 9,016 characters (79.9%). When considering multiple hypotheses, instead, reconstructions agree with 10,226 characters (90.7%). Cases of disagreement were manually analyzed and, as expected, most resulted from readings supported by the tradition but refuted by Petrocchi on exegetic grounds.

My proposed tree for the manuscripts selected by Sanguineti,
generated with PhyD3 (Kreft et al., 2017).

This tree suggests that, in general, Petrocchi's network is better supported than the tree by Sanguineti, as phylogenetic principles lead us to expect — the first was built considering statistical properties and using all of available data, while the second relied on many intuitions and assumptions never really tested. In particular, it supports the findings of Shaw and, as such, allows us to indicate the critical edition of Petrocchi as the best one. Even more important, however, it is a further evidence of the usefulness of phylogenetic methods, when appropriately used, in stemmatics.


Alagherii, Dantis (2001) Comedìa. Edited by Federico Sanguineti. Firenze: Edizioni del Galluzzo.

Alighieri, Dante (1994) La Commedia Secondo L’antica Vulgata: Introduzione. Edited by Giorgio Petrocchi. Opere di Dante Alighieri v. 1. Firenze: Le Lettere.

Huson, Daniel H.; Bryant, David (2006) Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23: 254–267.

Inglese, Giorgio (2007) Inferno, Revisione del testo e commento. Roma: Carocci.

Kreft, Lukasz; Botzki, Alexander; Coppens, Frederik; Vandepoele, Klaas; Van Bel, Michiel (2017) PhyD3: a Phylogenetic Tree Viewer with Extended PhyloXML Support for Functional Genomics Data Visualization. BioRxiv. Doi: 10.1101/107276.

Leonardi, Anna M.C. (1991) Introduzione. In: La Divina Commedia, by Dante Alighieri. Milano: Arnoldo Mondadori Editore.

Shaw, Prue (2011) Commedia: a Digital Edition. Birmingham: Scholarly Digital Editions.

Trovato, Paolo (2016) Metodologia editoriale per la Commedia di Dante Alighieri. Ferrara. https://www.youtube.com/watch?v=BfKUOAR9PXA. Date of access: March 19, 2017.

Tuesday, May 16, 2017

Connecting tree and network edges

I have struggled over the years to try to understand the relationship between trees and networks. In one sense, networks are generalizations of trees, and in another sense a tree is just a simplified network. But it is not always that simple.

For example, not all networks can be created by adding edges to a tree (see Networks vs augmented trees); so the connection between trees and networks is not always obvious. Moreover, it is not always easy to determine which tree edges are present in any given network, or which network edges are present in a given tree.

Nevertheless, this should be basic information in phylogenetics — otherwise, how can we know when a tree is adequate for our purposes, or when a network is needed?

It turns out that I have not been alone in struggling to connect trees and networks. Fortunately, some of these other people decided to actually do something about it, rather than simply struggling on. As a result, a computerized way to relate much of the important information connecting trees with networks now exists.
Klaus Schliep, Alastair J. Potts, David A. Morrison and Guido W. Grimm
Intertwining phylogenetic trees and networks.
Methods in Ecology and Evolution (Early View)
To quote the authors:
Here we provide a framework, implemented in the PHANGORN library in R, to transfer information between trees and networks. This includes: (i) identifying and labelling equivalent tree branches and network edges, (ii) transferring tree branch-support to network edges, and (iii) mapping bipartition support from a sample of trees (e.g. from bootstrapping or Bayesian inference) onto network edges.
These three functions are illustrated in this figure, taken from the paper. It should be self-explanatory to anyone who has tried to relate the edges of trees and networks; but if it is not, then you can read an explanation in the paper.

The R library referred to, including the source code, along with some examples and vignettes, can be accessed on the PHANGORN CRAN page.

Note that PHANGORN (originally created by Klaus Schliep) also contains other functions related to estimating phylogenetic trees and networks, using maximum likelihood, maximum parsimony, distance methods and hadamard conjugation. Specifically, it allows you to: estimate phylogenies, compare trees and models, and explore tree space and visualize phylogenetic trees and split graphs.

Tuesday, May 9, 2017

Dante and the tree model

I was preparing a blog post on phylogenetic methods for the study of the Divine Comedy, by Dante Alighieri (1265-1321), and it occurred to me that a note on Dante's contribution to the tree model might also be worthwhile. This medieval poet cannot, of course, be described as the father of the Stammbaum, but he should probably be listed among the many sources for the development of the model, and of the linguistic theories that it supported in the 19th century.

The study of Dante's works became almost an international mania with the rise of Romanticism in the 18-19th centuries, and scholars are not strangers to his more obscure works. One of these works is an abandoned linguistic essay entitled De vulgari eloquentia ("On eloquence in the vernacular language", circa 1305). In short, it is an unfinished manual on composition, a "poetics", with an introductory chapter discussing the appropriate language for poetry. The essay is written in Latin, but from the first paragraphs the author declares that the same language is not suitable for literature, as it is not a living language. Latin is then reserved for scientific and philosophical matters, and the author "ventures in a quest" for a good literary vernacular.

The first paragraphs are full of medieval opinions on language, such as the confusion arising from the Tower of Babel, and a discussion about how a linguistic ability would be superfluous for demons and angels alike. Towards the end the author starts to favor an artificial vernacular language, concluding that no living language (among the 14 dialects of the Italian peninsula) would be good enough. This latter idea was not followed when he wrote the Divine Comedy (which was written in Tuscan, Dante's native dialect), and this probably explains why the essay was abandoned just when the composition of that poem was begun.

However, between the biblical linguistics and the poetic formalism, Dante explores linguistic matters with an almost modern (and sometimes surprising) mindset. For example, he discusses how birds don't talk but simply repeat air movements; he discusses how grammar (i.e. Latin and Greek) is a codification; he provides a detailed, while subjective, map of the Italian vernaculars of the 12th century; and, what matters for us here, he explains that not all linguistic differences are due to the "vengeful confusion" arising from the Tower of Babel. Being human constructions, he says, languages are unstable and, as such, will change, as proved by many similarities that can't be random and don't really add much confusion (i.e. their differences are too feeble to be a consequence of the punishment of an almighty god). Our problem, he continues, is that changes are gradual and subtle, and as such we don't perceive them; but they do exist, as someone who returns to a city after many years can confirm, or as can be recorded when moving from city to city.

The (genealogical) tree model is implicit but undeniable in the eighth chapter of the first book, when the author uses words such as "root", "planted", and "branches". Here, I also report the original words in Latin, along with a translation adapted from Botterill (2006):
The confusion of languages [after the Tower of Babel] leads me [...] to the opinion that it was then that human beings were first scattered throughout the whole world, into every temperate zone and habitable region, right to its furthest corners. And since the principal root [radix] from which the human race has grown was planted [plantata] in the East, and from there our growth has spread, through many branches [palmites] and in all directions, finally reaching the furthest limits of the West [...]. [...] these people brought with them a tripartite language. Of those who brought it, some found their way to southern Europe and some to northern; and a third group, whom we now call Greeks, settled partly in Europe and partly in Asia. Later, from this tripartite language (which had been received in that vengeful confusion), different vernaculars developed, as I shall show later. For in that whole area that extends from the mouth of the Danube (or the Meotide marshes) to the westernmost shores of England, and which is defined by the boundaries of the Italians and the French, and by the [Atlantic] ocean, only one language prevailed, although later it was split up into many vernaculars by the Slavs, the Hungarians, the Teutons, the Saxons, the English, and several other nations. Only one sign of their common origin remains in almost all of them, namely that nearly all the nations listed above, when they answer in the affirmative, say [see the map above, from Elisabeth Burr] Starting from the furthest point reached by this vernacular (that is, from the boundary of the Hungarians towards the east), another occupied all the rest of what, from there onwards, is called Europe; and it stretches even beyond that. All the rest of Europe that was not dominated by these two vernaculars was held by a third, although nowadays this itself seems to be divided in three: for some now say oc, some oïl, and some , when they answer in the affirmative; and these are the Hispanic, the French, and the Italians. Yet the sign that the vernaculars of these three peoples derive from one and the same language is plainly apparent: for they can be seen to use the same words to signify many things, such as 'God', 'heaven', 'love', 'sea,' 'earth', 'is', 'lives', 'dies', 'loves', and almost all others. Of these peoples, those who say oc live in the western part of southern Europe, beginning from the boundaries of the Genoese. Those who say , however, live to the east of those boundaries, all the way to that outcrop of Italy from which the gulf of the Adriatic begins, and in Sicily. But those who say oïl live somewhat to the north of these others, for to the east they have the Germans, on the west and north they are hemmed in by the English sea and by the mountains of Aragon, and to the south they are enclosed by the people of Provence and the slopes of the Apennines.
The De vulgari eloquentia has routinely been printed alongside the Divine Comedy, and was studied, to give some examples, by Thomas Warton in his History of the English Poetry (London, 1775), by Johann Gottfried Eichhorn in his Allgemeine Geschichte der Cultur und Litteratur des neueren Europa (Göttingen, 1796), and by August Pott (a student of Franz Bopp) in his Indogermanischer sprachstamm" (1840). The essay was copied in Germany even before the introduction of the printing press; and a German translation, Über die Volkssprache (K. L. Kannegießer, 1845), was published in Leipzig when August Schleicher was already active in linguistic studies.

By this time, it seems, the work was almost a commonplace topic of discussion — when defending his model for the Italian language, and complaining about people who proposed a 12th century language for a 19th century nation state, around 1830, Alessandro Manzoni jokingly reminded us that it was "one of those books which nobody actually read, but everybody discusses".

This is one more little note to our narrative on the evolution of the tree model.


Alighieri, D. De Vulgari Eloquentia. Edited and translated by Steven Botterill. Cambridge University Press, 2006.

Elisabeth Burr. Klassifizierung der romanischen Sprachen.

Wednesday, May 3, 2017

On stemmatics and phylogenetic methods

No se publica un libro sin alguna divergencia entre cada uno de los ejemplares. Los escribas prestan juramento secreto de omitir, de interpolar, de variar. [No book is published without some divergence between each of the copies. Scribes take a secret oath to omit, to interpolate, to change.] (Jorge Luis Borges, La lotería en Babilonia, in Ficciones, 1962)
This is the first on series of posts on stemmatics, a field just as much in love with trees and networks as are phylogenetics and historical linguistics. Being an introduction, I explain what the field does, present the most important jargon, and offer a list references that, while suitable for the audience of this blog, is denser than what one might expect for a blog post.

Thank you to Mattis and David for inviting me to write!

Textual criticism

Textual criticism (or, less precisely, "philology") is a discipline concerned with the investigation of the history of literary, legal, and religious texts for explaining how differences among the copies of a text (its "witnesses") arose, and with the production of "critical editions", either scholarly curated versions of a text that aim to reconstruct the lost original or corrected versions of an existing copy.

The problem of divergence between copies of text, with the accumulation of involuntary and deliberate errors, as well as the need for a systematic study of such differences, is as old as writing itself. For example, our current editions for the epic poems of Homer descend from Ancient philological attempts to restore an uncontaminated original (see the first two figures). These include the edition of Pisistratus (VI century BCE, which determined what was to be sung at the Panathenaic Games), and the so-called VMK (Viermännerkommentar, "commentary of the four men") of the Alexandrian School (I-II century BCE), which is generally assumed to be the root of the witnesses that we have.

Van der Valk's reconstruction of the sources for Venetus A, one of the most
important manuscripts of Homer's Iliad (source: Wikipedia).

Erbse's reconstruction of the sources for Venetus A, one of the most important
manuscripts of the Iliad (source: Wikipedia).

Before stemmatics, an edition could either be based on a "good copy" (a version considered to be less contaminated or more faithful than others), in a "majority reading" (in which the most attested variant would be chosen), or in a principle of "eclecticism" (with each best reading individually selected by the editor's judgment). Each new version, as expected, contributed even more to the confusion, particularly when changes were voluntary.

Among the texts with long and complex traditions, objects of countless and sometimes bloody disputations on the "correct" readings, are the Bible and codes of laws, for which it was not uncommon to have a different version in each city, with predictable consequences. For example, the first published textual tree, as already covered in this blog (The first Darwinian evolutionary tree), was authored by Carl Johan Schlyter in 1827 in a study precisely on the multiple and conflicting copies of Swedish law.

As such, it is no surprise that objective approaches were soon developed (Homer's VMK edition being one of the first examples), culminating with the development of stemmatics, with its study of the genealogical relationship between witnesses, and its representation of such relationships by means of trees.


As a scientific approach to textual criticism, stemmatics established itself from the beginnings of 19th century as an alternative to emendations based in the opinions and wishes of editors, possibly inspiring both Charles Darwin and August Schleicher (for a general discussion on the development and significance of this method, see Timpanaro 2005). However, more than a "source", we should consider it a branch equally stemming from the "cultural framework" (Macé and Baret 2006: 91) that also gave us Darwinism and historical linguistics.

As was true for these latter disciplines, stemmatics was at first opposed, because of the revolution it brought to its field, along with its genealogical trees. However, just as in these sister disciplines, the results of the new mindset introduced by the explanation of evolution with trees could not be ignored, and this approach is so central to textual criticism that the latter can be divided into periods before and after the work of Karl Lachmann, the "father" of stemmatics, in particular the publication of his edition of Lucretius' De rerum natura (1850). In his commentaries, besides demonstrating the number of lines per page in the lost manuscript at the root of the tradition, Lachmann was even able to demonstrate the kind of script used to write it (Lachmanni 1850).

The work he chose, with the importance of Lucretius in the development of the scientific mindset (and, as we should remember when dealing with cultural evolution, of Darwin's theories), is unlikely to be casual, but this is a matter for a different blog post.


Genealogical trees are so central to the stemmatic method that the field itself is actually named after them. The main goal of an editor is to produce a stemma codicum ("family tree of manuscripts"), or simply stemma, a tree-like structure that supports the textual emendation and represents the "tradition" (the witnesses' genealogy), in analogy with the family trees of Roman families that figured in many texts reviewed by 19th century philologists. Stemma, in fact, is a Greek word meaning garland or wreath, that was incorporated in Imperial Latin to designate a family tree (and, figuratively, nobility itself), as family trees were drawn with a stemma at their top.

In short, stemmatics begins with a recensio, which is an investigation of all total and partial copies of a work. This review is followed by a collatio, a systematic scrutiny of the manuscripts' contents, when readings are aligned and compared. The results of this alignment are used to produce the stemma, following the principle that "community of errors implies community of origin". By analyzing the stemma and the errors, editors finally proceed to the emendatio, which is a reconstruction that explains the known variants, and is intended to represent the "archetype" (a lost witness at the root of the ramification, assumed to be closer to the original than any other copy).

A stemma is conventionally drawn top-to-bottom, with vertical placements roughly indicating the date of the manuscript (the higher, the older). Solid edges ("arrows") indicate descent, while dashed ones imply contamination (scribes using more than one source). Witnesses are usually labeled with abbreviated names or Latin letters, when the manuscript is available, or with Greek letters, when it is missing (with α usually reserved for the archetype and ω for the original). Below is a reproduction of Petrocchi's partial stemma for the tradition of Dante Alighieri's Divine Comedy, which I will cover in a future post. Note that the genealogy is actually a reticulating network rather than a simple tree.

Petrocchi's partial stemma for the Divine Comedy, presented in the
introduction to his critical edition (1965).

The example stemma offered by Maas (1958), adapted below, is still useful to demonstrate the principles of stemmatics. In this example, for a textual emendation manuscript H should be eliminated (as it descends from F), as well as I and J (copies of G). Manuscript C shows a contamination from its collateral D, something which should be considered when weighting errors. Sub-archetypes β and γ are to be inferred from the available witnesses of their branches, and their readings will have the same weight as K, the only member of the third family branching from the archetype (even though it is a recent manuscript), in establishing the "lesson" of α. Errors might be presumed in α itself, or even in the original ω, and in both cases a corrected "lesson" might be offered by the editor after internal and external evidences.

Exemplary stemma adapted from Maas (1958).

Adoption and practice

Stemmatics has been criticized and confronted since Lachmann's time. It requires very specialized knowledge, for example in distinguishing between monogenetic and polygenetic errors, i.e. those that arose once and those that emerged independently more than once (and that, as such, are not disjunctive). A number of its suppositions are routinely called into question, such as the idea that each copy always derives from a single source (accepting contamination, at most), that each copy has at least the same number of errors of its source, and, fundamentally, that traditions have one and only one archetype.

Many measures tend to be adopted to reduce the editorial effort. These include eliminating manuscripts considered to be descripti (i.e. proved to descend from a preserved witness, in theory sharing all the errors of their sources), and only performing the collatio in a set of critical passages (loci critici). While a complete stemma and a full collatio are desirable, such compromises might be unavoidable for long texts with ample traditions. For example, in the case of Dante Alighieri's Divine Comedy, after considering the time employed by scholars such as Petrocchi, Sanguineti, and Shaw for their editions, Trovato (2016) estimated the length of a full stemmatic approach in 400 man-years.

An alternative to stemmatic methods and suppositions, which also reduces the editorial effort, is found in scholars who follow the work of Joseph Bédier, who successfully challenged the limits of stemmatics by adopting a renewed version of the method of the "good copy" for his editions of medieval texts. The Bédierian method does not refute a scientific approach or methods such as the recensio, the collatio, or even the production of a stemma, but these are used to support the editor's judgment in selecting and curating a bon manuscript — a good edition of text to be corrected only where errors can be proved beyond reasonable doubt. In short, trees (and networks) have been central to textual criticism even when stemmatics itself, as a method, is being challenged.

Considering the editorial effort and the analogies with linguistics and biology, it is no surprise that digital workflows have been proposed, along with the development of computer resources and phylogenetic methods. Ideas for new approaches were explored by Froger (1969), and formal phylogenetic methods were attempted by Platnick and Cameron (1977). Recently, the number of editions supported by formal phylogenetic methods and software has increased (see, for example, Barbook et al. 1998; Stolz 2003; and Lantin, Baret and Macé 2004), also in the face of scientific evaluations of performance (Roos and Heikkila 2009).

Besides advances in speed and replicability, the new technologies are allowing us to expand the goals of the discipline, moving from electronic editing to computational philology. In fact, while the field has for centuries been defined by the production of critical editions, digital approaches have been shown to support a reduction in the importance of "authorial intention", allowing researchers to focus on the reception of texts by the public, in line with developments of literary theory (Jauss 1982), and with the goals established by the "New Philology" (Cerquiglini 1989). Manuscripts with readings that differ from a supposed original, traditionally described as "corrupted", are changing from copies that were meant to be discarded into data points that collaborate in an investigation of human history that is assisted by quantitative data and methods.


Barbrook A.C., Howe C.J., Blake N., Robinson P. (1998) The phylogeny of the Canterbury Tales. Nature 394 (6696): 839.

Cerquiglini B. (1989) Éloge de la variante: histoire critique de la philologie. Aux Travaux. Paris: Éditions du Seuil.

Froget J. (1969) La critique des textes et son automatization. Bulletin De L’Association Guillaume Budé 1(1): 125–129.

Jauss H.-R. (1982) Toward an Aesthetic of Reception. Minneapolis: University of Minnesota Press.

Lachmann C. (1850) De Rerum Natura. Commentarius. Berolini: Imprensis Georgii Reimeri.

Lantin A.-C., Baret P.V., Macé C. (2004) Phylogenetic analysis of Gregory of Nazianzus’ Homily 27. 7èmes Journées Internationales d’Analyse statistique des Données Textuelles, pp. 700-707.

Maas P. (1958). Textual Criticism. Translated by Barbara Flower. Oxford: Oxford University Press.

Macé C.; Baret P.V. (2006) Why phylogenetic methods work: the theory of evolution and textual criticism. Linguistica Computazionale. The Evolution of Texts: Confronting Stemmatological and Genetical Methods 24: 89–108.

Platnick N.I., Cameron H.D. (1977) Cladistic methods in textual, linguistic, and phylogenetic analysis. Systematic Zoology 26: 380–385.

Roos T., Heikkilä T. (2009) Evaluating methods for computer-assisted stemmatology using artificial benchmark data sets. Literary and Linguistic Computing fqp002.

Stolz, M. (2003) New philology and new phylogeny: aspects of a critical electronic edition of Wolfram’s Parzival. Literary and Linguistic Computing 18(2): 139–150.

Timpanaro S. (2005) The Genesis of Lachmann's Method. Translated and edited by G. W. Most. Chicago: University of Chicago Press.

Trovato P. (2016) Metodologia editoriale per la Commedia di Dante Alighieri. Ferrara. See Youtube; date of access: March 19, 2017.