Monday, June 24, 2013

The first Darwinian evolutionary tree

Tassy (2011) has pointed out that a Darwinian evolutionary tree has certain key characteristics that (in combination) distinguish it from other models of evolution, such as those devised by Darwin's predecessors:
  • it includes ancestral and descendant forms
  • ancestral taxa are species not higher taxa
  • extant taxa are only at the leaves not the internal nodes or edges
  • there is gradual a transition between forms
  • there is splitting of lineages.

Almost all of the early trees and networks do not match at least one of these criteria. For example, the earliest networks, those of Buffon in 1755 and Duchesne in 1766, illustrated within-species relationships in dog breeds and strawberry cultivars, respectively, so that contemporary taxa appeared at internal nodes (ie. some breeds or cultivars were seen as ancestors of others).

Lamarck's famous tree of 1809 showed relationships between higher taxonomic groups rather than species, and had several such groups transforming into other groups, so that the interior nodes represented contemporary taxonomic groups. His view of evolution was thus fundamentally different to that of Darwin.

Most of the subsequent pre-1859 trees of biological relationships showed non-genealogical affinity, for example those of Agassiz, Augier, Bronn, and Hitchcock — these were not intended to be evolutionary diagrams, because their authors did not believe in evolution (Ragan 2009; Tassy 2011). Other people followed the lead of Lamarck, and thus drew similar trees, such as Barbançois, Strickland, and Wallace.

Atkinson and Gray (2005) point out that "Darwinian ideas of descent with modification were less revolutionary in linguistics than they were in biology", and so Darwinian trees appeared earlier in linguistics. For example, Schlegel (1808) is usually credited with introducing a "stammbaum" (family tree) approach to comparative grammar, along with Bopp (1816). The previous language tree of Gallet in c.1800, showed a combination of geographical and chronological relationships, rather than being strictly genealogical, and some contemporary languages were shown at internal nodes. Both Čelakovský and Schleicher in 1853 independently drew the first truly genealogical diagrams in linguistics. These had contemporary languages at the leaves but language groups on the internal edges, rather than ancestral languages.

This leaves open the question of who first drew a tree that could be considered to be completely Darwinian.

The first tree

A family-tree approach has also been developed for textual analysis, where genealogical diagrams are called "stemma", and this is where we actually find the first diagrams that match all of Darwin's ideas about evolution, as listed above.

In 1827 Hans Samuel Collin and Carl Johan Schlyter published the first volume of the Corpus Iuris Sueo-Gotorum Antiqui, which was a compilation of all of the Medieval laws of Sweden, presented in both Latin and Swedish. Collin was involved as editor of volumes 1 and 2, with Schlyter acting as sole editor of volumes 3-13 (the latter published in 1877, so that Schlyter spent 55 years on the project).

In order to compile the definitive version of the laws, the editors consulted all of the known documents (some 800 or so), which consist of hand-written manuscript copies, each one being a copy of some earlier copy. The editors performed a detailed comparative analysis of the texts in order to establish an authemtic version of the original laws (their subsequent commentary in the books is longer than the laws themselves). This is, literally, a study of "descent with modification", and not merely an analogy with Darwin's famous expression.

The first volume, which covers the county laws of Västergötland, is unique among the 13 volumes in that the editors make the following comment in their Preface (page XXXVII):
Quo evidentius appareat mutua illorum codicum nunc descriptorum ratio, qui continent textum Iuris VG. antiquioris vel recentioris, vel partem aliquam illius textus, hanc rationem, prout ex iis, in quibus inter se conveniunt aut differunt codices, iudicare potuimus, schemate quodam cognationis, Tab. III, exprimere tentavimus.
För att göra förhållandet emellan de nu beskrifna codices, som innehålla WGL:s text eller någon del deraf, enligt dess äldre eller yngre redaktion, så mycket mer åskådligt, hafva vi sökt att genom ett slags stamtafla, Tab. III, framställa deras slägtskap så som vi af deras inbördes öfverensstämmelser eller olikheter tryckt oss kunna sluta dertill. 
Holm (1972) translates the Swedish as:
To make the relationship all the clearer between the codexes now described, containing in whole or in part the text of the Västergötland Law in its older or younger redaction, we have attempted to present their affinities, as far as we could determine them from mutual agreements and differences, in a kind of family-tree in Table III.
There are two online copies of the book, in Google Books, but neither of these displays Figure III correctly, as it is apparently a foldout. So, I have included it here.

The 1827 stemma from Collin & Schlyter.
The manuscript texts are lettered. The vertical axis represents time,
with the dashed lines indicating 25-year intervals from 1300 to 1500.

O'Hara (1996) points out that the idea of establishing the most authentic version of a text by reconstructing its ancestry may have been part of an earlier monastic tradition, designed to elucidate the nature of the original scriptures. However, Collin and Schlyter appear to be the first to have done this in such a thorough manner, as most people did not bother to locate all of the extant texts (see Holm 1972). Moreover, their use of a genealogical diagram to illustrate their conclusions seems to be totally original (Timpanaro 2005; Robins 2007). The stemma matches all of the Darwinian criteria, and so it lays claim to being the first Darwinian tree. Its most obvious difference to Darwin's ideas is that it refers to individual objects rather than to groups such as species.

Holm (1972) attributes the figure (and the idea for it) to Schlyter alone, although there is nothing in the original text to support this assumption — all of the editorial comments are written in the plural. However, Frederiksen (2009) has revisited the background to the stemma, and she concludes that "there is every possibility" that Schlyter should be given the sole credit. She also points out that if the stemma is "regarded as a schema that draws up the principal lines [of descent] and disregards the contamination of the tradition it would seem to be almost accurate." In other words, the editors seem to have got it right the first time.

The issue of contamination is an important one, referring to the fact that many textual copies are actually compiled form multiple sources. Under these circumstances the stemma should be a reticulating network not a tree. Indeed, Holm (1972) attributes the absence of stemma in any of the other 16 volumes to concern by Schlyter about contamination, and therefore the actual usefulness of a tree. Nevertheless, several other people produced stemma of varying degrees of sophistication soon after 1827, including Carl Zumpt in 1831, Friedrich Ritschl in 1832, and Johan Madvig in 1833 (Holm 1972; O'Hara 1996; Timpanaro 2005). For this blog, it is worth pointing out that the stemma by Ritschl (1832) explicitly shows contamination, and is thus a reticulate network, the first of its kind in stemmatology.

Hilgendorf's 1866 phylogeny of fossil snails.
The fossils are aligned horizontally with respect to
their geological layers.

One very interesting feature of the Collin & Schlyter figure is its clear resemblance to the fossil diagrams produced independently by Franz Hilgendorf and Albert Gaudry in 1866. Both of these people studied fossils in situ, so that they could see their distribution in the geological layers, and the fossil record was complete enough for them to construct evolutionary scenarios that connected the fossils together. They thus both produced evolutionary trees with the vertical axis explicitly representing time. The only real difference from the stemma is that in their diagrams time proceeds from bottom to top (as do the fossil layers in situ).

Finally, it is worth noting one very modern feature on the stemma. In only one case is a manuscript indicated as being a direct descendant of another. In all other cases the internal nodes are unlabeled, so that the known texts show sister-group relationships rather than direct ancestor-descendant relationships. If we do not have independent evidence that an observed text (or fossil) is a direct ancestor of another text (or fossil), then we should not indicate it as such in the evolutionary history.


Atkinson QD, Gray RD (2005) Curious parallels and curious connections — phylogenetic thinking in biology and historical linguistics. Systematic Biology 54: 513-526.

Bopp F (1816)  Über das Conjugationssystem der Sanskritsprache, in Vergleichung mit jenem der griechischen, lateinischen, persischen und germanischen Sprache. Andreäischen, Frankfurt-am-Main.

Collin HS, Schlyter CJ (eds) (1827) Corpus Iuris Sueo-Gotorum Antiqui. Volumen 1. Westgötalagen. Haeggström, Stockholm.

Frederiksen BO (2009) Stemmaet fra 1827 over Västgötalagen: en videnskabshistorisk bedrift og dens mulige forudsætninger. Arkiv för Nordisk Filologi 124: 129-150.

Holm G (1972) Carl Johan Schlyter and textual scholarship. Saga och Sed (Kungl. Gustav Adolfs Akademiens Årsbok 1972): 48-80.

Lamarck J-B (1809) Philosophie Zoologique. Dentu et l'Auteur, Paris.

O'Hara R (1996) Trees of history in systematics and philology. Memorie della Società Italiana di Scienze Naturali e del Museo Civico di Storia Naturale di Milano 27: 81-88.

Ragan MA (2009) Trees and networks before and after Darwin. Biology Direct 4: 43.

Ritschl F (1832) Thomae Magistri sive Theoduli Monachi Ecloga vocum Atticarum. Orphanotrophei, Halle.

Robins W (2007) Editing and evolution. Literature Compass 4: 89–120.

Schlegel F (1808) Über die Sprache und Weisheit der Indier: ein Beitrag zur Begrundung der Alterthumskunde. Mohr und Zimmer, Heidelberg.

Tassy, P. (2011) Trees before and after Darwin. Journal of Zoological Systematics and Evolutionary Research 49: 89-101.

Timpanaro S (2005) The Genesis of Lachmann's Method [translation]. University of Chicago Press, Chicago.


  1. 'tis a bit anachronistic to call it Darwinian when, actually, Darwin hadn't yet published his Origin of species. How about calling Darwin's tree stemmatological or something the like?

    1. English is rarely a literal language, and one consequence of this is that a label can apply to something that exists prior to its namesake. In this sense, Darwinian concepts can pre-date Darwin himself. Indeed, many of his ideas were not original, but he put them together in an original way, and pursued their consequences in detail. We could easily refer to many Darwinian concepts as Wallacean, for example.