Monday, February 18, 2013

Trees and networks of written manuscripts

It is often suggested by anthropologists that their studies, including archaeology and linguistics, are very likely to involve horizontal flows of phylogenetic information as well as vertical ones (see the earlier posts False analogies between anthropology and biology and Time inconsistency in evolutionary networks). For example, in linguistics the horizontal flow is referred to as "diffusion", while in stemmatology it is called "contamination".

The simplest way to illustrate this is to take a dataset and analyze it using both a tree-building method and a network method. Only if the network method produces a tree-like diagram can we then safely conclude that vertical descent has had a larger influence on the transmission of the cultural information than has horizontal transfer.

A few weeks ago I reported on a case, involving the historical development of the musical instrument called a cornet, where the author first used a tree to analyze the historical data and then later settled on a network, which turned out to be rather non-treelike (Cornets: from a tree to a network). Here, I point out another example, this time involving written text.

Stemmatology is the discipline that attempts to reconstruct the transmission history of a printed text on the basis of relationships between the various extant versions (eg. manuscripts or printings). In this case, the analysis concerns the Greek manuscripts for the New Testament, in particular the Letter of James.

The stemmatological study used a database listing the variants of the 761 characters in 165 Greek manuscripts of the Letter of James. Of these, 60 characters are constant, 266 are variable but parsimony-uninformative, and 435 are variable and parsimony-informative. The objective of the study was to trace the history of copying of one manuscript to another.

To construct a phylogenetic tree from the dataset, Spencer et al. (2002) performed a parsimony analysis, and then summarized this with an Adams-2 consensus tree of the resulting 10,000 maximum-parsimony trees. This tree is shown in the first figure.

However, this approach does not explicitly display the inferred contamination among the manuscripts, which would require a phylogenetic network rather than a tree. So, Spencer et al. (2004) produced a reduced median network, instead, based on 82 selected manuscripts and 301 binary characters. This is shown in the second figure.

Clearly, parts of the manuscript history are not very tree-like, notably the part at the inferred root of the network. Spencer et al. note that this network topology:
is consistent with the ideas that most variants arose early in the history of the Greek New Testament, that early manuscripts were often influenced by both oral and written traditions, and that later copies introduced fewer variants.
Under these circumstances, a tree cannot be an appropriate representation of the anthropological data, because horizontal transfer of information has had a large effect during at least part of the phylogenetic history.


Spencer M, Wachtel K, Howe CJ (2002) The Greek vorlage of the Syra Harclensis: a comparative study on method in exploring textual genealogy.  TC: a Journal of Biblical Textual Criticism 7: 3.

Spencer M, Wachtel K, Howe CJ (2004) Representing multiple pathways of textual flow in the Greek manuscripts of the Letter of James using reduced median networks. Computers and the Humanities 38: 1–14.

No comments:

Post a Comment