The interpretation of an evolutionary network is confounded by the fact that descendants of reticulation nodes have complex ancestry. Therefore, the concept of a Most Recent Common Ancestor (MRCA) is not as straightforward as it is for a tree, as there may be multiple paths from any one descendant back to its ancestors. This creates several possible interpretations of what we might mean by a MRCA.
Figure 1 illustrates the calculation of the MRCA in a tree of five taxa (A-E), showing the MRCA of taxa C and D. We simply trace each of the descendant taxa backward along the branches towards the root, and the ancestral node where all of these traces first intersect is the MRCA of those taxa.
Figure 2 illustrates a more complex history, involving two hybridization events. The incoming branches to the reticulation nodes have arrows, for emphasis. The figure also recognizes several possible interpretations of the MRCA of taxa C and D (see Huson and Rupp 2008; Fischer and Huson 2010).
A conservative definition of the MRCA (or a stable MRCA) is the intersection of all paths from the descendants to the root, so that any reticulation pushes the MRCA back towards the root. In this example it pushes the MRCA all the way to the root. Alternatively, we could define the Lowest Common Ancestor (or the minimal common ancestor) as the shared ancestor that is furthest from the root along any path. That is, the LCA is not an ancestor of any other common ancestor of the taxa concerned.
In the mathematical terminology of lattices, which can have an algebraic or order theoretic definition, the Conservative MRCA is called the Least Lower Bound (LLB) and the LCA is called the Greatest Lower Bound (GLB).
We could also have a biological compromise between these two mathematical concepts and recognize a Fuzzy MRCA, in which only a specified proportion of the paths (representing some proportion of the genomes) needs to be accommodated by the MRCA, thus keeping the MRCA close to the main collection of descendants (Fischer and Huson 2010). In this example, the Fuzzy MRCA represents 75% of the genome of taxon C and 100% of the genome of taxon D. (The Conservative MRCA represents 100% for both taxa, by definition; and in this example the LCA represents 50% of the genome of taxon C and 100% of the genome of taxon D.)
However, neither the Fuzzy MRCA nor the LCA is necessarily unique, although the Conservative MRCA will always be unique. Figure 3 shows an example where there are two independent LCAs of taxa C and D. Neither of these LCAs is an ancestor of the other, as required by the definition, and so they are both equal candidates as LCA. Each one represents 50% of the genome for both taxa C and D.
In terms of a lattice, Figure 2 is called a lower semi-lattice (or meet semi-lattice), because every pair of nodes has only one GLB, whereas Figure 3 is not a semi-lattice, because at least one node pair has more than one GLB.
This leads to the biological question of how we are best to interpret the MRCA in situations such as that represented by Figure 3. This is a question that does not yet seem to have been addressed by biologists. Figure 3 does not represent an impossible evolutionary history, although it may be an unusual one because one lineage hybridizes with another lineage twice, presumably at different times.
The lack of a unique LCA is clearly problematic, as it almost defeats the purpose of the concept of a MRCA. It would certainly make life easier if we could restrict evolutionary networks to the class of lower semi-lattices.
An alternative is to restrict the MRCA concept to the Conservative MRCA. However, it is easy to imagine situations where this pushes the MRCA so far towards the root of the network as to be uninformative, especially in cases involving horizontal gene transfer, which can occur between widely separated evolutionary groups. If we insist that a eukaryote MRCA represent 100% of the genome, and we include non-nuclear genomes in the calculation, then the Conservative MRCA creates an extreme theoretical problem.
A Fuzzy MRCA may be the best compromise between these two extremes, although there are obvious practical issues for obtaining agreement on how much of the genome history is to be discounted from the MRCA.
Fischer J., Huson D.H. (2010) New common ancestor problems in trees and directed acyclic graphs. Information Processing Letters 110: 331–335.
Huson D.H., Rupp R. (2008) Summarizing multiple gene trees using cluster networks. Lecture Notes in Bioinformatics 5251: 296–305.