Wednesday, August 21, 2013
Conflicting placental roots: network or tree?
In this blog we champion networks as a fundamental model for phylogenetics. Networks are more general than trees, in the sense that some networks are more tree-like than are others. However, I have noted before that the current trend in phylogenetics seems to be to try to use more and more complex trees as the phylogenetic model, rather than embracing networks as a more flexible model (Resistance to network thinking).
An interesting example of this trend is in the current issue of Molecular Biology & Evolution. There are two articles that investigate the root of the placental clade, by Morgan et al. and Romiguier et al., along with an editorial commentary by Teeling & Hedges.
The "placental root" problem has been difficult to resolve as a bifurcating process because different genetic datasets support different trees. As noted by Teeling & Hedges: "Untangling the root of the evolutionary tree of placental mammals has been nearly an impossible task. The good news is that only three possibilities are seriously considered ... Now, two groups of researchers have scrutinized the largest available genomic data sets bearing on the question and have come to opposite conclusions". The three alternative tree histories for the clade root are shown in the figure.
Both of the new empirical studies are based on the protein-coding sequences for most of the 40 currently available mammalian genomes. Morgan et al. use heterogenous substitution models to account for tree and dataset heterogeneity, and get strong support for option (c). Romiguier et al. divide their dataset into GC-rich and AT-rich genes, conclude that the GC-rich genes are most likely to suffer from long-branch attraction, and get strong support from the AT-rich genes for option (a).
Teeling & Hedges continue: "Needless to say, more research is needed." No! Previous genome-scale analyses of more than one million amino acid sites from orthologous protein-coding genes have not rejected any of the three alternatives, despite the statistical estimate that 20,000 amino acid sites should be sufficient to resolve the question at this level of divergence given the tree structure, branch lengths, and number of substitutions (Hallström & Janke 2010). Doesn't this mean that we have enough evidence already?
Clearly, the conflicting results should lead the reader to at least consider the idea that something might be wrong with the underlying tree model itself. Both of these new analyses are still based on tree models, no matter how sophisticated those models might be (see also the several other papers cited by Teeling & Hedges), and no matter how much data are involved.
An alternative perspective is provided by Hallström & Janke (2010): "Mammalian evolution may not be strictly bifurcating". Their network analysis of retroposon insertion data supports an alternative hypothesis for the history of placentals: the early divergences involved incomplete lineage sorting and hybridization. Neither of these two evolutionary processes is accounted for in the tree models of Morgan et al. and Romiguier et al., but both can be integral parts of a network model.
I think that we can see the suggested move from trees to networks as a form of Kuhnian paradigm shift. In Kuhn's historical model, during the period of "normal science" the failure of results to conform to the current paradigm is not seen as refuting the paradigm, but instead is seen as resulting from errors by researchers (e.g. use of inadequate models, acquisition of unreliable data). However, in the Kuhn model, as anomalous results accumulate a new paradigm emerges that subsumes the old results along with the anomalous results, forming a single new framework or paradigm.
Non-tree-like phylogenetic results are currently not seen by most phylogeneticists as refuting the paradigm of a phylogenetic tree, but instead are the result of inadequate phylogenetic tree-models and/or insufficient data (as exemplified by Salichos and Rokas 2013). Nevertheless, these results can also be seen as refuting that paradigm. In that case, a shift to network thinking would embrace all of the tree results as well as the non-tree ones, and would thus form a viable new paradigm.
We should not really call this a Kuhnian "revolution", of course, since tree-thinking and network-thinking are not incompatible, but rather the one is an extension of the other.
Note: There is a follow-up post — Why are there conflicting placental roots?
Hallström BM, Janke A (2010) Mammalian evolution may not be strictly bifurcating. Molecular Biology & Evolution 27: 2804-2816.
Morgan CC, Foster PG, Webb AE, Pisani D, McInerney JO, O’Connell MJ (2013) Heterogeneous models place the root of the placental mammal phylogeny. Molecular Biology & Evolution 30: 2145-2156.
Romiguier J, Ranwez V, Delsuc F, Galtier N, Douzery EJP (2013) Less is more in mammalian phylogenomics: AT-rich genes minimize tree conflicts and unravel the root of placental mammals. Molecular Biology & Evolution 30: 2134-2144.
Salichos L, Rokas A (2013) Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497: 327-331.
Teeling EC, Hedges SB (2013) Making the impossible possible: rooting the tree of placental mammals. Molecular Biology & Evolution 30: 1999-2000.