Wednesday, February 18, 2015

Representing macro- and micro-evolution in a network

In biology we often distinguish microevolutionary events, which occur at the population level, from macroevolutionary events, which involve species. We have traditionally treated phylogenetics as a study of macroevolution. However, more recently there has been a trend to include population-level events, such as incomplete lineage sorting and introgression.

This is of particular importance for the resulting display diagrams. A phylogenetic tree was originally conceived to represent macroevolution. For example, speciation and extinction occur as single events at particular times, and these events apply to discrete groups of organisms. The taxa can be represented as distinct lineages in a tree graph, and the events by having these lineages stop or branch in the graph.

This idea is easily extended to phylogenetic networks, where the gene-flow events are also treated as singular, so that hybridization or horizontal gene transfer can be represented as single reticulations among the lineages.

These are sometimes called "pulse" events. However, there are also "press" events that are ongoing. That is, a lot of genetic variation is generated where populations repeatedly mix, so that every gene-flow instance is part of a continuous process of mixing. This often occurs, for example, in the context of isolation by distance, such as ring species or clinal variation. Under these circumstances, processes like introgression and HGT can involve ongoing events.

For instance, in an earlier life I once studied three species of plant in the Sydney region (Morrison DA, McDonald M, Bankoff P, Quirico P, Mackay D. 1994. Reproductive isolation mechanisms among four closely-related species of Conospermum (Proteaceae). Botanical Journal of the Linnean Society 116: 13-31). One of the species was ecologically isolated from the other two (it occurred in dry rather than damp habitats), and the other two were geographically isolated from each other (they occurred on separate sandstone uplands with a large valley in between). These species look very different from each other, as shown in the picture above, but looks are deceiving. Where the ecological isolation was incomplete, introgression occurred and admixed populations could be found.

These dynamics are more difficult to represent in a phylogenetic tree or network. We do not have discrete groups that can be represented by lines on a graph, but instead have fuzzy groups with indistinct boundaries. Furthermore, we do not have discrete events, but instead have ongoing (repeated) processes.

Nevertheless, it seems clear that there is a desire in modern biology to integrate macroevolutionary and microevolutionary dynamics in a single network diagram. That is, some parts of the diagram will represent pulse events involving discrete groups and other parts will represent press events among fuzzy groups. This situation seems to be currently addressed by practitioners by first creating a tree to represent the pulse events (and possibly their times), and then adding imprecisely located dashed lines as a representation of ongoing gene flow — see the example in Producing trees from datasets with gene flow. This particular mixture of precision and imprecision seems rather unsatisfactory.

Perhaps someone might like to have a think about this aspect of phylogenetic networks, to see if there is some way we can do better.

1 comment: