Monday, June 10, 2019

Why don't people draw evolutionary networks sensibly?

In phylogenetics there are two types of network:
  • those where the network edges have a time direction, whether explicit or implied; and
  • those where the edges are undirected.
The latter networks are among the most valuable tools ever devised for the exploration of multivariate data patterns; and this blog is replete with examples drawn from all fields that produce quantitative data (see the Analyses blog page). The first type of network, however, is the only one that can display hypothesized evolutionary histories — that is, they can truly be called evolutionary networks.

Evolutionary networks have a set of characteristics that are essential in order to successfully display biological histories, such as:
  • no directed cycles, because otherwise one of the descendants would be its own ancestor;
  • time consistency, meaning that reticulations in the network only occur between contemporaries.
The latter requirement is not needed for the history of human artifacts, because the ideas on which those artifacts are based can be recorded, and then not used until much later — ideas can "leap forward" in time. There are a number of examples of this in this blog, as discussed in last week's post (A phylogenetic network outside science).

However, time consistency is pretty much universal in biology (see the post on Time inconsistency in evolutionary networks). Natural hybridization and introgression require two living organisms in order to occur, as does horizontal gene transfer. This is basic biology, at least outside the laboratory.

So, the question posed in this post's title refers to the fact that so many people draw their evolutionary networks in a manner that appears to violate time consistency.

Consider this example (from: Interspecies hybrids play a vital role in evolution. Quanta Magazine):

Note that the reticulation edges (the dashed lines) represent gene transfers by introgression or hybidization, and yet none of them are drawn vertically, as they would need to be in order to be time consistent (since time travels from left to right).

It might be argued that most of these are not all that important in practice, but the one to the left quite definitely matters very much. It shows gene transfer between: (i) an organism that speciated 3.65 million years ago and (ii) an organism that is the descendant of one that speciated 3.47 million years ago. The 180,000 years between those two events are not irrelevant; and they make the claimed gene transfer impossible.

One might think that this is simply the general media misunderstanding the network requirements, but this is not so. The diagram is actually a quite accurate representation of the one from the original scientific publication (from: Genome-wide signatures of complex introgression and adaptive evolution in the big cats. Science Advances 3: e1700299; 2017.):

The network shows the same series of hybridizations / introgressions. However, this time three sets of gene transfers are shown to be time consistent, represented by the horizontal arrows (since time flows from top to bottom). Two of the three diagonal arrows (light blue and orange) could be made time consistent (ie. drawn horizontally), although the authors have chosen not to do so, apparently for artistic reasons. However, the first reticulation cannot be made time consistent, for the reason outlined above.

So, people, please think about what you are drawing, and don't show things that are biologically impossible,


  1. So far this sounds like this might mean introgression into the tiger line from an extinct stem group member of the LLJ line. This should essentially add up to, yes, some genes being transferred 180k years into the future…?

    1. Another possibility is an extinct descendant that was genetically identical to the ancestor. Either way, to draw the network properly, a ghost lineage needs to be added, and the gene transfer needs to be drawn from that ghost lineage, not from the ancestor.