Wednesday, April 10, 2013

Highlighting splits in a splits graph

A splits graph is interpreted in terms of splits, or bipartitions, which divide the graph into two non-overlapping parts. If one wishes to refer to particular splits in a graph then one needs a way of highlighting those splits.

This can be done in a number of ways, some of them derived from conventions originating for the presentation of rooted phylogenetic trees. These include highlighting the taxa in one of the partitions, which is analogous to highlighting a clade in a rooted phylogenetic tree. Alternatively, we could colour the edges associated with each of the two partitions, as shown in this previous blog post (How to interpret splits graphs); however, this works only for a single split at a time.

Alternatively, it is also possible to label the edges of the splits themselves, as shown in this previous blog post (Representing evolutionary scenarios using splits graphs). Dabert et al. (Dabert M, Witalinski W, Kazmierski A, Olszanowski Z, Dabert J (2010) Molecular phylogeny of acariform mites (Acari, Arachnida): strong conflict between phylogenetic signal and long-branch attraction artifacts. Molecular Phylogenetics and Evolution 56: 222-241) present another possibility, which is to colour only the edges that separate to two partitions of each split, as shown in the figure.

This works very well visually. However, there is still the matter of actually labelling the coloured edges. Unfortunately, Dabert et al. chose to do this using terminology that is more appropriate for a rooted phylogenetic tree than an unrooted data-display network. That is, they refer to "clades", which can be recognized only in a rooted graph. Their diagram is clearly labelled with a root taxon, even though the graph itself is unrooted. The implication here is that interpreting the unrooted graph as a rooted network is straightforward, but it is not. It would be better to use the standard terminology, which refers to "splits" or "partitions", rather than to "clades".


  1. Agreed (mostly).

    If we're to use "clade" to mean "monophyletic clade" or even "my best guess at a monophyletic clade bearing in mind how difficult it is given my data", then it's a term that requires a root.

    Partition is probably better here than split though, because it seems to be a better known term for "division of set into things" and avoids the whole "lumpers vs splitters" taxonomist conflict.

    But if the graph has a root, then it's rooted: might not make biological sense to have the root there, but as a mathematical concept it is perfectly valid to say, certainly in a tree and probably in a split network, that "here is the root" and therefore "all these edges are directed away from the root". The slight caveat for these networks is that they do have to be split networks, else the implied direction is not clear. Here, it appears that there are no cycles created if we insist that all edges in a split (partition) have the same orientation, and for hybridisation networks this would also be the case, so perhaps it's not such a problem to supply a root to such a network...

    1. I agree that "partition" is better than "split". This is a splits graph, so the edges all have a unique direction, once a root location is specified. What the graph lacks is any biological interpretation of the nodes as ancestors. Nodes do have this interpretation in a rooted phylogenetic tree, and in an evolutionary network such as hybridization network, so that these graphs depict ancestor-descendant relationships. In the example above, the so-called root does not provide this interpretation, and thus clades cannot be recognized.