Wednesday, April 18, 2012

An explanation of graph types

Biologists sometimes are not clear about the distinction between directed and undirected graphs in relation to whether they are cyclic or acyclic. So, to help clarify matters, I have included a figure here that places examples of the various graphs into their respective categories.

There are four combinations of characteristics, shown in the figure as a 2x2 table.

Click to enlarge

In all of the graphs there are four (unlabelled) leaves, but the number of internal nodes and edges varies depending on whether there are cycles (4 nodes, 4 edges) or not (2 nodes, 1 edge; or 4 nodes, 4 edges).

The important point for biologists to note is that any evolutionary diagram must involve a directed acyclic graph (DAG). An undirected graph cannot represent history, because the direction of that history is not shown (and history is defined in terms of a past relative to the present). A directed cyclic graph cannot represent a realistic history, because at one of the nodes in the cycle an inferred ancestor in also its own descendant (or one of the inferred descendants is also its own ancestor).

Note that an undirected cyclic graph can be turned into either a directed cyclic graph or a directed acyclic graph. In a phylogenetic analysis, the goal is to produce a directed acyclic graph.

The main practical distinction between a "data-display network" and an "evolutionary network" is that the former is usually undirected and the latter always directed. The usual conceptual difference between a phylogenetic tree and an equivalent phylogenetic network (= evolutionary network) is that the latter has a reticulation node while the former does not.

There seems to be no consistency in the literature about what to call a cycle in the various graphs. I have made two suggestions here (loop and circuit). But, what should one call the reticulated part of a DAG?

1 comment:

  1. The phylogeny in flowering plants depends on masses of crosses as the season of flowering for many c3 is temperate. When flowering occurs selfing is sometimes just as likely in monoecious plants as crossing is. Directed cyclic histograms must include generation upon generation of clones as well.