Wednesday, January 7, 2015

Complex hybridizations in wheat

Sometimes there has been discussion about the structural complexity of phylogenetic networks. At one extreme, species phylogenies are seen as trees with occasional reticulations, and at the other end there is a whole cobweb of reticulations with no visible tree. In this context, comments are sometimes made about the likeliness of those outputs from network programs that show extensive gene flow. If a biologist does not believe that the history of "their" organisms involves extensive reticulation, then the algorithmic outputs might be dismissed as unrealistic.

Here I present one well-known example of extensive hybridization, in which the computer programs seem to agree on the same complex solution — the history of common bread wheat.

The data and analyses are from:
Marcussen T, Sandve SR, Heier L, Spannagl M, Pfeifer M, International Wheat Genome Sequencing Consortium, Jakobsen KS, Wulff BB, Steuernagel B, Mayer KF, Olsen OA (2014) Ancient hybridizations among the ancestral genomes of bread wheat. Science 345: 1250092.
The hybridization network shown above is a montage of two different phylogenies from the original paper. It shows four splits, one homoploid hybridization, and two polyploid hybridizations. The time is shown in the circles in units of millions of years (note that the scale is not linear).

The first split (6.5 million years ago) is between the genera Triticum (wheat) and Aegilops (goatgrasses), which are morphologically highly distinct, with Aegilops having rounded glumes rather than keeled glumes. There are currently c.20 recognized species in both Aegilops and Triticum, so only a small part of the diversity is shown in the network.

Domesticated Bread wheat (T. aestivum) is a hexaploid species, with the three diploid genomes being known as A, B and D. Their lineages are labeled and colored in the network diagram. The genome D lineage is the result of a homoploid hybridization (which has been taxonomically treated as part of Aegilops). Bread wheat is then the recent result of two successive allopolyploid hybridizations, with a tetraploid lineage as the intermediate.

Of the other species shown in the network, all of the goatgrasses are wild diploid species, as is T. uartu. T. monococcum is also diploid, with domesticated Einkorn wheat being derived from the wild ancestor. T. turgidum is a tetraploid species, with domesticated Emmer wheat being derived from the wild ancestor — it has recently diversified into many modern wheat species.

This is one of the most complex phylogenetic networks known, although that complexity is at least partly the result of leaving out most of the other diploid species in the Triticum and Aegilops clades. Program outputs that are more complex than this are unlikely to be realistic.

1 comment:

  1. David,

    Thanks very much for bringing our attention to this. My impression is that in terms of explicit/evolutionary/rooted phylogenetic networks this is probably the most high-profile publication so far. Great that networks are reaching a wider audience.

    It will be interesting to see in how far their experimental methodology is wheat-specific and in how far it can be re-used in other contexts. One thing immediately caught my eye: on page 1 there is a figure showing an explicit phylogenetic network on 3 taxa. We know mathematically that such network topologies cannot be distinguished purely by considering the topology of input gene trees (there are two other network topologies that behave "the same" through this lens). So it will be good to study in how far they have used additional model parameters and assumptions (such as ILS) to strengthen their hypotheses of network topology.