Wednesday, March 12, 2014

The Phylocode and reticulation

Biologists have this idea that when any of us uses a formal name then we should all be talking about the same thing. To this end, various cods of nomenclature have been proposed and agreed to over the centuries, notably those based on hierarchical ranking (eg. International Code of Nomenclature for Algae, Fungi, and Plants; International Code of Zoological Nomenclature; International Code of Nomenclature of Bacteria; International Code of Nomenclature for Cultivated Plants). Others have not been universally agreed to, and are not yet being used (eg. BioCode; PhyloCode).

Of the latter group, the Phylocode is not dead, but it is certainly hibernating. As explained by Mike Keesey (The PhyloCode Has a Deadline ):
The PhyloCode (more verbosely, the International Code of Phylogenetic Nomenclature) is a proposed nomenclatural code, intended as an alternative to the rank-based codes. It was first drafted in April 2000, and at that time the starting date was given as "1 January 200n". On this date the code would be enacted and published along with a companion volume, which would provide the first definitions under the code, establishing best practices and defining the most commonly used clade names across all fields of biology.
Well, the '00s came and went without the code being enacted. The hold-up was not the code itself, which has been at least close to its final form since 2007. (The last revision, in January 2010, was minor.) And it hasn't been the software for the registration database, which has been completed. The hold-up was the companion volume, which turned out to be a much more daunting project than expected.
There is a new progress report for Phylonyms, the companion volume to the PhyloCode. There will be at most 268 entries. Currently 186 of those (over two thirds) have already been accepted. The rest are at various stages of review. The contract with University of California Press calls for the manuscript to be submitted by September 1, 2014.

Of interest to us here at this blog is how the Phylocode treats reticulate evolution. In the rank-based nomenclatural codes (eg. ICN, ICZN), reticulate evolution is ignored. Each named group at any given rank is mutually exclusive, so that each taxon can be part of only one of the named groups. This naming scheme can be used to represent hierarchical relationships but not reticulate ones.

With regard to the Phylocode, Philip Cantino explicitly addressed this issue at the Botany 2008 conference (The taxonomic treatment of hybrid derivatives under the ICBN and the PhyloCode):
By convention, ranked taxa must be either nested or mutually exclusive, but clades that include species of hybrid origin may be partially overlapping. Consequently, reticulate evolution presents a challenge for phylogenetic systematists using traditional rank-based taxonomy and nomenclature, where a species can belong to only one taxon at a given rank. Assignment of a species derived from an intersectional (or intersubgeneric or intergeneric) hybrid to only one of its parental sections (or subgenera or genera) renders the other parental taxon at the same rank paraphyletic. When classifying such hybrids using a ranked hierarchy, one must reject either the convention that an organism can only belong to one taxon at a given rank or the convention that paraphyletic groups should not be formally recognized. Phylogenetic nomenclature accurately reflects the complex patterns of descent that result from hybridization, in that a species of hybrid origin belongs to all of the named clades that contain each of its parents. Thus, the expectation that named supraspecific taxa be monophyletic is maintained in spite of hybridization.
Putting aside the obvious suggestion that we could allow named groups to be paraphyletic (which they can be under the rank-based codes but not the Phylocode), the suggestion that organisms can belong to more than one named group (which they can under the Phylocode but not the other codes) is an interesting departure from tradition. It explicitly recognizes the existence of fuzzy groups, which can overlap.


The Phylocode has little to say explicitly about reticulation, but what it does say is clear:
Note 2.1.3. Clades are often either nested or mutually exclusive; however, phenomena such as speciation via hybridization, species fusion, and symbiogenesis can result in clades that are partially overlapping.
Note 2.2.1. Here and elsewhere in this code, "phylogenetic tree" is used loosely to include any directed graph, specifically those with additional connections representing phenomena such as hybridization (see Note 2.1.3).
Note 9.3.2. The application of a phylogenetic definition, and thus also of a phylogenetically defined clade name, requires a hypothesized phylogeny. To accommodate phenomena such as speciation via hybridization, species fusion, and symbiogenesis (see Note 2.1.3), the hypothesized phylogeny that serves as the context for the application of a phylogenetically defined name need not be strictly diverging.
Chapter VI. Provisions for Hybrids
Article 16.
16.1. Hybrid origin of a clade may be indicated by placing the multiplication sign (×) in front of the name. The names of clades of hybrid origin otherwise follow the same rules as for other clades.
16.2. An organism that is a hybrid between named clades may be indicated by placing the multiplication sign between the names of the clades; the whole expression is then called a hybrid formula.
Recommendation 16.2A. In cases in which it is not clear whether a set of hybrid organisms represents a clade (as opposed to independently produced hybrid individuals that do not form a clade), authors should consider whether a name is really needed, bearing in mind that formulae, though more cumbersome, are more informative.
In many ways, the sentiments expressed here about phylogenetics are the same as those engendered in the recent announcement of the NSF Genealogy of Life program (GoLife) (see NSF and reticulating phylogenies) — a genealogy does not have to be tree-like.


In one sense, we should applaud the creators of the Phylocode for explicitly addressing an issue that has traditionally been ignored by the creators of the other codes (who have ignored phylogeny), as well as by tree-based phylogeneticists (who seem to think that phylogenies consist only of nested monophyletic groups).

Previous suggestions for dealing with hybrids look a bit like an attempt to sweep all of the problems together into separate piles, and then simply labeling them "problem piles" (see How should we treat hybrids in a taxonomic scheme?). This is very much what is done, for example, under the International Code of Nomenclature for Algae, Fungi, and Plants. Here, hybrids are treated as separate taxa, and are named as such using a "hybrid formula" that applies to distinct "nothotaxa".

Furthermore, species separately derived from the same ancestral gene pool are considered to be distinct species, and are named appropriately. However, hybrids derived independently from crosses of the same two species appear to be treated in botany as being the same taxon, and thus share the same name. For example, the ICN states: "Elymus ×laxus is the correct name applicable to all hybrids between E. farctus and E. repens" and "the correct nothospecific designation for all hybrids between Euphorbia amygdaloides and E. characias is E. ×martini". Multiple origins are not considered.

Unfortunately, while the Phylocode does better than this, the potential consequences of the Phylocode rules may be somewhat messy. For example, introgression is an extensive phenomenon in zoology and especially botany, and if we were to take the Phylocode literally then a huge number of populations would have multiple species names. Moreover, horizontal gene transfer creates relationships between distant taxa, so that species would have names in two unrelated groups (eg. an animal name and a viral name). Finally, symbiogenesis means that all of the eukaryotes would have both a eukaryote name and a proteobacterium name (since that is where their mitochondrion probably originated), and all of the plants would also have a cyanobacterium name (since that is where their chloroplast probably originated).

On one hand, fuzzy groups are a reality in phylogenetics, as a result of reticulate evolutionary histories. On the other hand, there is a good practical reason why the traditional codes of nomenclature are based on mutually exclusive groups. The only complete and accurate representation of group relationships is the phylogeny itself, and trying to name groups that represent only parts of the phylogeny is a poor substitute for that diagram. This is the dilemma faced by the Phylocode, that in practice it is trying to substitute names for relationships.

No comments:

Post a Comment