Wednesday, July 24, 2013

A rant about the term "evolutionary network"


Mostly, I just rant to myself, and so I have generally avoided doing so in this blog. But this time I intend making an exception.

The expression "evolutionary network" has become completely meaningless in science, and this is a pity. This has happened because it has been applied to so many unrelated concepts that we can no longer work out what anyone means when they use it, without reading the rest of their text to work out the context.

Networks are, of course, ubiquitous in areas as diverse as the social sciences, biology, computer science, physics and economics, and consequently there is an extensive literature on the subject. This means that the term "evolutionary network" has a different meaning in various assorted areas of intellectual activity, such as neural networks, systems biology and quality measurement, as well as the usage in phylogenetics. What is annoying me, however, is that biologists use the term in oodles of different ways, as well.

Partly, this issue arises because of the use by computer scientists of known biological processes as models for developing computer algorithms, which are then named after the process that provided the inspiration (e.g. so-called genetic algorithms). Partly, the problem comes from claiming that a particular process (or something analogous to it) does actually occur in some particular field of study, and therefore using the relevant name (e.g. so-called evolutionary computing). But the problem in biology is that everyone claims that they are studying evolution, and therefore whatever they do can be called "evolutionary".

The essential point in biology is, naturally, that most patterns are the product of one or more evolutionary processes, to one degree or another. That does not, however, justify calling all patterns and processes "evolutionary". For example, observed similarity (of genes, genomes, organisms, species, etc) may or may not have a large evolutionary component — similarity may be the result of either proximal processes (which may be ecological, rather than strongly evolutionary) or ultimate processes (which are very likely to be evolutionary).

This was one of the strongest arguments for the distinction that has been made been phenetics (based on overall similarity) and phylogenetics (based on genealogy). A phenogram (expressing observed similarity) and a phylogram (expressing inferred genealogy) may be two very different things for any given group of objects. There seems to be no real justification for the merging of these two ideas; and yet this seems to be occurring increasingly.

The latest salvo that blurs the distinction similarity and genealogy has been fired by Halary et al. (2013. EGN: a wizard for construction of gene and genome similarity networks. BMC Evolutionary Biology 13: 146), who have this to say:
Here, we introduce a simple but powerful software program, EGN (for Evolutionary Gene and genome Network), for the reconstruction of similarity networks from large molecular datasets.
To explain this, in an earlier paper Alvarez-Ponce and colleagues (2013. Gene similarity networks provide tools for understanding eukaryote origins and evolution. Proceedings of the National Academy of Sciences of the USA 110: E1594–E1603) developed the idea of a gene similarity network, the name of which tells you exactly what it is. It is a non-phylogenetic network in which the edges directly connect observed genes based on their similarity; that is, it extends the classical concept of gene families. The authors present various reasons to justify their claim that "gene similarity networks have the potential to explore deeper relationships than phylogenetic trees".

The follow-up paper by Halary et al. (the one under discussion here) describes a computer program that automates the production of these gene similarity networks. But why have they called the program "Evolutionary Gene Network" rather than some version of "Gene Similarity Network"? This name is not only blatantly misleading but downright confusing. The network produced can be used to explore evolutionary history, sure, but it does not represent anything directly evolutionary. The evolutionary interpretation is in the mind of the beholder, not in the network algorithm.

I encourage everyone to be careful when naming their programs. A program name can mislead naive users if the name is disconnected from the program's purpose. Even the program SplitsTree mostly produces networks these days, and very rarely trees!

The term "evolutionary network" in biology, at least, could be usefully restricted to those networks representing evolutionary history directly (e.g. Thiergart et al. 2012. An evolutionary network of genes present in the eukaryote common ancestor polls genomes on eukaryotic and mitochondrial origin. Genome Biology & Evolution 4: 466-485).

No comments:

Post a Comment