Tuesday, June 12, 2012

New version of phylogenetic networks software

There's a new release of Dendroscope with several new methods for constructing phylogenetic networks. This blog will explain the new functionality added to the Cass algorithm.

Cass can be used to construct a rooted phylogenetic network from any number of rooted trees. These trees can be multifurcating (nonbinary). The output network will display all clusters of the input trees. In certain situations the algorithm has been shown to minimize the reticulation number of the network (see van Iersel et al. 2010 and Kelk et al. 2012).

The new release of Cass can also produce networks that display the input trees (rather than only the clusters from the input trees). In other words, Cass can be used as a heuristic for the problem Hybridization Number on any number of multifurcating trees. This is a notoriously hard problem and at the moment Cass is the only implemented algorithm for this problem.

For inputs consisting of two multifurcating trees, Cass solves Hybridization Number optimally (Kelk et al. 2012) and, for such instances, Dendroscope also contains a faster optimal algorithm (go to Algorithms - Hybridization Networks) by Huson and Linz (submitted). For bifurcating trees, there is one other heuristic available that can be used for any number of trees: the program PIRN by Yufeng Wu.

But, as I said, for more than two multifurcating trees, Cass is the only method currently available.

Let's have a look at the new functionality of Cass for these input trees:

The Cass algorithm can be found here (Algorithms - Level-k Network Consensus):

You get several new options:

If we don't check the box "construct only networks that display the trees", we get the following network with one reticulation, which displays all clusters of all input trees:

If we do check the box "construct only networks that display the trees", we get the following three networks with two reticulations each. Each of the networks displays all three input trees.

You see that in this case one needs more reticulations to display the trees than to display the clusters from the trees. You also see that Cass can now produce several solutions rather than just one. Note however that Cass is not guaranteed to find all optimal solutions. As a result of a collapsing step in Cass, it might miss networks (see Kelk et al. 2012). That is also the reason why Cass is not guaranteed to find an optimal solution (unless the input consists of only two trees or the output network is at most level-2).

Finally, Cass can also be used to construct a network from a multi-labelled tree. Go to Algorithms - Multi-Labelled-Tree to Network - MUL to Network, level-k-based.

No comments:

Post a Comment