Pages

Wednesday, December 3, 2014

Visual complexity and phylogenetic networks

Network diagrams have become rather commonplace in the modern world. Most of them are constructed along the same lines — observed entities (objects or concepts, or groups of them) are connected by lines showing observed relationships. Such visualizations are relatively easy to create using computers, and so they represent a relatively new form of visual data analysis. The complexity of the diagrams can be both seen and quantitatively analyzed, thus forming part of what is now grandiosely called "data mining and knowledge discovery".

The Visual Complexity project has been compiling an interesting set of online network visualizations. While the author (Manuel Lima) intends this to be "a unified resource space for anyone interested in the visualization of complex networks", at the moment it is simply a magpie collection of references to web pages. There are currently nearly 800 visualizations referenced, grouped into:
  • Art
  • Music
  • Biology
  • Food Webs
  • Transportation Networks
  • Business Networks
  • Social Networks
  • Political Networks
  • Computer Systems
  • Internet
  • World Wide Web
  • Pattern Recognition
  • Semantic Networks
  • Knowledge Networks
  • Multi-Domain Representation
  • Others
Our interest is in the Biology group, of course, where we have long known about networks, including food webs, which you will notice are grouped separately. There are currently 52 networks (plus 8 in the Food Web group), covering a wide range of topics, such as:
  • Gene interaction networks
  • Protein-protein interaction networks
  • Protein "homology" networks
  • Neuron networks
  • Haplotype blocks
  • Metabolic pathways
  • Genome maps
  • Physiology maps
  • Disease maps
  • Visualizing the aging process

This is all very well. However, we are specifically interested in phylogenetic networks, which are as old-fashioned as food webs. They differ significantly from these other biological networks. Phylogenies connect observed entities (objects, or groups of them) only indirectly, via unobserved nodes, with the lines representing inferred affinity or genealogical relationships. Only at the population level is it likely that all internal nodes, representing individuals, will be observed, and that their relationships might also be observed.

There are currently three phylogenies referenced by Visual Complexity:
Only the last of these is a network, the other two being trees. Sadly, the first one also contains a dead link, which is a problem common for most multi-year internet projects.

Unfortunately, the uniqueness of phylogenies among networks is not acknowledged by the Visual Complexity site. This is not unusual amongst network researchers, most of whom have never even heard of phylogenies. Moreover, many of the people who do seem to have heard of them often fail to understand them and their interpretation, so that they do not notice the fundamental difference. Nevertheless, phylogenetic networks are among the oldest type of recorded network, and there are certainly complex versions of them dating back to the 1700s (see those by Herman and by Batsch in Affinity networks updated).

Finally, the Visual Complexity site does not yet have much from anthropology (as distinct from the social sciences in general) or anything from linguistics (other than programming languages!). These are promising areas for studies of visual complexity.

No comments:

Post a Comment