The Genealogical World of Phylogenetic Networks: January 2016

Monday, January 25, 2016

What is in Traditional Chinese Medicines — a network analysis

I used to work for the New South Wales Institute of Technology, which in the late 1980s mutated into the University of Technology Sydney (UTS). During this process it acquired an organization called the College of Traditional Chinese Medicine. This group was placed in the Faculty of Science, for lack of anywhere else to put it.

These people had little contact with the rest of the faculty, and I don't recall ever meeting any of them. Indeed, their work was not really based on Western science. These days, the UTS College of Traditional Chinese Medicine offers a Bachelor of Health Science in Traditional Chinese Medicine, although they are most obvious in the UTS Chinese Herbal Medicine Clinic, which is also nominally still part of the Faculty of Science.

The presence of Traditional Chinese Medicine (TCM) in an Australian university setting is relevant to today's blog post, because Australia seems to be one of the few places to have shown any interest in connecting TCM and Western science. Indeed, there is also a Uniclinic of Traditional Chinese Medicine within the School of Science and Health at Western Sydney University. Most of the interest in studying TCMs has otherwise been confined to Asia (see Dennis Normile. 2003. The new face of Traditional Chinese Medicine. Science 299: 188-190).

Recently, a group of Australian researchers decided to have a look at the content of some of the TCMs available in their country:

Megan L. Coghlan, Garth Maker, Elly Crighton, James Haile, Dáithí C. Murray, Nicole E. White, Roger W. Byard, Matthew I. Bellgard, Ian Mullaney, Robert Trengove, Richard J.N. Allcock, Christine Nash, Claire Hoban, Kevin Jarrett, Ross Edwards, Ian F. Musgrave & Michael Bunce (2015) Combined DNA, toxicological and heavy metal analyses provides an auditing toolkit to improve pharmacovigilance of traditional Chinese medicine (TCM). Nature Scientific Reports 5: 17475.

Some of these TCMs (12 out of 26) are registered for use with the Therapeutic Goods Administration, which regulates their use within Australia, while the other TCMs are not (which technically means that they should not have been commercially available). However, there is little in the way of pharmacovigilance of herbal medicines anywhere in the world.

All of the products were comprehensively audited for their biological (via next generation DNA sequencing), toxicological (LC-MS analysis) and heavy metal (arsenic, cadmium and lead, via SF-ICP-MS analysis) contents. For the latter two analyses the amount of material detected was also quantified.

As usual, we can use a phylogenetic network to visualize these data, which I have done using a neighbor-net network on the presence-absence data. The result is shown in the figure. TCMs that are closely connected in the network are similar to each other based on their detected contents, and those that are further apart are progressively more different from each other. The registered products are highlighted in red.

There is wide variation among the products. The seven most divergent TCMs in the network are all unregistered, with the remaining seven being more similar to the registered TCMs. Only two TCMs (TCM10 and TCM17) have no discrepancies between the detected contents and what was declared (either to the regulatory agency, or to the consumer in the form of an ingredients list).

The authors summarize this situation:

Genetic analysis revealed that 50% of samples contained DNA of undeclared plant or animal taxa, including an endangered species of Panthera (snow leopard). In 50% of the TCMs, an undeclared pharmaceutical agent was detected including warfarin, dexamethasone, diclofenac, cyproheptadine and paracetamol. Mass spectrometry revealed heavy metals including arsenic, lead and cadmium, one with a level of arsenic >10 times the acceptable limit.

This study presents genetic, toxicological, and heavy metal data that should be of serious concern to regulatory agencies, medical professionals and the public who choose to adopt TCM as a treatment option. Of the 26 TCMs investigated, all but two can be classified as non-compliant on the grounds of DNA, toxicology and heavy metals, or a combination thereof. In total, 92% were deemed non-compliant with some medicines posing a serious health risk.

Such findings are not only of concern to the consumer, but also flag the need for detailed auditing of herbal preparations prior to evaluation in clinical trials.

Tuesday, January 12, 2016

Directional processes in language change

Given that we are still in the process of beginning the new year, it seems to be in order to talk about directions — not in general, but rather in specific, namely, about directions in language change. This is important in so far as many processes in language evolution are directional. This means that they follow a specific direction from a state X to a state Y, and this is frequently attested across a large number of the languages of the world, while the opposite process, that state Y changes to state X, is extremely rare or even unattested.

In language evolution there are a lot of well-known and well-investigated processes with a strong directional tendency. In sound change, for example, a [p] can easily become an [f], whether it is in the Indo-European, the Austronesian, or the Sino-Tibetan languages. Yet the opposite process, that an [f] becomes a [p] is extremely rare. Similar tendencies hold for a [k] becoming a [ʧ], as in Italian [ˈtʃɛnto] cento "hundred", going back to Latin [kɛntum] centum "hundred", or a [g] becoming a [h], as in Czech [ɦora] hora "mountain", going back to Proto-Slavic *gora "mountain" (Derksen 2008).

In semantic change, unidirectional tendencies can also be observed, although it is often more difficult to identify them, let alone generalising them. Nevertheless, I think it is a rather safe bet to claim that words which originally mean "head" have a certain tendency to shift their meaning to denote "(the) first, the boss" or "the upper part, the top", while the opposite shift (that words which mean "boss" or "top" will be used to denote "head") is very unlikely to happen. Finally, in grammatical change, or, to be more precise, in grammaticalization (the process by which languages acquire new grammatical categories) directionality is one of the most important constraints (Haspelmath 2004).

Linguists usually know these tendencies very well, and they use them in their daily work, be it when trying to reconstruct the original pronuncation of words in unattested ancestral languages, when deciphering historical documents, or when tracing the semantic development of words through history. Directional changes are also important in evolutionary biology. Ratchet-like (that means: unidirectional) processes serve as a major explanans for constructive neutral evolution (Gray et al. 2010), direction is at the core of lateral gene transfer, and — as David mentioned in an earlier post — the usage of directional (non-reversible) models in phylogenetic reconstruction even provides an elegant way to root a tree (see also Huelsenbeck et al. 2002).

Given the active transfer of ideas from the biological to the linguistic domain in the last two decades, and the important role that directional processes play in both domains, it is surprising to me that methodological transfer has so far been almost exclusively limited to time-reversible models. The only approach known to me that explicitly makes use of linguistic knowledge of directions is that of Baxter (2006). In this paper, Baxter analysed phonological mergers in Chinese dialects within a framework of Camin-Sokal parsimony (Camin and Sokal 1965).

Phonological merger is a specific systemic process in language evolution. When sounds change (and they always change in some way), it may happen that two formerly distinct sounds are pronounced in the same way. As a result, words that formerly sounded different may suddenly sound alike, such as English write and right, which remain different only in their spelling not pronunciation. Mergers are a prototypical irreversible process. Once a merger has happened, speakers cannot go back, unless they recorded the original distinction and artificially tuned their language. But even this may be less easy than it seems — it is always easy to reduce distinctions. For example, most English speakers wouldn't have many difficulties in artificially pronouncing all instances of s as sh during a conversation. But being asked to pronounce a randomly chosen set of words with s as sh will turn out to be much more difficult. For this reason, mergers are an ideal data type for directional models of language change. Their drawback is, however, that they are difficult to determine, which may also be the reason why Baxter's approach has never been tested on other language families since then.

It may be justified to use time-reversible models for analyses that use lexical data, especially cognate sets, as in the approaches following Gray and Atkinson (2003), since it is difficult to determine the impact of directional processes on lexical replacement. Furthermore, due to the specific way the data is sampled, it is extremely difficult to determine directions. Yet in many other approaches that use different types of data, especially in those cases that model sound change processes (Hruschka et al. 2015, Wheeler and Whiteley 2015) or grammatical change (Longobardi et al. 2013), it might have a substantial impact on the results if directionality was explicitly modeled.

What does this mean for the directions for the New Year? I keep being surprised by the similarities between evolutionary biology and historical linguistics, be it the organization of information in genomes and languages, the processes that drive evolution, the philosophical questions underlying our investigations, or the quarrels among scholars in their fields. Unfortunately, much of the transfer from the biological to the linguistic domain is still very simplistic, often ignoring the specific differences between the two domains. On the other hand, many fruitful analogies are still out there but have not yet been properly investigated. So, as a direction for those who work in interdisciplinary domains in this New Year, I think we should try to avoid reinventing the wheel, and we should also pay attention to not putting wheels on sledges.

References

Baxter, W. (2006): Mandarin dialect phylogeny. Cah. Linguistique -- Asie Orientale 35.1. 71-114.
Camin, J. and R. Sokal (1965): A method for deducing branching sequences in phylogeny. Evolution 19.3. 311-327.
Derksen, R. (2008): Etymological dictionary of the Slavic inherited lexicon . Brill: Leiden and Boston.
Gray, R. and Q. Atkinson (2003): Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426.6965. 435-439.
Gray, M., J. Lukes, J. Archibald, P. Keeling, and W. Doolittle (2010): Cell biology. Irremediable complexity?. Science 330.6006. 920-921.
Haspelmath, M. (2004): On directionality in language change with particular reference to grammaticalization. In: Fischer, O., M. Norde, and H. Perridon (eds.): Up and down the cline -- The nature of grammaticalization. John Benjamins Publishing Company: 17-44.
Hruschka, D., S. Branford, E. Smith, J. Wilkins, A. Meade, M. Pagel, and T. Bhattacharya (2015): Detecting regular sound changes in linguistics as events of concerted evolution. Curr. Biol. 25.1. 1-9.
Huelsenbeck, J., J. Bollback, and A. Levine (2002): Inferring the root of a phylogenetic tree. Systems Biology 51.1. 32-43.
Longobardi, G., C. Guardiano, G. Silvestri, A. Boattini, and A. Ceolin (2013): Toward a syntactic phylogeny of modern Indo-European languages. J. Hist. Linguist. 3.1. 122-152.
Wheeler, W. and P. Whiteley (2014): Historical linguistics as a sequence optimization problem: the evolution and biogeography of Uto-Aztecan languages. Cladistics 30.1. 1-13.