Monday, July 22, 2019

Two problems concerning the use of Ancient DNA


Last week I wrote a piece for The Wine Gourd blog, called The role of Wine Influencers — more of the same. I discussed the modern concern in the wine industry with social media Influencers, who use Facebook, Instagram, Twitter, Youtube, etc to promote wine — when LeBron James drinks a wine it will sell a whole lot better (presumably on the principle that “You may not be able to play like LeBron, but you can drink like him”).

My conclusion was that the wine industry has always had what are now called Micro- or Nano-Influencers, involving endorsements from people and organizations who possess an expert level of knowledge as well as social influence. For example, professional wine critics have always fitted this bill, notably Robert M. Parker Jr.

So, the existence of social media Wine Influencers is nothing new — it is simply the modern equivalent of something old.


Well, my blog post here is about the same idea in Ancient-DNA phylogenetics — the idea that, in spite of the claim that modern techniques provide new advantages, we may in fact simply be repeating ourselves. Modern issues are simply modern versions of the same old issues.

First problem

The first issue that I would like to raise is that of molecular data. This is seen as the crucial element of modern studies of ancient remains. Even the recent re-creation of the vineyard of Leonardo da Vinci (La Vigna di Leonardo, in Milan) involved the finding of sufficient DNA from the vineyard land, which was bombed during World War II, to identify the grape cultivar that was grown by da Vinci (Inside Leonardo da Vinci's vineyard).

The issue is that DNA studies, based on direct studies of genotype, are subject to all of the same data-analysis issues as are studies of phenotype (such as morphology, anatomy and ultrastructure).

One classic example is the supposed discovery in the 1980s of the phenomenon of Long-branch Attraction (LBA) in molecular studies. Here, if many shared nucleotide changes occur on distantly related branches of a phylogenetic tree, these branches may actually be reconstructed as sister lineages during the phylogenetic analysis. However, this is simply an example of parallelism, a phenomenon that had previously been known for decades in phylogenetic analyses of phenotype.

Many currently recognized practical problems in genotype studies, such as LBA and compositional biases, are merely specific examples of how analogy appears in molecular biology. Analogy will create convergences and parallelisms, and these will confound the attempt to detect homology.

So, reconstructing evolutionary history using molecular biology is a priori neither better nor worse than using any other source of data, because the same limitations apply. It is simply another type of data.

Second problem

The second issue that I would like to raise is that genome data are a type of Big Data, and the idea that Big Data will apparently solve all ills with data analyses. The idea seems to be that, if you can collect enough data, then you must be lead to "the truth".

This is nonsense — data are just numbers, and numbers can mislead, no matter how many there are. Data need to be interpreted by a human mind, if they are to tell that mind anything useful. The only thing that changes with the use of Big Data is the order in which the steps of the data analysis and interperation occur.

In the Old Days (ie. when I was a student), what we did was:
  1. develop an experimental question
  2. think about potential problems
  3. collect targeted data
  4. analyze the data
  5. interpret the data, to answer the question.
These days, with Big Data, what people do is:
  1. collect a very large amount of data
  2. analyze the data, and try to interpret it
  3. think of a question that the data might answer
  4. discover the potential problems later.
All that is really different is the order, along with which steps are confounded with which other steps.

I don't see that this is necessarily any better; it is just different. So, don't pin your hopes on Ancient DNA genome-scale data to solve problems with your work.

Other issues

Anyone working with Ancient DNA knows that there are oddles of other problems. Some of them are discussed for the general public by Gideon Lewis-Kraus, writing on January 17 2019 for The New York Times Magazine: Is Ancient DNA Research revealing new truths — or falling into old traps? The answer is, of course, "both".

No comments:

Post a Comment