Pages

Thursday, February 27, 2014

Roots and the phylogenetics of mythology


A few weeks ago I discussed the phylogenetic analysis of the tale of Little Red Riding Hood (The phylogenetics of Little Red Riding Hood). In that case, I pointed out that historical reconstructions require a rooted tree, and I discussed various possible methods for rooting the unrooted trees produced by the data analyses.

This is not the only time that phylogenetics has been applied to myths or tales. For example, d'Huy (2013a) has studied the prehistoric Polyphemus tale belonging to the European and North Amerindian areas, and d'Huy (2013b) has studied the mythological motif of the Cosmic Hunt linked to the Big Dipper constellation (typical for northern and central Eurasia and for the Americas but unknown on other continents). In the first case a binary matrix of 98 characteristics for 44 versions of the tale was used, and in the latter 93 characteristics for 47 versions. Both of these studies have rooted trees.

In the latter case, a novel method of rooting the tree was used. The unrooted tree was successively rooted with each of the likely versions of the tale as outgroup. In each case the ancestral tale (the protomyth) was reconstructed and the ancestral states of the tale's characteristics (called mythemes) were determined. The author then "selected the version that holds the majority of the wide shared mythemes (>50%) as the better root."

Unfortunately, this produced an unexpected root, as shown in the tree below. The colors in the tree refer to various geographical groupings of the tale versions.


So, I re-analyzed the data using the rooting methods that I previously applied to the Red Riding Hood analysis:
  • For the bayesian analysis, I used MrBayes (2 runs, 4 chains, 1,000,000 generations, sampling frequency 1000, 25% burnin) with a relaxed clock (with independent gamma rates model for the variation of the clock rate across lineages).
  • For the neighbor-joining tree I used the BioNJ algorithm in PAUP*, and found the midpoint root.
  • For the parsimony analysis, I used a 200-replicate parsimony-ratchet search via PAUP*, calculated the branch lengths of the majority-rule consensus tree with ACCTRAN optimization, and found the midpoint root.
These three alternative roots are also shown on the tree. They seem more likely than the published root.

Geographically, the root chosen by the author's method is within the red group (tales from Asia), based on the idea that "arguments in favour of localization of protypical Cosmic Hunt in Asia seem persuasive (Berezkin 2005)." Unfortunately, this a priori argument seems to have excluded any testing of the possibility that more than one version is the sister to the remaining tales — that is, only single outgroups were considered.

On the other hand, all three of the alternative roots group the tales into two major clades. For the bayesian-clock root the two clades have distinct animal motifs, a herbivore and a carnivore, respectively. These clades do not correspond to any of the three variants recognized by Berezkin (2005).

The bayesian-clock root puts the red-colored (Asia) versions of the tale into one of the two major clades, as it also does with the orange group (Africa), which makes this root more consistent with the geographical groupings — that is, all of the geographical groups are in only one of the two major clades, except for the purple group (American coast-plateau / British Columbia). Both the Parsimony and NJ roots do the same thing, but as well as the purple group they also split the pink group (northeastern America) between the two major clades, which reduces their geographical consistency compared to the bayesian-clock root.

The bayesian-clock root does not support the suggestion that the Cosmic Hunt myth originated in Asia. Indeed, the bayesian tree does not support any particular geographical location. Furthermore, the polyphyly of the purple group presents an intriguing aspect of the tale's history.

References

Yuri Berezkin (2005) The cosmic hunt: variants of a Siberian—North-American myth. Folklore 31: 79-100.

Julien d'Huy (2013a) Polyphemus (Aa. Th. 1137): a phylogenetic reconstruction of a prehistoric tale. Nouvelle Mythologie Comparée 1: 1-21.

Julien d'Huy (2013b) A cosmic hunt in the Berber sky: a phylogenetic reconstruction of a Palaeolithic mythology. Les Cahiers de l’AARS 16: 93-106.

Tuesday, February 25, 2014

Two years of network blogging


Today is the second anniversary of starting this blog, and this is post number 222. Thanks to all of our visitors over the past two years — we hope that the next year will be as productive as this past one has been.

I have summarized here some of the accumulated data, in order to document at least some of the productivity.

As of this morning, there have been 104,211 pageviews, with a median of 129 per day. The blog has continued to grow in popularity, with a median of 70 pageviews per day in the first year and 189 per day in the second year. The range of pageviews was 69-812 per day during this past year, and 3-667 the previous year. The daily pattern for the two years is shown in the first graph.

Line graph of the number of pageviews through time, up to today.
The largest values are off the graph. The green line is the half-way mark.
The inset shows the mean (blue) and standard deviation of the daily number of pageviews.

The erratic nature of the daily variation is apparently all too typical of blogs, and there appears to be no good explanation for it.  So, we might take this as a good example of the stochastic nature of the web.

There are a few general patterns in the data, the most obvious one being the day of the week, as shown in the inset of the above graph. The posts have usually been on Mondays and Wednesdays, and these two days have had the greatest mean number of pageviews.

Some of the more obvious dips include times such as Christmas - New Year; and the biggest peaks are associated with mentions of particular blog posts on popular sites. There also continue to be a few instances of "rogue" visits. These tend to be visits from sites such as Referer and Vampirestat.

The posts themselves have varied greatly in popularity, as shown in the next graph. It is actually a bit tricky to assign pageviews to particular posts, because visits to the blog's homepage are not attributed by the counter to any specific post. Since the current two posts are the ones that appear on the homepage, these posts are under-counted until they move off the homepage, (after which they can be accessed only by a direct visit to their own pages, and thus always get counted). On average, 30% of the blog's pageviews are to the homepage, rather than to a specific post page, and so there is considerable under-counting.

Scatterplot of post pageviews through time, up to last week; the line is the median.
Note the log scale, and that the values are under-counted (see the text).

It is good to note that the most popular posts were scattered throughout the two years. Keeping in mind the initial under-counting, the top collection of posts (with counted pageviews) have been:
129
42
73
172
10
98
58
49
29
19
67
188
8
The Music Genome Project is no such thing
Charles Darwin's unpublished tree sketches
Carnival of Evolution, Number 52
The acoustics of the Sydney Opera House
Why do we still use trees for the dog genealogy?
Faux phylogenies
Who published the first phylogenetic tree?
Evolutionary trees: old wine in new bottles?
Network analysis of scotch whiskies
Tattoo Monday IV
Metaphors for evolutionary relationships
Phylogenetics with SpongeBob
Tattoo Monday
4,552
3,100
1,964
1,891
1,641
1,451
1,359
1,352
1,298
1,247
1,178
1,088
1,051
This is quite a different list to the same time last year. Posts 129, 42 and 172 continue to receive visitors almost every day.

The audience for the blog continues to be firmly in the USA. Based on the number of pageviews, the visitor data are:
United States
United Kingdom
Germany
France
Russia
Canada
Australia
China
Brazil
Poland
41.1%
5.6%
4.9%
3.8%
3.3%
2.7%
2.1%
1.4%
1.0%
0.8%
You will note that this list is dominated by English-speaking countries. The blog does have a link to Google Translate to help other people, but it is clear that the audience is made up almost entirely of those people who are comfortable with English (or Australian, at any rate).

Finally, if anyone wants to contribute, then we welcome guest bloggers. This is a good forum to try out all of your half-baked ideas, in order to get some feedback, as well as to raise issues that have not yet received any discussion in the literature. If nothing else, it is a good place to be dogmatic without interference from a referee!

Wednesday, February 19, 2014

Multivariate data displays are not always necessary


Over the past two years I have published a number of posts in which I have used a data-display network as a multivariate data summary, comparable to an ordination (eg. PCA) or a cluster analysis (eg. UPGMA). This is a form of exploratory data analysis.

Here, I wish to point out that a multivariate data summary is not always necessary, even when the data are multivariate in form.

As an example, I will use the official census data on retail book sales in the USA. The monthly data are provided by the United States Census Bureau for the years 1992-2013 at:
 http://www.census.gov/retail/mrts/www/data/excel/mrtssales92-present.xls.
The data include census code 4512, which covers "Book Stores, General", "Specialty Book Stores" and "College Book Stores". The data notes say: "Estimates are shown in millions of dollars, and are based on data from the Monthly Retail Trade Survey, Annual Retail Trade Survey, and administrative records." I downloaded the data on 17 February 2014.

These data are multivariate. For example, if each year is taken as a sample object, then there are data for 12 variables for each sample (one for each month). Any multivariate data analysis can therefore be applied to this dataset.

In the usual manner, I have used the manhattan distance and a neighbor-net network. Years that are closely connected in the network are similar to each other based on the 12 monthly sales figures, and those that are further apart are progressively more different from each other.


However, all that the data show is a gradient clockwise from the top. That is, sales rose from 1992, reached a peak in 2007, and then declined again. That is, the data form a simple time series, and all that is actually needed is to plot them that way.

So, this same pattern could be displayed more simply by graphing the yearly averages, as shown in the next graph. A network is complete over-kill in this case. I presume that the recent decrease in retail book sales has something to do with the rise of e-book sales.


Finally, we could also plot the monthly sales, while we are at it. The peaks in late summer and at Christmas as very distinct. Presumably people are buying books to read in summer, and to give away at Christmas.


Finally, note that not all time series can be plotted in a simple manner. If the time patterns are complex, then a multivariate analysis, such as a network, will probably be of some use as a data display.

Monday, February 17, 2014

Network poster images


Note: Updated 18 April 2014

In a previous blog post I noted that there are many images of phylogenetic trees on the internet but there are very few for phylogenetic networks, and so I provided a Network road sign. Here, I provide three original images and one from another web site, plus a tree (which is actually a commercial t-shirt design).




[From the Epic of Evolution web site]


Wednesday, February 12, 2014

The updated Primer of Phylogenetic Networks


In an earlier post I reported on the creation of an Online Primer of Phylogenetic Networks, which is intended as a simple introduction to networks for those people who already know something about phylogenetic trees. This includes three animations (animated GIF files).

This primer has now been updated with explanations of the construction of Reduced Median Networks, Median-Joining Networks and Minimum-Spanning Trees.

So, the primer now includes:

  • Median Networks
  • Reduced Median Networks
  • Median-Joining Networks
  • Recombination Networks
  • Hybridization Networks
  • Parsimony Trees
  • Minimum-Spanning Trees

These are all related to each other, and can be easily explained by looking at the distribution of the characters in a parsimony context. With the exception of NeighborNet, this covers the networks that most commonly appear in the literature.

Any constructive feedback will be gratefully received.

Monday, February 10, 2014

HGT networks


Introgression is the transfer of genetic material from one species to another via sexual reproduction, and this process has been recognized for a long time. If sex is not involved (such as between distantly related organisms) then we usually refer to it as horizontal gene transfer (HGT), and this has only relatively recently come to the general attention of biologists.

During the 1990s, HGT among prokaryotes began to be taken seriously in phylogenetics (Smith et al. 1992; Syvanen 1994), and more than a decade later also in eukaryotes (see Bock 2010; Boto 2010; Renner & Bellot 2012). However, the question still remains as to when it was first considered within phylogenetics, as opposed to other areas of biology.

It seems that the first report of what was probably HGT in prokaryotes is due to Flu (1927), who of course did not recognize it as such. Indeed, Lederberg & Tatum (1946) also apparently observed HGT, but mistakenly attributed it to sexual recombination (in prokaryotes). This emphasizes just how difficult it can be to identify processes from looking at data patterns.

Further observations were reported by Freeman (1951) and Lederberg et al. (1951). Shortly afterwards, experimental work was published concerning mechanisms for the transfer of genetic material between micro-organisms via what we now call transduction (Zinder & Lederberg 1952; Stocker et al. 1953). The effect of this on phylogenetics was soon considered (Stocker 1955), although no diagrams representing reticulation were presented at this time. The focus was still on elucidating the processes rather than illustrating the phylogenies.

It seems that the first people to actually illustrate HGT among species were Jones & Sneath (1970). In their review of HGT, they not only considered the accumulating evidence for the processes, they explicitly illustrated all of the known cases. These were presented as a series of 18 unrooted phenetic diagrams with known HGT connections linking the bacterial taxa. A single example is shown here.


For eukaryotes, the possibility was early on considered that the asexual transfer of genetic units may be of more general occurrence (Ravin 1955). Indeed, Went (1971) presented a strong case for HGT among plants, based on morphological and anatomical data (ie. phenotypic rather than genotypic evidence). Benveniste & Todaro (1974) then suggested the possibility of exogenously acquired viral genes in mammals. However, it was not really until molecular sequencing became available in the 1980s that biologists really started presenting evidence for gene transfer among eukaryotes (Shilo & Weinberg 1981; Singh et al. 1981; Buslinger et al. 1982; Hyldig-Nielson et al. 1982; Engels 1983).

Most of these suggestions turned out to be spurious, once more evidence accumulated (Smith et al. 1992; Syvanen 1994). However, this did not stop Syvanen (1987) from explicitly considering the effect of HGT on the assessment of evolutionary relationships, apparently being the first to do so. Interestingly, he concluded that "horizontal gene flow would not necessarily preclude a linear molecular clock or change the rate of molecular evolution (assuming the neutral allele theory)."

References

Benveniste RE, Todaro GJ (1974) Evolution of C-type viral genes: inheritance of exogenously acquired viral genes. Nature 252: 456-459.

Bock R (2010) The give-and-take of DNA: horizontal gene transfer in plants. Trends in Plant Science 15: 11-22.

Boto L (2010) Horizontal gene transfer in evolution: facts and challenges. Proceedings of the Royal Society of London B: Biological Sciences 277: 819-827.

Busslinger M, Rusconi S, Birnstiel ML (1982) An unusual evolutionary behaviour of a sea urchin histone gene cluster. EMBO Journal 1: 27-33.

Engels WR (1983) The P family of transposable elements in Drosophila. Annual Review of Genetics 17: 315-344.

Flu P-C (1927) Sur la nature du bactériophage. Comptes Rendus Hebdomadaires des Séances et Mémoires de la Société de Biologie 96(1): 1148-1149.

Freeman VJ (1951) Studies on the virulence of bacteriophage-infected strains of Corynebacterium diphtheriae. Journal of Bacteriology 61: 675-688.

Hyldig-Nielson, JJ, Jensen EØ, Paludan K, Wiburg O, Garrett R, Jørgensen P, Marcker KA (1982) The prnmary structures of two lehemoglobin genes from soybean. Nucleic Acids Research 10: 689-701.

Jones D, Sneath PH (1970) Genetic transfer and bacterial taxonomy. Bacteriology Reviews 34: 40-81.

Lederberg J, Lederberg EM, Zinder ND, Lively ER (1951) Recombination analysis of bacterial heredity. Cold Spring Harbor Symposium on Quantitative Biology 16: 413-443.

Lederberg J, Tatum EL (1946) Gene recombination in Escherichia coli. Nature 158: 558.

Ravin AW (1955) Infection by viruses and genes. American Scientist 43: 468-478.

Renner SS, Bellot S (2012) Horizontal gene transfer in eukaryotes: fungi-to-plant and plant-to-plant transfers of organellar DNA. Advances in Photosynthesis and Respiration 35: 223-235.

Shilo BZ, Weinberg RA (1981) DNA sequences homologous to vertebrate oncogenes are conserved in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the USA 78: 6789-6792.

Singh L, Purdom IF, Jones KW (1981) Conserved sex chromosome-associated nucleotide sequences in eukaryotes. Cold Spring Harbor Symposium on Quantitative Biology 45: 805-813.

Smith MW, Feng D-F, Doolittle RF (1992) Evolution by acquisition: the case for horizontal gene transfers. Trends in Biochemical Science 17: 489-493.

Stocker BAD (1955) Bacteriophage and bacterial classification. Journal of General Microbiology 12: 375-379.

Stocker BAD, Zinder ND, Lederberg J (1953) Transduction of flagellar characters in Salmonella. Journal of General Microbiology 9: 410-433.

Syvanen M (1987) Molecular clocks and evolutionary relationships: possible distortions due to horizontal gene flow. Journal of Molecular Evolution 26: 16-23.

Syvanen M (1994) Horizontal gene transfer: evidence and possible consequences. Annual Review of Genetics 28: 237-261.

Went FW (1971) Parallel evolution. Taxon 20: 197-226.

Zinder ND, Lederberg J (1952) Genetic exchange in Salmonella. Journal of Bacteriology 64: 679-699.

Wednesday, February 5, 2014

NSF and reticulating phylogenies


In mid January, the US National Science Foundation released a new Program Solicitation, NSF 14-527. This is called the Genealogy of Life (GoLife).

The text notes:
This solicitation represents the successor program to the Assembling the Tree of Life program. The name has been changed to include projects covering the complexity of phylogenetic patterns across all of life.
So, it replaces the previous documents NSF 10-513 (the Assembling the Tree of Life program, AToL) and NSF 11-534 (the Assembling, Visualizing and Analyzing the Tree of Life program, AVAToL). The latter program has its own web page, which gives you an idea of what it has been about.

For those of us interested in phylogenetic networks, the following parts of the Introduction to the new program are of particular interest:
Understanding the tree of life has been a goal of evolutionary biologists since the time of Darwin. During the past decade, unprecedented gains in gathering and analyzing phylogenetic data have demonstrated increasingly complex genealogical patterns. 
The GoLife program builds upon the AToL program by accommodating the complexity of diversification patterns across all of life's history. Our current knowledge of processes such as hybridization, endosymbiosis and lateral gene transfer makes clear that the evolutionary history of life on Earth cannot accurately be depicted as a single, typological, bifurcating tree.
This is very good news, and a major step forward. The two Tree of Life programs were implicitly based on a rather unrealistic assumption about the shape of phylogenetic history. This new move has been a long time coming, both during prior stakeholder discussions and within the various committees. There are a number of people who will be personally pleased that this has come to fruition.

The focus of the new program is solely on biology, however:
Proposals should focus on poorly sampled clades or data layers within the Genealogy of Life where new data will have a profound impact on our understanding of the pattern of life's evolution ... Additional examples of projects that will not be considered by this program include: ... 5) projects that are solely focused on the development of new computational methods or technologies. 
This is a pity, because there needs to be method development before reticulating phylogenies can be constructed in a manner similar to what is currently used for tree building. There is likely to be a major need for practical large-scale methods for producing evolutionary networks, which unfortunately will be supported only outside of this particular program.

Furthermore:
The project should include a plan for integration and standardization of data consistent with three AVAToL projects: Open Tree of Life, ARBOR, and Next Generation Phenomics.
This is an important requirement, as scientific data tends to disappear into a black hole unless it is prised out of the scientists' hands. The problem is that the Open Tree of Life appears to currently have no mechanism for dealing with non-tree phylogenies!

Monday, February 3, 2014

Single-malt scotch whiskies — a network


In a previous post I presented a Network analysis of scotch whiskies. This analyzed 109 single-malt scotch whiskies based on the tasting notes of a single author, measuring 68 characteristics (nose, color, body, palate, finish). The analysis demonstrated that the conventional way of classifying Scotch malt whiskies by region does not relate to their taste.


An alternative approach to classifying the whiskies is, then, to actually try to group them by taste. This topic was tackled in the book by David Wishart (2002) Whisky Classified: Choosing Single Malts by Flavour, Pavilion Books, London. His objective was: "if you like a particular malt whisky, then we can tell you what other brands taste similar." This book was revised in 2006, and there was a 10th anniversary edition in 2012. There is also an associated web page (Whisky Classified).

Wishart proceeded as follows:
Tasting notes in 10 recently published books on malt whisky were analyzed for 86 readily available single malt whiskies: Arthur (1997), Broom (1998), Jackson (1995), Lerner (1997), MacLean (1997), Milroy (1995), Murray (1997), Nown (1997), Shaw (1997) and Tucek and Lamond (1997). Tasting notes published by the distilleries were also reviewed, where available.
A vocabulary of 500 aromatic and taste descriptors was compiled from the tasting notes in the 10 books. These were grouped into 12 broad aromatic features: Body (Light-Heavy), Sweetness (Dry-Sweet), Smoky (Peaty), Medicinal (Salty), Feinty (Sulphury), Honey (Vanilla), Spicy (Woody), Winey (Sherry), Nutty (Oaky-Creamy), Malty (Cerealy), Fruity (Estery) and Floral (Herbal). The 12 flavour categories are scored on a scale of 0-4 according to the intensity with which each feature is present in a whisky.
The 86 single malts were classified using ClustanGraphics. The cluster analysis groups malts into the same cluster when they have broadly the same taste characteristics across all 12 sensory variables. Technically, the method minimizes the variance within clusters and maximizes the variance between clusters. The result was ten clusters of single malt whiskies.
The order of the 10 clusters A-J maximizes the row-wise rank correlation of the underlying proximity matrix. Readers who are familiar with malt whiskies may recognise the two extremes of strongly sherried malts (cluster A) and the heavily peated, mainly Islay malts (cluster J). Adjacent to these polar benchmarks are the lightly sherried (clusters B and C) and lightly peated (clusters H and I) malts, with the light-bodied, floral and malty clusters, including four largely unpeated groups (clusters D-G) falling in the middle.
The classification is shown at the bottom of this post.

I have re-analyzed these data using the manhattan distance and a neighbor-net network. A copy of my data spreadsheet is available online. (Note that other online copies of the data [eg. here] contain errors.)


Whiskies that are closely connected in the network are similar to each other based on the 12 characteristics, and those that are further apart are progressively more different from each other. I have added colours to the network representing the ten alleged groups:
Cluster A
Cluster B
Cluster C
Cluster D
Cluster E
blue
light blue
light green
green
black
Cluster F
Cluster G
Cluster H
Cluster I
Cluster J
brown
orange
pink
crimson
red
This shows that the book's order of the groups proceeds roughly from the middle right of the network (blue) clockwise around to the top right (red).

This analysis provides only a vague justification for the book's classification. The red group does form a distinct cluster in the network, as does the blue group, which are the two extremes of the classification order. However, some of the middle groups, eg. brown and light green, do not form network clusters at all. The other groups more or less form neighbourhoods in the network, but they do not form clusters.

Therefore, trying to use this classification as the author intended, to identify whiskies that taste similar to each other, will be difficult. For example, the network shows that Ardmore is similar to Old Fettercairn, and Glen Deveron is similar to Tullibardine, but these pairs are nothing like each other — and yet all four of them are classified together (in Cluster F).

So, single-malt Scotch whiskies do not really form groups, except for the peaty flavoured ones (mostly from Islay), and to some extent the sherry flavoured ones. The rest form a continuous gradient between these two extremes. They all taste different, to one extent or another.

Finally, Wishart is not the only person to have tried clustering these data — Luba Gloukhov also tried, using k-means clustering, to no greater effect. Clustering techniques only work if there are groups in the data, and in this case the data show continuous variation between the two extremes.

Wishart's Classification of Single-Malt Whiskies

Cluster A
(Full-Bodied, Medium-Sweet, Pronounced Sherry with Fruity, Spicy, Malty Notes and Nutty, Smoky Hints): Balmenach, Dailuaine, Dalmore, Glendronach, Macallan, Mortlach, Royal Lochnagar;
Cluster B
(Medium-Bodied, Medium-Sweet, with Nutty, Malty, Floral, Honey and Fruity Notes): Aberfeldy, Aberlour, Ben Nevis, Benrinnes, Benromach, Blair Athol, Cragganmore, Edradour, Glenfarclas, Glenturret, Knockando, Longmorn, Scapa, Strathisla;
Cluster C
(Medium-Bodied, Medium-Sweet, with Fruity, Floral, Honey, Malty Notes and Spicy Hints): Balvenie, Benriach, Dalwhinnie, Glendullan, Glen Elgin, Glenlivet, Glen Ord, Linkwood, Royal Brackla;
Cluster D
(Light, Medium-Sweet, Low or No Peat, with Fruity, Floral, Malty Notes and Nutty Hints): An Cnoc, Auchentoshan, Aultmore, Cardhu, Glengoyne, Glen Grant, Mannochmore, Speyside, Tamdhu, Tobermory;
Cluster E
(Light, Medium-Sweet, Low Peat, with Floral, Malty Notes and Fruity, Spicy, Honey Hints): Bladnoch, Bunnahabhain, Glenallachie, Glenkinchie, Glenlossie, Glen Moray, Inchgower, Inchmurrin, Tomintoul;
Cluster F
(Medium-Bodied, Medium-Sweet, Low Peat, Malty Notes and Sherry, Honey, Spicy Hints): Ardmore, Auchroisk, Bushmills, Deanston, Glen Deveron, Glen Keith, Glenrothes, Old Fettercairn, Tomatin, Tormore, Tullibardine;
Cluster G
(Medium-Bodied, Sweet, Low Peat and Floral Notes): Arran, Dufftown, Glenfiddich, Glen Spey, Miltonduff, Speyburn;
Cluster H
(Medium-Bodied, Medium-Sweet, with Smoky, Fruity, Spicy Notes and Floral, Nutty Hints): Balblair, Craigellachie, Glen Garioch, Glenmorangie, Oban, Old Pulteney, Strathmill, Tamnavulin, Teaninch;
Cluster I
(Medium-Light, Dry, with Smoky, Spicy, Honey Notes and Nutty, Floral Hints): Bowmore, Bruichladdich, Glen Scotia, Highland Park, Isle of Jura, Springbank;
Cluster J
(Full-Bodied, Dry, Pungent, Peaty and Medicinal, with Spicy, Feinty Notes): Ardbeg, Caol Ila, Clynelish, Lagavulin, Laphroaig, Talisker.