Most data analyses involve processing the data using some model. For example, standard parametric statistical tests assume a normal distribution for the "error" term, as well as equal variances and linear relationships between the variables. If these model assumptions ado not hold, then any inferences from the tests may be incorrect.
It is possible to look at any dataset in a model-free manner, although this does not necessarily lead to any strong inferences. Looking at data is usually called exploratory data analysis. This is often done using graphs of various types.
Exactly the same principle applies to phylogenetics. A phylogenetic tree is an inference from the data via a given model. The inference is a reconstructed genealogical history assuming a divergent tree. In this context, different models will often (usually?) give different inferences.
Therefore, most phylogeneticists never actually see their data. What they see, instead, is the data as processed through some model. That is, they see inferences from the model, not the original data. Models are important, but the data should be even more important, for a scientist.
It is thus interesting that so many phylogeneticists skip the step of looking at their data, and proceed immediately to the model-based inference. So many of the disagreements throughout the literature end up being about the models and not the data. There are very strong opinions about which models should be used, with less attention being paid to whether the data contain sufficient information to answer the original scientific question in the first place.
A specific example of this was discussed in some earlier blog posts:
It is worth mentioning here that a haplotype network is not a genealogy. Instead, it is a summary of a population dataset, which may contain some phylogenetic patterns or it may not. So, a haplotype network is closer to exploratory data analysis than it is to model-based inference. This point is clearly made by Jessica W. Leigh and David Bryant (2015. PopART: full-feature software for haplotype network construction. Methods in Ecology and Evolution 6: 1110-1116):
The haplotype networks do provide, however, a concise and accessible representation of the data themselves, one aspect which is often lost in methods heavily dependent on model-based inference.Looking at the data before you start processing it can be a very good idea. After all, you may be able to avoid unlikely inferences.