Systematics and Parsimony, by Andrew Brower

Phylogenetic analysis is necessarily based on parsimony, both because it is precisely that criterion that leads to grouping according to putative synapomorphy and because empirical investigation is impossible without avoiding ad hoc hypotheses.

Once ideas have been reduced to formulae, it is easy to forget where the formulae came from, and to devise new methods with no logical basis simply by modifying formulae directly.

Farris, 1983

Andrew Brower was The Systematics Association’s President Lecture 2019 guest speaker. Here we reproduce a version of the talk he delivered on 27th November 2019 at The Linnean Society of London. Some of the ideas presented here were distilled from the author’s forthcoming book (Brower and Schuh, 2021).

In January, 2016, the editors (of whom I am one) of the Willi Hennig Society’s journal Cladistics published an editorial claiming that, “the epistemological paradigm of this journal is parsimony”, and that, “unless there is a pertinent reason to include multiple trees from alternative methods, a tree based on parsimony is sufficient as an intelligible, informative and repeatable hypothesis of relationships …” (editors, 2016).  As indicated by Steve Farris’ epigrams, above, these are far from novel proscriptions for cladists, and essentially represent a restatement of longstanding normative principles of the Hennig Society.  The primary aim of the editorial was to try to stem a growing deluge of manuscript submissions from authors, apparently covetous of the journal’s relatively high Impact Factor whose subject matter and approach we editors felt had little to do with what we viewed as the philosophical and methodological focus of the journal.  There are plenty of systematics journals, but this one is the Hennig Society’s, and Cladistics stands out, in our opinion, because of its emphasis on, well, cladistics. It seemed reasonable and within our editorial purview to circumscribe the scope of the journal to include only those works that meet the philosophical and methodological standards of the Hennig Society *.

To our surprise, the editorial was met with a torrent of invective on Twitter and other social media platforms, in a string called “#parsimonygate” that became notorious enough to be chronicled by Wired magazine (Simon, 2016).  To give a sense of the unpleasantness of this outpouring, here are a few quotes:

“It seems like cladistics is slipping from science to dogmatics”

“This is unscientific craziness – parsimony as a religion.”

The editorial board of Cladistics drops all pretense and embraces the journal’s identity as a religious rag!”“Science libraries should cancel subscriptions to Cladistics Journal b/c it is not science.”

What is it about the philosophical principle of parsimony and its application in the cladistic approach to phylogenetic analysis that these people feel is unscientific and tainted by religion or dogma?  Is there any merit to their claims?  What is the relationship between science and philosophy, anyway?  Does science discover the truth?  Can we tell whether science discovers the truth?  More specifically in the context of systematic inquiries, what constitutes evidence?  Which (and how many) assumptions should we make when we interpret our evidence?  How do we determine whether our interpretations are reliable, or “true”?  And what if the data are not amenable to the assumptions we have made?  The object of this essay is to examine the principle of parsimony in the context of questions like these.  When cladists talk about “philosophy”, these are the kinds of epistemological problems we have in mind.  We think that failure to consider such questions in the design and execution of phylogenetic analyses could result in naïve and perhaps misleading conclusions.  Let’s start by defining our terms.

What is parsimony?

According to our correspondents on Twitter, parsimony is variously:“religion”, “dogma”, “old, crude methods”, “ancient and flawed methodology”, “wrong”, “anachronistic crap” and “unscientific craziness”.  I will concede that it is ancient:  although often associated with a quotation from the 13th Century English monk William of Ockham, “Entia non sunt multiplicanda praeter necessitatem” (“Occam’s razor”), the idea that the economy of nature is efficient can be traced back to Aristotle (Thorburn, 1918).  As we shall see, the claims that nature is parsimonious and that nature should be interpreted parsimoniously are fundamentally distinct.

There are several contrasts to draw and explain at this point.  First of all, it may not be clear from the criticisms above what is being disparaged:   parsimony as a general principle, or parsimony as the optimality criterion of the cladistic method.  As a general principle, Ockham’s statement above translates roughly to, “entities must not be multiplied beyond what is necessary”.  This was expressed by Sober (2016) as, “simpler theories are better than theories that are more complex.”  This seems to many people to represent a common sense proposition, as does its corollary, the principle of uniformitarianism – the idea that the future resembles the past, and phenomena are therefore predictable based on prior experience.  To see the connection, imagine some data points that appear to fall on a line.  The parsimonious explanation of this pattern is a linear relationship between the independent and dependent variables.  Therefore, given the pattern, a parsimonious prediction would be that future observations also fall on that line.   Science would be impotent without such assumptions, and everyone who relies upon engineered products and machinery implicitly depends on the notion that the laws of physics that underlie safety standards do not change capriciously at unpredictable intervals.  Parsimony is not merely a human invention:  scrub jays who cache acorns are able to employ this foraging strategy successfully because the nuts stay where they put them and do not wander off or vanish (except when pilfered by other scrub jays).

As a principle applied to phylogenetic inference, a manifestation of parsimony – perhaps not the only one – is embodied in the criterion that the preferred hypothesis of relationships (cladogram) is the one with the fewest ad hoc hypotheses of homoplasy, which is also the one with the smallest number of steps for a given equally weighted data set.  Opposition to this application of parsimony generally stems from the notion that there are more “realistic” (and less parsimonious) evolutionary models that explain patterns of character transformation more accurately.  The concepts of realism and accuracy lead us to our next distinction about the principle of parsimony.

In describing the nature of parsimony, Emmanuel Kant (1787) parsed the distinction between epistemology — parsimonious perception of phenomena, and ontology — the parsimonious behavior of noumena (“things in themselves”):

 … reason presupposes the systematic unity of the various powers, on the ground that special natural laws fall under more general laws, and that parsimony in principles is not only an economical requirement of reason, but is one of nature’s own laws.

Following Kant, we may divide the philosophy of science into two realms:  “what is?” and “how do we know?”  The former is the realm of ontology.  “If a tree falls in the forest and no one is there to hear it, does it make a sound?” is an ontological question.  In many peoples’ minds, such questions immediately beg other questions:  why do you think the tree does or does not make a sound?  Is the sound a phenomenon of perception, or an external event that is perceived?  These are theoretical problems about the nature of knowledge — epistemology.

One of the persistent misunderstandings of — or accusations against —Occam’s razor is that it represents an ontological claim that nature behaves parsimoniously.   That is a testable hypothesis that might or might not be true, or might be true some of the time but not all of the time.  Many would argue that the hypothesis has been falsified.  The basis for many criticisms of cladistic parsimony is, essentially, this ontological claim, that nature is not parsimonious — that the existence of homoplasy falsifies the parsimony criterion (e. g., Felsenstein, 1978; Cartmill, 1981, Beatty, 1982).  Farris (1983) refuted those arguments by reemphasizing the distinction between ontology and epistemology and clarifying that the application of parsimony in cladistics is epistemological:  cladistic parsimony does not claim that homoplasy is minimal, it claims that the preferred hypothesis of relationships is the one upon which homoplasy is minimized, however much of it may occur in a given data set.

Another means to draw the distinction between ontological and epistemological parsimony is to consider the dichotomy drawn by Elliot Sober (2015) between the “razor of denial” and the “razor of silence”.  The former represents an ontological claim, such as “x does not exist”, while the latter, which Sober considered to be a less forceful claim, is more epistemological:  “there is no evidence to suggest that x exists”.  One might by analogy consider the distinction between atheism and agnosticism:  an atheist asserts that there is no god, while an agnostic withholds judgment pending miraculous evidence (Brower, 2019).

Most applications of parsimony in cladistics are razors of silence.  For example, the most parsimonious tree is preferred because it requires the fewest ad hoc hypotheses of homoplasy, not because cladists believe it is more likely to be true.  Similarly, the nodes on cladograms do not connote ancestral taxa, not because cladists don’t believe that there were ancestors, but because ancestors and their characters represent inferences, not observations.  Another razor of silence relates to weighting schemes.  Modelers frequently claim that the equal weights often applied as a default in cladistic analyses are less realistic than their more elaborate differential weighting schemes (e. g., Huelsenbeck et al. 2011).  Cladists question the premise that one can know what is “realistic” in the empirical sciences.  Cladists use equal weights not because they view them to be more or less realistic than some other possible weighting scheme, but because, once again, such a scheme requires fewer ad hoc hypotheses to explain expected unruly behavior by different classes of characters or character state transformations.  Hennig (1966:121) provided a fourth example:

“… ‘phylogenetic systematics would lose all the ground on which it stands’ if the presence of apomorphous characters in different species were considered first of all as convergences (or parallelisms), with proof to the contrary required in each case.  Rather the burden of proof must be placed on the contention that ‘in individual cases the possession of common apomorphous characters may be based only on convergence (or parallelism).’”

In this famous passage, Hennig argued that homology is the parsimonious common cause explanation of shared apomorphy, and that the explanation of similar features by independent derivation represents a less parsimonious ad hoc hypothesis.

Statistical inconsistency: parsimony’s Achille’s heel?

Statistical consistency is the mathematical property that an inferential procedure converges on the true answer with increasing certainty as the amount of evidence increases; statistical inconsistency is convergence upon the wrong answer instead.  Since Felsenstein (1978) first raised the issue that “parsimony can be statistically inconsistent”, statistically-minded phylogeneticists have banged the drum about the undesirable qualities of statistically consistent estimators, and therefore the inadequacy of cladistic parsimony for inferring phylogenetic hypotheses.  Mathematically speaking, statistical consistency is a desirable property of an inferential tool, and it is certainly true that one can contrive a model under which parsimony will behave in a statistically inconsistent manner.  The thing is, as Farris (1983, 1999) and I (Brower, 2018; Brower and Schuh 2021) have repeatedly pointed out, any method can be inconsistent if the data do not fit the a priori model of evolution. For phylogenetic inference among actual taxa, there is neither a way to know whether the answer you get is true or not, nor a way to know whether your method is converging on the right or the wrong answer.  Thus, statistical consistency has no bearing on empirical problems of this kind.

In addition, although Felsenstein observed that under particular circumstances parsimony can be inconsistent, that certainly does not mean that it always is inconsistent.  As shown in early simulation experiments by Huelsenbeck and Hillis (1993), except under extremely contrived circumstances of differing branch length, parsimony usually obtains the “correct” tree.  One might therefore wonder if an experiment could be designed to test how often parsimony produces a wrong answer from actual data.  Of course, you never know what the right answer is, so deciding what is right or wrong is a challenge.  A decade ago, Rindal and Brower (2011) addressed this question by comparing results from empirical papers in which parsimony and a model-based method were both used to analyze empirical data sets, to test the substance of assertions that likelihood or Bayesian models solve parsimony’s inconsistency problem.  If the results are frequently different, then perhaps parsimony is not doing a very good job, but if they are usually the same, then the extra ad hoc assumptions of the models are not adding anything to the efficacy of the inference.  We found that in more than 99% of the cases, the results did not differ to the extent that the authors of the articles found worth mentioning.  This clearly shows that if parsimony is regularly producing erroneous results, then so are the models.  Thus, nothing is gained by increasing the complexity of the analysis. It is evidence such as this that led the editors of Cladistics (2016) to state the following:

“We do not consider the hypothetical problem of statistical inconsistency to constitute a philosophical argument for the rejection of parsimony.  All phylogenetic methods, including parsimony, may produce inconsistent or otherwise inaccurate results for a given data set.  The absence of certain truth represents a philosophical limit of empirical science.”

At the time, this seemed to us to be stating the obvious, but apparently we struck a nerve in defanging the modelers’ central argument against cladistics. As their tweets reveal, when reason fails, one can always turn to bigotry.

Parsimonious understanding of an unparsimonious world

In 1748, David Hume raised the problem of induction, a skeptical empiricist argument that there is no necessary reason why the future should resemble the past.  This puzzle has preoccupied philosophers ever since, because it seems to challenge fundamental premises of science and human understanding.  Hume recognized this himself.  He offered the following circumspect and surprisingly modern recipe for the advancement of knowledge (1748:159):

“To begin with clear and self-evident principles, to advance by timorous and sure steps, to review frequently our conclusions and examine accurately all their consequences — though by all means we shall make both a slow and a short progress in our systems — are the only methods by which we can ever hope to reach truth and attain a proper stability and certainty in our determinations.”

Hennig ‘s (1966) “checking and rechecking” approach to character coding reflects this cautious and tentative scientific attitude.

Hume’s points about inductive reasoning have never been satisfactorily resolved, and remain troublesome for the philosophy of science.  Let us consider a few examples of parsimonious scientific background assumptions challenged by the problem of induction.  First, the principle of cause and effect underlies our understanding of everything from the predictable behavior of colliding billiard balls to the evolution of warning coloration in poisonous insects (which would not occur if the aposematism and the poison were not reliably associated).  Yet this basic premise of explanation is not justifiable as an inductive generalization.  Nor is the principle of common cause, which allowed John Snow to infer the source of the Broad Street cholera outbreak, and, as noted above, provides the basis for hypotheses of homology.  Charles Lyell’s principle of uniformitarianism, invoked to explain the slow and steady processes of geological history, is a massive ontological claim that the past resembles the present.  Coin flipping, gambling and other statistical applications depend upon the future resembling the past in a probabilistically reliable manner (think “the law of large numbers”).  Indeed, the very rejection of miracles in scientific explanation would seem to be an inductive postulate based on the razor of silence.  As Hans Reichenbach said (1930:67),  “… the principle of induction is unreservedly recognized by the whole of science, and … there is no one who seriously doubts this principle, even for daily life.”

Of course, Karl Popper (1959), and many cladists, have vociferously rejected inductivism, and instead opted for hypothetico-deductive falsificationism (or in some cases, abductivisim, e. g., Fitzhugh, 2006).  Yet we still draw general conclusions from the weight of the evidence, so the difference between these philosophical stances may boil down more to semantic emphasis than to contrasting procedures and outcomes in actual practice (cf. Rindal and Brower, 2011).  Nevertheless, skeptical distaste for inductive reasoning is why cladists assert that a most parsimonious cladogram is to be preferred over other hypotheses of relationships, because it is the least falsified by ad hoc hypotheses of homoplasy.  In this sense, we may view the cladistic endeavor in its totality to be based on a razor of silence regarding true or accurate representation of real historical patterns of evolutionary diversification.

For those, such as our twittering detractors, who doubt the relevance of parsimony to contemporary phylogenetics, recall that any time one minimizes something, such as a pairwise distance or path length, one is implicitly employing a parsimony-based optimality criterion.  While “maximum likelihood” sounds like the opposite of minimization, given that likelihoods are incredibly tiny quantities and usually reported as log likelihoods, the optimal “maximum” value is the one closest to zero –a minimum.  And at the stage of choosing models, those with fewer parameters are preferred over those with more, given similar likelihood scores.  Indeed, the very premise of an evolutionary model makes rigid, uniformitarian  assumptions about the probabilistic behavior of various kinds of characters and character state transformations.  Bootstrap values, molecular clocks, coalescence theory, I could go on. The fact is, parsimony is woven through the fabric of all these methods and concepts, as an inescapable epistemological prerequisite of science.  Thus, it is not the principle of parsimony, but rather the suggestion that parsimony is “dogma” or “anachronistic crap” that actually constitutes “unscientific craziness”.

I began this essay with a story about the 2016 Cladistics editorial, and I close with this challenge to opponents of parsimony:  the statistical inconsistency claim against parsimony is dead in the water; that cladistic parsimony and models yield similar results from empirical data sets has been shown repeatedly.  Everyone ought to know by now, despite quixotic rhetoric, that realism is an untenable desideratum for empirical scientific problems.  Slandering advocates of parsimony as religious zealots and such is neither a substantive nor a decorous mode of debate, and merely highlights the apparent lack of a reasoned foundation for the opposing stance.  I have posed the question before, and I really am curious about the answer:  what rational basis do you believe remains for preferring manifestly unparsimonious methods?

* I take for granted that members of The Systematics Association know what “cladistics” is – perhaps a vainglorious presumption as we enter the third decade of the 21st Century.  If you don’t, you can read two forthcoming, but probably very different, books on the topic:  Williams and Ebach (2020), and Brower and Schuh (2021).


I thank my friend David M. Williams, President of the Systematics Association in 2019, for the invitation to travel to London to present a talk, and for the further request to adapt it into the current form for publication in the Systematist.  As I have acknowledged before, David and I do not agree about everything, but we agree about many more things than we disagree about – including the etiquette of disagreement:  there are many paths to cladistic enlightenment.  I am grateful to the Systematics Association for sponsoring my travel.  The views expressed in this essay do not necessarily reflect the opinions or policies of the United States Department of Agriculture or of the U. S. government.


Beatty, J. 1982. Classes and cladists. Systematic Zoology 31, 25-34.

Brower, A. V. Z. 2018. Statistical consistency and phylogenetic inference:  a brief review. Cladistics 34, 562-567.

Brower, A. V. Z. 2019. Background knowledge:  the assumptions of pattern cladistics. Cladistics 35, 717-731.

Brower, A. V. Z., and R. T. Schuh.  2021.  Biological Systematics:  Principles and Applications (3rd edn.).  Cornell University Press, Ithaca, NY.

Cartmill, M. 1981. Hypothesis testing and phylogenetic reconstruction. Z. zool. Syst. Evolut.-forsch. 19, 73-96.

Editors.  2016. Editorial. Cladistics 32, 1.

Farris, J. S. 1983. The logical basis of phylogenetic analysis. In: Platnick, N. I., Funk, V. A. (Eds.), Advances in Cladistics, Vol. 2. Columbia University Press, New York, pp. 7-36.

Farris, J. S. 1999. Likelihood and inconsistency. Cladistics 15, 199-204.

Felsenstein, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27, 401-410.

Fitzhugh, K. 2006. The abduction of phylogenetic hypotheses. Zootaxa 1145, 1-110.

Hennig, W. 1965. Phylogenetic systematics. Annu. Rev. Entomol. 10, 97-116.

Huelsenbeck, J. P., Alfaro, M. E., Suchard, M. A. 2011. Biologically inspired phylogenetic models strongly outperform the no common mechanism model. Systematic Biology 60, 225-233.

Huelsenbeck, J. P., Hillis, D. M. 1993. Success of phylogenetic methods in the four-taxon case. Syst. Biol. 42, 247-264.

Hume, D. 1748. An inquiry concerning human understanding. (1955 edition). Bobbs-Merrill, Indianapolis.

Popper, K. R. 1959. The Logic of Scientific Discovery. Basic Books, New York.

Reichenbach, H., 1930. Die philosophische Bedeutung der modernen Physik. Erkenntnis 1, 49-71.

Rindal, E., Brower, A. V. Z. 2011. Do model-based phylogenetic analyses perform better than parsimony?  A test with empirical data. Cladistics 27, 331-334.

Simon, M. 2016. Twitter nerd-fight reveals a long, bizarre scientific feud. Wired

Thorburn, W. M. 1918. The myth of Occam’s Razor. Mind 27, 345-353. Williams, D. M. and M. C. Ebach.  2020.  Cladistics: A Guide to Biological Classification. Cambridge University Press.

The Author

Andrew Brower is Supervisory Agriculturalist at USDA APHIS Plant Protection and Quarantine, National Identification Service; Research Associate at the U. S. National Museum of Natural History, Department of Entomology; and Research Associate at the American Museum of Natural History, Division of Invertebrates.