Cladistic methodology

A cladistic analysis or methodology is applied to a certain set of information. To organize this information a distinction is made between characters, and character states. Consider the color of feathers, this may be blue in one species but red in another. Thus, "red feathers" and "blue feathers" are two character states of the character "feather-color."

In the old days, the researcher would decide which character states were present before the last common ancestor of the species group (plesiomorphies) and which were present in the last common ancestor (synapomorphies). Usually this is done by considering one or more outgroups (organisms that are considered not to be part of the group in question, but that are related to the group). Only synapomorphies are of any use in characterising cladistic divisions.

Next, different possible cladograms were drawn up and evaluated. Clades should have as many synapomorphies as possible. The hope is that a sufficiently large number of true synapomorphies will be large enough to overwhelm any unintended symplesiomorphies (homoplasies), caused by convergent evolution (i.e. characters that resemble each other because of environmental conditions or function, not because of common ancestry). A well-known example of homoplasy due to convergent evolution is the character wings. Though the wings of birds and insects may superficially resemble one another and serve the same function, each evolved independently. If a bird and an insect are both accidentally scored "POSITIVE" for the character "presence of wings", a homoplasy would be introduced into the dataset, which may cause erroneous results.

When equivalent possibilities turn up, one is usually chosen based on the principle of parsimony: the most compact arrangement is likely the best hypothesis of relationship (a variation of Occam's razor). Another approach, particularly useful in molecular evolution, is maximum likelihood, which selects the optimal cladogram that has the highest likelihood based on a specific probability model of changes.

Of course, it is no longer done this way: researcher bias is something to be avoided. These days much of the analysis is done by software: besides the software to calculate the trees themselves, there is sophisticated statistical software to provide a more objective basis.

Cladistics has taken a while to settle in, and there is still wide debate over how to apply Hennig's ideas in the real world. There is concern that use of widely different data sets (for instance, structural versus genetic characteristics) may produce widely different trees. However, largely, cladistics has proven useful in resolving phylogenies and has gained widespread support.

As DNA sequencing has become easier, phylogenies are increasingly constructed with the aid of molecular data. Computational systematics allows the use of these large data sets to construct objective phylogenies. These can filter out some true synapomorphies from parallel evolution more accurately. A powerful method of reconstructing phylogenies is the use of genomic Retrotransposon Markers, which are virtually ambiguity-free according to current knowledge (though this is simply an assumption based on statistics and may, although unlikely, not be true in a specific case). Ideally, morphological, molecular and possibly other (behavioral etc) phylogenies should be combined: none of the methods is "superior", but all have different intrinsic sources of error. For example, true character convergence is much more common in morphology than in molecular sequences, but true character reversions do usually only occur in the latter (see Long branch attraction). Dating based on molecular information is usually more precise than dating of fossils, but more fraught with error (see Molecular clock). By combining and comparing, many errors can be eliminated.

Credits
Credits: This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Cladistics"