Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial HMMs:
Errata

Dirk Husmeier
Biomathematics and Statistics Scotland (BioSS)
JCMB, The King's Building, Edinburgh EH9 3JZ, United Kingdom

I am grateful to Wolfgang Lehrach for discovering the bugs reported below.

Transition-transversion ratio

If you downloaded the software before 11 October 2005, please download it again as the old version contained a serious bug. For that reason I would also advise you to inform me when you decide to download my software; just send an email to dirk@bioss.ac.uk. This will allow me to contact you if I were to find another bug in the code, and to notify you about future changes and improvements.

The transition-transversion ratio is defined as alpha/beta. This is not the standard definition, though. In the Kimura model, the standard definition is alpha/(2*beta). The factor 1/2 results from the fact that there are twice as many transversions than transitions. In the HKY and F84 models, the transition/transversion ratio depends on functions of the nucleotide frequencies; see, for instance, equation 24 in Felsenstein and Churchill (1996), Mol. Biol. Evol. 13:93-104. You need to keep that difference in mind when comparing your results with a program that uses the standard definition, like Seq-Gen. Here are further details.

In phylogenetic analysis, there is an identifiability problem between the branch lengths and the parameters of the rate matrix unless some further conditions are specified. The standard condition that is required to hold is:
sum_i Q_ii pi_i = -1
where Q_ii are the elements of the rate matrix, and pi_i is the equilibrium distribution over nucleotides. This condition allows the branch lengths to be interpreted as the expected number of nucleotide substitutions per site.

Note that my code uses a different equation of constraint that does not lead to this convenient interpretation of the branch lengths. This deviation is irrelevant for the detection of mosaic structures in DNA sequence alignments itself. However, it can cause confusions when comparing your results with those of a standard phylogeny package, like SeqGen. To avoid this confusion, replace function Barge_NucSubstMatrix.m by the following revised function, written by Wolfgang Lehrach.

Nucleotide substitution model

Although the nucleotide equilibrium frequencies are read in and passed on to various functions as a parameter vector, they are never used to actually overwrite the default setting of a uniform distribution. This implies that my code implements the Kimura model of nucleotide substitutions, and not the HKY85 model, as intended.

When correcting this bug, one has to transpose the transition probability matrix used in my code. For the Kimura model, this is obsolete, as the transition probability matrix is symmetric. But that does not hold for the HKY85 model.

I have not yet made these corrections for two reasons. Firstly, I no longer have the time to support this software. Secondly, I find that my software has been superseded by a more recent program package developed by Wolfgang Lehrach, whose software will be made available here shortly.


Back to the previous page.