RUN
Program RUN reads in
the parameter structure PAR.mat, created
with START, and performs
a training simulation.
After the simulation, you can display several results.
-
Evolution of the log likelihood during training
The type of curves you get depends on the chosen
training scheme.
When training with EM, the programs plots the
log likelihood of the whole HMM
against twice the number of EM cylcles.
After each E-step, the log likelihood ist plotted after
each of two partial M-steps: after optimizing
the recombination probability, and after optimizing
the branch lengths.
When training with CML, the program plots the
constrained log likelihoods, for
each of the possible trees in turn, against the number
of gradient ascent steps.
In both cases, this
provides an easy way to check if
the training process has converged, that is, if the
training parameters
have been chosen appropriately.
-
Plotting the phylogenetic trees
The program draws the phylogenetic trees for the three
possible topologies.
-
Plot of the posterior probabilities and classification scores
This plots a figure composed of six sub-figures.
The sub-figures in the top row plot the posterior probabilities
P(S_t|D), conditional on the data D,
for the three states S_t in {1,2,3},
against the position t in the multiple alignment.
The histograms in the bottom row show you the
classification scores for three different regions.
These regions are called
No Rec (no recombination),
Rec 1 (1st recombination),
and Rec 2 (2nd recombination).
This is motivated by the
synthetic data
and the Neisseria sequence.
If you use your own
data and want to change this, you need to modify the code
in PlotRecombi.m.
For the synthetic data, the correct topology
in Rec 1 is State-2, the correct
topology in Rec 2 is State-3.
For the Neisseria data , the correct
topology in Rec 1 is believed to be
State-3, the topolgy in Rec 2 is unknown.
In both cases, the dominant topology in region
No Rec is State-1.
You can get two different prediction from
the program. The first classification is obtained
from the Viterbi path, that is, the joint
mode of P(S_1,...,S_N|D). You are prompted if you
are interested in the single-site mode as well.
If you give an affirmative response, the program
plots a second histogram based on the mode of P(S_t|D).
Last modified: Mon May 22 13:52:28 BST