Milestone 8:
Combining phylogenetic trees with hidden Markov models
for detecting recombination - Bayesian approach (MCMC)
Theory
The theory is explained in the following article:
Practice
-
Implement the algorithm in Matlab.
Make use of your knowledge of MCMC, which you
obtained in your previous
MSc project and in our
journal club meetings.
Also, have a look at my book chapter on learning
in Bayesian networks.
Note that your implementation is easier than
the one in
my MBE paper, because you use a simpler nucleotide
substitution model (the Kimura model).
First, set the transition-transversion ratio
to a fixed value of 2.
All you have to do is to resample the weights
(five parameters for each tree topology) and the
hidden state sequences.
Later, make the transition-transversion parameter
adaptable too,
that is, update it in the MCMC simulation.
Make sure you optimize the proposal probabilities
in the burn-in phase, as described in
my MBE paper.
For the weights, this means that you have to adapt
the perturbation stepsize so as to optimize the acceptance
ratio.
Note that you can improve one aspect of the algorithm
described in
my paper:
rather than sample the hidden state sequences
with a Gibbs-within-Gibbs scheme, which is slow,
sample the whole hidden state sequence in one go.
This requires you to slightly modify the forward-backward
algorithm, as explained in the
book by Durbin et al. and the
following paper,
that is,
you have to change the code in
Kevin Murphy's
HMM toolbox accordingly.
-
Test your implementation
in the way described in the
previous milestone section
.
Note: A comparison of the Gibbs-within-Gibbs scheme with the
modified forward-backward algorithm would constitute some
new work, to be included in your thesis.
Back to the previous page.