Midproject review


Objective

Recall that the objective of your project is to combine a factorial hidden Markov with a phylogenetic tree for 4-species alignments to distinguish between recombination and rate variation. This project builds on the following earlier work:

And this is, in essence, what your thesis is about:

Objective: Detecting rate variation and recombination.
Model: Phylogenetic tree plus two parallel (factorial) hidden Markov model.
Learning method: Either maximum likelihood (EM algorithm) or MCMC.


Method

You can follow two routes:

Route A ("classical"): Optimizing the parameters with maximum likelihood
Route B ("Bayesian"): Sampling the parameters from the posterior with MCMC

Route A might be faster because it seems to be more straightforward to use existing functions implemented in Kevin Murphy's Bayes Net Toolbox and Hidden Markov Model (HMM) Toolbox. It is your task to familiarize yourself with these software packages and to decide which functions are useful for your purposes, and how to use them.

Route B is, in principle, the more powerful method; and it has the advantage of leading to some new intermediate results that you can include in your thesis.

In drafting the next milestone sections, I will assume that you follow route B. Feel free to explore route A yourself.


Current situation

You should by now have obtained a sufficiently profound background in probabilistic modelling and learning from data, obtained by This should, in principle, put you in the position to Make sure that you keep focused on these objectives; you have not quite got there yet.

Additional help

You may use the program packages SERAD and BARCE for comparison with your own software implementation. To download SERAD, click here.
Back to the previous page.