Milestone 5: Learning parameters from incomplete data
with the Baum-Welch algorithm
First half of December 2003
Theory
Let us now turn to the more realistic situation
in which we don't know the hidden state sequence.
The objective is to learn the parameters from incomplete
observation. Here, incomplete means that
we have only access to one or several
sequences of emissions from the model,
while the states that generated these emissions are unknown.
The method to apply in this case is the
Baum-Welch algorithm, which is a particular
application of the Expectation Maximization (EM)
algorithm.
Note that the Baum-Welch algorithm
makes use of the forward-backward
algorithm, which you have implemented and tested for the
previous milestone.
Also, refer to the
previous milestone
for suggested reading material.
Practice
Implement the Baum-Welch algorithm
in software
and add this module to your
program.
How can you check easily if your implementation
has no bugs?
Think about it before you look up
the answer.
Test your implementation on the
following examples, for which you
know the true parameter values.
How do the results depend on the initialization?
Download the HMM software package
written by
Tapas Kanungo.
Use this software package to
-
generate data from an HMM with known parameters;
-
learn the parameters from the data with your own program;
-
compare the results with those you get with Kanungo's program.
Try and implement the Baum-Welch algorithm in software
on your own; this is a good exercise for
-
understanding the algorithm properly,
-
learning to put an algorithm into code, and
-
testing its implementation.
Once you've got the algorithm to work, I'd
encourage you to compare your software implementation
with the code written by
Tapas Kanungo.
Do you see any substantial differences in the code?
What, do you think, could be improved in your own code?
Back to the previous page.