Pruning for PDM

Dirk Husmeier, Biomathematics and Statistics Scotland (BioSS)

Preparation

Add the following directories to your command path:

Add the directory in which you keep your MATLAB functions JambePrune*.m to your MATLABPATH. To add a function to your MATLAB path, proceed as follows:

New_Path='YOUR_PATH';
eval(sprintf('addpath ''%s'' ', New_Path));

where YOUR_PATH is the name of your path.

You need to keep the following files from the MCMC simulation carried out by JAMBE:

The functions to be described below will overwrite these files; hence they need to be copied to the respective bak files:

cp resultsAllTopos.out resultsAllTopos.bak
cp resultsStringToIntegerTranslator.out resultsStringToIntegerTranslator.bak

In fact, the functions described below assume that this has happened, and they read in from the *.bak rather than the original *.out files.


Step 1

The topology trajectory you get from the PDM method implemented in Jambe is saved in resultsAllTopos.out. Move it to file resultsAllTopos.bak and then convert the topology strings to integers with the command java PruneInputConvertStringToInteger. Redirect the output of this command to some temporary file, say resultsAllToposInt.out. If you use UNIX, this looks as follows:

mv resultsAllTopos.out resultsAllTopos.bak
java PruneInputConvertStringToInteger > resultsAllToposInt.out


Step 2

Extract the topologies from file resultsStringToIntegerTranslator.bak and add a semicolon at the end. Name the resulting file intree. On Unix, you do this with the following command:

awk '{print $3 ";"}' ResultsStringToIntegerTranslator.bak > intree

You should get a file that looks as follows:

(1,(2,((3,4),((5,6),(7,8)))));
(1,(2,(((3,4),(7,8)),(5,6))));
(1,((2,((5,6),(7,8))),(3,4)));
(1,((2,(3,4)),((5,6),(7,8))));
(1,(2,(((3,4),(5,6)),(7,8))));
...


Step 3

Make sure

Run batch file job_runTreeDist.bat, which contains the following command lines:
treedist << END
2
P
S
Y
END


All steps together (on UNIX)

The following code assumes that you have moved or copied the output files from a PDM simulation with Jambe, resultsAllTopos.out and resultsStringToIntegerTranslator.out to the respective back-up files:

mv resultsAllTopos.out resultsAllTopos.bak
mv resultsStringToIntegerTranslator.out resultsStringToIntegerTranslator.bak

It then procedes to create a new pruned file resultsAllTopos.out. This new file

java PruneInputConvertStringToInteger >! resultsAllToposInt.out
awk '{print $3 ";"}' resultsStringToIntegerTranslator.bak >! intree
rm outfile
job_runTreeDist.bat
IN MATLAB: JambePrune
rm intree
rm resultsAllToposInt.out
rm outfile

You can now compute the probabilistic divergence measure in the standard way:

java JambeAnalyseTopos

Note that this command overwrites the file resultsStringToIntegerTranslator.out. For this reason, we have copied (in the code fragment above) this file to resultsStringToIntegerTranslator.bak.

You can do the whole analysis within MATLAB using the command jobJambePrune. This program calls all the required MATLAB functions as well as the external UNIX and JAVA commands.

Note that it only works if you have set the command paths properly.


Back to my homepage