Synthetic Data

Data.dat contains 16 synthetic DNA sequence alignments of 1000 base pairs and with two recombinant zones each. The dominant topology is in State=1, the first recombinant region is in State=2, and the second recombinantion event switches to State=3. You can find a definition of the three states by typing

MATLAB> help Tree4

on the MATLAB command line. This also shows you how the branch-length vectors are defined. The data sets vary with respect to the branch lengths of the phylogentic tree, the lengths of the recombination zones, their locations, and the location of the recombination events in time (in percent of the external branch lengths along which the sequences have evolved at the time of the event).

Location of the recombinant regions

In the following 5-tupels, the boldface numbers indicate the lengths of the recombinant zones, while numbers typed in italics indicate the lengths of the non-recombinant regions.

1-4 200-200-200-200-200
5-8 200-100-300-300-100
9-12 200-100-500-100-100
13-16 200-50-400-250-100

Branch lengths

You find a definition of the branch-length vectors by typing

MATLAB> help Tree4

at the MATLAB prompt.

1-2, 5-6, 9-10, 13-14 [0.1 0.1 0.1 0.1 0.2]
3-4, 7-8, 11-12, 15-16 [0.05 0.1 0.2 0.1 0.1]

Time of the recombination event

If the number is even, the recombination event happened after evolving along 70% of the external branches, if the number is odd, it happened after evolving along 80% of the external branches.


Last modified: Mon May 22 13:51:27 BST