Data sets

Synthetic Data

8 taxa, 5000 nucleotides, 5 different regions, breakpoints: 1000, 2000, 3000, 4000
Null No segmentation ------ ---1-5000
True True segmentation -1-1000-1001-2000-2001-3000-3001-4000-4001-5000
1 Only left segment correct -1-1000-1001-2000-----2001-5000
2 Only right segment correct -----1-3000-3001-4000-4001-5000
3 Subdividing recombinant regions -1-10001001-15001501-2000-2001-30003001-35003501-4000-4001-5000
4 Subdividing non-recombinant regions 1-500501-1000-1001-20002001-25002501-3000-3001-40004001-45004501-5000
5 Complete subdivision 1-500501-10001001-15001501-20002001-25002501-30003001-35003501-40004001-45004501-5000
6 True segmentation with slight misplacement -1-1010-1011-2010-2011-3010-3011-4010-4011-5000

Segmentations of synthetic DNA sequence alignment

Dengue Virus

7 taxa, 2295 nucleotides, 2 different regions, breakpoint: 1146
Null No segmentation ---1-2295
True True segmentation 1-1146--1147-2295
1 Subdivision of left region 1-573574-1146-1147-2295
2 Subdivision of right region 1-1146-1147-17211722-2295
3 Subdivision of both regions 1-573574-11461147-17211722-2295
4 Correct segmentation with slight misplacement 1-1156--1157-2295

Neisseria

8 taxa, 787 nucleotides, 4 different regions, breakpoints: 201, 507, 537.
Null No segmentation ----- 1-787
True True segmentation 1-201-202-506507-537-538-787
1 Only left segment resolved 1-201-202-506--507-787
2 Only right segment resolved 1-506--507-537-538-787
3 Subdividing left non-recombinant region 1-201202-350351-506507-537-538-787
4 Subdividing right non-recombinant region 1-201-202-506507-537538-662663-787
5 Subdividing both non-recombinant regions 1-201202-350351-506507-537538-662663-787

Hepatitis B

10 taxa, 3049 nucleotides, 5 different regions, breakpoints: 603, 1882, 2071, 2238.
Null No segmentation ------ 1-3049
True True segmentation 1-603-604-18821883-20712072-2238-2239-3049
1Merging of the first three regions 1-2071--- 2072-2238 -2239-3049
2Merging of the first two regions 1-1882--1883- 2071 2072-2238-2239-3049
3 Merging of the last three regions 1-603-604-1882---1883-3049
4 Merging of the last two regions 1-603-604-18821883-2071--2072-3049
5 Subdividing region 604-1882 1-603604-12431244-18821883-2071 2072-2238-2239-3049
6 Subdividing region 2238-end 1-603-604-18821883-2071 2072-2238 2239-26442645-3049
7 Subdividing both regions 1-603604-12431244-18821883-2071 2072-22382239-26442645-3049

Segmentations of real-world DNA sequence alignments


Back to my homepage
Last modified: March 2001