COMPARISON OF EXCEPTIONAL WORDS IN TWO SEQUENCES |
|
Comparing exceptional words in two sequences is a way to compare these two sequences. In particular, it allows us to compare the signature of different organisms in terms of word composition.
For instance, we have compared the exceptionality of 3-words under M1 using the Gaussian approximation in two different phages: Lambda and T7; their whole genomes have been studied, respectively 48,502 bases and 39,936 bases.
(To view in more detail please click on image.)
We can note from the above picture that:
some words are exceptional but differently in Lambda and T7: CAG, CGG and AAA are over-represented in Lambda and under-represented in T7. On the contrary, CAA is under-represented in Lambda and over-represented in T7.
Some words appear to be specific of a unique phage: CTT, TTG and CTA are only exceptional in Lambda , whereas AAG, TGG and GGT are only exceptional in T7.
Finding words with unexpected frequencies in DNA sequences. 11.9.98 Page: 16 of 21 |
|