PERIODIC STRUCTURE OF THE WORDS |
|
When the last letters of a word W are identical to its first letters, two occurrences of W may overlap in the sequence. For instance, ACTTGAC starts and ends with AC so two occurrences of ACTTGAC may overlap like ACTTGACTTGAC. Such a word is said to be periodic.
The lag between two overlapping occurrences of W is called a period of W. In the above example, 5 is a period of ACTTGAC.
When W has a small period p (compared to the word length), we clearly see the periodic structure of the word:
W is composed of a "root" (W') repeated some times; the last motif W', represented here by W'', is often truncated.
It is clear on the above picture that multiples of the smallest period are periods themselves. For instance, the minimal period of AACAACAACAA is 3; 6 and 9 are trivial periods. Note that 10 is also a period of this word.
A period is said to be principal if it is not a multiple of the minimal period.
The only principal periods of AACAACAACAA are 3 and 10.
Finding words with unexpected frequencies in DNA sequences. 11.9.98 Page: 7 of 21 |
|