Cowan, R. (1991)
Expected frequencies of DNA patterns using Whittle's formula.
J. Appl. Prob.28 886-892.
Prum B., Rodolphe F. and Turckheim É de.
(1995).
Finding Words with Unexpected Frequencies in Deoxyribonucleic Acid Sequences,
J. R. Statist. Soc B., 57 205-220.
Abstract
Schbath, S. (1995),
Compound Poisson approximation of word counts in DNA sequences,
ESAIM: Prob. Stat., 1 1-16.
Schbath, S. (1995).
Étude asymptotique du nombre d'occurrences d'un
mot dans une chaîne de Markov et application à la
recherche de mots de fréquence exceptionnelle dans
les séquences d'ADN,
Thèse de l'Université René Descartes, Paris V.
Schbath, S.
(1997).
An efficient statistic to detect over- and under- represented words
in DNA sequences.
J. Comp. Biol., 4 189-192.
Abstract.
Chen-Stein method
Arratia, R., Goldstein, L. and Gordon, L.
(1989).
Two moments suffice for Poisson approximations : the Chen-Stein
method.
Ann. Prob.17 9-25.
Arratia, R., Goldstein, L. and Gordon, L.
(1990).
Poisson approximation and the Chen-Stein method.
Statistical Science.
5 403-434.
Combinatorics on words
Guibas, L. J. and Odlyzko, A. M.
(1981).
Periods in strings.
J. Combinatorial Theory A.
30 19-42.
Lothaire, M.
(1983).
Combinatorics on words.
Addison-Wesley.
Application
Leung, G. M., M. Y. Marsh and Speed, T. P.
(1996).
Over and underrepresentation of short DNA words in herpesvirus
genomes.
J. Comp. Biol.3 345-360.
Schbath, S.
Prum, B. and Turckheim É de.
(1995).
Exceptional motifs in different Markov chain models for a statistical
analysis of DNA sequences.
J. Comp. Biol., 2 417-437.
Finding words with unexpected frequencies in DNA sequences. 11.9.98 Page: References