Summary
There are several algorithms designed for searches for homologous sequences (Fitch 1966; Needleman and Wunsch 1970; Chva'tal and Sankoff 1975; Griggs 1977; Sannkoff 1972; Smith and Waterman 1981; Smith et al. 1981, Wagner and Fisher 1974; Waterman et al. 1976). This paper presents some very simple and useful high speed, “text editing” algorithms that search for exact nucleotide sequence repetition and genome duplication. The last algorithm suggested here is specifically adapted for the 4-letter alphabet of nucleotide sequences. Owing to the rapid accumulation of nucleotide sequences and the frequent need to search for sequence repetition or where a given set of nucleotides occurs in long sequences, efficient algorithms of this type are a necessity.
Similar content being viewed by others
References
Boyer RS, Moore JS (1977) A fast string searching algorithm. Communications of Association for Computer Machinery (ACM) 20:762–772
Chva'tal V Sankoff D (1975) Longest common subsequences of two random sequences. J Appli Probability 12:306–315
Fitch WM (1966) An improved method of testing for evolutionary homology. J Mol Biol 16:9–16
Griggs J (1977) Symmetric chain orders, Sperner theorems and loop-matchings. Ph.D. Thesis, M.I.T.
Knuth DE, Morris JH Jr, Pratt VA (1974) fast pattern matching in strings, Computer Science Dept. Report TR CS-74-440. Stanford University
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453
Nussinov R (1977) Secondary structure analysis of nucleic acids. Ph.D. thesis, Rutgers University
Nussinov R, Pieczenik G, Griggs J, Kleitman D (1978) Algorithms for loop matchings. Soc Ind Appl Math (C) J Appl Math 35:68–82
Sankoff D (1972) Matching sequences under deletion/insertion constraints Proc Natl Acad Sci USA 69:4–6
Smith TF, Waterman MS (1981) Identification of common molecular subsequences J Mol Biol 147:195–197
Smith TF, Waterman MS, Fitch WH (1981) Comparative biosequence metrics. J Mol Evol 18:38–46
Wagner RA, Fischer, MJ (1974) String to string correlation problem. J. Association for Computer Machinery (ACM) 21:168–173
Waterman MS, Smith TF, Beyer WA (1976) Some biological sequence metrics. Adv Math 20:367–387
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Nussinov, R. Efficient algorithms for searching for exact repetition of nucleotide sequences. J Mol Evol 19, 283–285 (1983). https://doi.org/10.1007/BF02099975
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02099975