Skip to main content
Log in

Efficient algorithms for searching for exact repetition of nucleotide sequences

  • Original Articles
  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Summary

There are several algorithms designed for searches for homologous sequences (Fitch 1966; Needleman and Wunsch 1970; Chva'tal and Sankoff 1975; Griggs 1977; Sannkoff 1972; Smith and Waterman 1981; Smith et al. 1981, Wagner and Fisher 1974; Waterman et al. 1976). This paper presents some very simple and useful high speed, “text editing” algorithms that search for exact nucleotide sequence repetition and genome duplication. The last algorithm suggested here is specifically adapted for the 4-letter alphabet of nucleotide sequences. Owing to the rapid accumulation of nucleotide sequences and the frequent need to search for sequence repetition or where a given set of nucleotides occurs in long sequences, efficient algorithms of this type are a necessity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Boyer RS, Moore JS (1977) A fast string searching algorithm. Communications of Association for Computer Machinery (ACM) 20:762–772

    Google Scholar 

  • Chva'tal V Sankoff D (1975) Longest common subsequences of two random sequences. J Appli Probability 12:306–315

    Google Scholar 

  • Fitch WM (1966) An improved method of testing for evolutionary homology. J Mol Biol 16:9–16

    PubMed  Google Scholar 

  • Griggs J (1977) Symmetric chain orders, Sperner theorems and loop-matchings. Ph.D. Thesis, M.I.T.

  • Knuth DE, Morris JH Jr, Pratt VA (1974) fast pattern matching in strings, Computer Science Dept. Report TR CS-74-440. Stanford University

  • Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453

    Article  PubMed  Google Scholar 

  • Nussinov R (1977) Secondary structure analysis of nucleic acids. Ph.D. thesis, Rutgers University

  • Nussinov R, Pieczenik G, Griggs J, Kleitman D (1978) Algorithms for loop matchings. Soc Ind Appl Math (C) J Appl Math 35:68–82

    Article  Google Scholar 

  • Sankoff D (1972) Matching sequences under deletion/insertion constraints Proc Natl Acad Sci USA 69:4–6

    PubMed  Google Scholar 

  • Smith TF, Waterman MS (1981) Identification of common molecular subsequences J Mol Biol 147:195–197

    Article  PubMed  Google Scholar 

  • Smith TF, Waterman MS, Fitch WH (1981) Comparative biosequence metrics. J Mol Evol 18:38–46

    Article  PubMed  Google Scholar 

  • Wagner RA, Fischer, MJ (1974) String to string correlation problem. J. Association for Computer Machinery (ACM) 21:168–173

    Google Scholar 

  • Waterman MS, Smith TF, Beyer WA (1976) Some biological sequence metrics. Adv Math 20:367–387

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nussinov, R. Efficient algorithms for searching for exact repetition of nucleotide sequences. J Mol Evol 19, 283–285 (1983). https://doi.org/10.1007/BF02099975

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02099975

Key words

Navigation