Skip to main content
Log in

A method for determining the position and size of optimal sequence regions for phylogenetic analysis

  • Articles
  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

The availability of fast and accurate sequencing procedures along with the use of PCR has led to a proliferation of studies of variability at the molecular level in populations. Nevertheless, it is often impractical to examine long genomic stretches and a large number of individuals at the same time. In order to optimize this kind of study, we suggest a heuristic procedure for detection of the shortest region whose informational content can be considered sufficient for significant phylogenetic reconstruction. The method is based on the comparison of the pairwise genetic distances obtained from a set of sequences of reference to those obtained for different windows of variable size and position by means of a simple index. We also present an approach for testing whether the informative content in the stretches selected in this way is significantly different from the corresponding content shown by the larger genomic regions used as reference. Application of this test to the analysis of the VP1 protein gene of foot-and-mouth-disease type C virus allowed us to define optimal stretches whose informative content is not significantly different from that displayed by the complete VP1 sequence. We showed that the predictions made for type C sequences are valid for type O sequences, indicating that the results of the procedure are consistent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adell JC, Dopazo J (1994) Monte Carlo simulation in phylogenies: an application to test the constancy of evolutionary rates. J Mole Evol 38:305–309

    Google Scholar 

  • Balanant J, Guillot S, Candrea A, Delpeyroux F, Crainic R (1991) The natural genomic variability of poliovirus analyzed by a restriction fragment length polymorphism assay. Virology 184:646–654

    Google Scholar 

  • Britten RJ, Baron WF, Stout DB, Davidson EH (1991) Sources and evolution of human Alu repeated sequences. Proc Natl Acad Sci USA 85:4770–4774

    Google Scholar 

  • Chang S-YP, Bowman BH, Weiss JB, García RE, White TJ (1993) The origin of HIV-1 isolate HTLV-IIIB. Nature 363:466–469

    Google Scholar 

  • Churchill GA, von Haeseler A, Navidi WC (1992) Sample size for a phylogenetic inference. Mol Biol Evol 9:753–765

    Google Scholar 

  • DeBry RW, Abele L, Weiss SH, Hill MD, Bouzas M, Lorenzo E, Graebnitz F, Resnick L (1993) Dental HIV transmission? Nature 361:691

    Google Scholar 

  • Delwart EL, Shpaer EG, Louwagie J, McCutchan F, Grez M, Rübsamen-Waigmann H, Mullins JI (1993) Genetic relationships determined by a heteroduplex mobility assay: analysis of HIV env genes. Science 262:1257–1261

    Google Scholar 

  • Domingo E, Holland JJ (1988) High error rates, population equilibrium and evolution of RNA replicating systems. In: Domingo E, Holland JJ, Ahlquist P (eds) RNA genetics, vol III. CRC Press, Boca Raton, FL, p 3

    Google Scholar 

  • Domingo E, Mateu MG, Martínez MA, Dopazo J, Moya A, Sobrino F (1990) Genetic variability and antigenic diversity of foot-and-mouth disease virus. In: Kurstak E, Marusyk RG, Murphy FA, Van Regenmortel MHV (eds) Applied virology research. Plenum, New York, p 233

    Google Scholar 

  • Dopazo J (1994) Estimating errors and confidence intervals for branch lengths in phylogenetic trees by a bootstrap approach. J Mol Evol 38:300–304

    Google Scholar 

  • Dopazo J, Dress A, von Haeseler A (1993a) Split decomposition: a technique to analyze viral evolution. Proc Natl Acad Sci USA 90:10320–10324

    Google Scholar 

  • Dopazo J, Sobrino F, López-Galíndez C (1993b) Estimates by computer simulation of genetic distances from comparison of RNAse A mismatch cleavage patterns. J Virol Methods 45:73–82

    Google Scholar 

  • Dopazo J, Rodríguez A, Sáiz JC, Sobrino F (1993c) Design of primers for PCR amplification of highly variable genomes. Comput Appl Biosci 9:123–125

    Google Scholar 

  • Dopazo J, Rodrigo MJ, Rodríguez A, Sáiz JC, Sobrino F (1995) Aphthovirus evolution. In: Gibbs A, Calisher CH, Garcia-Arenal F (eds) Molecular evolution of viruses. Cambridge University Press, Cambridge (in press)

    Google Scholar 

  • Dopazo J, Sobrino F (1993) A computer program for the design of PCR primers for diagnosis of highly variable genomes. J Virol Methods 41:157–166

    Google Scholar 

  • Dopazo J, Sobrino F, Palma EL, Domingo E, Moya A (1988) VP1 protein gene of foot-and-mouth disease virus: a quasispecies model of molecular evolution. Proc Natl Acad Sci USA 85:6811–6815

    Google Scholar 

  • Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia

    Google Scholar 

  • Erlich HA, Gelfand D, Sninsky JJ (1991) Recent advances in the polymerase chain reaction. Science 252:1643–1651

    Google Scholar 

  • Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791

    Google Scholar 

  • Felsenstein J (1993) PHYLIP (phylogeny inference package), version 3.5. Department of Genetics, University of Washington, Seattle

    Google Scholar 

  • Goldman N (1993) Statistical tests of models of DNA substitution. J Mol Evol 36:182–198

    Google Scholar 

  • Gorman OT, Bean WJ, Kawaoka Y, Donatelli I, Guo Y, Webster RG (1990) Evolution of influenza A virus nucleoprotein genes: implications for the origins of HINT human and classical swine viruses. J Virol 65:3704–3714

    Google Scholar 

  • Hedges SB (1994) Molecular evidence for the origin of the birds. Proc Natl Acad Sci USA 91:2621–2624

    Google Scholar 

  • Hillis DM, Huelsenbeck JP (1994) Support for dental HIV transmission. Nature 369:24–25

    Google Scholar 

  • Hoetzel AR (1992) Molecular genetic analysis of populations. A practical approach. IRL Press, New York

    Google Scholar 

  • Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, p 21

    Google Scholar 

  • Jurka J, Milosavljevic A (1991) Reconstruction and analysis of human Alu genes. J Mol Evol 32:105–121

    Google Scholar 

  • Knowles NJ, Marquardt O, Samuel AR (1988) Antigenic and molecular characterization of isolates of recent outbreaks of foot-and-mouth disease virus in the Federal Republic of Germany. Report of the Session of the Research Group of the Standing Technical Committee of the European Commission for the Control of Foot-and-mouth Disease, Prague, Czechoslovakia, F.A.O., Rome, p 149

    Google Scholar 

  • Kumar S, Tamura K, Nei M (1994) MEGA: molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189–191

    CAS  PubMed  Google Scholar 

  • Linz U, Delling U, Rubsamen-Waigmann H (1990) Systematic studies on parameters influencing the performance of the polymerase chain reaction. J Clin Chem Clin Biochem 28:5–13

    Google Scholar 

  • Martin AP, Kessing BD, Palumbi SR (1990) Accuracy of estimating genetic distances between species from short sequences of mitochondrial DNA. Mol Biol Evol 7:485–488

    Google Scholar 

  • Martínez MA, Dopazo J, Hernández J, Mateu MG, Sobrino F, Domingo E, Knowles NJ (1992) Evolution of the capsid protein genes of foot-and-mouth disease virus: antigenic variation without accumulation of nucleotide substitutions over six decades. J Virol 66:3557–3565

    Google Scholar 

  • Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

    Google Scholar 

  • Nicoh N, Hayase N, Iwabe N, Kuma K-I, Miyata T (1994) Phylogenetic relationships of the kingdoms Animalia, Plantae, and Fungi inferred from 23 different protein species. Mol Biol Evol 11:762–768

    Google Scholar 

  • Ou C-Y, Ciesielski CA, Myers G, Bandea CI, Luo C-C, Korber BTM, Mullins JI, Schochetman G, Berkelman RL, Economou AN, Witte JJ, Furman LJ, Satten GA, MacInnes KA, Curran JW, Jaffe HW (1992) Molecular epidemiology of HIV transmission in a dental practice. Science 256:1165–1171

    Google Scholar 

  • Rico-Hesse R, Pallansch MA, Nottay BK, Kew OM (1987) Geographic distribution of wild poliovirus type 1 genotypes. Virology 160:311–322

    Google Scholar 

  • Rodrigo MJ, Dopazo J (1995) Evolutionary analysis of picornavirus family. J Mol Evol 40:362–371

    Google Scholar 

  • Rzhetsky A, Nei M (1993) Theoretical foundation of the minimum-evolution method of phylogenetic inference. Mol Biol Evol 10: 1073–1095

    Google Scholar 

  • Rzhetsky A, Nei M (1994) Unbiased estimates of the number of nucleotide substitutions when substitution rate varies among different sites. J Mol Evol 38:295–299

    Google Scholar 

  • Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487–491

    CAS  PubMed  Google Scholar 

  • Saitou N, Imanishi T (1989) Relative efficiencies of the Fitch-Margoliash, maximum-parsimony, maximum-likelihood, minimum-evolution and neighbour-joining methods of phylogenetic tree construction in obtaining the correct tree. Mol Biol Evol 6: 514–525

    Google Scholar 

  • Saitou N, Nei M (1987) The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    Google Scholar 

  • Sáiz JC, Sobrino F, Dopazo J (1993) Molecular epidemiology of foot-and-mouth disease virus type O. J Gen Virol 74:2281–2285

    Google Scholar 

  • Swofford DL, Olsen GJ (1990) Phylogeny reconstruction. In: Hillis DM, Moritz C (eds) Molecular systematics. Sinauer Associates, Sunderland, MA, p 411

    Google Scholar 

  • Tajima F (1991) Determination of window size for analyzing DNA sequences. J Mol Evol 33:470–473

    Google Scholar 

  • Tajima F (1993) Unbiased estimation of evolutionary distances between nucleotide sequences. Mol Biol Evol 10:677–688

    Google Scholar 

  • Tateno Y, Takezaki N, Nei M (1994) Relative efficiencies of the maximum-likelihood, neighbor-joining and maximum-parsimony methods when substitution rate varies with site. Mol Biol Evol 11:261–277

    Google Scholar 

  • Wartell RM, Hosseini H, Moran CJ Jr (1990) Detecting base pair substitutions in DNA fragments by temperature gradient electrophoresis. Nucleic Acids Res 18:2699–2701

    Google Scholar 

  • Waterman MS, Clifford TH (1971) On the similarity of dendograms. J Theor Biol 73:789–800

    Google Scholar 

  • Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock. Mol Biol Evol 9:1119–1147

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Correspondence to: J. Dopazo

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martin, M.J., González-Candelas, F., Sobrino, F. et al. A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Mol Evol 41, 1128–1138 (1995). https://doi.org/10.1007/BF00173194

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00173194

Key words

Navigation