Abstract
The availability of fast and accurate sequencing procedures along with the use of PCR has led to a proliferation of studies of variability at the molecular level in populations. Nevertheless, it is often impractical to examine long genomic stretches and a large number of individuals at the same time. In order to optimize this kind of study, we suggest a heuristic procedure for detection of the shortest region whose informational content can be considered sufficient for significant phylogenetic reconstruction. The method is based on the comparison of the pairwise genetic distances obtained from a set of sequences of reference to those obtained for different windows of variable size and position by means of a simple index. We also present an approach for testing whether the informative content in the stretches selected in this way is significantly different from the corresponding content shown by the larger genomic regions used as reference. Application of this test to the analysis of the VP1 protein gene of foot-and-mouth-disease type C virus allowed us to define optimal stretches whose informative content is not significantly different from that displayed by the complete VP1 sequence. We showed that the predictions made for type C sequences are valid for type O sequences, indicating that the results of the procedure are consistent.
Similar content being viewed by others
References
Adell JC, Dopazo J (1994) Monte Carlo simulation in phylogenies: an application to test the constancy of evolutionary rates. J Mole Evol 38:305–309
Balanant J, Guillot S, Candrea A, Delpeyroux F, Crainic R (1991) The natural genomic variability of poliovirus analyzed by a restriction fragment length polymorphism assay. Virology 184:646–654
Britten RJ, Baron WF, Stout DB, Davidson EH (1991) Sources and evolution of human Alu repeated sequences. Proc Natl Acad Sci USA 85:4770–4774
Chang S-YP, Bowman BH, Weiss JB, García RE, White TJ (1993) The origin of HIV-1 isolate HTLV-IIIB. Nature 363:466–469
Churchill GA, von Haeseler A, Navidi WC (1992) Sample size for a phylogenetic inference. Mol Biol Evol 9:753–765
DeBry RW, Abele L, Weiss SH, Hill MD, Bouzas M, Lorenzo E, Graebnitz F, Resnick L (1993) Dental HIV transmission? Nature 361:691
Delwart EL, Shpaer EG, Louwagie J, McCutchan F, Grez M, Rübsamen-Waigmann H, Mullins JI (1993) Genetic relationships determined by a heteroduplex mobility assay: analysis of HIV env genes. Science 262:1257–1261
Domingo E, Holland JJ (1988) High error rates, population equilibrium and evolution of RNA replicating systems. In: Domingo E, Holland JJ, Ahlquist P (eds) RNA genetics, vol III. CRC Press, Boca Raton, FL, p 3
Domingo E, Mateu MG, Martínez MA, Dopazo J, Moya A, Sobrino F (1990) Genetic variability and antigenic diversity of foot-and-mouth disease virus. In: Kurstak E, Marusyk RG, Murphy FA, Van Regenmortel MHV (eds) Applied virology research. Plenum, New York, p 233
Dopazo J (1994) Estimating errors and confidence intervals for branch lengths in phylogenetic trees by a bootstrap approach. J Mol Evol 38:300–304
Dopazo J, Dress A, von Haeseler A (1993a) Split decomposition: a technique to analyze viral evolution. Proc Natl Acad Sci USA 90:10320–10324
Dopazo J, Sobrino F, López-Galíndez C (1993b) Estimates by computer simulation of genetic distances from comparison of RNAse A mismatch cleavage patterns. J Virol Methods 45:73–82
Dopazo J, Rodríguez A, Sáiz JC, Sobrino F (1993c) Design of primers for PCR amplification of highly variable genomes. Comput Appl Biosci 9:123–125
Dopazo J, Rodrigo MJ, Rodríguez A, Sáiz JC, Sobrino F (1995) Aphthovirus evolution. In: Gibbs A, Calisher CH, Garcia-Arenal F (eds) Molecular evolution of viruses. Cambridge University Press, Cambridge (in press)
Dopazo J, Sobrino F (1993) A computer program for the design of PCR primers for diagnosis of highly variable genomes. J Virol Methods 41:157–166
Dopazo J, Sobrino F, Palma EL, Domingo E, Moya A (1988) VP1 protein gene of foot-and-mouth disease virus: a quasispecies model of molecular evolution. Proc Natl Acad Sci USA 85:6811–6815
Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia
Erlich HA, Gelfand D, Sninsky JJ (1991) Recent advances in the polymerase chain reaction. Science 252:1643–1651
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791
Felsenstein J (1993) PHYLIP (phylogeny inference package), version 3.5. Department of Genetics, University of Washington, Seattle
Goldman N (1993) Statistical tests of models of DNA substitution. J Mol Evol 36:182–198
Gorman OT, Bean WJ, Kawaoka Y, Donatelli I, Guo Y, Webster RG (1990) Evolution of influenza A virus nucleoprotein genes: implications for the origins of HINT human and classical swine viruses. J Virol 65:3704–3714
Hedges SB (1994) Molecular evidence for the origin of the birds. Proc Natl Acad Sci USA 91:2621–2624
Hillis DM, Huelsenbeck JP (1994) Support for dental HIV transmission. Nature 369:24–25
Hoetzel AR (1992) Molecular genetic analysis of populations. A practical approach. IRL Press, New York
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, p 21
Jurka J, Milosavljevic A (1991) Reconstruction and analysis of human Alu genes. J Mol Evol 32:105–121
Knowles NJ, Marquardt O, Samuel AR (1988) Antigenic and molecular characterization of isolates of recent outbreaks of foot-and-mouth disease virus in the Federal Republic of Germany. Report of the Session of the Research Group of the Standing Technical Committee of the European Commission for the Control of Foot-and-mouth Disease, Prague, Czechoslovakia, F.A.O., Rome, p 149
Kumar S, Tamura K, Nei M (1994) MEGA: molecular evolutionary genetics analysis software for microcomputers. Comput Appl Biosci 10:189–191
Linz U, Delling U, Rubsamen-Waigmann H (1990) Systematic studies on parameters influencing the performance of the polymerase chain reaction. J Clin Chem Clin Biochem 28:5–13
Martin AP, Kessing BD, Palumbi SR (1990) Accuracy of estimating genetic distances between species from short sequences of mitochondrial DNA. Mol Biol Evol 7:485–488
Martínez MA, Dopazo J, Hernández J, Mateu MG, Sobrino F, Domingo E, Knowles NJ (1992) Evolution of the capsid protein genes of foot-and-mouth disease virus: antigenic variation without accumulation of nucleotide substitutions over six decades. J Virol 66:3557–3565
Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York
Nicoh N, Hayase N, Iwabe N, Kuma K-I, Miyata T (1994) Phylogenetic relationships of the kingdoms Animalia, Plantae, and Fungi inferred from 23 different protein species. Mol Biol Evol 11:762–768
Ou C-Y, Ciesielski CA, Myers G, Bandea CI, Luo C-C, Korber BTM, Mullins JI, Schochetman G, Berkelman RL, Economou AN, Witte JJ, Furman LJ, Satten GA, MacInnes KA, Curran JW, Jaffe HW (1992) Molecular epidemiology of HIV transmission in a dental practice. Science 256:1165–1171
Rico-Hesse R, Pallansch MA, Nottay BK, Kew OM (1987) Geographic distribution of wild poliovirus type 1 genotypes. Virology 160:311–322
Rodrigo MJ, Dopazo J (1995) Evolutionary analysis of picornavirus family. J Mol Evol 40:362–371
Rzhetsky A, Nei M (1993) Theoretical foundation of the minimum-evolution method of phylogenetic inference. Mol Biol Evol 10: 1073–1095
Rzhetsky A, Nei M (1994) Unbiased estimates of the number of nucleotide substitutions when substitution rate varies among different sites. J Mol Evol 38:295–299
Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487–491
Saitou N, Imanishi T (1989) Relative efficiencies of the Fitch-Margoliash, maximum-parsimony, maximum-likelihood, minimum-evolution and neighbour-joining methods of phylogenetic tree construction in obtaining the correct tree. Mol Biol Evol 6: 514–525
Saitou N, Nei M (1987) The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Sáiz JC, Sobrino F, Dopazo J (1993) Molecular epidemiology of foot-and-mouth disease virus type O. J Gen Virol 74:2281–2285
Swofford DL, Olsen GJ (1990) Phylogeny reconstruction. In: Hillis DM, Moritz C (eds) Molecular systematics. Sinauer Associates, Sunderland, MA, p 411
Tajima F (1991) Determination of window size for analyzing DNA sequences. J Mol Evol 33:470–473
Tajima F (1993) Unbiased estimation of evolutionary distances between nucleotide sequences. Mol Biol Evol 10:677–688
Tateno Y, Takezaki N, Nei M (1994) Relative efficiencies of the maximum-likelihood, neighbor-joining and maximum-parsimony methods when substitution rate varies with site. Mol Biol Evol 11:261–277
Wartell RM, Hosseini H, Moran CJ Jr (1990) Detecting base pair substitutions in DNA fragments by temperature gradient electrophoresis. Nucleic Acids Res 18:2699–2701
Waterman MS, Clifford TH (1971) On the similarity of dendograms. J Theor Biol 73:789–800
Zharkikh A, Li W-H (1992) Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock. Mol Biol Evol 9:1119–1147
Author information
Authors and Affiliations
Additional information
Correspondence to: J. Dopazo
Rights and permissions
About this article
Cite this article
Martin, M.J., González-Candelas, F., Sobrino, F. et al. A method for determining the position and size of optimal sequence regions for phylogenetic analysis. J Mol Evol 41, 1128–1138 (1995). https://doi.org/10.1007/BF00173194
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00173194