ISSN:
1432-1432
Keywords:
Short sequence distribution
;
Sequence constraints
;
Averaged sequence
;
Sequence structure
;
Asymmetric nucleotide sequences
;
GC content
;
Evolution
;
Evolutionary constraints
Source:
Springer Online Journal Archives 1860-2000
Topics:
Biology
Notes:
Summary The data from a genomic library can be sorted into the frequencies of every possible tetranucleotide in the sequence. This tabulation, a short sequence distribution, contains the frequency of occurrence of the 256 tetranucleotides and thus seems to serve as a vehicle for averaging sequence information. Two such distributions can be readily compared by correlation. Reported here are correlations (Spearmanr s) of the distributions from all of the genomic libraries in GenBank 44.0 with sizes equal to or larger than that ofSalmonella typhimurium, except for the data for mouse and humans. All of the organisms examined showed highly significant correlations between the two DNA strands (not the complementarity expected from base pairing). Of 155 comparisons between libraries, 132 showed significant correlations at the 99% confidence level. Application of the correlation coefficients as a similarity matrix clustered most organisms in a phenogram in a pattern consistent with other hypotheses. This suggests a highly conserved pattern underlying all other genetic information in cellular DNA and affecting both DNA strands, perhaps caused by interaction with conserved factors necessary for DNA packaging.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1007/BF02099925
Permalink