ISSN:
0887-3585
Keywords:
DNA binding proteins
;
maximum likelihood
;
CRP
;
finite mixtures
;
transcription regulation
;
Chemistry
;
Biochemistry and Biotechnology
Source:
Wiley InterScience Backfile Collection 1832-2000
Topics:
Medicine
Notes:
Statistical methodology for the identification and characterization of protein binding sites in a set of unaligned DNA fragments is presented. Each sequence must contain at least one common site. No alignment of the sites is required. Instead, the uncertainty in the location of the sites is handled by employing the missing information principle to develop an “expectation maximization” (EM) algorithm. This approach allows for the simultaneous identification of the sites and characterization of the binding motifs. The reliability of the algorithm increases with the number of fragments, but the computations increase only linearly. The method is illustrated with an example, using known cyclic adenosine monophophate receptor protein (CRP) binding sites. The final motif is utilized in a search for undiscovered CRP binding sites.
Additional Material:
3 Ill.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1002/prot.340070105