Abstract
This paper presents a case study of a machine-aided knowledge discovery process within the general area of drug design. Within drug design, the particular problem of pharmacophore discovery is isolated, and the Inductive Logic Programming (ILP) system progol is applied to the problem of identifying potential pharmacophores for ACE inhibition. The case study reported in this paper supports four general lessons for machine learning and knowledge discovery, as well as more specific lessons for pharmacophore discovery, for Inductive Logic Programming, and for ACE inhibition. The general lessons for machine learning and knowledge discovery are as follows.
1. An initial rediscovery step is a useful tool when approaching a new application domain.
2. General machine learning heuristics may fail to match the details of an application domain, but it may be possible to successfully apply a heuristic-based algorithm in spite of the mismatch.
3. A complete search for all plausible hypotheses can provide useful information to a user, although experimentation may be required to choose between competing hypotheses.
4. A declarative knowledge representation facilitates the development and debugging of background knowledge in collaboration with a domain expert, as well as the communication of final results.
Article PDF
Similar content being viewed by others
References
Andrews, P., Carson, J., Caselli, A., Spark, M., & Woods, R. (1985). Conformational analysis and active site modelling of angiotensin-converting enzyme inhibitors. Journal of Medicinal Chemistry, 28:393–399.
Bohacek, R., Lombaert, S. D., McMartin, C., Priestle, J., & Grutter, M. (1996). Three-dimensional models of ACE and NEP inhibitors and their use in the design of potent dual ACE/NEP inhibitors. Journal of the American Chemical Society, 118:8231–8249.
Brint, A. & Willett, P. (1987). Algorithms for the identification of three-dimensional maximal common substructures. J. Chem. Inf. Comput. Sci., 27(152):152–158.
Buchanan, B., Feigenbaum, E., & Sridharan, N. (1972). Heuristic theory formation: data interpretation and rule formation. In Meltzer, B. and Michie, D., editors, Machine intelligence 7, pages 267–290. Edinburgh University Press.
Debnath, A., de Compadre, R. L., Debnath, G., Schusterman, A., & Hansch, C. (1991). Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of Medicinal Chemistry, 34(2):786–797.
Dietterich, T., Lathrop, R., & Lozano-Perez, T. (1997). Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1-2):31–71.
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (1996). Advances in knowledge discovery and data mining. AAAI Press / MIT Press.
Finn, P. (1996). Computer-based screening of compound databases for the indentification of novel leads. Drug Discovery Today, 1:363–370.
Hansch, C. & Leo, A. (1995). Exploring QSAR. ACS.
Hansch, C., Maloney, P., Fujita, T., & Muir, M. (1962). Correlation of biological activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients. Nature, 194:178–180.
Hassell, C., Krohn, A., Moody, C., & Thomas, W. (1982). The design of a new group of angiotensin-converting enzyme inhibitors. FEBS Letters, 147:175–179.
Jain, A., Dietterich, T., Lathrop, R., Chapman, D., Critchlow, R., Bauer, B., Webster, T., & Lozano-Pérez, T. (1994a). Compass: a shape-based machine learning tool for drug design. Journal of Computer-Aided Molecular Design, 8:635–652.
Jain, A., Koile, K., Bauer, B., & Chapman, D. (1994b). Compass: Predicting biological activities from molecular surface properties. Journal of Medicinal Chemistry, 37:2315–2327.
King, R., Muggleton, S., Lewis, R., & Sternberg, M. (1992). Drug design by machine learning: The use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. Proceedings of the National Academy of Sciences, 89(23):11322–11326.
King, R., Muggleton, S., Srinivasan, A., & Sternberg, M. (1996). Structure-activity relationships derived by machine learning: the use of atoms and their bond connectives to predict mutagenicity by inductive logic programming. Proceedings of the National Academy of Sciences, 93:438–442.
Leach, A. (1991). A survey of methods for searching conformational space of small and medium sized molecules. In Lipkowitz and Boyd, editors, Reviews of Computational Chemistry, Vol. 2. VCH USA.
Lee, Y., Buchanan, B., & Aronis, J. (1998). Knowledge-based learning in exploratory science: Learning rules to predict rodent carcinogenicity. Machine Learning, 30, 217–240.
Lombaert, S. D., Chatelain, R., Fink, C., & Trapani, A. (1996). Design and pharmacology of dual angiotensinconverting enzyme and neutral endopeptidase inhibitors. Current Pharmaceutical Design, 2:443–462.
Martin, Y., Bures, M., Danaher, E., DeLazzer, J., Lico, I., & Pavlik, P. (1993). A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists. Journal of Computer-Aided Molecular Design, 7:83–102.
Mayer, D., Naylor, C., Motoc, I., & Marshall, G. (1987). A unique geometry of the active site of angiotensinconverting enzyme consistent with structure-activity studies. Journal of Computer-Aided Molecular Design, 1:3–16.
Michalski, R., Mozetic, I., Hong, J., & Lavrac, N. (1986). The AQ15 inductive learning system: an overview and experiments. In Proceedings of IMAL 1986, Orsay. Université de Paris-Sud.
Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, 13:245–286.
Muggleton, S. (1996). Learning from positive data. In Proceedings of the Sixth Inductive Logic Programming Workshop, Lecture notes in artificial intelligence, Berlin. Springer-Verlag.
Muggleton, S. & Feng, C. (1990). Efficient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo. Ohmsha.
Muggleton, S., Page, C., & Srinivasan, A. (1996). An initial experiment into stereochemistry-based drug design using ILP. In Proceedings of the Sixth Inductive Logic Programming Workshop, Lecture notes in artificial intelligence, Berlin. Springer-Verlag.
Nilsson, N. (1980). Principles of Artificial Intelligence. Tioga, Palo Alto, CA.
Provost, F. & Aronis, J. (1996). Scaling up inductive learning with massive parallelism. Machine Learning, 23:33–46.
Saith, R., Srinivasan, A., Michie, D., & Sargent, I. (1997). The relationship between embryo, oocyte and follicular features and the developmental potential of human IVF embryos. Human Reproduction (Submitted).
Shapiro, E. (1983). Algorithmic program debugging. MIT Press.
Srinivasan, A. & Camacho, R. (1996). Experiments in numerical reasoning with ILP. Technical Report PRGTR-22-96, Oxford University Computing Laboratory, Oxford.
Srinivasan, A. & Camacho, R. (1997). Experiments in numerical reasoning with ILP. Journal of Logic Programming (accepted).
Whittle, P. & Blundell, T. (1994). Protein structure-based drug design. Annu. Rev. Biophys. Biomol. Struct., 23:349–375.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Finn, P., Muggleton, S., Page, D. et al. Pharmacophore Discovery Using the Inductive Logic Programming System PROGOL. Machine Learning 30, 241–270 (1998). https://doi.org/10.1023/A:1007460424845
Issue Date:
DOI: https://doi.org/10.1023/A:1007460424845