The Importance of Attribute Selection Measures in Decision Tree Induction

Liu, W.Z.; White, A.P.

doi:10.1023/A:1022609119415

The Importance of Attribute Selection Measures in Decision Tree Induction

Published: April 1994

Volume 15, pages 25–41, (1994)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

The Importance of Attribute Selection Measures in Decision Tree Induction

Download PDF

W.Z. Liu¹ &
A.P. White¹

2274 Accesses
42 Citations
Explore all metrics

Abstract

Recent work by Mingers and by Buntine and Niblett on the performance of various attribute selection measures has addressed the topic of random selection of attributes in the construction of decision trees. This article is concerned with the mechanisms underlying the relative performance of conventional and random attribute selection measures. The three experiments reported here employed synthetic data sets, constructed so as to have the precise properties required to test specific hypotheses. The principal underlying idea was that the performance decrement typical of random attribute selection is due to two factors. First, there is a greater chance that informative attributes will be omitted from the subset selected for the final tree. Second, there is a greater risk of overfitting, which is caused by attributes of little or no value in discriminating between classes being “locked in” to the tree structure, near the root. The first experiment showed that the performance decrement increased with the number of available pure-noise attributes. The second experiment indicated that there was little decrement when all the attributes were of equal importance in discriminating between classes. The third experiment showed that a rather greater performance decrement (than in the second experiment) could be expected if the attributes were all informative, but to different degrees.

References

Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees. Monterey, CA: Wadsworth.
Google Scholar
Buntine, W., & Niblett, T. (1992). A further comparison of splitting rules for decision-tree induction. Machine Learning, 8, 75–86.
Google Scholar
Keppel, G. (1973). Design and analysis: A researcher's handbook. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Kononenko, I., Bratko, I., & Roskar, E. (1984). Experiments in automatic learning of medical diagnostic rules. (Technical Report). Jozef Stefan Institute, Ljubjana, Yugoslavia.
Google Scholar
Liu, W.Z., & White, A.P. (1991). A review of inductive learning. In I.M. Graham & R.W. Milne (Eds.), Research and development in expert systems VIII. Cambridge: Cambridge University Press.
Google Scholar
Mingers, J. (1989a). An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3, 319–342.
CAS PubMed Google Scholar
Mingers, J. (1989b). An empirical comparison of pruning methods for decision-tree induction. Machine Learning, 4, 227–243.
Google Scholar
Niblett, T., & Bratko, I. (1987). Learning decision rules in noisy domains. In M.A. Bramer (Ed.), Research and development in expert systems III. Cambridge: Cambridge University Press.
Google Scholar
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Google Scholar
Quinlan, J.R. (1988). Decision trees and multi-valued attributes. Machine Intelligence, 11, 305–318.
Google Scholar
Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Google Scholar
White, A.P. (1985). PREDICTOR: an alternative approach to uncertain inference in expert systems. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence (pp. 328–330). Los Altos: Morgan Kaufmann.
Google Scholar
White, A.P. (1987). Probabilistic induction by dynamic path generation in virtual trees. In M.A. Bramer (Ed.), Research and development in expert systems III. Cambridge: Cambridge University Press.
Google Scholar
White, A.P., & Liu, W.Z. (1990). Probabilistic induction by dynamic path generation for continuous variables. In T.R. Addis & R.M. Muir (Eds.), Research and development in expert systems VII. Cambridge: Cambridge University Press.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Birmingham, P.O. Box 363, Birmingham, B15 2TT, United Kingdom
W.Z. Liu & A.P. White

Authors

W.Z. Liu
View author publications
You can also search for this author in PubMed Google Scholar
A.P. White
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, W., White, A. The Importance of Attribute Selection Measures in Decision Tree Induction. Machine Learning 15, 25–41 (1994). https://doi.org/10.1023/A:1022609119415

Download citation

Issue Date: April 1994
DOI: https://doi.org/10.1023/A:1022609119415

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Importance of Attribute Selection Measures in Decision Tree Induction

Abstract

Article PDF

Similar content being viewed by others

Decision Tree

Decision Tree Induction Methods and Their Application to Big Data

Decision Tree

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

The Importance of Attribute Selection Measures in Decision Tree Induction

Abstract

Article PDF

Similar content being viewed by others

Decision Tree

Decision Tree Induction Methods and Their Application to Big Data

Decision Tree

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation