This talk will be about some recent work that makes progress on this more elusive goal. I will describe results that give performance guarantees for two of the most widely used and empirically successful heuristics in machine learning: the decision tree learning algorithms C4.5 (Quinlan 1993) and CART (Breiman et al. 1984). The main result is a proof that these algorithms are in fact *boosting* algorithms. By this we mean that if the functions used to label the internal nodes of the decision tree are weakly correlated with the unknown target function, then the top-down algorithms we study will amplify this weak correlation to build a tree achieving any desired level of accuracy. The bounds we obtain for this amplification show an interesting dependence on the *splitting criterion* used by the algorithm.
One nice aspect of these results is that they directly lead to some interesting improvements and experiments. In particular, the theoretical analysis suggests that a new splitting criterion may lead to smaller trees, and we confirm this experimentally using data sets from the UC Irvine repository. Furthermore, by giving performance bounds for the top-down algorithms in the well-studied boosting model, we permit direct comparison with algorithms developed especially for that model. We investigate this line of thought by experimentally comparing C4.5 and the AdaBoost algorithm (Freund and Schapire 1995), and find that their differences of behavior are nicely explained by the so-called "advantage sequence" of the formal analysis.
This talk will assume no prior knowledge of the theory or practice of machine learning.
HOSTS: Professor E. Grimson and Professor R. Rivest
|
Modified: Jun 24, 1997
|
Current events
|
Your comments
and inquiries are welcome.