Discretization Methods in Supervised Learning
D.A. ZIGHED, S. RABASEDA, R. RAKOTOMALALA, F. FESCHET
E.R.I.C_Lyon, Université Lumière Lyon 2
5, avenue P. Mendès-France 69676 Bron Cedex
Tél : 33-4-78-77-24-14 Fax : 33-4-78-77-23-75
In supervised learning, we often need a data mining method which is capable of taking into account numeric and symbolic data. Since most methods only deal with symbolic data, it is necessary to transform numeric attributes into symbolic ones. This process is called discretization. It consists in cutting the domain of the numeric attribute in a finite number of intervals and in coding each interval by a different value. The numeric attribute thus becomes a symbolic attribute. Among different methods, we focus here on methods which use the classes of the examples to discretize an attribute. In this context, there are several kinds of methods: on the one hand, there are the hill-climbing methods with the Top-Down and the Bottom-Up strategies and, on the other hand, there are methods based on Fischer's optimal algorithm. Top-Down methods recursively find a discretization point and build a binary partition by splitting the considered population at this point. At each step a thinner partition is obtained. These methods are based on criteria such as the Chi2 statistical law, Information Gain, incertainty measure, Minimum Description Length Principle Cut (MDLPC), CONTRAST. Bottom-Up methods like Chi-Merge or FUSINTER start with the thinnest possible partition and try to merge intervals. The paper reviews these techniques and proposes an extension of Fischer's algorithm. In order to evaluate the discretization methods considered, we use the Breiman's waves dataset. The prediction rate is compared. We also provide a discussion on the complexities of the different heuristics towards the one of Fischer's algorithm and on the increase in prediction reliability using this last strategy.