Publications of Jérôme Darmont
F. Bentayeb, J. Darmont, C. Favre, C. Udréa, "Efficient On-Line Mining of Large Databases", International Journal of Business Information Systems, Vol. 2, No. 3, 2007, 328-350.
Great efforts have been achieved to apply data mining algorithms onto large databases. However, long processing times remain a practical issue. This paper presents a framework to offer to database users online operators for mining large databases without size limit, in acceptable processing times. First, we integrate decision tree algorithms directly into database management systems. We are thus only limited by disc capacity and not by main memory. However, disc accesses still induce long response times. Hence, we propose two optimisations in a second step: reducing the size of the learning database by building its corresponding contingency table and reducing the number of database accesses by exploiting bitmap indices. Thus, the various decision tree based methods we implemented within Oracle deal with contingency tables or bitmap indices rather than with the whole training set. Experimentations performed show the efficiency of our integrated methods.
Databases, On-line data mining, Decision trees, Performance, Relational views, Contingency tables, Bitmap indices
[ BibTeX | XML | Full paper | Back ]