Comparison of data mining tools

The principal idea is to show the use and the behavior of several free software vis-a-vis to various scenarios of data processing. We observe that these softwares have many common points.

We use mainly the following free packages:

Tutorials

Problem Component used Tutorial Dataset
Teaching data mining with free tools

Can we (and can one be satisfied to) use free software in teaching?
Several criteria to evaluate data mining tools.

The functionnalities of three (famous) data mining tools : ORANGE - TANAGRA - WEKA. (ERIC Lab seminar -- Dec., 12, 2005).

Slides
(french)
-
ROC curve with ORANGE, TANAGRA et WEKA.

In this tutorial, we show the common points and the differences between these softwares in a particular task: computing a ROC curve from a logistic regression.

Dataset
Sampling
Define Status
Logistic regression
ROC Curve
ds1_10
Building a decision tree with ORANGE, TANAGRA and WEKA.

In this tutorial, we show: (1) how to build a decision tree from a dataset; (2) how to estimate its error rate using a cross-validation process.

Dataset
Define Status
C-RT Classification tree
Cross-validation
heart
Comparison of supervised methods using a predefined test set (with ORANGE, TANAGRA and WEKA).

In this tutorial, we show: (1) how to build several prediction models with the same training set; (2) how to estimate their error rate using a predefined test set.

Dataset
Define Status
C-RT Classification tree
Logistic regression
SVM (Linear)
breast tow
Interactive Tree Builder (ORANGE et SIPINA)

In this tutorial, we show how to interactively build decision tree with ORANGE and SIPINA.

iris
Computing association rules (ORANGE, TANAGRA et WEKA).

In this tutorial, we show how to compute association rules from various implementations of A PRIORI AGRAWAL's algorithm.

Dataset
Define Status
A PRIORI
vote
Building a neural network (SIPINA, TANAGRA et WEKA).

In this tutorial, we show how to parametrize, train and evaluate a multilayer perceptron.

Dataset
Define Status
Multilayer perceptron
ionosphere


Ricco Rakotomalala.