Tutorials - Supervised Learning & Scoring

Subject Components Tutorial Dataset
Decision Tree - ID3
Predict breast cancer from cells characteristics.
Dataset
Define Status
Spv Learning (Meta Spv)
ID3
breast
Discretization & Naive Bayes Classifier
Supervised discretization.
Dataset
Define Status
MDLPC
Naive Bayes
breast
Feature Selection
Correlation based feature selection for supervised learning.
Dataset
Define Status
MDLPC
MIFS
iris
Classification on a new dataset
Apply a classifier on a new dataset
Dataset
Select examples
Define Status
C-RT
View dataset
Export dataset
datasets
LIFT Curve
Targeting potential customers [SCORING].
(CoIL Challenge -- 2000).
Scoring
Lift
Spv Learning
tic data
ROC Curve
Computing ROC Graphs for classifier comparison.
Scoring
Roc
Spv Learning
heart
Use a predefined test set
Compare several supervised learning algorithms on a user predefined test set.
Spv Learning
Test
sonar data
Resampling Error Rate Estimate
Compare supervised learning algorithm with resubstitution and cross-validation error rate estimation
Spv Learning
ID3 and K-NN
Cross-Validation
heart
ID3 and big dataset
Supervised Learning with big dataset -- COVTYPE (581102 examples), all attributes are discrete (discretized).
ID3
Supervised Learning
covtype
Feature Construction -- SVD
NIPALS, a fast SVD or PCA algorithm, useful for high dimensional dataset. Application on a proteins classification process.
NIPALS
Spv Learning
K-NN
Bootstrap
dataset
SVM
SVM -- Support Vector Machine. A supervised learning algorithm which is well adapted for high dimensional problems.

Implements John C. Platt's sequential minimal optimization algorithm for training a support vector classifier using polynomial or RBF kernels.
References
J. Platt (1998). Fast Training of Support Vector Machines using Sequential Minimal Optimization. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds., MIT Press.
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy, Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation, 13(3), pp 637-649, 2001.
Nota: This is a port of WEKA implementation (SMO.JAVA, ver. 3-4)
SVM
Spv Learning
Bootstrap
sonar
Classification Trees and Decision Lists
Compare Decision Lists and Decision Trees algorithms on HEART dataset. These methods give similar results.
MDLPC
Decision List
Spv Learning
Bootstrap
heart
SVM for classification task
C-SVC, a very efficient implementation of a multi-class SVM from the LIBSVM library.
SVM
Spv Learning
Bootstrap
protein classification
Random Forest
Supervised learning with Breiman's Random Forest.
BAGGING
Random Tree
heart
STEPDISC
Stepwise Discriminant Analysis. Feature selection for Linear Discriminant Analysis.
Stepdisc
Linear Discriminant Analysis
sonar
FORWARD/BACKWARD LOGIT
Variable selection for binary logistic regression.
Forward-logit
Backward-logit
Scoring
Lift curve
Binary logistic regression
bank
MULTINOMIAL LOGISTIC REGRESSION
Multinomial logistic regression (or polytomous logistic regression for nominal dependent variable).
Multinomial Logistic Regression brand
Partial Least Squares Discriminant Analysis
Using the PLS Regression principle for classification task.
C-PLS
PLS-DA
PLS-LDA
breast


Ricco Rakotomalala.