Tutorials - Data Manipulation

Subject | Components | Tutorial | Dataset |

An EXCEL add-in in order to automatically prepare and transfer a dataset to TANAGRA.
This approach is an alternative to the software where the specification menu and the reports are embedded in the spreadsheet, such as XLMINER or XLSTAT. Our add-in is compatible with all EXCEL version from 97 to 2007. |
dataset | ||

An add-on in order to transfer a dataset from OOoCalc (Open Office Calc) to TANAGRA. |
View Dataset C-RT Cross-validation |
breast dataset | |

Import tab separated text fileBuild a dataset into EXCEL spreadsheet, save into text file format (tab separator) and import into TANAGRA. |
Data importation wizard Dataset |
weather | |

Descriptive StatisticsSome descriptive statistics on an dataset. |
Data importation wizard Dataset Define Status Univariate continuous stat Univariate discrete stat Group characterization |
breast | |

Feature Construction -- DiscretizationSupervised discretization. |
Dataset Define Status MDLPC Naive Bayes |
breast | |

Feature Construction -- Dummy coding of categorical attributesTransform a categorical attribute into a set of binary variables. |
Dataset Define Status 0_1_BINARIZE Logistic Regression Linear Discriminant analysis |
c.heart | |

EXCEL File FormatHandle a spreadsheet dataset (EXCEL 97 & 2000) |
Dataset Define Status Group characterization |
adult | |

WEKA File FormatHandling WEKA (.ARFF) file format Use DATANAMORF if you want wide options about missing data handling |
Dataset |
sick | |

Big DatasetSupervised Learning with big dataset -- COVTYPE (581102 examples), all attributes are discrete (discretized). |
ID3 Supervised Learning |
covtype | |

Save/Load a sub-diagramWith the 1.4.8 version, we can save/load a part of the diagram. We can thus apply the same analysis on various dataset. |
Supervised Learning |
vote & zoo | |

Univariate detection of outliersTanagra 1.4.24 implements a new component which intends to detect outliers on variables. Our reference is NIST website. |
More Univariate Cont Stat Univariate Continuous Stat Univariate Outlier Detection |
body mass index |

Ricco Rakotomalala.