PKDD 2000 Conference:
Papers and Presentations



Contents


  1. Long Papers of the Conference
  2. Short Papers / Posters of the Conference
  3. Workshops
  4. Tutorials
  5. Invited Talks
  6. Discovery Challenge
  7. Business Meeting




Long Papers


Zighed, D.A., Université Lyon 2, Bron, France
Komorowski, J., Norwegian University of Science and Technology, Trondheim, Norway
Zytkow, J., University of North Carolina, Charlotte, NC, USA
(Eds.)
Principles of Data Mining and Knowledge Discovery
4th European Conference, PKDD, 2000, Lyon, France, September 13-16, 2000 Proceedings

2000. XV, 701 pp.
3-540-41066-X

Recommanded price: 146 DM - FF 550 - £ 50,50
Springer-Verlag Publisher

 

 

S1A: Towards broader foundations

Arno J. KNOBBE, Arno SIEBES, Hendrik BLOCKEEL, and Daniël van der WALLEN
Multi-Relational Data Mining, Using UML for ILP

Akihiro INOKUCHI, Takashi WASHIO, and Hiroshi MOTODA
An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data

Maurice BERNADET
Basis of a Fuzzy Knowledge Discovery System

 

S1B: Rules and trees

Dragan GAMBERGER, and Nada LAVRAC
Confirmation rule sets

Marc SEBBAN, and Richard NOCK
Contribution of Dataset Reduction Techniques to Tree-Simplification and Knowledge Discovery

Ljupco TODOROVSKI, and Saso DZEROSKI
Combining Multiple Models with Meta Decision Trees



S2A: Databases and reward-based learning

Tadeusz MORZY, Marek WOJCIECHOWSKI, and Maciej ZAKRZEWICZ
Materialized Data Mining Views

Jean-François BOULICAUT, Arthur BYKOWSKI, and Christophe RIGOTTI
Approximation of frequency queries by means of free-sets

Paul MUNTEANU, and Denis CAU
Efficient Score-based Learning of Equivalence Classes of Bayesian Networks

Christophe DRUET, Damien ERNST, and Louis WEHENKEL
Application of Reinforcement Learning to Electrical Power System Closed-Loop Emergency Control



S2B: Classification

Melanie HILARIO, and Alexandros KALOUSIS
Quantifying the Resilience of Inductive Classification Algorithms

Alexey TSYMBAL, and Seppo PUURONEN
Bagging and Boosting with Dynamic Integration of Classifiers

Carlos SOARES, and Pavel B. BRAZDIL
Zoomed Ranking: Selection of Classification Algorithms Based on Relevant Performance Information

Pierre GEURTS
Some enhancements of decision tree bagging



S3A: Association rules and exceptions

Marcus-Christopher LUDL, and Gerhard WIDMER
Relative Unsupervised Discretization for Association Rule Mining

Jochen HIPP, Ulrich GÜNTZER, and Gholamreza NAKHAEIZADEH
Mining Association Rules: Deriving a Superior Algorithm by Analysing Today's Approaches

Einoshin SUZUKI, and Jan ZYTKOW
Unified Algorithm for Undirected Discovery of Exception Rules

Jean-Hugues CHAUCHAT, Ricco RAKOTOMALALA, and Didier ROBERT
Sampling Strategies for targeting rare groups from a bank customer database

 

S3B: Instance-based discovery

Jinyan LI, Guozhu DONG, and Kotagiri RAMAMOHANARAO
Instance-Based Classification by Emerging Patterns

Gautam DAS, and Heikki MANNILA
Context-Based Similarity Measures for Categorical Databases

Ngoc Binh NGUYEN, and Tu Bao HO
A Mixed Similarity Measure in Near-Linear Computational Complexity for Distance-based Methods

Stéphane LALLICH, and Ricco RAKOTOMALALA
Fast feature selection using partial correlation for multi-valued attributes

 

S4A: Clustering and classification

Markus M. BREUNIG, Hans-Peter KRIEGEL, and Jörg SANDER
Fast Hierarchical Clustering Based on Compressed Data and OPTICS

Ljupco TODOROVSKI, Peter FLACH, and Nada LAVRAC
Predictive Performance of Weighted Relative Accuracy

Bin ZHANG, Meichun HSU, and George FORMAN
Accurate Recasting of Parameter Estimation Algorithms using Sufficient Statistics for Efficient Parallel Speed-up Demonstrated for Center-Based Data Clustering Algorithms

Maria HALKIDI, M. VAZIRGIANNIS, and Y. BATISTAKIS
Quality scheme assessment in the clustering process

 

S5A: Time Series

Iztok SAVNIK, Georg LAUSEN, Hans-Peter KAHLE, Heinrich SPIECKER, and Sebastian HEIN
Algorithm for Matching Sets of Time Series

Georg LAUSEN, Iztok SAVNIK, and Aldar DOUGARJAPOV
MSTS: A System for Mining Sets of Time Series

Juan J. RODRIGUEZ, Carlos J. ALONSO, and Henrik BOSTRÖM
Learning First Order Logic Time Series Classifiers : Rules and Boosting

 

Short Papers / Posters


Zighed, D.A., Université Lyon 2, Bron, France
Komorowski, J., Norwegian University of Science and Technology, Trondheim, Norway
Zytkow, J., University of North Carolina, Charlotte, NC, USA
(Eds.)
Principles of Data Mining and Knowledge Discovery
4th European Conference, PKDD, 2000, Lyon, France, September 13-16, 2000 Proceedings

2000. XV, 701 pp.
3-540-41066-X

Recommanded price: 146 DM - FF 550 - £ 50,50
Springer-Verlag Publisher

 

 

Sylvia ACID, and Luis M. DE CAMPOS
Learning Right Sized Belief Networks by Means of a Hybrid Methodology

Brock BARBER, and Howard J. HAMILTON
Algorithms for Mining Share Frequent Itemsets Containing Infrequent Subsets

Hilan BENSUSAN, and Christophe GIRAUD-CARRIER
Discovering task neighbourhoods through landmark learning performances

Leon BOBROWSKI, and Marek KRETOWSKI
Induction of Multivariate Decision Trees by using Dipolar Criteria

Sam BREWER, and Tom KHABAZA
Inductive Logic Programming in Clementine

Deborah R. CARVALHO, and Alex A. FREITAS
A genetic algorithm-based solution for the problem of small disjuncts

Antonio CIAMPI, and Yves LECHEVALLIER
Clustering Large, Multi-Level Data Sets: An Approach Based On Kohonen Self Organizing Maps

Antonio CIAMPI, Djamel A. ZIGHED, and Jérémy CLECH
Trees and induction graphs for multivariate response

Richard COLE, Peter EKLUND, and Gerd STUMME
CEM - Visualisation and Discovery in Email

Chabane DJERABA
Image access and data mining: an approach

Nikos DROSSOS, Athanasios PAPAGELIS, and Dimitris KALLES
Decision Tree Toolkit: A Component-based Library of Decision Tree Algorithms

Laurent DURY, Laurence LEHERTE, and Daniel P. VERCAUTEREN
Determination of Screening Descriptors for Chemical Reaction Databases

A. J. FEELDERS
Prior knowledge in economic applications of data mining

Pierre GEURTS, and Louis WEHENKEL
Temporal machine learning for switching control

David GROSSER, Jean DIATTA, and Noël CONRUYT
Improving Dissimilarity Functions with Domain Knowledge, applications with IKBS system

Attila GYENESEI
Mining Weighted Association Rules for Fuzzy Quantitative Items

Eui-Hong (Sam) HAN, and George KARYPIS
Centroid-Based Document Classification: Analysis & Experimental Results

Robert J. HILDERMAN, and Howard J. HAMILTON
Applying Objective Interestingness Measures in Data Mining Systems

Martin HOLENA
Observational Logic Integrates Data Mining Based on Statistics and Neural Networks

Dimitar HRISTOVSKI, Saso DZEROSKI, Borut PETERLIN, and Anamarija ROZIC-HRISTOVSKI
Supporting Discovery in Medicine by Association Rule Mining of Bibliographic Databases

Hillol KARGUPTA, Weiyun HUANG, Krishnamoorthy SIVAKUMAR, Byung-Hoon PARK, and Shuren WANG
Collective Principal Component Analysis from Distributed, Heterogeneous Data

Saori KAWASAKI, Ngoc Binh NGUYEN, and Tu Bao HO
Hierarchical Document Clustering Based on Tolerance Rough Set Model

Jörg KELLER, Valerij BAUER, and Wojciech KWEDLO
Application of Data-Mining and Knowledge Discovery in Automotive Data Engineering

Jan KOMOROWSKI, Torgeir R. HVIDSTEN, Tor-Kristian JENSSEN, Dyre TJELDVOLL, Eivind HOVIG, Arne K. SANVIK, and Astrid LAEGREID

Towards Knowledge Discovery from cDNA Microarray Gene Expression Data

Marzena KRYSZKIEWICZ
Mining with Cover and Extension Operators

Pascale KUNTZ, Fabrice GUILLET, Rémi LEHN, and Henri BRIAND
A User-Driven Process for Mining Association Rules

Carsten LANQUILLON
Learning from Labeled and Unlabeled Documents: A comparative study on semi-supervised text classification

P.A. LAUR, F. MASSEGLIA, and P. PONCELET
Schema Mining: Finding Structural Regularity among Semistructured Data

Bing LIU, Yiming MA , and Ching Kian WONG
Improving an Association Rule Based Classifier

Chung-Leung LUI, and Fu-Lai CHUNG
Discovery of generalized association rules with multiple minimum supports

Adele MARSHALL, Sally McCLEAN, Mary SHAPCOTT, and Peter MILLARD
Learning Dynamic Bayesian Belief Network using Conditional Phase-Type Distributions

José F. MARTINEZ-TRINIDAD, Miriam VELASCO-SANCHEZ, and Edgar E. CONTRERAS-ARAVELO
Discovering differences in patients with uveitis through typical testors by class

F. MASSEGLIA, P. PONCELET, and M. TEISSEIRE
Web Usage Mining: how to Efficiently manage New transactions and new clients

Frédéric MOAL, Teddy TURMEAUX, and Christel VRAIN
Mining Relational Databases

Maybin K. MUYEBA, and John A. KEANE
Interestingness In Attribute-Oriented Induction (AOI): Multiple-level Rule Generation

Takashi OKADA, and Mayumi OYAMA
Discovery of Characteristic Subgraph Patterns using Relative Indexing and the Cascade Model

Gerhard PAASS, and Jorg KINDERMANN
Transparency and Predictive Power. Explaining Complex Classification Models

Srinivasan PARTHASARATHY, and Mitsunori OGIHARA
Clustering Homogeneous Datasets for Distributed Mining

Petra PERNER, and Chid APTE
Empirical Evaluation of Feature Subset Selection based on a Real-World Data Set

Gerard RAMSTEIN, Pascal BUNELLE, and Yannick JACQUES
Discovery of ambiguous patterns in sequences. Application to bioinformatics

Zbigniew W. RAS, and Alicja WIECZORKOWSKA
Action-Rules : how to increase profit of a company

Gilbert RITSCHARD, and Nicolas NICOLOYANNIS
Aggregation and Association in Cross Tables

Céline ROBARDET, Fabien FESCHET, and Nicolas NICOLOYANNIS
An experimental study of partition quality indices in clustering

Fabrice ROSSI, and Frédérick VAUTRAIN
Expert Constrained Clustering: a Symbolic Approach

Ansaf SALLEB, and Christel VRAIN
An Application of Association Rules Discovery to Geographic Information Systems

Dan A. SIMOVICI, Dana CRISTOFOR, and Laurentiu CRISTOFOR
Generalized Entropy and Projection Clustering of Categorical Data

Vojtech SVATEK, and Martin KAVALEC
Supporting Case Acquisition and Labelling in the Context of Web Mining

Pang-Ning TAN, Vipin KUMAR, and Jaideep SRIVASTAVA
Indirect Association: Mining Higher Order Dependencies in Data

Tudor TEUSAN, Gilles NACHOUKI, Henri BRIAND and Jacques PHILIPPE
Discovering Association Rules in Large, Dense Databases

Peter TSELIOS, Agapios PLATIS, and George VOUROS
Providing advice to Website Designers towards effective websites re-organization

Shusaku TSUMOTO
Clinical Knowledge Discovery in Hospital Information Systems: Two Case Studies

Stijn VIAENE, B. BAESENS, T. VAN GESTEL, J.A.K. SUYKENS, D. VAN DEN POEL, J. VANTHIENEN, D. DE MOOR, and G. DEDENE
Knowledge discovery using Least Squares Support Vector Machine Classifiers: a Direct Marketing Case

Sholom M. WEISS, Brian F. WHITE, and Chidanand V. APTE
Lightweight Document Clustering

Hsin-Chang YANG, and Chung-Hong LEE
Automatic Category Structure Generation and Categorization of Chinese Text Documents

Show-Jane YEN
Mining Generalized Multiple-Level Association Rules

Show-Jane YEN, and Chung-Wen CHO
An Efficient Approach to Discovering Sequential Patterns from Large Databases

Ning ZHONG, Juzhen DONG, and Setsuo OHSUGA
Using Background Knowledge as a Bias to Control the Rule Discovery Process





Workshops


Workshop 1

Data Mining, Decision Support, Meta-learning and ILP:
Forum for Practical Problem
Pavel Brazdil and Alipio Jorge (University of Porto, Portugal)



Meta-Learning

Evaluation of Machine-Learning Algorithm Ranking Advisors
Helmut Berrer, Iain Paterson, Jörg Keller

Meta-Analysis: From data Characterisation for Meta-Learning to Meta-Regression
Christian Köpf, Charles Taylor, Jörg Keller

Report on the Experiments with Feature Selection in Meta-Level Learning
Ljupco Todorovski, Pavel Brazdil, Carlos Soares

 

Inductive Logic Programming (ILP) for KDD

Towards Practical Inductive Logic Programming
Luc de Raedt

A First-order Representation for Knowledge Discovery and Bayesian Classification on Relational Data
Nicolas Lachiche, Peter A. Flach

Cumulativity as Inductive Bias
Hendrik Blockeel, Luc Dehaspe

Using Inductive Logic Programming to Assist in the Retrieval of Relevant Information from an Electronic Library System
William T. H. Loggie

 

Demo of Data Mining Software

Demo of D-Miner
Dietrich Wettschereck

 

Pre-Processing for KDD

How to Preprocess Large Databases
Regina Zücker, Jörg-Uwe Kietz

The Principal Component Method as a Preprocesing Stage for Decision Tree Learning
Lubos Popelinsky, Pavel Brazdil

 

KDD & Data Mining for Decision Support

KDD in Telecommunications
Filip Zelezny, Petr Miksovsky, Olga Stepankova, Jiri Zidek

Enhancing e-Bussiness Through Web Data Mining
Amy Shi, Allen Long, David Newcomb

 

Advances in Data Mining

A Study on Rule Extraction from Neural Networks Applied to Medical Databases
Guido Bologna

A Data Analysis Approach based on a Neural Networks Data Sets Decomposition and it's Hardware Implementation
Kurosh Madani, Abdennasser Chebira

User-Centric Mining of Association Rules
S K Gupta, Vasudha Bhatnagar, S K Wasan

 

Workshop 2
Temporal, Spatial and Spatio-Temporal Data Mining (TSDM2000)
John F. Roddick (Flinders U., South Australia)
and Kathleen Hornsby (U. Maine, USA)

Warning: The papers of this workshop are not available on PKDD2000 website because the TSDM2000 workshop proceeding will be published by Springer-Verlag.

Discovering Temporal Patterns in Multiple Granularities
Y Li, X.S. Wang, S. Jajodia

Refined Time Stamps for Concept Drift Detection During Mining for Classification Rules
R. Hickey, and M.M. Black

K-Harmonic Means: A Spatial Clustering Algorithm With Boosting
B. Zhang, M. Hsu, and U. Dayal

Identifying Temporal Patterns for Characterization and Prediction of Financial Time Series Events
R. Povinelli

Value Range Queries on Earth Science Data via Histogram Clustering
R. Yang, K.-S. Yang, M. Kafatos, and X.S.Wang

Fast Randomized Algorithms for Robust Estimation of Location
V. Estivill-Castro, and M. Houle

Rough Sets in Spatio-Temporal Data Mining
T. Bittner

Join Indices as a Tool for Spatial Data Mining
K. Zeitouni, L. Yeh, and M-A. Aufaure

Data Mining with Calendar Attributes
H.J. Hamilton, D.J. Randall, and R.J. Hilderman

AUTOCLUST+: Automatic Clustering of Point-Data Sets in the Presence of Obstacles
V. Estivill-Castro and I. Lee

 

Workshop 3
Knowledge Discovery in Biology
Jan Komorowski (Norwegian University of Science and Technology, Trondheim, Norway)


Opportunities for Data Mining and Knowledge Discovery in Genomic Data
Steffen Schulze-Kremer

Data Mining and Knowledge Discovery in the Pharmaceutical Industry: A grand challenge
Magnus L. Andersson

Data Mining in Gene Expression Data Sets
Yudong He

A Methodology for Knowledge Discovery from Gene Expressions
Torgeir Hvidsten

Protein-Protein InteractionPrediction for C. elegans
Nicolas Thierry-Mieg

Knowledge Discovery from a literature network of human genes for gene-expression analysis
Lisa Öberg

Data Enrichment opf Molecular Structure: Data Mining within Conformational Analysis
Tobias Galliat
(Download1, download2)

 

Workshop 4
Machine Learning and Textual Information Access
Hugo Zaragoza (LIP6, University Paris 6, France),
Patrick Gallinari (LIP6, University Paris 6, France) and
Martin Rajman (EPFL, Switzerland)

Welcome and Introductions

Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach
I. Androutsopoulos, G. Paliouras, V. Karkaletsis, G. Sakkis, C. D. Syropoulos and P. Stanatopoulos

Short Document Categorization- Itemsets Method
J. Hynek, K.. Jezek and O. J Rohlik

A bootstrapping approach for thematic analysis
O. Ferret and B. Grau

A trainable algorithm for summarizing news stories
J. L. Neto, A. D. Santos, C. A. A. Kaestner, A. A. Freitas, J. C. Nievola

Interactive Learning for Text Summarization
M. R. Amini

Relations between Terms Discovered by Association Rules
M. H. Haddad, J. P. Chevallet, M. F. Bruandet

Multi-class Text Categorization with Error Correcting Codes
J. Kindermann, E. Leopold and G. Paass

Text Classification using String Kernels
H. Lodhi, J. Shawe-Taylor, N. Cristianini, C. Watkins

Information Visualization and Analysis for Knowledge Discovery: Using a Multi Self-Organizing Mapping
C. François, X. Polanco, J.C. Lamirel

Document Classification and Visualization to Support the Investigation of Suspected Fraud
J. Hangman, D. Perrotta, R. Steinberger and A. Varfis

 

Workshop 5
Knowledge Management: Theory and Applications
Jean-Louis Ermine (CEA Paris, France)

Introduction

Contents

Challenges and Approaches for Knowledge Management in Companies
Jean-Louis Ermine

Knowledge Management in PSA Peugeot Citroen for Designing Complex products
Patrick Coustillière

A Knowledge Maturity model : a First Approach to Knowledge Management
Cécile Decamps, and Jean-Luc Plet

Samanta : A Knowledge Server
Pascal Vandekerckhove, and Fabrice Guillet

Designing knowledge spaces that work for work - the relationship between physical, virtual and psychological spaces for knowledge management
Victoria Ward, and Clive Holtham

Interenterprise cooperative information systems for knowledge management
Imed Boughzala, Manuel Zacklad, and Nada Matta

A multi-agent system using semantic metadata for the cooperation among multiple information sources
Gilles Dubois, and Danielle Boulanger

Scattered Naive Theories: Why the human mind is isomorphic to the internet web
David Leiser

Knowledge Management and Scientific Observation
Thierno Tounkara, and Philippe Benhamou

Relative Measure for Mining Interesting Rules
Farhad Hussain, Huan Liu, and Hongjun Lu

Process Knowledge Management for Activity Support
Pierre Maret

An XML Framework proposal for knowledge discovery in databases
Petr Kotasek, and Jaroslav Zendulka

University knowledge management - issues and prospects
Jaroslava Mikulecká, and Peter Mikulecký

An Appoach to Text Mining using Information extraction
H. Karanikas, C. Tjortjis, and B. Theodoulidis.

 

Workshop 6
Symbolic Data Analysis:
Theory, Software and Applications for Knowledge Mining
Edwin Diday (University of Paris Dauphine, France)

Preface of PKDD'2000 Workshop on Symbolic Data Analysis
E. Diday and O. Rodriguez

Knowledge discovery from symbolic data and the SODAS software
E. Diday

Symbolic Analysis of Financial Data
F. Goupil, M. Touati, E. Diday and H. Van Der Veen

Generalization of the Principal Components Analysis to Histogram Data
O. Rodriguez, E. Diday and S. Winsberg

Symbolic Representation of Long Time-Series
G. Hebrail and B. Hugueney

Pyramidal Clustering Algorithms in ISO-3D Project
O. Rodriguez and E. Diday

Clustering Large Datasets and Visualizations of Large Hierarchies and Pyramids: Symbolic Data Analysis Approach
V. Batagelj, E. Pavletic, M. Zaveršnik and S. Korenjak-Cerne

Temporal Symbolic Descriptions Graphics in ISO-3D
M. Noirhomme, A. Nahimana and C. Mazel

Marking and Generalization by Symbolic Descriptions in the Symbolic Official Data Analysis Software
M. Gettler-Summa





Tutorials


 

Tutorial 1
An Introduction to Distributed Data Mining
H. Kargupta (Washington State University, USA)

 

Tutorial 2
Clustering Techniques for large Data Sets: from the past to the future
A. Hinneburg and D. A. Keim (University of Halle, Germany)

 

Tutorial 3
Data Analysis for Web Marketing and Merchandizing Applications
Myra Spiliopoulou (Humboldt Universitat zu Berlin, Germany)

 

Tutorial 4
Database Support for business Intelligence Applications
Wolfgang Lehner (University of Erlangen-Nuremberg, Germany)

 

Tutorial 5
Text Mining
Yves Kodratoff (U. Paris-Sud, France),
Djamel Zighed (U. Lyon 2, France) and
Serge Di Palma (U. Lyon 2, France)

"Computational-linguistics-based" TM
Yves Kodratoff

Application of Sipina to TM
Serge Di Palma

 




Invited talks

 

Invited talk 1: Willi Kloesgen (GMD, Germany)
Multi-relational, statistical, and visualization approaches for spatial knowledge discovery

Invited talk 2: Luc De Raedt (U. Freiburg, Germany)
Data mining in multi-relational databases

Invited talk 3:Arno Siebes (Utrecht U., Netherlands)
Developing KDD Systems




Discovery Challenge

 


Discovery Challenge

Chairs:
Arno Siebes (Utrecht U., Netherlands) and
Petr Berka (University of Economics, Prague, Czech Republic)

Table of Contents

Preface

I. Financial Data

Petr BREKA
Guide to the Financial Data

David COUFAL
Financial data sets analysis - hierarchical testing with GUHA method

Andreas HOTHO, and Alexander MAEDCHE
Efficient Discovery of Client Profiles from a Financial Database

Einoshin SUZUKI
Mining Financial Data with Scheduled Discovery of Exception Rules

 

II. Medical Data

Jan M. ZYTKOW, Shusaku TSUMOTO, and Katsuhiko TAKABAYASHI
Medical (Thrombosis) Data Description

Abraham MEIDAN, Alex CHESKIS, Ohad GEFEN, Boris LEVIN, and Ilya VOROBYOV
The WizWhy analysis of the PKDD 2000 Discovery Challenge Medical Domain

Ahmed Y. TAWFIK, and Krista STRICKLAND
Mining Medical Data for Causal and Temporal Patterns





Business Meeting



Minutes of the PKDD-2000 Business Meeting