EDA 2007 : 3èmes journées francophones sur les Entrepôts de Données et l'Analyse en ligne

Conférences invitées

Mukesh Mohania - IBM India Research Lab

Email: mkmukesh@in.ibm.com

• Abstract:

Enterprises have gathered operational business information from multiple structured data sources and stored it in a central repository, called data warehousing, for decision support functionalities and data analysis. The enterprises are now realizing to integrate their entire information sources, including "unstructured" contents, for deeper and richer information analysis. Several applications, such as processing warranty claims, finding promotional materials in real-time based on user's transaction value, detecting health insurance claim processing frauds in (near) real-time by integrating information from various data sources (some of them may be from the competitors), etc., require integration of both structured and unstructured information based on events and business policies. Thus, it is vital for data warehousing to enable the integration of data and content sources to provide real-time read and write access, to transform data for business analysis and data interchange, and to data placement for performance, currency and availability.

In this talk, we will first review the existing technologies in data warehousing and information integration, and then discuss how the enterprise applications are moving from data warehousing to (Active) Information Integration system. We will also discuss an architecture of a new approach for integrating information based on policies that does not require to defining a global schema (virtualization approach) or any materialization of pre-computed results (warehouse approach). We will finally discuss several applications that require such kind of integration, and show that the current approaches cannot satisfy these applications.

• Biography:

Mukesh Mohania received his Ph.D. in Computer Science & Engineering from Indian Institute of Technology, Bombay, India in 1995. He was a faculty member in University of South Australia, Western Michigan University from 1995-2001. He was also associated with Kyoto University and Purdue University as Senior Research Fellow from 1996-2001. Currently, he is a manager in IBM India Research Lab and leading database and autonomic computing research groups. He has worked extensively in the areas of rule processing in distributed databases, data warehousing, semi/unstructured databases, XML data integration, data mining and autonomic computing. He was awarded Technical Achievement Award in the area of Web Database Management and Data Warehousing by Association of Database and Expert Systems Applications in Greenwich, U.K., 2000. His work on XML data mining and context-oriented information integration received the best paper award in CIKM 2004 and CIKM 2005, respectively.

Litwin Witold - CERIA, Université Paris-Dauphine

Email: Witold.LITWIN@dauphine.fr

LH*P2Prs : Une Structure de Données Scalable et Distribuée Pour l'Environnement P2P

• Abstract:

LH*P2Prs est une Structure de Données Scalable et Distribuée (SDDS), conçu spécialement pour l'environnement P2P. Sa propriété caractéristique est que toute requête adressée à un pair incorrect, est reacheminée vers le pair correct en un seul message. Cette propriété unique à l'heure actuelle est probablement impossible à améliorer, dans le cadre de l'axiomatique habituelle des SDDS et de systèmes P2P. Elle résulte d'un système particulier de notifications entre les paires quand les nouveaux pairs sont incorporés. La méthode offre aussi la /k/-disponibilité, protégeant les données stockées contre la panne simultanée de /k />= 1 pairs et contre le "churn" typique de systèmes P2P. La valeur de /k /est scalable. La protection résulte du maintien de la parité, calculée par un code à correction d'effacements particulièrement efficace, de type Reed-Salomon.

EDA '07

Conférences invitées

Mukesh Mohania - IBM India Research Lab

Litwin Witold - CERIA, Université Paris-Dauphine