Back RSS stream

Publications of Jérôme Darmont

Reference (inproceedings)

P.N. Sawadogo, E. Scholly, C. Favre, E. Ferey, S. Loudcher, J. Darmont, "Metadata Systems for Data Lakes: Models and Features", 1st International Workshop on BI and Big Data Applications (BBIGAP@ADBIS 2019), Bled, Slovenia, September 2019; Communications in Computer and Information Science, Vol. 1064, Springer, Heidelberg, Germany, 440-451.

Abstract

Over the past decade, the data lake concept has emerged as an alternative to data warehouses for storing and analyzing big data. A data lake allows storing data without any predefined schema. Therefore, data querying and analysis depend on a metadata system that must be efficient and comprehensive. However, metadata management in data lakes remains a current issue and the criteria for evaluating its effectiveness are more or less nonexistent.

In this paper, we introduce MEDAL, a generic, graph-based model for metadata management in data lakes. We also propose evaluation criteria for data lake metadata systems through a list of expected features. Eventually, we show that our approach is more comprehensive than existing metadata systems.

Keywords

Data lakes, Metadata modeling, Metadata management

 

[ BibTeX | XML | Full paper | Back ]