Back RSS stream

Publications of Jérôme Darmont

Reference (inproceedings)

I. Nogueira, M. Romdhane, J. Darmont, "Modeling Data Lake Metadata with a Data Vault", 22nd International Database Engineering and Applications Symposium (IDEAS 2018), Villa San Giovanni, Italia, June 2018, 253-261; ACM, New York.

Abstract

With the rise of big data, business intelligence had to find solutions for managing even greater data volumes and variety than in data warehouses, which proved ill-adapted. Data lakes answer these needs from a storage point of view, but require managing adequate metadata to guarantee an efficient access to data. Starting from a multidimensional metadata model designed for an industrial heritage data lake presenting a lack of schema evolutivity, we propose in this paper to use ensemble modeling, and more precisely a data vault, to address this issue. To illustrate the feasibility of this approach, we instantiate our metadata conceptual model into relational and document-oriented logical and physical models, respectively. We also compare the physical models in terms of metadata storage and query response time.

Keywords

Big data, Data lake, Metadata management, Ensemble modeling, Data vault

 

[ BibTeX | XML | Full paper | Back ]