New accepted publication at #DaWaK: Discovering Relationships

Recherche-Research
A. Diouan, S. Loudcher, J. Darmont, E. Ferey, « Discovering Relationships in Data Lakes Using Large Language Models: An Industrial Case ». 28th International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2026), Graz, Austria. LNCS.

Abstract: Data lakes rely on metadata to remain usable, yet this meta data is often limited or weakly informative for column relationship discovery, especially in ERP-derived datasets with coded or abbreviated schema labels. We propose ColRel, a two-stage method that builds column embeddings from metadata and data available at ingestion time. In difficult cases, such as coded schemata, business dictionaries help better interpret column names and support the generation of short natural-language descriptions used in the second stage. Experiments on public benchmarks and an industrial ERP dataset show that ColRel is particularly effective in semantically related, weak-signal settings.

Nouvelle publication #HumanitésNumériques

Recherche-Research

R. El-Idrissi, J. Simon-Reig, L. Romero, J. Agoun, J.P. Girard, G. de-Prado, J. Darmont, S. Loudcher, « Structuration, exploration et valorisation d’archives archéologiques par l’intelligence artificielle au sein d’un lac de données », 7e Colloque de l’association francophone des Humanités numériques (Humanistica), Paris, Mai 2026.