Back RSS stream

Publications of Jérôme Darmont

Reference (inproceedings)

R. El-Idrissi, J. Agoun, J. Darmont, S. Loudcher, "Content Learning for Metadata Extraction and Enrichment of Historical Hand-Drawn Illustrations", New Trends in Databases and Information Systems - ADBIS 2026 Short Papers, Sept-October 2026; Communications in Computer and Information Science, Springer, Heidelberg, Germany.

Abstract

The digitisation of cultural heritage collections requires automated metadata generation to improve resource discovery and reuse. This paper presents the Content Learning for Metadata Extraction and Enrichment (CLEAD), a multimodal approach for generating metadata from illustrations in digitised archaeological diaries. CLEAD combines illustration detection and segmentation with vision-language and large language models to generate and enrich illustration descriptions using page-level textual context. Experiments on the DataLAC and IlluHisDoc datasets show that integrating visual and textual information improves the semantic quality and relevance of the generated metadata. These results demonstrate that CLEAD effectively supports the indexing and retrieval of visual heritage resources in digital archives and data lakes.

Keywords

Data Lakes, Metadata, Artificial Intelligence, Vision-Language Models, Historical Documents, Document Analysis

 

[ BibTeX | XML | Back ]