Subject.
Large Language Models (LLMs) such as LLaMA 3, Mistral, or Flan-T5 achieve impressive results on modern NLP benchmarks but are mostly trained on English-dominant, contemporary web corpora. When applied to historical French texts, specialized professional language, or social-media varieties, they often show distributional drift, producing anachronistic or inconsistent interpretations. This raises a key research question: how can we adapt general-purpose LLMs to new linguistic and temporal contexts without losing their general reasoning ability? Recent methods address this through different forms of domain adaptation. Domain-Adaptive Pretraining (DAPT) fine-tunes models on domain corpora [Gururangan et al., 2020], while parameter-efficient tuning techniques such as Adapters [Houlsby et al., 2019] and Low Rank Adaptation (LoRA) [Hu et al., 2022] update only small portions of the model. RetrievalAugmented Generation (RAG) [Lewis et al., 2020] and in-context learning adapt the model’s behavior dynamically without retraining. Yet, their effectiveness on low-resource, noisy, or historical French data remains an open question. This project investigates how ideas from statistical learning, transfer learning, and meta-learning can enhance these adaptation techniques. It aims to design and test hybrid strategies that balance adaptation fidelity, robustness, and computational efficiency.
Research directions include:
â—Ź Quantifying and modeling the linguistic shift between modern and historical or informal French [Ben-David et al., 2010, Johansson et al., 2019].
â—Ź Exploring meta-learning or transfer learning frameworks for domain generalization [Finn et al., 2017, Dou et al., 2019].
â—Ź Integrating covariate-shift correction [Sugiyama and Kawanabe, 2012] with parameter-efficient fine-tuning.
â—Ź Combining transfer or meta-learning with retrieval and prompting [Zhuang et al., 2020] to allow dynamic adaptation.
â—Ź Investigating cross-lingual domain adaptation through language-shift benchmarks (e.g., XTREME, FLORES-200, MGSM), measuring how new methods transfer between typologically diverse languages and historical dialects.
Required profile.
The ideal candidate is technically strong, with a solid foundation in machine learning theory, optimization, or applied mathematics, and excellent programming skills in Python (e.g. PyTorch, JAX or TensorFlow). Experience with software engineering and experiment management tools such as Docker, Git, or Ray is desirable. Familiarity with Large Language Models (LLMs), Vision-Language Models (VLMs), or other foundation models is a plus. A background in NLP is not required, what matters most is technical depth, theoretical understanding, and a rigorous, analytical mindset.
Contacts.
The intern will join the Data Mining & Decision team of the ERIC lab. (Campus Porte des Alpes, Bron).
Duration: 6 months, starting in January or later
Supervision: Pegah Alizadeh (pegah.alizadeh@univ-lyon2.fr)
Julien Jacques (julien.jaques@univ-lyon2.fr)
Some references
.
[Ben-David et al., 2010] Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J. W.
(2010). A theory of learning from different domains. Machine learning, 79(1):151–175. [Dou et al., 2019] Dou, Q.,
Coelho de Castro, D., Kamnitsas, K., and Glocker, B. (2019). Domain generalization via model-agnostic learning
of semantic features. Advances in neural information processing systems, 32.
[Finn et al., 2017] Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of
deep networks. In International conference on machine learning, pages 1126–1135. PMLR.
[Gururangan et al., 2020] Gururangan, S., Marasovi´c, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., and
Smith, N. A. (2020). Don’t stop pretraining: Adapt language models to domains and tasks. ACL.
[Houlsby et al., 2019] Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and Gelly, S. (2019). Parameter-efficient transfer learning for nlp. In International conference on machine learning, pages 2790–2799. PMLR.
[Hu et al., 2022] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W., et al. (2022). Lora: Low-rank adaptation of large language models. ICLR, 1(2):3. 2
[Johansson et al., 2019] Johansson, F. D., Sontag, D., and Ranganath, R. (2019). Support and invertibility in
domain-invariant representations. In The 22nd International Conference on Artificial Intelligence and Statistics,
pages 527–536. PMLR.
[Lewis et al., 2020] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., K¨uttler, H., Lewis, M., Yih, W.-t., Rockt¨aschel, T., et al. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks.
Advances in neural information processing systems, 33:9459–9474.
[Sugiyama and Kawanabe, 2012] Sugiyama, M. and Kawanabe, M. (2012). Machine learning in non-stationary
environments: Introduction to covariate shift adaptation. MIT press.
[Zhuang et al., 2020] Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. (2020). A
comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76.
