Author Archives: Julien Jacques

10/12/25 : Offre de #stage Designing Task-Specific Reward and Loss Functions for Large Language Models #llm

Subject. Recent alignment techniques such as Reinforcement Learning from Human Feedbac (RLHF) [Christiano et al., 2017] and Reinforcement Learning from AI Feedback (RLAIF) [Bai et al., 2022] have improved the… Read more »

10/12/25 – Offre de #stage : Domain Adaptation of Large Language Models to New Data Distributions #llm

Subject. Large Language Models (LLMs) such as LLaMA 3, Mistral, or Flan-T5 achieve impressive results on modern NLP benchmarks but are mostly trained on English-dominant, contemporary web corpora. When applied… Read more »