09/02/06 – #stage Master Internship Offer – Spring 2026 Survival Analysis for Medical Application

Séminaire/congrès/conférence

Level: Master 1 or Master 2

Information

  • Advisor(s): Guillaume Metzler and Julien Jacques
  • Mail: guillaume.metzler@univ-lyon2.fr julien.jacques@univ-lyon2.frLocation
  • Laboratoire ERIC (Lyon)
  • Duration: 4 to 6 months, between March and September 2026
  • Compensation: approximately 600 euros/month
  • Keywords: Survival Analysis, Python programming, Statistical Modelling, Cox model

General Overview: The main idea is to build upon two contributions developed by a PhD student as part of her doctoral research on Survival Analysis and to work on improving them in order to strengthen the overall approach. This will mainly involve revisiting the existing code to enhance and correct it, as well as conducting additional experimental studies.

Expected profile: Master or engineering degree in Computer Science or Applied Mathematics related to ma- chine/statistical learning. The candidate must show some interest mostly in Programming (Python) and Statistics.

How to apply? Send to guillaume.metzler@univ-lyon2.fr and julien.jacques@univ-lyon2.fr a CV and your Master grades.

Summary
Survival analysis plays a key role in the medical field, particularly in patient care and in the design of treatments tailored to individual patient characteristics. It has many applications, notably in oncology and in organ transplantation. The proposed internship falls within this latter context and aims at understanding the factors that explain the success or failure of an organ transplant. To this end, the project relies on real, accessible medical data involving a large number of individuals and a very high-dimensional set of covariates describing both donors and recipients. However, these data present several challenges. In particular, some patients may leave the follow-up study at one hospital and continue their care at another institution that does not collect the same clinical indicators, leading to missing values in some covariates. Another key difficulty lies in the heterogeneous nature of the data: patients differ substantially in their characteristics, for instance in terms of medical history. Moreover, the factors explaining transplant success may differ across subpopulations, for example, between child and adult patients since organ transplantation inherently depends on patient-specific constraints. From a statistical perspective, survival analysis is commonly based on Cox proportional hazards models, which model the hazard function at time t, as the patient death as a function of individual characteristics. The objectives of this internship are twofold: (i) First, to develop survival models that account for population heterogeneity using mixture Cox models [5, 4], while identifying the key covariates explaining transplant success or failure within each subgroup through penalization methods [6, 1]; (ii) Second, to investigate efficient strategies for imputing missing values [3] in
order to improve the predictive performance of survival models [2]. In practice, the candidate will work on these two aspects separately by building upon existing research contributions in both areas. The goal will be to strengthen these works by improving the experimental protocol, correcting and enriching the existing codebase, and performing more systematic comparisons with state-of-the-art methods. A baseline implementation will be provided to the student as a starting point. Ultimately, the objective is to submit the resulting contributions to peer-reviewed journals in statistics.

Main objectives of the proposed Internship
During the internship, the candidate will work on survival analysis, with a particular focus on Cox proportional hazards models. The main objective will be to build upon two existing research contributions and to improve them in order to make them suitable for submission. This will mainly involve revisiting and enhancing the existing codebase,
correcting potential issues, and conducting additional experimental analyses. The work can be structured as follows: (i) thoroughly understand the existing methodological and computational framework related to Cox models: Mixture of Cox Models, Missing Data. (ii) improve and extend the current implementations to enhance their robustness,
efficiency, and reproducibility, (iii) carry out additional experiments to strengthen the empirical results and assess the performance of the proposed approaches.

Expected results
• Literature review: survival Analysis and Cox Model (penalized one and missing data completion).
• Theoretical: learn how to penalized mixture of cox models and deal with missing data.
• Practical: improve existing code and correct it in order to improve the experimental part of the papers.
For more background material about the topics of this internship, the candidate can have a quick look at the references given the associated section.

References
[1] J. J. Goeman. L1 penalized estimation in the cox proportional hazards model. Biometrical Journal, 52(1):70–84, 2010.
[2] Wei Jiang, Julie Josse, and Marc Lavielle. Stochastic approximation em for logistic regression with missing values. arXiv preprint arXiv:1901.08752, 2019.
[3] Julie Josse and François Husson. missmda: a package for handling missing values in multivariate data analysis. Journal of statistical software, 70:1–31, 2016.
[4] Chirag Nagpal, Steve Yadlowsky, Negar Rostamzadeh, and Katherine Heller. Deep cox mixtures for survival regression. In Machine
Learning for Healthcare Conference, pages 674–708. PMLR, 2021.
[5] S. K. Ng, L. Xiang, and K. K. W. Yau. Mixture modelling for medical and health sciences. Chapman and Hall/CRC, 2019.
[6] R. Tibshirani. The lasso method for variable selection in the cox model. Statistics in Medicine, 16(4):385–395, 1997.

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *