[MA 2025 05] Sequential knowledge distillation for MIMIC-IV by copying the behaviour of prediction models
Department of Medical Informatics (KIK)
Proposed by: Iacer Calixto, assistant professor of artificial intelligence [i.coimbra@amsterdamumc.nl ]
Introduction
Recently, researchers have developed an efficient method to “copy” the behaviour of a prediction model ‘A’ into another prediction model ‘B’. For instance, ‘A’ can be a neural network trained to predict mortality given a patient’s medical history, and ‘B’ can be any kind of machine learning model (e.g., a decision tree, support vector machine, random forest, logistic regression, neural network, etc.) [1, 2]. We refer to model A as the teacher model, and model B as the student model. The method proposed in [1,2] allows for one to distil the knowledge from the teacher model into the student model so that in terms of performance they are similar. In the original formulation in [1,2] researchers assume no access to the original training data used to train the teacher model. This is not the case in this SRP project. The reasons to distil this knowledge from the teacher to the student are many, and a few examples include interpretability (e.g., the student model comes from a family of machine learning models that are more interpretable than the teacher) or compute efficiency (e.g., the student model requires less specialised hardware to run and is therefore cheaper to deploy in terms of energy consumption).
Description of the SRP Project/Problem
In this SRP, you will work on the problem of predicting the risk of death (a task that can be modelled as a classification problem) for an intensive care unit (ICU) patient, and you will use the MIMIC-IV [3] dataset. MIMIC-IV includes data for over 40,000 ICU patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC) in the United States, including demographics, labs, medications, and more.1 You will build neural network-based teacher models using all relevant data available for a patient (including structured data and optionally also free-text clinical notes). You will then adapt the methodology introduced in [1,2] to distil the knowledge of the teacher into different student models, investigating in what scenarios the knowledge distillation works well (and in what scenarios it fails). You will compare the one-shot vs. online vs. sequential knowledge distillation approaches introduced in [1,2] and discuss how they compare in the MIMIC-IV dataset.
Research questions
RQ 1) How does one-shot versus online versus sequential knowledge distillation as proposed in [1,2] apply to transferring the knowledge of a teacher model trained on a classification task, e.g., predicting mortality using MIMIC-IV?
RQ 2) To what extent do the feature types affect the overall quality of the copy (e.g., numerical features only versus numerical and categorical features)?
Expected results
The main outcome of this SRP project is a scientific paper. We will publish the results of your work in a top-tier machine learning workshop, and you will use this paper as your thesis for defending your SRP.
You will deliver a publicly available code base where all the experiments conducted on your SRP will be shared with the research community.
Time period, please tick at least 1 time period
November – June ?
May – November ?
Contact
Iacer Calixto, assistant professor of artificial intelligence, KIK, i.coimbra@amsterdamumc.nl
References
[1] N. Statuto, I. Unceta, J. Nin, and O. Pujol. A scalable and efficient iterative method for copying machine learning classifiers. Journal of Machine Learning Research, 24(390):1–34, 2023. URL http://jmlr.org/papers/v24/23-0135.html.
[2] I. Unceta, J. Nin, and O. Pujol. Copying machine learning classifiers. IEEE Access, 8:160268–160284, 2019. URL https://api.semanticscholar.org/CorpusID:67877026.
[3] A. EW Johnson, L. Bulgarelli, L. Shen, A. Gayles, A. Shammout, S. Horng, T. J Pollard, S. Hao, B. Moody, B. Gow, et al. Mimic-iv, a freely accessible electronic health record dataset. Scientific data, 10(1):1, 2023.