[MA 2025 02] Interpretable knowledge distillation by copying the behaviour of prediction models
Department of Medical Informatics (KIK)
Proposed by: Iacer Calixto, assistant professor of artificial intelligence [i.coimbra@amsterdamumc.nl ]
Introduction
Recently, researchers have developed an efficient method to “copy” the behaviour of a prediction model ‘A’ into another prediction model ‘B’. For instance, ‘A’ can be a neural network trained to predict mortality given a patient’s medical history, and ‘B’ can be any kind of machine learning model (e.g., a decision tree, support vector machine, random forest, logistic regression, neural network, etc.) [1, 2]. We refer to model A as the teacher model, and model B as the student model. The method proposed in [1,2] allows for one to distil the knowledge from the teacher model into the student model so that in terms of performance they are similar. In the original formulation in [1,2] researchers assume no access to the original training data used to train the teacher model. This is not the case in this SRP project. The reasons to distil this knowledge from the teacher to the student are many, and a few examples include interpretability (e.g., the student model comes from a family of machine learning models that are more interpretable than the teacher) or compute efficiency (e.g., the student model requires less specialised hardware to run and is therefore cheaper to deploy in terms of energy consumption).
Description of the SRP Project/Problem
In this SRP, you will work on the problem of predicting the risk of death (a task that can be modelled as a classification problem) for an intensive care unit (ICU) patient, and you will use the MIMIC-IV [3] dataset. MIMIC-IV includes data for over 40,000 ICU patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC) in the United States, including demographics, labs, medications, and more.1 You will build neural network-based teacher models using all relevant data available for a patient (including structured data and free-text clinical notes). You will then adapt the methodology introduced in [1,2] to distil the knowledge of the teacher into different student models, investigating in what scenarios the knowledge distillation works well (and in what scenarios it fails).
In your SRP you will investigate the scenario where the task is predicting mortality, and you will investigate the case where the student model is interpretable, e.g., a logistic regression model. Here, you will first try using only structured variables in the student model for model distillation. Later, you will also include derived variables from free-text clinical notes to the set of variables used in the model to see whether you can obtain similar performance to that of the teacher model while using less data / more interpretable data.
Research questions
RQ 1) To what extent does knowledge distillation as proposed in [1,2] apply to transferring the knowledge of a teacher model trained on a classification task, e.g., predicting mortality using MIMIC-IV?
RQ 2) To what extent can we distil knowledge from a teacher model trained using free-text variables into a student model that only uses structured variables (and possibly other variables derived from free-text)?
Expected results
The main outcome of this SRP project is a scientific paper. We will publish the results of your work in a top-tier machine learning workshop, and you will use this paper as your thesis for defending your SRP.
You will deliver a publicly available code base where all the experiments conducted on your SRP will be shared with the research community.
Time period, please tick at least 1 time period
November – June ?
May – November ?
Contact
Iacer Calixto, assistant professor of artificial intelligence, KIK, i.coimbra@amsterdamumc.nl
References
[1] N. Statuto, I. Unceta, J. Nin, and O. Pujol. A scalable and efficient iterative method for copying machine learning classifiers. Journal of Machine Learning Research, 24(390):1–34, 2023. URL http://jmlr.org/papers/v24/23-0135.html.
[2] I. Unceta, J. Nin, and O. Pujol. Copying machine learning classifiers. IEEE Access, 8:160268–160284, 2019. URL https://api.semanticscholar.org/CorpusID:67877026.
[3] A. EW Johnson, L. Bulgarelli, L. Shen, A. Gayles, A. Shammout, S. Horng, T. J Pollard, S. Hao, B. Moody, B. Gow, et al. Mimic-iv, a freely accessible electronic health record dataset. Scientific data, 10(1):1, 2023.