Scientific Research Project

[MA 2021 06] Interpretability of Machine Learning models for Prediction of Acute Kidney Injury

Amsterdam UMC, location AMC, department of Medical Informatics

Proposed by: Iacopo Vagliano [i.vagliano@amsterdamumc.nl]

Interpretability of Machine Learning models for Prediction of Acute Kidney Injury

Scientific Research Project Number: MA 2021 06

Place: Amsterdam UMC, location AMC, department of Medical Informatics

Introduction

Acute kidney injury (AKI) is a common and potentially life-threatening condition that affects approximately one in five inpatient admissions in the US. Clinically detection of AKI uses serum creatinine increase as a marker of acute decline in renal function. The lag of such an increase behind the renal injury therefore results in delayed treatment and preventative alerts could therefore empower clinicians to act before a major clinical decline has occurred.

Description of the SRP Project/Problem

In order for a model to be accepted in the clinical environment not only it should provide good predictions but also be interpretable. Therefore, interpretability of the predictions of the machine learning models is a crucial requirement to their design. However, complex models, such as neural networks, bring up the question of the trade-off between accuracy and interpretability of a model’s output3. With this in mind, the goal is to create a interpretable model capable of clinically acceptable predictions. To achieve this goal, the following steps can be followed:

1. Develop a range of models from classical machine learning, such as logistic regression, boosted trees and random forests, as well as neural networks (CNN, RNN, etc.) to predict AKI in ICU patients (relying on the MIMIC and/or eICU dataset). Select up to three models for the next steps based on their prediction performance (measured through their area under receiver operating curves).

2. Enhance the interpretability of the selected models using one or more of the following approaches: feature importance, feature selection with the INVASE method, LIME, symbolic metamodeling for models explanation, and attention, e.g. attentive state-space modelling.

3. Evaluate the models taking into account both predictive performance (discrimination as well as calibration) and interpretability.

It is recommended having followed the course Special topics in data science in medicine (MAM11).

Research question

1. How accurately can we predict AKI on the MIMIC dataset with different machine learning models on the MIMIC/eICU dataset?

2. How do we define and assess model’s interpretability? How can we increase the interpretability of prediction models?

3. What is the optimal trade-off between interpretability and accuracy of the designed models?

Expected results

Algorithms and a trained prediction models for predicting AKI.

A validation approach and performance results of the models in terms of discrimination (using receiver operating curves and precision/recall curves), calibration and interpretability.

A master thesis written in a form of a scientific article.

Time period:

7 months

Contact:

Mentor: Iacopo Vagliano, Amsterdam UMC, location AMC, department of Medical Informatics, i.vagliano@amsterdamumc.nl

References:

Molnar, C. (2020). Interpretable Machine Learning. lulu.com., https://christophm.github.io/interpretable-ml-book/, 2020

Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pp. 4765–4774, 2017.

MIMIC-III, a freely accessible critical care database. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, and Mark RG. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. Available at: http://www.nature.com/articles/sdata201635

Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG and Badawi O. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific Data (2018). DOI: http://dx.doi.org/10.1038/sdata.2018.178