Scientific Research Project

[MA 2023 18 ] Prediction of medications and diagnostic codes from electronic health records using autoencoder-based recommender systems

Amsterdam UMC, location AMC, Department of Medical Informatics

Proposed by: Iacopo Vagliano [i.vagliano@amsterdamumc.nl]

Introduction

Availability of clinical codes in Electronic Health Records (EHRs) is crucial for patient care as well as reimbursement purposes. However, entering them in the EHR is tedious, and some clinical codes may be overlooked. Furthermore, patients are prescribed many medications, especially when affected by multiple morbidities and/or in critical settings, such as ICU patients. For both scenarios, the automatic suggestion of codes and/or medications can help the staff annotating the EHR and clinicians prescribing medications, respectively. Recommender systems are software tools that suggest items to users, such codes and medications [1,2]. Thus, these tools are fit to the automatic suggestion of codes and/or medications.

Description of the SRP Project/Problem

Given an incomplete list of clinical codes or medications, the goal is to investigate the performance of ML methods on predicting the complete ones, and assess the added predictive value of including other clinical patient data in this task. Building up on our previous study [3], we use the MIMIC dataset [4] and frame the task of completing the clinical codes and/or medication as a recommendation problem. To achieve this goal, various autoencoder approaches (a family of deep learning methods) can be considered together with other simple yet effective methods, such as item co-occurrence and Singular Value Decomposition (SVD).

Various models’ inputs can be addressed, such as: 1) known clinical codes or medications, and 2) the codes/medications plus patients variables (and/or clinical notes). Predictive performance will be evaluated with the different inputs addressed. Optionally, specific subgroup of patients, codes and medications may be considered in a second stage (e.g. cardiology patients/codes).

Research questions

1. How accurately can we predict codes and/or medications on the MIMIC dataset with different autoencoders and other machine learning models on the MIMIC dataset?

2. Does the use of clinical variables, in addition to the incomplete list of codes/medications, improve the predictive performance of the models?

Expected results

Trained prediction models for predicting codes/medications.

The performance results of the models in terms of recommendation accuracy using recommendation metrics, such as mean average precision, F1 score, normalized discount gain, and mean reciprocal rank.

A master thesis written in a form of a scientific article.

Time period (usually 7 months)

7 months

References

1. Wiesner M, Pfeifer D. Health Recommender Systems: Concepts, Requirements, Technical Basics and Challenges. Int J Environ Res Public Health. 2014;11(3):2580-2607. doi:10.3390/ijerph110302580

2. Ricci, F., Rokach, L., Shapira, B. (2015). Recommender Systems: Introduction and Challenges. In: Recommender Systems Handbook. Springer. https://doi.org/10.1007/978-1-4899-7637-6_1

3. Yordanov, T.R., Abu-Hanna, A., Ravelli, A.C., Vagliano, I. (2023). Autoencoder-Based Prediction of ICU Clinical Codes. In: Artificial Intelligence in Medicine. AIME 2023. Lecture Notes in Computer Science, vol 13897. Springer, Cham. https://doi.org/10.1007/978-3-031-34344-5_8

4. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data, 2016. doi:10.1038/sdata.2016.35