Scientific Research Project

[MA 2023 27] Explainable NLP for healthcare with bag of n-gram embedding models

Department of Medical Informatics, Amsterdam UMC, University of Amsterdam

Proposed by: Iacer Calixto [i.coimbra@amsterdamumc.nl]

Introduction

· Highly capable Large Language Models (LLMs) with Billions of parameters trained on trillions of tokens are the driving force behind rapid progress in NLP and its applications. But their black-box nature is a cause for concern since explainability of these models is paramount in many applications. We need models/methods that are by design easy to interpret with minimal loss in performances.

· A generalised additive model (GAM) is a generalised linear model with a linear predictor involving a sum of smooth functions of covariates. Generalised Additive Models allow us to model non-linear data while maintaining explainability. They can be represented by the following equation.

g (E[y]) = ß + f1 (x1) + f2 (x2) + · · · + fK (xp ) • where (x1, x2, . . ., xp ) are the input variables / covariates • g (·) is the link function • each fi is a univariate shape function · Aug-GAMs[Singh et al][1] (aka. Emb-GAMs) propose to use LLMs to augment interpretable models (like GAMs) while keeping performance comparable to black-box models. The overall idea is fitting an additive model on fixed-size embeddings for decoupled n-grams in a given sequence (which can be extracted offline), summing these, and using the sum to train a linear model.

Description of the SRP Project/Problem

Aug-GAMs can achieve near state-of-the-art performance across multiple text classification datasets while still being white-box / explainable. However, there is much room for improvement in both the performance and explainability fronts, as well as potential ablation studies where Aug-GAM's robustness and generalizability is explored. These are the ideas in this master’s project. We note that we have preliminary results for some initial experiments that show promising results insofar, but further experiments are needed.

Some concrete directions are listed below.

Aug-GAMs [1] (aka. Emb-GAMs) propose to use LLMs to augment interpretable models (like GAMs) while keeping performance comparable to black-box models. The overall idea is fitting an additive model on fixed-size embeddings for decoupled n-grams in a given sequence (which can be extracted offline), summing these, and using the sum to train a linear model. For a quick illustration, see Figure 1 in [1] (https://arxiv.org/pdf/2209.11799.pdf).

Research questions

Main Questions

1. How does model performance using an interpretable bag of n-grams model and an explainable boosting machine (EBM) classifier compares to the performance of uninterpretable black-box models (e.g., BERT fine-tuned on the dataset) for mortaility prediction with MIMIC-III and/or MIMIC-IV?

2. Can we improve Aug-GAM model performance using methodological expansions to the baseline Aug-GAM model?

a. Can we improve by adding contextual information from all sentences in the test corpora?

b. Can we improve by using a separate retrieval corpus to compute n-gram representations?

Additional Questions

1. Can we improve Aug-GAM by using feature extractors other than BERT?

2. Can we improve Aug-GAM by building ensembles of multiple feature extractors?

Expected results

A detailed study of the method and based on results from our experiment, a research paper that can be sent to leading NLP/Medical Informatics venues or workshops.

Time period

- November – June

- May – November

Contact

Iacer Calixto, i.coimbra@amsterdamumc.nl

Nishant Mishra, n.mishra@amsterdamumc.nl

References

[1] Singh, Chandan, Armin Askari, Rich Caruana, and Jianfeng Gao. "Augmenting interpretable models with LLMs during training." arXiv preprint arXiv:2209.11799 (2022).

[2] Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov, Roberta: A robustly optimized BERT pretraining approach, CoRR abs/1907.11692 (2019)

[3] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang, Biobert: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics 36 (2020), no. 4, 1234–1240.

[4] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

[5] Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker, Accurate intelligible models with pairwise interactions, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA), KDD ’13, Association for Computing Machinery, 2013, p. 623–631.