[MA 2024 03] Understand the behaviour of AI models via semantic match

Amsterdam UMC, location AMC
Proposed by: Giovanni Cinà [g.cina@amsterdamumc.nl]

Introduction

One of the biggest problems with Machine Learning (ML) nowadays is the difficulty in understanding what the model has learned, and why it is giving certain results. Currently many Explainable AI (XAI) techniques report the ‘importance’ or ‘contribution’ of features for the prediction of a ML model on a specific input, for instance generating a heatmap indicating which parts of an image are more important or salient for a model. Such techniques are commonly referred to as ‘feature attribution’ or ‘feature importance’ methods. These methods are now widely used both in research and industry. Unfortunately, there is no systematic way to collect all these local explanations into a global understanding of what the model has learned. Even worse, people often fall prey to the so-called ‘confirmation bias’, meaning that they generalize from a few explanations just because they fit their prior belief.

For example, suppose a doctor is evaluating a model predicting lung cancer from lung scans. With an explainability technique, the clinician sees that on some images the model is focusing on lung nodules. Is this enough to conclude that the model has learned that nodules are dangerous and a predictor of lung cancer? Is the model reliable? Because the doctor knows the importance of the nodules, he/she might be led to believe the model has learned this (confirmation bias), but we simply do not know.

Description of the SRP Project/Problem The goal of this thesis is to contribute to the solution to this problem, by testing and refining a novel method to aggregate local explanations in general model understanding, avoiding confirmation bias. In particular the thesis will build on the work of the main supervisor [1,2], broadening the set of experiments to new use cases and feature attribution methods. This line of research will connect with ongoing research in XAI and learn from other experiences in the field (e.g. [3] in NLP).


The student is expected to be independent, proactive, and organized. A proper math background will be required to understand the theoretical underpinning of the problem. To run experiments, the students will need i) some experience in developing and applying machine learning algorithms and ii) advanced programming skills in Python.


Research questions

1) Are the results obtained by the supervisor replicable with other feature attribution techniques? This entails the replication of three sets of experiments on

· tabular data

· image data (classification of images in a synthetic and real-world use case)

· text data (question-answering dataset)

2) Can we extend the experiments to other use cases?

3) Can we find and test hypotheses for a medical use case?


Expected results

Beside the completion of the thesis, the students is expected to

· show an in-depth understanding of the semantic match framework he/she will be testing

· deliver the results of the aforementioned experiments

· curate a versioned and well-documented codebase to allow for the replication of the experiments by other scientists


Time period

o November – June

o May - November


Contact

The student will be working under the supervision of Giovanni Cinà (g.cina@amsterdamumc.nl), assistant professor with a joint appointment at the Faculty of Science and Faculty of Medicine, and join his group of 4 PhDs and a few MSc students. The supervision meetings will take place weekly, and additionally the student will be encouraged to take part in the relevant seminars, reading groups with peers and social gatherings.


References

[1] Cina, Giovanni, et al. "Semantic match: Debugging feature attribution methods\titlebreak in XAI for healthcare." Conference on Health, Inference, and Learning. PMLR, 2023.

[2] Cinà, Giovanni, et al. "Fixing confirmation bias in feature attribution methods via semantic match." arXiv preprint arXiv:2307.00897 (2023).

[3] Zhou, Yilun, Marco Tulio Ribeiro, and Julie Shah. "Exsum: From local explanations to model understanding." arXiv preprint arXiv:2205.00130 (2022).