Scientific Research Project

[MA 2022 14] SMALL RNAs DEPTH NORMALIZATION METHOD AND BIOMARKERS MODEL DEVELOPMENT

Amsterdam UMC

Proposed by: Martijn C. Schut [m.c.schut@amsterdamumc.nl]

SMALL RNAs DEPTH NORMALIZATION METHOD AND BIOMARKERS MODEL DEVELOPMENT

Place of the SRP Project

Liquid Biopsy Center, Cancer Center Amsterdam, Department of Pathology, Amsterdam UMC

Translational AI Laboratory, Department of Clinical Chemistry, Amsterdam UMC

Introduction – Small non-coding microRNA (miRNA) is a family of 2,000, highly conserved 21 base-pair oligonucleotides that regulate protein translation. Aberrations in miRNA expression and function drives disease including cancer. Notably, miRNAs are secreted by cells through small vesicles called exosomes that have a role in cell-cell communication and organ crosstalk. To this end, exosomes enter circulation and once isolated, their miRNA content can be analyzed and exploited as biomarkers for disease.

[Project I; 7 months] Problem description – Most if not all miRNAs are prone to post-transcriptional modifications slightly altering their nucleotide sequence and function. These thousands of small miRNA molecules can be measured simultaneously by Next Generation Sequencing, which results in the generation of high-dimensional data. However, deep sequencing techniques are prone to systematic non-biological artifacts that arises from variations in experimental handling, the origin of the samples, etc. Consequently, a critical first step is to normalize this high-dimensional data coming from different samples and sequencing runs to make it as comparable as possible. Normalized data can be subjected to machine learning generating ‘miRNA signatures’ that detect cancer. However, factors such as the depth coverage or the quality of the samples can affect the number of miRNA molecules present in the samples and therefore the levels of each miRNA that have been detected and/or the number of detected miRNAs. This, together with the fact that different normalization methods lead to different results, and that there is no consensus on this field, complicates the extraction of truly biologically motivated differences and validation efforts.

Research objective – To identify the best normalization method for low-input sequencing data, which is of crucial importance for application in the clinic.

Research question – How does normalization of low-input sequencing data affect the quality of ML-generated miRNA signatures?

[Project II; 7 months] Problem description – Another key step towards the application of miRNA profiles into the clinic is the discovery of miRNA diagnostic panels. Unbiased selection of secreted miRNA combinations with AI strategies from very large blood sample cohorts is a powerful method to detect cancer in early stages. Therefore, the evaluation and/or development of different AI strategies to discover new miRNA biomarker panels for response to therapy monitoring in cancer or diagnostic is very important. This is of crucial importance especially in some cancer types that need minimally invasive response monitoring tools. These methods should be optimized for relatively small, clinical trial-sized, patient samples cohorts. We have strong preliminary data showing that sequencing data of secreted miRNAs as input feature and LASSO regression models can improve response prediction in small clinical cohorts of patients with haematological malignancies.

Research objective – To outperform existing panel discovery methodologies and find better discriminating panels of miRNAs.

Research question – Which AI algorithm (adapted or newly developed) can best discover miRNA biomarker panels?

Contact – Liquid Biopsy Center, Amsterdam UMC: Dr. Michiel Pegtel (d.pegtel@amsterdamumc.nl), Dr. Cristina Gómez Martín (c.a.gomezmartin@amsterdamumc.nl); Translational AI Laboratory: Prof. dr. Martijn Schut (m.c.schut@amsterdamumc.nl).