Predicting Structural Motifs of Glycosaminoglycans using Cryogenic Infrared Spectroscopy and Random Forest

Jerome Riedel, Maike Lettow, Márkó Grabarics, Michael Götze, Rebecca L. Miller, Geert Jan Boons, Gerard Meijer, Gert von Helden, Gergo Peter Szekeres*, Kevin Pagel*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

In recent years, glycosaminoglycans (GAGs) have emerged into the focus of biochemical and biomedical research due to their importance in a variety of physiological processes. These molecules show great diversity, which makes their analysis highly challenging. A promising tool for identifying the structural motifs and conformation of shorter GAG chains is cryogenic gas-phase infrared (IR) spectroscopy. In this work, the cryogenic gas-phase IR spectra of mass-selected heparan sulfate (HS) di-, tetra-, and hexasaccharide ions were recorded to extract vibrational features that are characteristic to structural motifs. The data were augmented with chondroitin sulfate (CS) disaccharide spectra to assemble a training library for random forest (RF) classifiers. These were used to discriminate between GAG classes (CS or HS) and different sulfate positions (2-O-, 4-O-, 6-O-, and N-sulfation). With optimized data preprocessing and RF modeling, a prediction accuracy of >97% was achieved for HS tetra- and hexasaccharides based on a training set of only 21 spectra. These results exemplify the importance of combining gas-phase cryogenic IR ion spectroscopy with machine learning to improve the future analytical workflow for GAG sequencing and that of other biomolecules, such as metabolites.

Original languageEnglish
Pages (from-to)7859-7868
Number of pages10
JournalJournal of the American Chemical Society
Volume145
Issue number14
DOIs
Publication statusPublished - 12 Apr 2023

Fingerprint

Dive into the research topics of 'Predicting Structural Motifs of Glycosaminoglycans using Cryogenic Infrared Spectroscopy and Random Forest'. Together they form a unique fingerprint.

Cite this