Abstract
Cerebrospinal fluid (CSF) is an essential matrix for the discovery of neurological disease biomarkers. However, the high dynamic range of protein concentrations in CSF hinders the detection of the least abundant protein biomarkers by untargeted mass spectrometry. It is thus beneficial to gain a deeper understanding of the secretion processes within the brain. Here, we aim to explore if and how the secretion of brain proteins to the CSF can be predicted. By combining a curated CSF proteome and the brain elevated proteome of the Human Protein Atlas, brain proteins were classified as CSF or non-CSF secreted. A machine learning model was trained on a range of sequence-based features to differentiate between CSF and non-CSF groups and effectively predict the brain origin of proteins. The classification model achieves an area under the curve of 0.89 if using high confidence CSF proteins. The most important prediction features include the subcellular localization, signal peptides, and transmembrane regions. The classifier generalized well to the larger brain detected proteome and is able to correctly predict novel CSF proteins identified by affinity proteomics. In addition to elucidating the underlying mechanisms of protein secretion, the trained classification model can support biomarker candidate selection.
Original language | English |
---|---|
Pages (from-to) | 3068-3080 |
Number of pages | 13 |
Journal | Journal of Proteome Research |
Volume | 22 |
Issue number | 9 |
DOIs | |
Publication status | Published - 1 Sept 2023 |
Bibliographical note
Publisher Copyright:© 2023 The Authors. Published by American Chemical Society.
Funding
Funders | Funder number |
---|---|
AC-Immune | |
BAZIS | |
Cergentis BV | |
Alzheimer's Association | |
Alzheimer's Drug Discovery Foundation | LSHM20106, 73305095007 |
Eli Lilly and Company | |
Roche | |
Horizon 2020 Framework Programme | 860197, LSHM21018 |
EU Joint Programme – Neurodegenerative Disease Research | |
ZonMw | |
Nederlandse Organisatie voor Wetenschappelijk Onderzoek | |
Alzheimer Nederland |
Keywords
- brain proteome
- cerebrospinal fluid
- fluid biomarker
- machine learning
- protein secretion