Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning

Jos P Kanning, Junfeng Wang, Shahab Abtahi, Mirjam I Geerlings, Ynte M Ruigrok*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Aneurysmal subarachnoid haemorrhage (aSAH) is a type of stroke with high mortality and morbidity. This study aimed to identify novel aSAH risk factors by combining machine learning (ML) and traditional statistical methods. Using the UK Biobank, we identified aSAH cases via hospital-based ICD codes and analysed 618 baseline variables covering demographics, lifestyle, medical history, and physical measurements. The CatBoost ML algorithm and Shapley Additive Explanations (SHAP) identified the top 25 variables most influential in predicting aSAH. Logistic regression further described these variables while adjusting for established aSAH risk factors. Among 501,847 participants, 893 aSAH cases were identified. ML identified 214 variables with non-zero SHAP values. Logistic regression of the top 25 variables revealed four potential novel aSAH risk factors. Increased aSAH risk was associated with mean sphered cell volume (OR 1.02, 95% CI 1.00-1.03) and tea intake (OR 1.03, 95% CI 1.01-1.05). Decreased aSAH risk was associated with peak expiratory flow (OR 0.80, 95% CI 0.66-0.96), and haematocrit percentage (OR 0.97, 95% CI 0.95-1.00). Future research should validate these findings and explore the potential non-linear relationships and interactions indicated by the ML models.

Original languageEnglish
Article number9256
JournalScientific Reports
Volume15
Issue number1
DOIs
Publication statusPublished - 18 Mar 2025

Bibliographical note

Publisher Copyright:
© The Author(s) 2025.

Funding

This project has received funding from the European Research Council (ERC) under the European Union\u2019s Horizon 2020 research and innovation program (grant agreement No. 852173).

FundersFunder number
European Research Council
Horizon 2020 Framework Programme852173

    Keywords

    • Adult
    • Aged
    • Female
    • Humans
    • Logistic Models
    • Machine Learning
    • Male
    • Middle Aged
    • Risk Factors
    • Subarachnoid Hemorrhage/epidemiology
    • United Kingdom/epidemiology

    Fingerprint

    Dive into the research topics of 'Identifying novel risk factors for aneurysmal subarachnoid haemorrhage using machine learning'. Together they form a unique fingerprint.

    Cite this