TY - JOUR
T1 - Leveraging machine learning for predicting acute graft-versus-host disease grades in allogeneic hematopoietic cell transplantation for T-cell prolymphocytic leukaemia
AU - Chandra, G
AU - Wang, JF
AU - Siirtola, P
AU - Röning, J
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/5/11
Y1 - 2024/5/11
N2 - Orphan diseases, exemplified by T-cell prolymphocytic leukemia, present inherent challenges due to limited data availability and complexities in effective care. This study delves into harnessing the potential of machine learning to enhance care strategies for orphan diseases, specifically focusing on allogeneic hematopoietic cell transplantation (allo-HCT) in T-cell prolymphocytic leukemia. The investigation evaluates how varying numbers of variables impact model performance, considering the rarity of the disease. Utilizing data from the Center for International Blood and Marrow Transplant Research, the study scrutinizes outcomes following allo-HCT for T-cell prolymphocytic leukemia. Diverse machine learning models were developed to forecast acute graft-versus-host disease (aGvHD) occurrence and its distinct grades post-allo-HCT. Assessment of model performance relied on balanced accuracy, F1 score, and ROC AUC metrics. The findings highlight the Linear Discriminant Analysis (LDA) classifier achieving the highest testing balanced accuracy of 0.58 in predicting aGvHD. However, challenges arose in its performance during multi-class classification tasks. While affirming the potential of machine learning in enhancing care for orphan diseases, the study underscores the impact of limited data and disease rarity on model performance.
AB - Orphan diseases, exemplified by T-cell prolymphocytic leukemia, present inherent challenges due to limited data availability and complexities in effective care. This study delves into harnessing the potential of machine learning to enhance care strategies for orphan diseases, specifically focusing on allogeneic hematopoietic cell transplantation (allo-HCT) in T-cell prolymphocytic leukemia. The investigation evaluates how varying numbers of variables impact model performance, considering the rarity of the disease. Utilizing data from the Center for International Blood and Marrow Transplant Research, the study scrutinizes outcomes following allo-HCT for T-cell prolymphocytic leukemia. Diverse machine learning models were developed to forecast acute graft-versus-host disease (aGvHD) occurrence and its distinct grades post-allo-HCT. Assessment of model performance relied on balanced accuracy, F1 score, and ROC AUC metrics. The findings highlight the Linear Discriminant Analysis (LDA) classifier achieving the highest testing balanced accuracy of 0.58 in predicting aGvHD. However, challenges arose in its performance during multi-class classification tasks. While affirming the potential of machine learning in enhancing care for orphan diseases, the study underscores the impact of limited data and disease rarity on model performance.
KW - Acute graft-versus-host disease
KW - Allogeneic hematopoietic cell transplantation
KW - Data size
KW - Machine learning
KW - Model performance
KW - Orphan diseases
KW - T-cell prolymphocytic leukemia
UR - http://www.scopus.com/inward/record.url?scp=85192930555&partnerID=8YFLogxK
U2 - 10.1186/s12874-024-02237-y
DO - 10.1186/s12874-024-02237-y
M3 - Article
C2 - 38734644
SN - 1471-2288
VL - 24
JO - BMC Medical Research Methodology
JF - BMC Medical Research Methodology
IS - 1
M1 - 112
ER -