Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases

医学风湿性疾病多类分类机器学习人工智能免疫学重症监护医学类风湿性关节炎支持向量机计算机科学

作者

Yuhang Li,Wei Wei,Renren Ouyang,Rujia Chen,Ting Wang,Yuan Xu,Feng Wang,Hongyan Hou,Hongyan Hou

出处

期刊：Lupus science & medicine [BMJ]
日期：2024-01-01 卷期号：11 (1): e001125-e001125

链接

bmj.com bmj.comdoi.org

标识

DOI：10.1136/lupus-2023-001125

摘要

Objective Systemic autoimmune rheumatic diseases (SARDs) encompass a diverse group of complex conditions with overlapping clinical features, making accurate diagnosis challenging. This study aims to develop a multiclass machine learning (ML) model for early-stage SARDs classification using accessible laboratory indicators. Methods A total of 925 SARDs patients were included, categorised into SLE, Sjögren’s syndrome (SS) and inflammatory myositis (IM). Clinical characteristics and laboratory markers were collected and nine key indicators, including anti-dsDNA, anti-SS-A60, anti-Sm/nRNP, antichromatin, anti-dsDNA (indirect immunofluorescence assay), haemoglobin (Hb), platelet, neutrophil percentage and cytoplasmic patterns (AC-19, AC-20), were selected for model building. Various ML algorithms were used to construct a tripartite classification ML model. Results Patients were divided into two cohorts, cohort 1 was used to construct a tripartite classification model. Among models assessed, the random forest (RF) model demonstrated superior performance in distinguishing SLE, IM and SS (with area under curve=0.953, 0.903 and 0.836; accuracy= 0.892, 0.869 and 0.857; sensitivity= 0.890, 0.868 and 0.795; specificity= 0.910, 0.836 and 0.748; positive predictive value=0.922, 0.727 and 0.663; and negative predictive value= 0.854, 0.915 and 0.879). The RF model excelled in classifying SLE (precision=0.930, recall=0.985, F1 score=0.957). For IM and SS, RF model outcomes were (precision=0.793, 0.950; recall=0.920, 0.679; F1 score=0.852, 0.792). Cohort 2 served as an external validation set, achieving an overall accuracy of 87.3%. Individual classification performances for SLE, SS and IM were excellent, with precision, recall and F1 scores specified. SHAP analysis highlighted significant contributions from antibody profiles. Conclusion This pioneering multiclass ML model, using basic laboratory indicators, enhances clinical feasibility and demonstrates promising potential for SARDs classification. The collaboration of clinical expertise and ML offers a nuanced approach to SARDs classification, with potential for enhanced patient care.

求助该文献

Novel multiclass classification machine learning approach for the early-stage classification of systemic autoimmune rheumatic diseases

今日热心研友