混合模型
布里氏评分
计算机科学
人工智能
支持向量机
聚类分析
核(代数)
过采样
模式识别(心理学)
数学
带宽(计算)
组合数学
计算机网络
作者
Meng Xing,Yanbo Zhang,Hongmei Yu,Zhenhuan Yang,Xueling Li,Qiong Li,Yanlin Zhao,Zhiqiang Zhao,Yanhong Luo
标识
DOI:10.1016/j.cmpb.2022.107103
摘要
Diffuse large B-cell lymphoma (DLBCL) is common in adults' non-Hodgkin's lymphoma. Relapse mainly occurs within two years after diagnosis and has a poor prognosis. Relapse after two years is less frequent and has a better prognosis. In this work, we constructed a relapse prediction model for diffuse large B-cell lymphoma patients within two years, expecting to provide a reference for Clinicians to implement individualized treatment.We propose a secondary-level class imbalance method based on Gaussian mixture model (GMM) clustering resampling to balance the data. Then use a multi-kernel support vector machine(SVM) to inscribe heterogeneous clinical data. Finally, merging them to identify recurrence patients within two years.Among all the class imbalance methods in this work, Inverse Weighted -GMM +SMOTEENN has the best performance. Compared with NO-GMM (Directl use the SMOTEENN without the GMM clustering process), its Area Under the ROC Curve(AUC) increases by 8.75%, and ECE and brier scores decrease 2.07% and 3.09%, respectively. Among the four classification algorithms in this work, Multiple kernel learning (MKL) has the most minimized brier scores and expected calibration error(ECE), the largest AUC, accuracy, Recall, precision and F1, has the best discrimination and calibration.Our inverse weighted -GMM+SMOTEENN+MKL (GMM-SENN-MKL) method can handle data class imbalance and clinical heterogeneity data well and can be used to predict recurrence in DLBCL patients.
科研通智能强力驱动
Strongly Powered by AbleSci AI