主成分分析
人工智能
支持向量机
模式识别(心理学)
分类器(UML)
随机森林
特征(语言学)
激光诱导击穿光谱
特征提取
计算机科学
数学
激光器
物理
光学
语言学
哲学
作者
Qianqian Wang,Geer Teng,Xiaolei Qiao,Yu Zhao,Jinglin Kong,Liqiang Dong,Xutai Cui
摘要
The correct classification of pathogenic bacteria is significant for clinical diagnosis and treatment. Compared with the use of whole spectral data, using feature lines as the inputs of the classification model can improve the correct classification rate (CCR) and reduce the analyzing time. In order to select feature lines, we need to investigate the contribution to the CCR of each spectral line. In this paper, two algorithms, important weights based on principal component analysis (IW-PCA) and random forests (RF), were proposed to evaluate the importance of spectra lines. The laser-induced plasma spectra (LIBS) of six common clinical pathogenic bacteria species were measured and a support vector machine (SVM) classifier was used to classify the LIBS of bacteria species. In the proposed IW-PCA algorithm, the product of the loading of each line and the variance of the corresponding principal component were calculated. The maximum product of each line calculated from the first three PCs was used to represent the line's importance weight. In the RF algorithm, the Gini index reduction value of each line was considered as the line's importance weight. The experimental results demonstrated that the lines with high importance were more suitable for classification and can be chosen as feature lines. The optimal number of feature lines used in the SVM classifier can be determined by comparing the CCRs with a different number of feature lines. Importance weights evaluated by RF are more suitable for extracting feature lines using LIBS combined with an SVM classification mechanism than those evaluated by IW-PCA. Furthermore, the two methods mutually verified the importance of selected lines and the lines evaluated important by both IW-PCA and RF contributed more to the CCR.
科研通智能强力驱动
Strongly Powered by AbleSci AI