超平面
k-最近邻算法
分类器(UML)
模式识别(心理学)
人工智能
计算机科学
独立性(概率论)
大边距最近邻
数学
统计
组合数学
作者
Rui Fan,Yongsheng Ding,Quan Zou,Yuan Liu
标识
DOI:10.1016/j.ijbiomac.2023.125774
摘要
Vesicular transport proteins participate in various biological processes and play a significant role in the movement of substances within cells. These proteins are associated with numerous human diseases, making their identification particularly important. In this study, we developed a novel strategy for accurately identifying vesicular transport proteins. We developed a novel multi-view classifier called graph-regularized k-local hyperplane distance nearest neighbor model (HSIC-GHKNN), which combines the Hilbert-Schmidt independence criterion (HSIC)-based multi-view learning method with a local hyperplane distance nearest-neighbor classifier. We first extracted protein evolution information using two feature extraction methods, pseudo-position-specific scoring matrix (PsePSSM) and AATP, and addressed dataset imbalance using the Edited Nearest Neighbors (ENN) algorithm. Subsequently, we employed a local hyperplane distance nearest-neighbor classifier for each view identification and added an HSIC term to maintain independence between views. We then assessed the performance of our identification strategy and analyzed the PsePSSM and AATP feature sets to determine the influencing factors of the classification results. The experimental results demonstrate that the accurate and Matthew correlation coefficients of our strategy on the independent test set are 85.8 % and 0.548, respectively. Our approach outperformed existing methods in most evaluation metrics. In addition, the proposed multi-view classification model can easily be applied to similar identification tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI