随机森林
计算机科学
鉴定(生物学)
集合(抽象数据类型)
序列(生物学)
机器学习
二元分类
试验装置
二进制数
人工智能
数据挖掘
计算生物学
支持向量机
数学
化学
生物
植物
生物化学
算术
程序设计语言
作者
Hua Deng,Meng Ding,Sheng Wang,Weihua Li,Guixia Liu,Yun Tang
标识
DOI:10.1016/j.compbiomed.2023.106844
摘要
Anticancer peptides (ACPs), a series of short bioactive peptides, are promising candidates in fighting against cancer due to their high activity, low toxicity, and not likely cause drug resistance. The accurate identification of ACPs and classification of their functional types is of great importance for investigating their mechanisms of action and developing peptide-based anticancer therapies. Here, we provided a computational tool, called ACP-MLC, to address binary classification and multi-label classification of ACPs for a given peptide sequence. Briefly, ACP-MLC is a two-level prediction engine, in which the 1st-level model predicts whether a query sequence is an ACP or not by random forest algorithm, and the 2nd-level model predicts which tissue types the sequence might target by the binary relevance algorithm. Development and evaluation by high-quality datasets, our ACP-MLC yielded an area under the receiver operating characteristic curve (AUC) of 0.888 on the independent test set for the 1st-level prediction, and obtained 0.157 hamming loss, 0.577 subset accuracy, 0.802 F1-scoremacro, and 0.826 F1-scoremicro on the independent test set for the 2nd-level prediction. A systematic comparison demonstrated that ACP-MLC outperformed existing binary classifiers and other multi-label learning classifiers for ACP prediction. Finally, we interpreted the important features of ACP-MLC by the SHAP method. User-friendly software and the datasets are available at https://github.com/Nicole-DH/ACP-MLC. We believe that the ACP-MLC would be a powerful tool in ACP discovery.
科研通智能强力驱动
Strongly Powered by AbleSci AI