赖氨酸
微生物群
计算生物学
基因组
人体微生物群
生物
噬菌体
生物信息学
遗传学
基因
大肠杆菌
作者
Yiran Fu,Shuting Yu,Jianfeng Li,Z Lao,Xiaofeng Yang,Zhanglin Lin
出处
期刊:Cell Reports
[Elsevier]
日期:2024-08-01
卷期号:43 (8): 114583-114583
标识
DOI:10.1016/j.celrep.2024.114583
摘要
Vast shotgun metagenomics data remain an underutilized resource for novel enzymes. Artificial intelligence (AI) has increasingly been applied to protein mining, but its conventional performance evaluation is interpolative in nature, and these trained models often struggle to extrapolate effectively when challenged with unknown data. In this study, we present a framework (DeepMineLys [deep mining of phage lysins from human microbiome]) based on the convolutional neural network (CNN) to identify phage lysins from three human microbiome datasets. When validated with an independent dataset, our method achieved an F1-score of 84.00%, surpassing existing methods by 20.84%. We expressed 16 lysin candidates from the top 100 sequences in E. coli, confirming 11 as active. The best one displayed an activity 6.2-fold that of lysozyme derived from hen egg white, establishing it as the most potent lysin from the human microbiome. Our study also underscores several important issues when applying AI to biology questions. This framework should be applicable for mining other proteins.
科研通智能强力驱动
Strongly Powered by AbleSci AI