清脆的
特征选择
计算机科学
选择(遗传算法)
人工智能
特征(语言学)
机器学习
生物
遗传学
基因
语言学
哲学
作者
Jiashun Fu,Xuyang Liu,Ruijie Deng,Xiue Jiang,Wensheng Cai,Haohao Fu,Xueguang Shao
标识
DOI:10.1021/acs.jcim.4c02438
摘要
CRISPR/Cas13a serves as a key tool for nucleic acid tests; therefore, accurate prediction of its activity is essential for creating robust and sensitive diagnosis. In this study, we create a dual-branch neural network model that achieves high prediction accuracy and classification performance across two independent CRISPR/Cas13a data sets, outperforming previously published models relying solely on sequence features. The model integrates direct sequence encoding with descriptive features and yields 99 key descriptive features out of 1553, extracted through statistical analysis, which critically influence guide–target interactions and Cas13a guide activity. By employing Shapley Additive Explanations and Integrated Gradients for feature importance analysis, we show that sequence composition, mismatch type and frequency, and the protospacer flanking site region are primary features. These findings underscore the importance of using descriptive features as complementary inputs to deep learning-based encoding and provide valuable insights into the mechanisms underlying guide–target interaction. All in all, this study not only introduces a reliable and efficient model for Cas13a guide activity prediction but also offers a foundation for future rational design efforts.
科研通智能强力驱动
Strongly Powered by AbleSci AI