计算机科学
机器学习
人工智能
一般化
可用性
可扩展性
人机交互
数学
数据库
数学分析
作者
Hua Shi,Shuang Li,Xi Su
出处
期刊:Methods
[Elsevier]
日期:2022-02-26
卷期号:204: 126-131
被引量:3
标识
DOI:10.1016/j.ymeth.2022.02.009
摘要
N6-methyladenine (6mA) in DNA, a type of DNA methylation in epigenetic modification, has attracted extensive attention in recent years. In order to improve our understanding of 6mA biological activities and mechanisms in plant genomes, we need to be able to accurately identify 6mA sites. Because traditional wet-lab experiments frequently necessitate a large amount of manpower and time, a plethora of computational methods, particularly machine learning, have emerged to achieve fast and accurate 6mA site prediction. Traditional machine learning methods, on the other hand, rely heavily on manual features and integrated learning to improve performance, resulting in a reliance on prior knowledge and a large model scale. Furthermore, many models are only trained and tested for one species, with no comparison of model generalization performance, resulting in models with limited practical usability. In order to increase the generalization capability of the model, we propose a lightweight structure predictor Plant6mA based on Transformer encoder. Based on results on independent test sets, our proposed Plant6mA has better generalization performance than the most advanced methods in predicting 6mA location in plant genomes. Plant6mA's MultiHead Attention mechanism effectively enhances its expressive ability by capturing potential biological information from multiple scales of the input sequence. Furthermore, we used a dimensionality reduction tool to visualize Plant6mA's training process and visually demonstrate the effectiveness of our model.
科研通智能强力驱动
Strongly Powered by AbleSci AI