计算机科学
人工智能
深度学习
模式识别(心理学)
自编码
核糖核酸
人工神经网络
计算生物学
卷积神经网络
机器学习
作者
Min Zeng,Yifan Wu,Chengqian Lu,Fuhao Zhang,Fang-Xiang Wu,Min Li
标识
DOI:10.1101/2021.03.13.435245
摘要
Abstract Motivation Long non-coding RNAs (IncRNAs) are a class of RNA molecules with more than 200 nucleotides. A growing amount of evidence reveals that subcellular localization of lncRNAs can provide valuable insights into their biological functions. Existing computational methods for predicting lncRNA subcellular localization use k-mer features to encode lncRNA sequences. However, the sequence order information is lost by using only k-mer features. Results We proposed a deep learning framework, DeepLncLoc, to predict lncRNA subcellular localization. In DeepLncLoc, we introduced a new subsequence embedding method that keeps the order information of lncRNA sequences. The subsequence embedding method first divides a sequence into some consecutive subsequences, and then extracts the patterns of each subsequence, last combines these patterns to obtain a complete representation of the lncRNA sequence. After that, a text convolutional neural network is employed to learn high-level features and perform the prediction task. Compared to traditional machine learning models with k-mer features and existing predictors, DeepLncLoc achieved better performance, which shows that DeepLncLoc could effectively predict lncRNA subcellular localization. Our study not only presented a novel computational model for predicting lncRNA subcellular localization but also provided a new subsequence embedding method which is expected to be applied in other sequence-based prediction tasks. Availability The DeepLncLoc web server, source code and datasets are freely available at http://bioinformatics.csu.edu.cn/DeepLncLoc/, and https://github.com/CSUBioGroup/DeepLncLoc. Contact limin@mail.csu.edu.cn
科研通智能强力驱动
Strongly Powered by AbleSci AI