计算机科学
人工智能
深度学习
机器学习
判决
构造(python库)
随机森林
人工神经网络
关系抽取
支持向量机
自然语言处理
信息抽取
程序设计语言
作者
Mengqi Luo,Shangfu Li,Yuxuan Pang,Lantian Yao,Renfei Ma,Hsi-Yuan Huang,Hsien-Da Huang,Tzong-Yi Lee
摘要
Abstract MicroRNA (miRNA)–target interaction (MTI) plays a substantial role in various cell activities, molecular regulations and physiological processes. Published biomedical literature is the carrier of high-confidence MTI knowledge. However, digging out this knowledge in an efficient manner from large-scale published articles remains challenging. To address this issue, we were motivated to construct a deep learning-based model. We applied the pre-trained language models to biomedical text to obtain the representation, and subsequently fed them into a deep neural network with gate mechanism layers and a fully connected layer for the extraction of MTI information sentences. Performances of the proposed models were evaluated using two datasets constructed on the basis of text data obtained from miRTarBase. The validation and test results revealed that incorporating both PubMedBERT and SciBERT for sentence level encoding with the long short-term memory (LSTM)-based deep neural network can yield an outstanding performance, with both F1 and accuracy being higher than 80% on validation data and test data. Additionally, the proposed deep learning method outperformed the following machine learning methods: random forest, support vector machine, logistic regression and bidirectional LSTM. This work would greatly facilitate studies on MTI analysis and regulations. It is anticipated that this work can assist in large-scale screening of miRNAs, thereby revealing their functional roles in various diseases, which is important for the development of highly specific drugs with fewer side effects. Source code and corpus are publicly available at https://github.com/qi29.
科研通智能强力驱动
Strongly Powered by AbleSci AI