计算机科学
核糖核酸
背景(考古学)
步伐
数据科学
人工智能
计算生物学
生物
遗传学
基因
地理
古生物学
大地测量学
作者
Xi Wang,Gu Rong,Zhiyuan Chen,Yongge Li,Xiao-Hong Ji,Guolin Ke,Han Wen
标识
DOI:10.1101/2023.07.11.548588
摘要
A bstract RNA molecules play a crucial role as intermediaries in diverse biological processes. Attaining a profound understanding of their function can substantially enhance our comprehension of life’s activities and facilitate drug development for numerous diseases. The advent of high-throughput sequencing technologies makes vast amounts of RNA sequence data accessible, which contains invaluable information and knowledge. However, deriving insights for further application from such an immense volume of data poses a significant challenge. Fortunately, recent advancements in pre-trained models have surfaced as a revolutionary solution for addressing such challenges owing to their exceptional ability to automatically mine and extract hidden knowledge from massive datasets. Inspired by the past successes, we developed a novel context-aware deep learning model named Uni-RNA that performs pre-training on the largest dataset of RNA sequences at the unprecedented scale to date. During this process, our model autonomously unraveled the obscured evolutionary and structural information embedded within the RNA sequences. As a result, through fine-tuning, our model achieved the state-of-the-art (SOTA) performances in a spectrum of downstream tasks, including both structural and functional predictions. Overall, Uni-RNA established a new research paradigm empowered by the large pre-trained model in the field of RNA, enabling the community to unlock the power of AI at a whole new level to significantly expedite the pace of research and foster groundbreaking discoveries.
科研通智能强力驱动
Strongly Powered by AbleSci AI