Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis
RNA序列
计算生物学
转录组
计算机科学
基因
核糖核酸
遗传学
生物
基因表达
生物信息学
作者
Sayed Mohammad Ebrahim Sahraeian,Marghoob Mohiyuddin,Robert Sebra,Hagen Tilgner,Pegah Tootoonchi Afshar,Kin Fai Au,Narges Bani Asadi,Mark Gerstein,Wing Hung Wong,M Snyder,Eric E. Schadt,Hugo Y. K. Lam
RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensive study analysing a broad spectrum of RNA-seq workflows. Surpassing the expression analysis scope, our work also includes assessment of RNA variant-calling, RNA editing and RNA fusion detection techniques. Specifically, we examine both short- and long-read RNA-seq technologies, 39 analysis tools resulting in ~120 combinations, and ~490 analyses involving 15 samples with a variety of germline, cancer and stem cell data sets. We report the performance and propose a comprehensive RNA-seq analysis protocol, named RNACocktail, along with a computational pipeline achieving high accuracy. Validation on different samples reveals that our proposed protocol could help researchers extract more biologically relevant predictions by broad analysis of the transcriptome.RNA-seq is widely used for transcriptome analysis. Here, the authors analyse a wide spectrum of RNA-seq workflows and present a comprehensive analysis protocol named RNACocktail as well as a computational pipeline leveraging the widely used tools for accurate RNA-seq analysis.