生物
计算生物学
DNA测序
基因组
转录组
核糖核酸
基因
遗传学
基因表达
作者
Haitao Luo,Dechao Bu,Liang Sun,Runsheng Chen,Yi Zhao
标识
DOI:10.1007/978-1-4939-1062-5_18
摘要
Each day, more and more transcripts are being discovered along the genome (especially in poorly annotated species) thanks to the rapid progress of high-throughput technology such as RNA sequencing. However, this situation unravels the challenge of how to classify the newly identified transcripts into protein coding or noncoding. Here, we describe a de novo approach named coding–noncoding index (CNCI), a powerful signature tool by profiling adjoining nucleotide triplets (ANT) to effectively distinguish between protein-coding and noncoding sequences independently of known annotations. The main advantage of CNCI is its ability to accurately classify transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, which allowed it to be used for all vertebrates and invertebrates based on the training data of well-annotated species (such as human and Arabidopsis). In this chapter, we illustrate the CNCI method in detail through an example of RNA-sequencing data generated from six biological replicates of six mouse tissues. CNCI software is available at http://www.bioinfo.org/software/cnci .
科研通智能强力驱动
Strongly Powered by AbleSci AI