RNA序列
管道(软件)
计算机科学
工作流程
文档
数据库
元数据
模块化设计
计算生物学
集合(抽象数据类型)
程序设计语言
转录组
基因
基因表达
万维网
生物
遗传学
作者
Bastian Seelbinder,Thomas Wolf,Stefan Priebe,Sylvie McNamara,Silvia Gerber,Reinhard Guthke,Jörg Linde
摘要
ABSTRACT In transcriptomics, the study of the total set of RNAs transcribed by the cell, RNA sequencing (RNA-seq) has become the standard tool for analysing gene expression. The primary goal is the detection of genes whose expression changes significantly between two or more conditions, either for a single species or for two or more interacting species at the same time (dual RNA-seq, triple RNA-seq and so forth). The analysis of RNA-seq can be simplified as many steps of the data pre-processing can be standardised in a pipeline. In this publication we present the “GEO2RNAseq” pipeline for complete, quick and concurrent pre-processing of single, dual, and triple RNA-seq data. It covers all pre-processing steps starting from raw sequencing data to the analysis of differentially expressed genes, including various tables and figures to report intermediate and final results. Raw data may be provided in FASTQ format or can be downloaded automatically from the Gene Expression Omnibus repository. GEO2RNAseq strongly incorporates experimental as well as computational metadata. GEO2RNAseq is implemented in R, lightweight, easy to install via Conda and easy to use, but still very flexible through using modular programming and offering many extensions and alternative workflows. GEO2RNAseq is publicly available at https://anaconda.org/xentrics/r-geo2rnaseq and https://bitbucket.org/thomas_wolf/geo2rnaseq/overview , including source code, installation instruction, and comprehensive package documentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI