转录组
基因
对数正态分布
生物
计算生物学
基因组
基因组学
遗传学
基因表达
分布(数学)
基因表达谱
统计
数学
数学分析
作者
Laurence de Torrenté,Samuel Zimmerman,Masako Suzuki,Maximilian Christopeit,John M. Greally,Jessica C. Mar
摘要
Abstract In genomics, we often impose the assumption that gene expression data follows a specific distribution. However, rarely do we stop to question this assumption or consider its applicability to all genes in the transcriptome. Our study investigated the prevalence of genes with expression distributions that are non-Normal in three different tumor types from the Cancer Genome Atlas (TCGA). Surprisingly, less than 50% of all genes were Normally-distributed, with other distributions including Gamma, Bimodal, Cauchy, and Lognormal were represented. Relevant information about cancer biology was captured by the genes with non-Normal gene expression. When used for classification, the set of non-Normal genes were able to discriminate between cancer patients with poor versus good survival status. Our results highlight the value of studying a gene’s distribution shape to model heterogeneity of transcriptomic data. These insights would have been overlooked when using standard approaches that assume all genes follow the same type of distribution in a patient cohort.
科研通智能强力驱动
Strongly Powered by AbleSci AI