集合(抽象数据类型)
空(SQL)
空分布
统计的
拟合优度
计算机科学
检验统计量
无效假设
基因
计算生物学
统计
数据挖掘
统计假设检验
数学
生物
遗传学
程序设计语言
作者
Mengqi Zhang,Sahar Gelfman,Cristiane Araújo Martins Moreno,Janice McCarthy,Matthew Harms,David B. Goldstein,Andrew S. Allen
摘要
Gene set-based signal detection analyses are used to detect an association between a trait and a set of genes by accumulating signals across the genes in the gene set. Since signal detection is concerned with identifying whether any of the genes in the gene set are non-null, a goodness-of-fit (GOF) test can be used to compare whether the observed distribution of gene-level tests within the gene set agrees with the theoretical null distribution. Here, we present a flexible gene set-based signal detection framework based on tail-focused GOF statistics. We show that the power of the various statistics in this framework depends critically on two parameters: the proportion of genes within the gene set that are non-null and the degree of separation between the null and alternative distributions of the gene-level tests. We give guidance on which statistic to choose for a given situation and implement the methods in a fast and user-friendly R package, wHC (https://github.com/mqzhanglab/wHC). Finally, we apply these methods to a whole exome sequencing study of amyotrophic lateral sclerosis.
科研通智能强力驱动
Strongly Powered by AbleSci AI