计算机科学
离群值
数据挖掘
样本量测定
贝叶斯定理
参数统计
贝叶斯概率
软件
人工智能
统计
数学
程序设计语言
作者
W. Evan Johnson,Cheng Li,Ariel Rabinovic
出处
期刊:Biostatistics
[Oxford University Press]
日期:2006-04-21
卷期号:8 (1): 118-127
被引量:7083
标识
DOI:10.1093/biostatistics/kxj037
摘要
Non-biological experimental variation or “batch effects" are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes (>25) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.
科研通智能强力驱动
Strongly Powered by AbleSci AI