推论
群体基因组学
插补(统计学)
人口
基因组学
样品(材料)
DNA测序
计算机科学
生物
计算生物学
基因组
机器学习
遗传学
人工智能
缺少数据
基因
社会学
人口学
化学
DNA
色谱法
作者
Runyang Nicolas Lou,Arne Jacobs,Aryn P. Wilder,Nina Overgaard Therkildsen
出处
期刊:Authorea - Authorea
日期:2021-04-21
被引量:3
标识
DOI:10.22541/au.160689616.68843086/v3
摘要
Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data, and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate how the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency and genetic diversity estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference, with a few notable exceptions. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based population genomics research. With this overview, we hope to make lcWGS more approachable and stimulate its broader adoption.
科研通智能强力驱动
Strongly Powered by AbleSci AI