生物
基因组
人类基因组
序列(生物学)
人类遗传变异
结构变异
遗传学
计算生物学
遗传变异
全基因组测序
进化生物学
基因组学
集合(抽象数据类型)
变化(天文学)
参考基因组
基因
1000基因组计划
计算机科学
基因型
单核苷酸多态性
物理
程序设计语言
天体物理学
作者
Neil Weisenfeld,Shuangye Yin,Ted Sharpe,Bayo Lau,Ryan Hegarty,Laurie Holmes,Brian Sogoloff,Diana Tabbaa,Louise Williams,Carsten Russ,Chad Nusbaum,Eric S. Lander,Iain MacCallum,David B. Jaffe
出处
期刊:Nature Genetics
[Springer Nature]
日期:2014-10-19
卷期号:46 (12): 1350-1355
被引量:207
摘要
Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome.
科研通智能强力驱动
Strongly Powered by AbleSci AI