生物
1000基因组计划
基因组学
等位基因频率
基因组
生殖系
计算生物学
遗传学
种系突变
DNA测序
人口
基因
癌症
外显子组
突变
支票2
外显子组测序
等位基因
单核苷酸多态性
基因型
社会学
人口学
作者
Aimee L. Davidson,Conrad Leonard,Lambros T. Koufariotis,Michael T. Parsons,Georgina E. Hollway,John V. Pearson,Felicity Newell,Nicola Waddell,Amanda B. Spurdle
摘要
Aggregate population genomics data from large cohorts are vital for assessing germline variant pathogenicity. However, there are no specifications on how sequencing quality metrics should be considered, and whether exome-derived and genome-derived allele frequencies should be considered in isolation. Germline genome sequence data were simulated for nine read-depths to identify a minimum acceptable read-depth for detecting variants. gnomAD exome-derived and genome-derived datasets were assessed for read-depth, for six key cancer genes selected for variant curation by ClinGen expert panels. Non-Finnish European allele frequency (AF) or filter AF of coding variants in these genes, assigned into frequency bins using modified ACMG-AMP criteria, was compared between exome-derived and genome-derived datasets. A 30X read-depth achieved acceptable precision and recall for detection of substitutions, but poor recall for small insertions/deletions. Exome-derived and genome-derived datasets exhibited low read-depth for different gene exons. Individual variants were mostly assigned to non-divergent AF bins (>95%) or filter AF bins (>97%). Two major bin divergences were resolved by applying the minimal acceptable read-depth threshold. These findings show the importance of assessing read-depth separately for population datasets sourced from different short-read sequencing technologies before assigning a frequency-based ACMG-AMP classification code for variant interpretation.
科研通智能强力驱动
Strongly Powered by AbleSci AI