生物
注释
错义突变
等位基因
计算生物学
遗传学
人类遗传变异
人类基因组
特质
单核苷酸多态性
基因组
遗传变异
致病性
1000基因组计划
基因
基因型
突变
计算机科学
微生物学
程序设计语言
作者
Martin Kircher,Daniela Witten,Preti Jain,Brian J. O’Roak,Gregory M. Cooper,Jay Shendure
出处
期刊:Nature Genetics
[Springer Nature]
日期:2014-02-02
卷期号:46 (3): 310-315
被引量:5607
摘要
Jay Shendure, Greg Cooper and colleagues report a framework for annotation of genetic variation, Combined Annotation–Dependent Depletion (CADD), integrating diverse annotations into a single C score. They show that C scores correlate with annotations of functionality, pathogenicity and experimentally measured regulatory effects. Current methods for annotating and interpreting human genetic variation tend to exploit a single information type (for example, conservation) and/or are restricted in scope (for example, to missense changes). Here we describe Combined Annotation–Dependent Depletion (CADD), a method for objectively integrating many diverse annotations into a single measure (C score) for each variant. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human-derived alleles from 14.7 million simulated variants. We precompute C scores for all 8.6 billion possible human single-nucleotide variants and enable scoring of short insertions-deletions. C scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects and complex trait associations, and they highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current single-annotation method.
科研通智能强力驱动
Strongly Powered by AbleSci AI