Constructing germline research cohorts from the discarded reads of clinical tumor sequences

生殖系 插补(统计学) 遗传学 基因分型 精密医学 生物 1000基因组计划 种系突变 计算生物学 人类遗传学 基因型 单核苷酸多态性 计算机科学 突变 基因 机器学习 缺少数据
作者
Alexander Gusev,Stefan Groha,Kodi Taraszka,Yevgeniy R. Semenov,Noah Zaitlen
出处
期刊:Genome Medicine [BioMed Central]
卷期号:13 (1) 被引量:29
标识
DOI:10.1186/s13073-021-00999-4
摘要

Hundreds of thousands of cancer patients have had targeted (panel) tumor sequencing to identify clinically meaningful mutations. In addition to improving patient outcomes, this activity has led to significant discoveries in basic and translational domains. However, the targeted nature of clinical tumor sequencing has a limited scope, especially for germline genetics. In this work, we assess the utility of discarded, off-target reads from tumor-only panel sequencing for the recovery of genome-wide germline genotypes through imputation.We developed a framework for inference of germline variants from tumor panel sequencing, including imputation, quality control, inference of genetic ancestry, germline polygenic risk scores, and HLA alleles. We benchmarked our framework on 833 individuals with tumor sequencing and matched germline SNP array data. We then applied our approach to a prospectively collected panel sequencing cohort of 25,889 tumors.We demonstrate high to moderate accuracy of each inferred feature relative to direct germline SNP array genotyping: individual common variants were imputed with a mean accuracy (correlation) of 0.86, genetic ancestry was inferred with a correlation of > 0.98, polygenic risk scores were inferred with a correlation of > 0.90, and individual HLA alleles were inferred with a correlation of > 0.80. We demonstrate a minimal influence on the accuracy of somatic copy number alterations and other tumor features. We showcase the feasibility and utility of our framework by analyzing 25,889 tumors and identifying the relationships between genetic ancestry, polygenic risk, and tumor characteristics that could not be studied with conventional on-target tumor data.We conclude that targeted tumor sequencing can be leveraged to build rich germline research cohorts from existing data and make our analysis pipeline publicly available to facilitate this effort.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
豆豆浆发布了新的文献求助10
1秒前
1秒前
细心采萱发布了新的文献求助10
1秒前
内向的白玉完成签到 ,获得积分10
1秒前
Li完成签到,获得积分20
2秒前
123发布了新的文献求助10
2秒前
ifonly发布了新的文献求助10
3秒前
try发布了新的文献求助10
3秒前
3秒前
大模型应助木至至采纳,获得10
3秒前
4秒前
无花果应助墨aizhan采纳,获得10
5秒前
SciGPT应助墨aizhan采纳,获得10
5秒前
今后应助墨aizhan采纳,获得10
5秒前
今后应助墨aizhan采纳,获得10
5秒前
科研通AI6.4应助墨aizhan采纳,获得10
5秒前
852应助墨aizhan采纳,获得10
5秒前
领导范儿应助墨aizhan采纳,获得10
5秒前
大模型应助墨aizhan采纳,获得10
5秒前
香蕉觅云应助墨aizhan采纳,获得10
5秒前
科研通AI6.2应助墨aizhan采纳,获得10
5秒前
Jasper应助Maestro_S采纳,获得100
6秒前
莫明发布了新的文献求助10
6秒前
bkagyin应助会跳投的绿丸采纳,获得30
7秒前
8秒前
万宇发布了新的文献求助10
9秒前
充电宝应助郭长宇采纳,获得10
9秒前
9秒前
10秒前
科研通AI6.3应助谦让芷巧采纳,获得10
11秒前
赛特新思发布了新的文献求助10
12秒前
小野狼完成签到,获得积分0
13秒前
自由逐风的小驴子完成签到,获得积分10
13秒前
Cell完成签到 ,获得积分10
14秒前
无花果应助bot_753采纳,获得10
14秒前
15秒前
吕培森发布了新的文献求助10
15秒前
蛋花完成签到,获得积分20
16秒前
脑洞疼应助万宇采纳,获得10
16秒前
16秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Picture this! Including first nations fiction picture books in school library collections 2000
The Cambridge History of China: Volume 4, Sui and T'ang China, 589–906 AD, Part Two 1500
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
Quality by Design - An Indispensable Approach to Accelerate Biopharmaceutical Product Development 800
ON THE THEORY OF BIRATIONAL BLOWING-UP 666
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6390930
求助须知:如何正确求助?哪些是违规求助? 8206039
关于积分的说明 17368326
捐赠科研通 5444599
什么是DOI,文献DOI怎么找? 2878673
邀请新用户注册赠送积分活动 1855123
关于科研通互助平台的介绍 1698381