DNA甲基化
计算生物学
生物
癌症
遗传学
亚硫酸氢盐测序
胎儿游离DNA
DNA测序
DNA
基因
基因表达
胎儿
产前诊断
怀孕
作者
Qiang Wei,Chao Jin,Yan Wang,Shanshan Guo,Xu Guo,Xiaonan Liu,Jing An,Jinliang Xing,Bingshan Li
摘要
Cell-free DNA (cfDNA) provides a convenient diagnosis avenue for noninvasive cancer detection. The current methods are focused on identifying circulating tumor DNA (ctDNA)s genomic aberrations, e.g. mutations, copy number aberrations (CNAs) or methylation changes. In this study, we report a new computational method that unifies two orthogonal pieces of information, namely methylation and CNAs, derived from whole-genome bisulfite sequencing (WGBS) data to quantify low tumor content in cfDNA. It implements a Bayes model to enrich ctDNA from WGBS data based on hypomethylation haplotypes, and subsequently, models CNAs for cancer detection. We generated WGBS data in a total of 262 samples, including high-depth (>20×, deduped high mapping quality reads) data in 76 samples with matched triplets (tumor, adjacent normal and cfDNA) and low-depth (~2.5×, deduped high mapping quality reads) data in 186 samples. We identified a total of 54 Mb regions of hypomethylation haplotypes for model building, a vast majority of which are not covered in the HumanMethylation450 arrays. We showed that our model is able to substantially enrich ctDNA reads (tens of folds), with clearly elevated CNAs that faithfully match the CNAs in the paired tumor samples. In the 19 hepatocellular carcinoma cfDNA samples, the estimated enrichment is as high as 16 fold, and in the simulation data, it can achieve over 30-fold enrichment for a ctDNA level of 0.5% with a sequencing depth of 600×. We also found that these hypomethylation regions are also shared among many cancer types, thus demonstrating the potential of our framework for pancancer early detection.
科研通智能强力驱动
Strongly Powered by AbleSci AI