倍性
端粒
生物
基因组
遗传学
计算生物学
顺序装配
染色体
基因组学
德布鲁因图
图形
基因
计算机科学
理论计算机科学
基因表达
转录组
作者
Mikko Rautiainen,Sergey Nurk,Brian P. Walenz,Glennis A. Logsdon,David Porubskỳ,Arang Rhie,Evan E. Eichler,Adam M. Phillippy,Sergey Koren
标识
DOI:10.1038/s41587-023-01662-6
摘要
The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.
科研通智能强力驱动
Strongly Powered by AbleSci AI