纳米孔测序
端粒
基因组
计算生物学
顺序装配
串联重复
计算机科学
抛光
生物
遗传学
DNA
基因
工程类
机械工程
基因表达
转录组
作者
Ann M. Mc Cartney,Kishwar Shafin,Michael Alonge,Andrey V. Bzikadze,Giulio Formenti,Arkarachai Fungtammasan,Kerstin Howe,Chirag Jain,Sergey Koren,Glennis A. Logsdon,Karen H. Miga,Alla Mikheenko,Benedict Paten,Alaina Shumate,Daniela C. Soto,Ivan Sović,Jonathan Wood,Justin M. Zook,Adam M. Phillippy,Arang Rhie
出处
期刊:Nature Methods
[Springer Nature]
日期:2022-03-31
卷期号:19 (6): 687-695
被引量:59
标识
DOI:10.1038/s41592-022-01440-3
摘要
Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI