顺序装配
连续性
计算机科学
杂交基因组组装
图形
公制(单位)
算法
水准点(测量)
基因组
人类基因组
计算生物学
理论计算机科学
生物
遗传学
基因
工程类
操作系统
地理
转录组
基因表达
运营管理
大地测量学
作者
Mikhail Kolmogorov,Jeffrey Yuan,Yu Lin,Pavel A. Pevzner
标识
DOI:10.1038/s41587-019-0072-8
摘要
Accurate genome assembly is hampered by repetitive regions. Although long single molecule sequencing reads are better able to resolve genomic repeats than short-read data, most long-read assembly algorithms do not provide the repeat characterization necessary for producing optimal assemblies. Here, we present Flye, a long-read assembly algorithm that generates arbitrary paths in an unknown repeat graph, called disjointigs, and constructs an accurate repeat graph from these error-riddled disjointigs. We benchmark Flye against five state-of-the-art assemblers and show that it generates better or comparable assemblies, while being an order of magnitude faster. Flye nearly doubled the contiguity of the human genome assembly (as measured by the NGA50 assembly quality metric) compared with existing assemblers.
科研通智能强力驱动
Strongly Powered by AbleSci AI