康蒂格
顺序装配
火炬松
杂交基因组组装
百万-
基因组大小
基因组
纳米孔测序
深度测序
计算机科学
计算生物学
生物
松属
遗传学
基因
转录组
植物
天文
基因表达
物理
作者
Aleksey V. Zimin,Kristian Stevens,Marc W. Crepeau,Daniela Puiu,Jill L. Wegrzyn,James A. Yorke,Charles H. Langley,David B. Neale,Steven L. Salzberg
出处
期刊:GigaScience
[Oxford University Press]
日期:2017-01-01
卷期号:6 (1)
被引量:65
标识
DOI:10.1093/gigascience/giw016
摘要
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
科研通智能强力驱动
Strongly Powered by AbleSci AI