计算机科学
软件
自由序列分析
杂交基因组组装
哈希表
参考基因组
索引
多序列比对
数据挖掘
DNA测序
序列比对
散列函数
算法
生物
遗传学
程序设计语言
DNA
基因
基因型
单核苷酸多态性
肽序列
作者
Heng Li,Richard Durbin
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2009-05-18
卷期号:25 (14): 1754-1760
被引量:43894
标识
DOI:10.1093/bioinformatics/btp324
摘要
The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals.We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows-Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is approximately 10-20x faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package.http://maq.sourceforge.net.
科研通智能强力驱动
Strongly Powered by AbleSci AI