DNA测序
计算机科学
索引
杂交基因组组装
数据科学
基因组学
深度测序
计算生物学
大规模并行测序
快照(计算机存储)
基因组
数据挖掘
参考基因组
生物
DNA
遗传学
数据库
单核苷酸多态性
基因
基因型
作者
Ka‐Chun Wong,Jiao Zhang,Shankai Yan,Xiangtao Li,Qiuzhen Lin,Sam Kwong,Cheng Liang
摘要
The recent advances in DNA sequencing technology, from first-generation sequencing (FGS) to third-generation sequencing (TGS), have constantly transformed the genome research landscape. Its data throughput is unprecedented and severalfold as compared with past technologies. DNA sequencing technologies generate sequencing data that are big, sparse, and heterogeneous. This results in the rapid development of various data protocols and bioinformatics tools for handling sequencing data. In this review, a historical snapshot of DNA sequencing is taken with an emphasis on data manipulation and tools. The technological history of DNA sequencing is described and reviewed in thorough detail. To manipulate the sequencing data generated, different data protocols are introduced and reviewed. In particular, data compression methods are highlighted and discussed to provide readers a practical perspective in the real-world setting. A large variety of bioinformatics tools are also reviewed to help readers extract the most from their sequencing data in different aspects, such as sequencing quality control, genomic visualization, single-nucleotide variant calling, INDEL calling, structural variation calling, and integrative analysis. Toward the end of the article, we critically discuss the existing DNA sequencing technologies for their pitfalls and potential solutions.
科研通智能强力驱动
Strongly Powered by AbleSci AI