顺序装配
基因组
芒果
生物
参考基因组
全基因组测序
遗传学
从头转录组组装
转录组
基因
植物
基因表达
作者
Nagendra Kumar Singh,Ajay Kumar Mahato,Pawan Kumar Jayaswal
出处
期刊:Compendium of plant genomes
日期:2021-01-01
卷期号:: 165-186
被引量:5
标识
DOI:10.1007/978-3-030-47829-2_10
摘要
Mango (Mangifera indica L.) is one of the most important fruits of the world both in terms of value and volume. Wild Mangifera species are distributed throughout South and South-East Asia, but mango cultivars have originated in India which produces more than fifty percent of the world's mango with more than thousand varieties. The genome size of mango estimated by flow cytometry is 402 ± 10 Mbp (2n = 40). Here, we present a brief account of the global efforts for sequencing of mango genome and transcriptome studies for the analysis of trait-related differential gene expression. High heterozygosity of about 2.5% made it difficult to assemble the genome of mango using Illumina short sequence reads of 100–150 bp because it did not permit the assembly of its maternal and paternal genomes into an integrated mosaic genome assembly. The problem was overcome by using PacBio SMRT long sequence reads for genome assembly using long overlaps of 500 bp and high mismatch of 15% to take care of the heterozygosity, resulting in the first draft genome assembly of 323 Mb of mango variety 'Amrapali' (NCBI Accession no. LMWC00000000 v.1). The draft genome assembly was updated to a reference quality assembly of 403 Mbp with 4312 scaffolds anchored to 20 chromosome pseudomolecules (LMWC01000000 v.2). The latest v.3 Amrapali assembly, assisted by BioNano optical fingerprinting and Hi-C conformation capture sequencing, has 2314 scaffolds with a high N50 value of 11.78 Mpb. Mapping sequence reads from 18 different transcriptome studies showed an average 96% coverage of the gene space, and BUSCO analysis of 1440 genes showed 93.4% coverage of the conserved eukaryote orthologs. A total of 46,395 protein-coding genes have been predicted showing maximum homology with Citrus sinensis. The Amrapali genome has 45% repeats and large segmental duplications, indicating at least one recent (15.87–31.74 Mya) and one ancient (253.96–269.84 Mya) whole genome duplication. Mosaic genome assemblies of mango variety 'Tomy Atkins' from USA and 'Kensington Pride' from Australia have also been presented at the annual Plant and Animal Genome (PAG) conferences. Recently, two more mosaic genome assemblies of mango varieties 'Hong Jian Ha' and 'Alphonso' have been reported using PacBio SMRT sequencing with Falcon-unzip and 'Canu' software to separate the primary contigs (P-contigs) and haplotigs (H-contigs). However, there is still a need to develop clean phase-separated reference assemblies of the maternal and paternal genomes of highly heterozygous mango using the recent trio-binning approach. The reference genome will help accelerate breeding of dwarf, stress tolerant, and high-quality mango varieties.
科研通智能强力驱动
Strongly Powered by AbleSci AI