已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

The complete telomere-to-telomere genome assembly of lettuce

端粒 端粒结合蛋白 基因组 生物 遗传学 DNA 计算生物学 基因 DNA结合蛋白 转录因子
作者
Ke Wang,Jingyun Jin,Jingxuan Wang,Xinrui Wang,Jie Sun,Dian Meng,Xiangfeng Wang,Yong Wang,Li Guo
出处
期刊:Plant communications [Elsevier BV]
卷期号:5 (10): 101011-101011 被引量:3
标识
DOI:10.1016/j.xplc.2024.101011
摘要

Lettuce (Lactuca sativa L.) is an annual plant of the Asteraceae family, commonly used as a fresh-cut vegetable and a primary ingredient in salads. It is rich in vitamins, minerals, polyphenols, and carotenoids, providing numerous health benefits. In 2021, lettuce achieved a gross production value of $16.6 billion worldwide, with China, the United States, and Western Europe as leading lettuce producers (Food and Agriculture, 2023Food and AgricultureOrganization of the United Nations. FAOSTAT, Rome2023Google Scholar). Most cultivated lettuce varieties are inbred (2n = 18) and exhibit genetic diversity, rendering them susceptible to various abiotic and biotic stresses (Richard, 2004Richard N.,R. S.A.M.H., N.Diseases of Fruits and Vegetables Diagnosis and Management. KLUWER ACADEMIC PUBLISHERS, 2004Google Scholar; Galieni et al., 2015Galieni A. Di Mattia C. De Gregorio M. Speca S. Mastrocola D. Pisante M. Stagnari F. Effects of nutrient deficiency and abiotic environmental stresses on yield, phenolic compounds and antiradical activity in lettuce (Lactuca sativa L.).Sci. Hortic. 2015; 187: 93-101Crossref Scopus (0) Google Scholar). Hence, lettuce breeding efforts primarily focus on improving yield, quality, and disease resistance, heavily dependent on genetic and genomic resources such as molecular markers, reference genomes, and multi-omics data. The first lettuce genome was assembled using next-generation sequencing (NGS) reads in 2017 (Reyes-Chin-Wo et al., 2017Reyes-Chin-Wo S. Wang Z. Yang X. Kozik A. Arikit S. Song C. Xia L. Froenicke L. Lavelle D.O. Truco M.J. et al.Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce.Nat. Commun. 2017; 814953Crossref PubMed Scopus (294) Google Scholar). In 2022, the improved lettuce reference genome v11 of crisphead lettuce cultivar Salinas (GCA_002870075.4) was released; subsequently, Shen et al. (Shen et al., 2023Shen F. Qin Y. Wang R. Huang X. Wang Y. Gao T. He J. Zhou Y. Jiao Y. Wei J. et al.Comparative genomics reveals a unique nitrogen-carbon balance system in Asteraceae.Nat. Commun. 2023; 144334Crossref Scopus (7) Google Scholar) assembled the genome of stem lettuce (L. sativa var. Augustana). Although these assemblies have greatly facilitated lettuce research (Wei et al., 2021Wei T. van Treuren R. Liu X. Zhang Z. Chen J. Liu Y. Dong S. Sun P. Yang T. Lan T. et al.Whole-genome resequencing of 445 Lactuca accessions reveals the domestication history of cultivated lettuce.Nat. Genet. 2021; 53: 752-760Crossref PubMed Scopus (66) Google Scholar; Gao et al., 2022Gao F. Li J. Zhang J. Li N. Tang C. Bakpa E.P. Xie J. Genome-wide identification of the ZIP gene family in lettuce (Lactuca sativa L.) and expression analysis under different element stress.PLoS One. 2022; 17e0274319Crossref Scopus (6) Google Scholar; Pink et al., 2022Pink H. Talbot A. Graceson A. Graham J. Higgins G. Taylor A. Jackson A.C. Truco M. Michelmore R. Yao C. et al.Identification of genetic loci in lettuce mediating quantitative resistance to fungal pathogens.Theor. Appl. Genet. 2022; 135: 2481-2500Crossref PubMed Scopus (4) Google Scholar; Shen et al., 2023Shen F. Qin Y. Wang R. Huang X. Wang Y. Gao T. He J. Zhou Y. Jiao Y. Wei J. et al.Comparative genomics reveals a unique nitrogen-carbon balance system in Asteraceae.Nat. Commun. 2023; 144334Crossref Scopus (7) Google Scholar), they remain highly fragmented and incomplete—containing hundreds of gaps and omitting key genetic elements such as centromeres, rDNA, and telomeres—continues to hinder progress in genomic research, gene cloning, and molecular breeding. Here, we report the first complete telomere-to-telomere (T2T) genome of the L. sativa cv. PKU06 (Figure 1A), which is widely cultivated and consumed. This assembly included 112.4× coverage of PacBio high-fidelity (HiFi) long reads, 42.9× coverage of Oxford Nanopore Technology (ONT) ultra-long reads (N50 > 100 kb), and 118.8× coverage of Hi-C reads (Supplemental Table 1). Genome assembly was performed using an in-house pipeline (Supplemental Figure 1) as follows. First, the HiFi and ONT reads were assembled using hifiasm, resulting in a draft genome of 125 contigs. After removing microbial and plastid sequences, these contigs were anchored to nine chromosomes using Hi-C data (Supplemental Figure 2). Errors in placement or mis-orientation of the contigs were manually corrected in Juicebox. This yielded a chromosome-scale assembly with only two remaining gaps on Chr4, which were subsequently filled with the ONT reads to achieve a gap-free assembly (Supplemental Figure 3). The two nucleolus organizer regions (NORs) on Chr1 and Chr8 were successfully resolved, containing a total of 8.63-Mb rDNA repeat arrays with 884 copies (Figure 1B). The final complete T2T genome (LsT2T) (Figure 1A) is 2593 Mb in size with a contig N50 of 320.7 Mb, marking a 2565.6% increase in N50 compared to the 12.5 Mb of Salinas (Supplemental Table 2). In addition, we identified all 18 telomeres using the seven-base telomere repeats (CCCTAAA and TTTAGGG) (Supplemental Table 3). LsT2T showed high synteny (96.96%) to the Salinas genome, though it displayed structural variants likely due to differences between the two cultivars (Supplemental Figure 4). Notably, LsT2T closed 384 gaps present in the Salinas genome, substantially improving the contiguity of the lettuce genome (Supplemental Table 2). Extensive validation confirmed the accuracy of LsT2T. First, the Hi-C interaction map of LsT2T showed no obvious structural assembly errors (Supplemental Figure 2). Secondly, the alignment of all raw sequencing data to LsT2T yielded mapping rates of 99.9%, 96.4%, and 99.9% for HiFi, ONT, and NGS reads, respectively (Supplemental Table 1). Uniform genome-wide read coverage (Figure 1A) indicated a complete and highly accurate assembly. Interestingly, we observed sporadic instances of elevated coverage in ONT reads (Figure 1A; Supplemental Table 4) corresponding to chloroplast sequences, suggesting the integration of plastid genome within the nuclear genome. Furthermore, LsT2T has a quality value of 58 and a BUSCO score of 97.6% (Supplemental Table 2), demonstrating its high accuracy and completeness. Approximately 2.1 Gb of repetitive elements (REs) constituting 81.4% of the LsT2T genome were annotated, predominantly comprised of transposable elements (TEs) (Figure 1C; Supplemental Table 5). Notably, the majority of these TEs were LTR retrotransposons, with Gypsy and Copia elements representing 37.84% and 27.23% of the LsT2T genome, respectively. A total of 45507 protein-coding genes (Supplemental Table 6) were predicted in LsT2T using ab initio prediction, comparison with homologous proteins, and transcriptomic data from five different tissues sequenced using NGS and PacBio Iso-seq. Of these genes, 48.8% were functionally annotated using eggNOG-mapper, and 57.3% were expressed in at least one tissue, with a threshold of TPM ≥ 1 (Supplemental Table 6). Analysis of newly assembled sequences in LsT2T compared to the Salinas genome revealed that these sequences consisted of 2.09% genes, 31.34% REs, 16.9% centromeres, and 43.4% rDNA arrays (Supplemental Figure 5B), highlighting the significance of a complete genome in uncovering essential genomic regions. In addition, comparative analysis of the protein-coding genes in the LsT2T, Salinas, and Augustana genomes through orthogroup identification revealed a high degree of similarity across the three genome annotations, despite the differences in cultivar types, assembly quality, and annotation pipelines. LsT2T and Salinas (leaf lettuce) were more similar to each other than to Augustana (stem lettuce) in terms of the number of shared orthogroups (Supplemental Figure 5C). Centromeres, which are repeat-rich heterochromatic regions, are critical for accurate chromosome segregation during cell division (Cleveland et al., 2003Cleveland D.W. Mao Y. Sullivan K.F. Review Centromeres and Kinetochores: From Epigenetics to Mitotic Checkpoint Signaling Elements of the Mitotic Checkpoint, They Control Cell Cycle Advance during Cell Division. Defining the Locus the Centromere Challenges the Classic View of a Genetic.2003Google Scholar). The centromeres of lettuce were identified through ChIP-seq profiling using a lettuce-specific CENH3 (centromere-specific histone 3) antibody, which clearly delineated the boundaries of nine centromeres (Figure 1D; Supplemental Table 7), ranging in size from 2.7 Mb (Chr6) to 4.5 Mb (Chr7). The position of centromeres varied across chromosomes, with the ratio of long arm vs. short arm ranging from 1.1 (Chr6) to 3.2 (Chr8) (Figure 1A; Supplemental Figure 4A). Low sequence similarity among the centromeres was observed (Supplemental Figure 6), suggesting strong diversification. Centromeric repeats predominantly consisted of Gypsy (56.6%), Copia (13.1%), and satellites (16.3%), differing from those in non-centromeric regions (Figure 1C). In addition, centromeric Gypsy elements were dominated by Tekay, Angela, and centromeric retrotransposons of maize (CRMs) (Supplemental Figure 7A). Notably, CRMs appeared more frequently in centromeric than non-centromeric regions, consistent with previous reports for maize and cotton (Chen et al., 2023Chen J. Wang Z. Tan K. Huang W. Shi J. Li T. Hu J. Wang K. Wang C. Xin B. et al.A complete telomere-to-telomere assembly of the maize genome.Nat. Genet. 2023; 55: 1221-1231Crossref PubMed Scopus (52) Google Scholar; Chang et al., 2024Chang X. He X. Li J. Liu Z. Pi R. Luo X. Wang R. Hu X. Lu S. Zhang X. et al.High-quality Gossypium hirsutum and Gossypium barbadense genome assemblies reveal the landscape and evolution of centromeres.Plant Commun. 2024; 5100722https://doi.org/10.1016/j.xplc.2023.100722Abstract Full Text Full Text PDF Scopus (5) Google Scholar). Phylogenetic analysis of Gypsy revealed that centromeric CRMs formed a unique clade, suggesting the expansion of centromeric CRMs distinct from non-centromeric CRMs (Supplemental Figure 7B). The proportions of satellites in the centromeres varied from 3.25% (Chr3) to 60.14% (Chr1) (Figure 1D; Supplemental Table 7). De novo identification of centromeric satellite monomers using TRASH revealed 30-bp, 62-bp, 287-bp, and 123-bp monomers as predominated satellites (Supplemental Figure 7C). We also observed higher-order repeats (Figure 1E; Supplemental Figure 8), primarily composed of 62-bp monomers along with miscellaneous short repeats (Supplemental Figure 7C). Analysis of CENH3 enrichment demonstrated that CENH3 preferentially binds to Gypsy elements and satellite sequences (Figure 1E; Supplemental Figures 8 and 9), highlighting their importance in centromere function. Despite the decoded lettuce genome, its 3D genomic landscape remains largely unexplored. We utilized miniMDS to model the 3D structure of the lettuce genome using high-resolution Hi-C data (Supplemental Figure 10). The 2.59-Gb lettuce genome is organized into topologically associated domains (TADs) and A/B compartments, exhibiting a low frequency of A/B compartment switching. Notably, all centromeres were localized in the B compartment (Figure 1E; Supplemental Figure 11). The A compartment demonstrated a higher gene density and lower TE density than the B compartment, and both compartments displayed distinctive epigenetic markers (Figure 1E; Supplemental Figure 11). ChIP-seq analysis of histone modifications revealed that H3K4me3 and H3K27me3, which mark transcription activation and repression, respectively, were enriched in A compartments, whereas B compartments showed enrichment for H3K9me2, typically associated with heterochromatin (Figure 1E; Supplemental Figure 11). This conserved pattern is consistent with those observed in most plant 3D genomes reported thus far. Given the susceptibility of cultivated lettuce to diseases, developing disease-resistant cultivars is crucial for environment-friendly disease management. Nucleotide-binding site leucine-rich repeat (NLR) proteins are crucial for plant immunity against pathogens (Chou et al., 2023Chou W.C. Jha S. Linhoff M.W. Ting J.P.Y. The NLR gene family: from discovery to present day.Nat. Rev. Immunol. 2023; 23: 635-654Crossref PubMed Scopus (9) Google Scholar). Our systematic analysis identified 514 putative NLR genes in the LsT2T genome, which were classified into seven subfamilies based on a phylogenetic analysis of the NB-ARC domain (Figure 1F). This classification indicates high phylogenetic diversity. By contrast, the same approach identified only 484 NLR genes in the v11 genome. The majority of NLR genes in the LsT2T genome were tandemly duplicated and genomically clustered, particularly on Chr1 and Chr2 (Figure 1G). Interestingly, four new NLRs were identified in the filled gap regions of LsT2T (Figure 1H; Supplemental Figure 12), including one specifically located within a gap region of Chr4 that was exclusively covered by ONT reads mapped to LsT2T . Transcriptomic analysis of the 514 NLR genes (Supplemental Table 8) revealed that 58 of these genes were significantly upregulated during gray mold (Botrytis cinerea) infection compared to mock treatments, and 38 of these genes encoding TIR-NB-ARC(-LRR) domains were predominantly upregulated (Figure 1F; Supplemental Table 9). The most significantly upregulated NLR gene, lettuce_v2_00029769, is homologous to the Arabidopsis thaliana AT5G36930 gene, which encodes a TIR-NB-ARC-LRR type NLR. The future functional characterization of these infection-induced NLR genes, as revealed by the T2T genome, will provide deeper insights into the mechanisms of lettuce immunity against pathogens. In summary, we generated the complete T2T genome of lettuce, the first for Asterids, and thoroughly dissected the complex genetic and epigenetic landscape of its centromeres. This genome will serve as an essential resource for advancing lettuce research and facilitating genetic improvements. All raw sequencing data generated for this project have been deposited in the China National Center for Bioinformation under accession number CRA014517, accessible at the link: https://ngdc.cncb.ac.cn/gsa/s/Pya57yDW. The genome assembly and annotation are available on Figshare at the following link: https://figshare.com/s/f5f0e8068d5a236ea408. This project was supported by the Key R&D Program of Shandong Province (ZR202211070163) and the Natural Science Foundation for Distinguished Young Scholars of Shandong Province (ZR2023JQ010). L.G. is also supported by the Taishan Scholars Program of Shandong Province.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
zzz完成签到 ,获得积分10
2秒前
XhuaQye完成签到,获得积分10
5秒前
贪玩的秋柔举报吉吉求助涉嫌违规
5秒前
yue完成签到 ,获得积分10
7秒前
evepeace发布了新的文献求助10
9秒前
9秒前
王m完成签到 ,获得积分10
10秒前
慕青应助Chara_kara采纳,获得10
11秒前
11秒前
gmugyy完成签到,获得积分10
11秒前
酷波er应助高牙采纳,获得10
12秒前
13秒前
凡仔发布了新的文献求助10
14秒前
古风完成签到 ,获得积分10
15秒前
李健应助xhuang采纳,获得30
15秒前
脑洞疼应助婉孝采纳,获得10
17秒前
热情莆发布了新的文献求助10
17秒前
Hello应助yuanyuan采纳,获得10
18秒前
NexusExplorer应助arton采纳,获得10
19秒前
奋斗灵安完成签到,获得积分10
21秒前
酷波er应助LZC采纳,获得10
21秒前
22秒前
22秒前
22秒前
23秒前
打倒恶人完成签到,获得积分10
23秒前
24秒前
Chara_kara发布了新的文献求助10
25秒前
dewingel完成签到 ,获得积分10
25秒前
FF完成签到 ,获得积分0
26秒前
婉孝发布了新的文献求助10
28秒前
29秒前
缓慢氧化完成签到,获得积分10
30秒前
zhongbo发布了新的文献求助10
30秒前
32秒前
zhu发布了新的文献求助10
33秒前
hx完成签到 ,获得积分10
34秒前
zict2010完成签到,获得积分10
35秒前
洗洗睡完成签到,获得积分10
35秒前
nanshou完成签到,获得积分10
35秒前
高分求助中
Cronologia da história de Macau 1600
Treatment response-adapted risk index model for survival prediction and adjuvant chemotherapy selection in nonmetastatic nasopharyngeal carcinoma 1000
Lloyd's Register of Shipping's Approach to the Control of Incidents of Brittle Fracture in Ship Structures 1000
BRITTLE FRACTURE IN WELDED SHIPS 1000
Intentional optical interference with precision weapons (in Russian) Преднамеренные оптические помехи высокоточному оружию 1000
Atlas of Anatomy 5th original digital 2025的PDF高清电子版(非压缩版,大小约400-600兆,能更大就更好了) 1000
Toughness acceptance criteria for rack materials and weldments in jack-ups 800
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 计算机科学 化学工程 生物化学 物理 复合材料 内科学 催化作用 物理化学 光电子学 细胞生物学 基因 电极 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6194617
求助须知:如何正确求助?哪些是违规求助? 8021966
关于积分的说明 16695292
捐赠科研通 5290154
什么是DOI,文献DOI怎么找? 2819408
邀请新用户注册赠送积分活动 1799093
关于科研通互助平台的介绍 1662087