作者
Jiming Gao,Jing Wang,Ruishan Wang,Jian Wang,Ying Ma,Jian Yang,Bada-Lahu Tai,LI Chong-jiu,Xingyun Chai,Shungang Jiao,Tong Chen,Han Zheng,Xiang Li,Liping Kang,Chao Jiang,Junhui Zhou,Juan Liu,Lusheng Huang
摘要
Syringa pinnatifolia, a shrub within the Syringa genus of the Oleaceae family (Figure S1), is valued in traditional Chinese medicine for its volatile chemical constituents. These principally include a suite of humulane-type monocyclic sesquiterpenoids with an 11-membered ring, notably zerumbone and its derivatives. Renowned for their cardioprotective, immunomodulatory and anti-carcinogenic properties, these compounds are found extensively in various monocot species (Haque et al., 2017) yet are conspicuously present in the eudicot S. pinnatifolia. The biosynthesis pathway of zerumbone has been elucidated in the monocot Zingiber zerumbet (Okamoto et al., 2011; Yu et al., 2008, 2011), while it is still unclear in the eudicot S. pinnatifolia. Seeking to characterize the biosynthetic pathway of zerumbone in S. pinnatifolia, we assembled a chromosome-level genome utilizing 105.17 Gb of Illumina sequencing data, 309.64 Gb of PacBio long reads and 123.54 Gb of Hi-C paired-end reads (Tables S1–S3; Appendix S1). The PacBio data facilitated the construction of a contig-level genome assembly spanning 998.20 Mb (Table S4), and subsequent integration of Hi-C data that we anchored to the 23 pseudochromosomes (Figure S2; Tables S5–S7). A de novo annotation predicted 43 197 genes, and a BUSCO analysis indicated gene coverage of 93.9% (Figure S3). An MCMCtree (Yang, 2007) analysis suggested that S. pinnatifolia and S. oblata diverged at the boundary of the Paleogene and Neogene periods (Figure S4). The Ks value distributions among anchored paralogs and orthologs indicated that at least two whole-genome duplication (WGD) events have occurred in S. pinnatifolia (Figure 1a), and an intragenomic synteny analysis revealed remnants of a WGD event common to Oleaceae species (Wang et al., 2022) (Figure S5). Comparative analysis of the volatile components of S. pinnatifolia and Z. zerumbet revealed identical compounds involved in the elucidated biosynthesis pathway of zerumbone, suggesting a conserved biosynthetic pathway in S. pinnatifolia (Figure S6). In the monocot Z. zerumbet, the biosynthesis pathway of zerumbone begins with the production of α-humulene by cyclization of farnesyl pyrophosphate (FPP) by ZSS1 (a terpene synthase) (Yu et al., 2008). We identified 32 terpene synthase (TPS) genes divided into TPS-a, TPS-b, TPS-c, TPS-e/f and TPS-g subfamilies of the TPS family in S. pinnatifolia (Figure S7). Subsequently, we employed the TPS belonging to TPS-a and TPS-g (Table S8), which are primarily responsible for sesquiterpenoid production in TPS family (Chen et al., 2011) to seek the enzyme functioning to cyclization of FPP producing α-humulene. Using FPP as the substrate, we employed the Escherichia coli heterologous expression system to estimate the function of the TPS candidate genes (Figure S8; Appendix S1). It found that TPS8, which belongs to TPS-a subfamily in S. pinnatifolia, had the ability to facilitate the cyclization of FPP to generate α-humulene along with β-caryophyllene (Figures 1b and S9). Except for TPS8, we characterized six TPS functioning to generate 14 different sesquiterpenoids along with three TPS producing monoterpenoids (Figures S10–S17). The products of TPS8 is same as the α-humulene synthase in Z. zerumbet (ZSS1), and they both belong to TPS-a subfamily (Figure S18), although the identity of them was only 40.93%. In percentage, the ratio of the two products (α-humulene and β-caryophyllene) in ZSS1 is different from that in TPS8, we found that ZSS1 produced more α-humulene of the two products in Z. zerumbet (approximately 88%) and the number of TPS8 is about 23% (Figure S19a), which may be a factor resulting in higher zerumbone content in the monocot species as the zerumbone content comparison of the tissues where this component mainly accumulates in S. pinnatifolia (stem) and Z. zerumbet (rhizome) shown (Figure S19b). Analysing the homologous of TPS8 within Oleaceae family, we identified three genes from the reported Oleaceae species genomes (Figure 1c). We estimated the their catalysing effects on FPP employing E. coli heterologous expression system and found that OeTPSa from O. europaea can produce α-humulene and β-caryophyllene as TPS8 did (Figure S20). Cytochrome P450 (P450) enzymes perform various bio-oxidation reactions that collectively greatly expand the chemical diversity of terpenoids (Zheng et al., 2019). Our annotation identified 217 putative CYP450 genes divided into eight clans (CYP51, CYP71, CYP72, CYP74, CYP710, CYP85, CYP86 and CYP97) in the S. pinnatifolia genome (Figure S21). CYP71, CYP72, CYP85 and CYP86 are multi-family clans with larger scales than the single-family clans, and CYP71 clan includes majority of the CYP450 families involved in sesquiterpenoid metabolism up to now (Zheng et al., 2019). Focussing on CYP71 and the related clans, in vitro functional assays in which microsomes representing more than 100 P450s (from yeast cells) were fed zerumbone as the substrate revealed a new peak for the CYP76S105 reaction (Appendix S1). Comparison between the retention time (13.54 min) and mass spectrum (m/z 234.22) with a chemical standard confirmed the product as zerumbone epoxide, apparently resulting from C2 to C3 epoxidation of zerumbone (Figures 1d and S22). In this way, we found a downstream P450-catalysed step of zerumbone biosynthesis pathway, which has not been revealed in the monocot Z. zerumbet (Figure 1e). Moreover, the epoxidation reaction that adds a single oxygen atom across a carbon–carbon double bond has not been reported in CYP76 family as far as we know (Wang et al., 2021). To understand the evolutionary history of the S. pinnatifolia CYP76S105, we constructed a phylogenetic tree with the transcripts homologous to CYP76S105 (one transcripts with highest identity each species) from the 1000 Plant Transcriptome (1KP) database and the reported Oleaceae species genomes (Figure 1f; Table S9). Among the homologous transcripts, Ligustrum sinense LsCYP76S was the transcripts with the highest identity (92.86%) to CYP76S105 from 1KP database; O. europaea OeCYP76S was the transcripts with the highest identity (87.27%) from the reported Oleaceae species genomes except for S. pinnatifolia. The microsynteny analysis illustrated the evolutionary trajectory of CYP76S105 homologues within the Oleaceae family, and two homologous genes (OeCYP76S-a/b) with identical nucleotide sequences in different chromosomal locations (Figure 1g). Moreover, we found a S. pinnatifolia SpCYP76S-b with the highest identity (85.92%) to CYP76S105 in this plant itself, and it located close to CYP76S105 in chromosome 9. Employing zerumbone as substrate, we found that L. sinense LsCYP76S (designated CYP76S110) and O. europaea OeCYP76S (designated CYP76S63) can convert zerumbone to zerumbone epoxide (Figure S23), while the other candidates including the S. pinnatifolia SpCYP76S-b with a high identity to CYP76S105 cannot. The two transcripts of CYP76S63 emerged as duplicated gene from the WGD event, with catalysing effect on zerumbone; however, the gene pair (CYP76S105/SpCYP76S-b) formed from tandem duplication (TD) event did not encode the enzyme with same catalysing function on zerumbone, showing the complex evolutionary history about the loss and gain of the gene function in Oleaceae family (Figure 1g). In summary, this study presents a chromosome-level assembly of S. pinnatifolia aimed at elucidating the zerumbone biosynthesis pathway in eudicots. We provided a TPS responsible for cyclization of FPP to facilitate the production of α-humulene, which is the sesquiterpene intermediate in the zerumbone biosynthesis pathway. The finding of the epoxidation reaction catalysed by CYP76 enzymes would broaden the insight of the functions in this family. This work was supported by the CACMS Innovation Fund (CI2021A04101, CI2023E002), the Fundamental Research Funds for the Central Public Welfare Research Institutes (Grant No. ZZ13-YQ-093). The authors declare no competing interests. J. L. and L. H. designed and led the project. B. T., X. C. and S. J. collected and provided the samples. Jing W. and T. C. performed the assembly, annotation and estimated the genome. J. G. carried out the identifications of TPS and CYP450. J. G., Jian W., J. Y., X. L. and L. K. performed the analysis of product from TPS. J. G., R. W. and Y. M. contributed toward the in vitro functional assay of CYP450. C. L. supported the analysis of the products from CYP450. H. Z. and J. Z supported the construction of plasmid. J. G., Jing W. and J. L. wrote the manuscript. C. J. contributed to edit the manuscript. All of the authors read and approved the final manuscript. The genome data have been deposited in Genome Warehouse in China National Genomics Data Center (NGDC) under the BioProject accession number PRJCA023451 and the Biosample number SAMC3365680. Appendix S1 Supplemental materials and methods. Figure S1-S23 and Table S1-S9 Supplemental Figures and Tables. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.