已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

RLKdb: A comprehensively curated database of plant receptor-like kinase families

生物 计算生物学 数据库 生物信息学 计算机科学
作者
Zhiyuan Yin,Jinding Liu,Daolong Dou
出处
期刊:Molecular Plant [Elsevier BV]
卷期号:17 (4): 513-515 被引量:1
标识
DOI:10.1016/j.molp.2024.02.014
摘要

Since the first plant receptor-like kinase (RLK) gene ZmPK1 was cloned from Zea mays in 1990 (Walker and Zhang, 1990Walker J.C. Zhang R. Relationship of a putative receptor protein kinase from maize to the S-locus glycoproteins of Brassica.Nature. 1990; 345: 743-746Crossref PubMed Google Scholar), this large gene family has been extensively studied and shown to play crucial roles in growth, development, and immunity (Tang et al., 2017Tang D. Wang G. Zhou J.M. Receptor Kinases in Plant-Pathogen Interactions: More Than Pattern Recognition.Plant Cell. 2017; 29: 618-637Crossref PubMed Scopus (464) Google Scholar). RLKs are widespread in the plant kingdom. However, biological functions of most RLKs remain largely elusive (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar). Given RLKs share a conserved monophyletic RLK/Pelle kinase domain, RLKs in several model plants are classified into distinct families by extracellular domains (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). However, independent domain shuffling in specific lineages drives the origin of novel families, which raises a question what is the landscape of RLKs across the entire plant kingdom? Previously, sequence-homology-based methods have been widely used for RLK identification and classification, which might miss distantly related proteins with similar structures and potential novel families not mentioned in the literature. The academic community urgently requires a dedicated database for a systematic overview of the RLK gene family, providing data support for in-depth research on RLK genes. Here, we used a topology-based method to accurately isolate the RLKomes from proteomes. The obtained RLKomes were further classified into (sub)families based on extracellular domains. We constructed a comprehensively curated plant RLK database (https://biotec.njau.edu.cn/rlkdb), which contains valuable resources for investigating the origin and evolution of the RLK family and multiple online tools for personalized analysis. To obtain the landscape of RLKs in plants, we collected 300 plant genomes with chromosome-level assemblies for identification of RLKs. In addition to some significant model species, including Arabidopsis, rice, and maize, these plant genomes encompass representatives from 4 phyla, 12 classes, and 45 orders (Figure 1A; Supplemental Table 1). We adopted a previously described pipeline developed by our group to identify plant RLKs (Yin et al., 2023Yin Z. Shen D. Zhao Y. Peng H. Liu J. Dou D. Cross-kingdom analyses of transmembrane protein kinases show their functional diversity and distinct origins in protists.Comput. Struct. Biotechnol. J. 2023; 21: 4070-4078Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar). In Arabidopsis thaliana, our pipeline identified 468 RLKs, representing a 72% increase compared to the Ensembl annotation (Martin et al., 2023Martin F.J. Amode M.R. Aneja A. Austine-Orimoloye O. Azov A.G. Barnes I. Becker A. Bennett R. Berry A. Bhai J. et al.Ensembl 2023.Nucleic Acids Res. 2023; 51: D933-D941Crossref PubMed Scopus (92) Google Scholar). We further examined the reliability of our pipeline with reference to the 610 putative RLKs reported by Shiu and Bleecker (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). Among these, we observed that our pipeline missed 144 putative RLKs while predicting two novel RLKs. In the missed RLKs, 16 putative RLK gene models were removed from the current genome assembly, and 128 putative RLKs do not have a transmembrane domain. Several methods were also used to identify leucine-rich repeat (LRR)-RLKs and some other families (Man et al., 2020Man J. Gallagher J.P. Bartlett M. Structural evolution drives diversification of the large LRR-RLK gene family.New Phytol. 2020; 226: 1492-1505Crossref PubMed Scopus (45) Google Scholar, Man et al., 2023Man J. Harrington T. Lally K. Bartlett M. Asymmetric evolution of protein domains in the leucine-rich repeat receptor-like kinase (LRR-RLK) family of plant developmental coordinators.bioRxiv. 2023; (Preprint at)https://doi.org/10.1101/2023.03.13.532436Crossref Scopus (0) Google Scholar; Ngou et al., 2022Ngou B.P.M. Heal R. Wyler M. Schmid M.W. Jones J.D.G. Concerted expansion and contraction of immune receptor gene repertoires in plant genomes.Nat. Plants. 2022; 8: 1146-1152Crossref PubMed Scopus (28) Google Scholar, Ngou et al., 2024Ngou B.P.M. Wyler M. Schmid M.W. Kadota Y. Shirasu K. Evolutionary trajectory of pattern recognition receptors in plants.Nat. Commun. 2024; 15: 308Crossref PubMed Scopus (0) Google Scholar). Comparatively, our pipeline has high accuracy and is suitable for systematic and high-throughput identification of RLKomes covering all the different families. In total, 220 038 RLKs were identified from 300 plant genomes. The RLKome size ranges from 1 to 2459, with an average proteome percentage of 1.35% (Figure 1B; Supplemental Table 1). In the past three decades, more than a dozen RLK families have been described (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar), but a systematic and automatic pipeline for the classification of RLKome is still lacking. PRGdb (http://prgdb.org/prgdb4/) is a database about pathogen receptor genes but only provides the whole list of RLKs, lacking detailed gene information and classified families. According to their distinct extracellular domain structures, RLKs were divided into 18 families. Among them, 15 families have known Pfam annotations. The remaining unannotated RLKs were clustered by protein sequence similarity, which further yielded the proline-rich extensin-like receptor kinase and unknown disordered 1 families. All the unclassified RLKs were defined as the unclassified family. LRR (44.0%), G-LecRLK (13.9%), and wall-associated kinase (11.1%) are the largest families, which make up 69% of the RLKdb (Figure 1C). The large and well-known families occur in almost all the 300 plant genomes here, while the thaumatin, glycoside hydrolase family 19; cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins; and proline-rich membrane anchor 1 families are only found in specific lineages. RLKdb has a very concise and user-friendly web interface. Through the home page or the navigation menu, users can open an RLK family (Supplemental Figure 1) or RLKome page (Supplemental Figure 2) to explore the database. In the RLK family page, the first section contains its family description, its lineage coverage, and a list box for switching to other families (Supplemental Figure 1A). The following section is an interactive table of genomes that possess the corresponding RLK family (Supplemental Figure 1B). Through the load button in the table, users can load an RLK family of interest into the third section (Supplemental Figure 1C). The RLK members and landscape of the family can be displayed in five panels: (1) the RLK table panel shows all RLK members, (2) the linkage map panel displays the positions of RLK members in the genome, (3) the length distribution panel exhibits the distribution of RLK protein lengths, (4) the domain topology panel presents the percentage of various function domain topologies and a domain word cloud, and (5) the phylogeny panel showcases the evolutionary relationships among RLK members. The RLKome page has a similar layout. Its initial section provides information about the plant genome, including details on species, lineage, taxonomy, genome assembly, cultivar, and more (Supplemental Figure 2A). The second section is a column chart showing the number of different RLK families in the RLKome. By clicking on an RLK family name, the corresponding RLK family can be retrieved and displayed in the five panels that are identical to the family page. By clicking on the hyperlinks associated with RLK IDs in the RLK table panel, users can access a dedicated RLK page displaying its detailed information (Figure 1D). In the RLK page, the first section provides a snapshot of RLK protein structure, along with essential details such as species, data source, and family information (Supplemental Figure 3A). The second section contains six panels: (1) the gene model panel shows gene exon-intron structure and domain topology in protein (Supplemental Figure 3B), (2) the transcription factor binding site panel provides a table of transcription factor binding sites upstream of the RLK gene (Supplemental Figure 3C), (3) the primer panel offers five pairs of qPCR primers (Supplemental Figure 3D), (4) the structure panel exhibits the 3D structure of the RLK protein and its ligand binding sites (Supplemental Figure 3E), (5) the interaction panel presents RLK's potential interacting proteins based on the experimentally validated protein interactions collected in the STRING database (Supplemental Figure 3F), and (6) the phylogeny panel includes a Sankey diagram to show the distribution of corresponding RLK subfamily across plant species, an interactive table of RLK subfamily members, and a phylogeny tree containing the members of the RLK subfamily (Supplemental Figure 3G). Through the phylogeny tree and the Sankey diagram, users can intuitively see the relatedness of a particular RK of interest across the diversity of plant species in the database. We also developed online tools that enable users to search and classify RLKs into different families (Figure 1E). The web-based tool allows a user to upload a proteome or transcriptome file in FASTA format (Supplemental Figure 4A). The sequences undergo processing through the pipeline on a multi-core and GPU Linux server. For a proteome file, the user will obtain an RLK annotation file containing information on signal peptide, transmembrane, kinase, and other domain regions, along with an RLK sequence file. In the case of a transcriptome file, users will receive an additional open reading frame annotation file that highlights coding regions in the transcript sequences. To enhance database accessibility, the BLAST and Foldseek programs have been integrated to support sequence similarity and structure similarity retrieval, respectively (Supplemental Figures 4B and 4C). In summary, we have accurately annotated the RLKomes and classified RLK families of 300 plant genomes with chromosome-level assemblies. The RLKdb provides comprehensive information of the RLKome, the RLK family, and RLKs. An online tool for genome- and transcriptome-wide identification and classification of RLKs was also developed. The valuable resources and tools will aid evolutionary and functional studies of plant RLKs. This study was supported by grants from the National Natural Science Foundation of China (32270208, 32202251, and 32230089), the Fundamental Research Funds for the Central Universities (KYCXJC2023001 and KYQN2023039), the Natural Science Foundation of Jiangsu Province (BK20221000), and the China Agricultural Research System (CARS-21).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
小宋爱睡觉完成签到,获得积分10
1秒前
2秒前
厘米完成签到,获得积分10
3秒前
4秒前
7秒前
爱笑发夹完成签到,获得积分10
9秒前
机灵的忆梅完成签到 ,获得积分10
9秒前
10秒前
Komorebi发布了新的文献求助10
10秒前
123完成签到,获得积分10
10秒前
GGBAO发布了新的文献求助10
11秒前
11秒前
13秒前
天天关注了科研通微信公众号
14秒前
Lionnn发布了新的文献求助10
15秒前
PingxuZhang完成签到,获得积分10
16秒前
儒雅涵易完成签到 ,获得积分10
17秒前
17秒前
cheng发布了新的文献求助10
19秒前
kayyu发布了新的文献求助10
19秒前
Bonnie发布了新的文献求助10
20秒前
22秒前
25秒前
27秒前
劳永杰发布了新的文献求助10
27秒前
lhy发布了新的文献求助10
28秒前
小马甲应助张绵羊采纳,获得10
28秒前
酷酷的冰真应助西科Jeremy采纳,获得20
28秒前
大模型应助易水采纳,获得10
29秒前
lzx应助kento采纳,获得100
30秒前
Lucas应助Bonnie采纳,获得10
31秒前
酷炫芷珊发布了新的文献求助10
31秒前
32秒前
NexusExplorer应助Ricky采纳,获得10
33秒前
33秒前
吴壮发布了新的文献求助10
36秒前
36秒前
69发布了新的文献求助10
37秒前
威武白薇发布了新的文献求助10
38秒前
38秒前
高分求助中
A new approach to the extrapolation of accelerated life test data 1000
Cognitive Neuroscience: The Biology of the Mind 1000
Technical Brochure TB 814: LPIT applications in HV gas insulated switchgear 1000
Immigrant Incorporation in East Asian Democracies 600
Nucleophilic substitution in azasydnone-modified dinitroanisoles 500
不知道标题是什么 500
A Preliminary Study on Correlation Between Independent Components of Facial Thermal Images and Subjective Assessment of Chronic Stress 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3968054
求助须知:如何正确求助?哪些是违规求助? 3513070
关于积分的说明 11166315
捐赠科研通 3248263
什么是DOI,文献DOI怎么找? 1794163
邀请新用户注册赠送积分活动 874892
科研通“疑难数据库(出版商)”最低求助积分说明 804626