基因组
生物
编码
注释
基因
计算生物学
基因组学
遗传学
基因预测
人类基因组
比较基因组学
系统基因组学
基因组计划
基因注释
进化生物学
系统发育学
克莱德
出处
期刊:Cell genomics
[Elsevier]
日期:2023-08-01
卷期号:3 (8): 100375-100375
被引量:9
标识
DOI:10.1016/j.xgen.2023.100375
摘要
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes—the Earth's code of life.
科研通智能强力驱动
Strongly Powered by AbleSci AI