基因
人类基因组
计算生物学
注释
生物
基因组
遗传学
核糖核酸
基因注释
作者
Paulo Amaral,Sílvia Carbonell Sala,Francisco M. De La Vega,Tiago Faial,Adam Frankish,T Gingeras,Roderic Guigó,Jennifer Harrow,Artemis G. Hatzigeorgiou,Rory Johnson,Terence D. Murphy,Mihaela Pertea,Kim D. Pruitt,Shashikant Pujar,Hazuki Takahashi,Igor Ulitsky,Ales Varabyou,Christine A. Wells,Mark Yandell,Piero Carninci,Steven L. Salzberg
出处
期刊:Nature
[Springer Nature]
日期:2023-10-04
卷期号:622 (7981): 41-47
被引量:47
标识
DOI:10.1038/s41586-023-06490-x
摘要
Scientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years. Beside the ongoing annotation of protein-coding genes, their isoforms and pseudogenes, the invention of high-throughput RNA sequencing and other technological breakthroughs have led to a rapid growth in the number of reported non-coding RNA genes. For most of these non-coding RNAs, the functional relevance is currently unclear; we look at recent advances that offer paths forward to identifying their functions and towards eventually completing the human gene catalogue. Finally, we examine the need for a universal annotation standard that includes all medically significant genes and maintains their relationships with different reference genomes for the use of the human gene catalogue in clinical settings. Although the catalogue of human protein-coding genes is nearing completion, the number of non-coding RNA genes remains highly uncertain, and for all genes much work remains to be done to understand their functions.
科研通智能强力驱动
Strongly Powered by AbleSci AI