生物医学
领域(数学)
计算机科学
数据科学
2019年冠状病毒病(COVID-19)
生物
生物信息学
医学
数学
病理
纯数学
传染病(医学专业)
疾病
作者
Rita González-Márquez,Luca Schmidt,Benjamin M. Schmidt,Philipp Berens,Dmitry Kobak
出处
期刊:Patterns
[Elsevier]
日期:2024-04-09
卷期号:5 (6): 100968-100968
被引量:9
标识
DOI:10.1016/j.patter.2024.100968
摘要
The number of publications in biomedicine and life sciences has grown so much that it is difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here, we present a two-dimensional (2D) map of the entire corpus of biomedical literature, based on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined with t-SNE tailored to handle samples of this size. We used our map to study the emergence of the COVID-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive website that allows easy exploration and will enable further insights and facilitate future research.
科研通智能强力驱动
Strongly Powered by AbleSci AI