Co-citation and Co-authorship Networks of Statisticians

引用计算机科学数据科学概率统计网络分析文献计量学复杂网络情报检索数据挖掘统计万维网数学量子力学物理

作者

Pengsheng Ji,Jiashun Jin,Zheng Tracy Ke,Wan‐Shan Li

出处

期刊：Journal of Business & Economic Statistics [Informa]
日期：2021-09-09 卷期号：40 (2): 469-485 被引量：35

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1080/07350015.2021.1978469

摘要

We collected and cleaned a large dataset on publications in statistics. The dataset consists of the co-author relationships and citation relationships of 83, 331 articles published in 36 representative journals in statistics, probability, and machine learning, spanning 41 years. The dataset allows us to construct many different networks, and motivates a number of research problems about the research patterns and trends, research impacts, and network topology of the statistics community. In this article we focus on (i) using the citation relationships to estimate the research interests of authors, and (ii) using the co-author relationships to study the network topology. Using co-citation networks we constructed, we discover a "statistics triangle," reminiscent of the statistical philosophy triangle (Efron 1998 Efron, B. (1998), "Fisher in the 21st Century," Statistical Science, 13, 95–114.[Crossref], [Web of Science ®] , [Google Scholar]). We propose new approaches to constructing the "research map" of statisticians, as well as the "research trajectory" for a given author to visualize his/her research interest evolvement. Using co-authorship networks we constructed, we discover a multi-layer community tree and produce a Sankey diagram to visualize the author migrations in different sub-areas. We also propose several new metrics for research diversity of individual authors. We find that "Bayes," "Biostatistics," and "Nonparametric" are three primary areas in statistics. We also identify 15 sub-areas, each of which can be viewed as a weighted average of the primary areas, and identify several underlying reasons for the formation of co-authorship communities. We also find that the research interests of statisticians have evolved significantly in the 41-year time window we studied: some areas (e.g., biostatistics, high-dimensional data analysis, etc.) have become increasingly more popular. The research diversity of statisticians may be lower than we might have expected. For example, for the personalized networks of most authors, the p-values of the proposed significance tests are relatively large.

求助该文献

Co-citation and Co-authorship Networks of Statisticians

今日热心研友