水准点(测量)
心理测量学
心理学
应用心理学
自然语言处理
数据科学
计算机科学
计量经济学
临床心理学
数学
地图学
地理
作者
Yuan Li,Yue Huang,Hongyi Wang,Xiangliang Zhang,James Zou,Lichao Sun
出处
期刊:Cornell University - arXiv
日期:2024-06-25
标识
DOI:10.48550/arxiv.2406.17675
摘要
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants. The broader integration of LLMs into society has sparked interest in whether they manifest psychological attributes, and whether these attributes are stable-inquiries that could deepen the understanding of their behaviors. Inspired by psychometrics, this paper presents a framework for investigating psychology in LLMs, including psychological dimension identification, assessment dataset curation, and assessment with results validation. Following this framework, we introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence. This benchmark includes thirteen datasets featuring diverse scenarios and item types. Our findings indicate that LLMs manifest a broad spectrum of psychological attributes. We also uncover discrepancies between LLMs' self-reported traits and their behaviors in real-world scenarios. This paper demonstrates a thorough psychometric assessment of LLMs, providing insights into reliable evaluation and potential applications in AI and social sciences.
科研通智能强力驱动
Strongly Powered by AbleSci AI