发布文献求助

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

水准点（测量）心理测量学心理学应用心理学自然语言处理数据科学计算机科学计量经济学临床心理学数学地图学地理

作者

Yuan Li,Yue Huang,Hongyi Wang,Xiangliang Zhang,James Zou,Lichao Sun

出处

期刊：Cornell University - arXiv 日期：2024-06-25

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.48550/arxiv.2406.17675

摘要

Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants. The broader integration of LLMs into society has sparked interest in whether they manifest psychological attributes, and whether these attributes are stable-inquiries that could deepen the understanding of their behaviors. Inspired by psychometrics, this paper presents a framework for investigating psychology in LLMs, including psychological dimension identification, assessment dataset curation, and assessment with results validation. Following this framework, we introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence. This benchmark includes thirteen datasets featuring diverse scenarios and item types. Our findings indicate that LLMs manifest a broad spectrum of psychological attributes. We also uncover discrepancies between LLMs' self-reported traits and their behaviors in real-world scenarios. This paper demonstrates a thorough psychometric assessment of LLMs, providing insights into reliable evaluation and potential applications in AI and social sciences.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 烟花上传了应助文件

2秒前; liuhang完成签到，获得积分10

2秒前; 大模型的应助被朴素的笑容采纳，获得30

2秒前; xiaoxiao发布了新的文献求助10

3秒前; zzz完成签到，获得积分10

3秒前; 富贵鱼鱼发布了新的文献求助10

4秒前; nature24上传了应助文件

5秒前; 白茶泡泡球发布了新的文献求助10

6秒前; bkagyin的应助被爱上人家四月采纳，获得10

6秒前; wanci的应助被senlin采纳，获得10

6秒前; 情怀上传了应助文件

7秒前; 科研通管家关闭了小芳豪的文献求助

7秒前; 生产队的建设者关闭了生产队的建设者的文献求助

10秒前; 传奇3的应助被xiaoxiao采纳，获得10

10秒前; 大力奇迹完成签到，获得积分10

11秒前; 哈哈发布了新的文献求助10

12秒前; wanci上传了应助文件

13秒前; Sandy发布了新的文献求助30

13秒前; hehe发布了新的文献求助10

13秒前; 科研通AI5的应助被科研通管家采纳，获得10

13秒前; 脑洞疼的应助被科研通管家采纳，获得10

13秒前; 刘荻萩的应助被科研通管家采纳，获得40

14秒前; 感动城驳回了Jasper的应助

14秒前; bkagyin的应助被科研通管家采纳，获得10

14秒前; 科研通AI5的应助被科研通管家采纳，获得10

14秒前; 所所的应助被科研通管家采纳，获得10

14秒前; www的应助被科研通管家采纳，获得10

14秒前; JamesPei的应助被科研通管家采纳，获得10

14秒前; 科目三的应助被科研通管家采纳，获得30

14秒前; 酷波er的应助被科研通管家采纳，获得10

14秒前; lany的应助被科研通管家采纳，获得10

14秒前; 慕青的应助被科研通管家采纳，获得10

14秒前; www的应助被科研通管家采纳，获得10

14秒前; sy的应助被科研通管家采纳，获得10

15秒前; shen的应助被科研通管家采纳，获得10

15秒前; 英姑的应助被科研通管家采纳，获得10

15秒前; 科研通AI5的应助被科研通管家采纳，获得10

15秒前; fang的应助被科研通管家采纳，获得10

15秒前; lany的应助被科研通管家采纳，获得10

15秒前; 今后的应助被科研通管家采纳，获得10

15秒前

高分求助中: Production Logging: Theoretical and Interpretive Elements 2700; Neuromuscular and Electrodiagnostic Medicine Board Review 1000; こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510; The First Nuclear Era: The Life and Times of a Technological Fixer 500; ALUMINUM STANDARDS AND DATA 500; 岡本唐貴自伝的回想画集 500; Distinct Aggregation Behaviors and Rheological Responses of Two Terminally Functionalized Polyisoprenes with Different Quadruple Hydrogen Bonding Motifs 450

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3668063; 求助须知：如何正确求助？哪些是违规求助？ 3226515; 关于积分的说明 9769764; 捐赠科研通 2936459; 什么是DOI，文献DOI怎么找？ 1608572; 邀请新用户注册赠送积分活动 759665; 科研通“疑难数据库（出版商）”最低求助积分说明 735460

今日热心研友

请叫我风吹麦浪

科研小民工

昏睡的蟠桃

一蓑烟雨任平生

默默地读文献

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通