Can Large Language Models Assess Personality from Asynchronous Video Interviews? A Comprehensive Evaluation of Validity, Reliability, Fairness, and Rating Patterns

可靠性(半导体) 心理学 异步通信 有效性 人格 计算机科学 应用心理学 评定量表 可靠性工程 心理测量学 社会心理学 临床心理学 发展心理学 工程类 电信 功率(物理) 物理 量子力学
作者
Tianyi Zhang,Antonis Koutsoumpis,Janneke K. Oostrom,Djurre Holtrop,Sina Ghassemi,Reinout E. de Vries
出处
期刊:IEEE Transactions on Affective Computing [Institute of Electrical and Electronics Engineers]
卷期号:15 (3): 1769-1785 被引量:3
标识
DOI:10.1109/taffc.2024.3374875
摘要

The advent of Artificial Intelligence (AI) technologies has precipitated the rise of asynchronous video interviews (AVIs) as an alternative to conventional job interviews. These one-way video interviews are conducted online and can be analyzed using AI algorithms to automate and speed up the selection procedure. In particular, the swift advancement of Large Language Models (LLMs) has significantly decreased the cost and technical barrier to developing AI systems for automatic personality and interview performance evaluation. However, the generative and task-unspecific nature of LLMs might pose potential risks and biases when evaluating humans based on their AVI responses. In this study, we conducted a comprehensive evaluation of the validity, reliability, fairness, and rating patterns of two widely-used LLMs, GPT-3.5 and GPT-4, in assessing personality and interview performance from an AVI. We compared the personality and interview performance ratings of the LLMs with the ratings from a task-specific AI model and human annotators using simulated AVI responses of 685 participants. The results show that LLMs can achieve similar or even better zero-shot validity compared with the task-specific AI model when predicting personality traits. The verbal explanations for predicting personality traits generated by LLMs are interpretable by the personality items that are designed according to psychological theories. However, LLMs also suffered from uneven performance across different traits, insufficient test-retest reliability, and the emergence of certain biases. Thus, it is necessary to exercise caution when applying LLMs for human-related application scenarios, especially for significant decisions such as employment.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ronnie完成签到,获得积分10
3秒前
HarryYang完成签到 ,获得积分10
5秒前
糊涂的语兰完成签到 ,获得积分10
6秒前
科研通AI2S应助Apple采纳,获得10
6秒前
猫毛完成签到,获得积分10
12秒前
AaoTii完成签到,获得积分10
12秒前
积极盼山完成签到,获得积分10
14秒前
小狗完成签到,获得积分10
15秒前
yifei完成签到,获得积分10
16秒前
bkagyin应助科研通管家采纳,获得10
17秒前
思源应助科研通管家采纳,获得10
17秒前
SciGPT应助科研通管家采纳,获得10
17秒前
思源应助科研通管家采纳,获得10
17秒前
领导范儿应助科研通管家采纳,获得10
18秒前
共享精神应助科研通管家采纳,获得30
18秒前
CipherSage应助科研通管家采纳,获得10
18秒前
深情安青应助科研通管家采纳,获得10
18秒前
18秒前
Yziii应助科研通管家采纳,获得10
18秒前
fangjie应助科研通管家采纳,获得10
18秒前
JamesPei应助科研通管家采纳,获得100
18秒前
JamesPei应助科研通管家采纳,获得10
18秒前
18秒前
文静的雨筠完成签到 ,获得积分10
20秒前
汛钥完成签到,获得积分10
21秒前
辣辣完成签到,获得积分10
21秒前
龙阔发布了新的文献求助50
24秒前
楼翩跹完成签到 ,获得积分10
26秒前
我是老大应助风城玫瑰采纳,获得10
26秒前
28秒前
紫熊发布了新的文献求助20
29秒前
huohuo完成签到,获得积分10
30秒前
深海完成签到,获得积分10
32秒前
青衍完成签到,获得积分10
34秒前
柔弱藏今发布了新的文献求助10
35秒前
36秒前
37秒前
大罗完成签到,获得积分10
39秒前
勤奋青寒完成签到,获得积分10
39秒前
39秒前
高分求助中
Sustainability in Tides Chemistry 2800
The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000
Rechtsphilosophie 1000
Bayesian Models of Cognition:Reverse Engineering the Mind 888
Very-high-order BVD Schemes Using β-variable THINC Method 568
Chen Hansheng: China’s Last Romantic Revolutionary 500
XAFS for Everyone 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3137115
求助须知:如何正确求助?哪些是违规求助? 2788086
关于积分的说明 7784551
捐赠科研通 2444121
什么是DOI,文献DOI怎么找? 1299763
科研通“疑难数据库(出版商)”最低求助积分说明 625574
版权声明 601011