Lost in translation? Not for Large Language Models: Automated divergent thinking scoring performance translates to non-English contexts

创造力 计算机科学 论证(复杂分析) 对象(语法) 人工智能 比例(比率) 心理学 认知心理学 自然语言处理 社会心理学 生物化学 量子力学 物理 化学
作者
Aleksandra Zielińska,Peter Organisciak,Denis Dumas,Maciej Karwowski
出处
期刊:Thinking Skills and Creativity [Elsevier]
卷期号:50: 101414-101414 被引量:5
标识
DOI:10.1016/j.tsc.2023.101414
摘要

Divergent thinking (DT) has been at the heart of creativity measurement for over seven decades. At the same time, large-scale usage of DT tests is limited due to the tedious procedure of scoring the responses, which often requires several judges to assess thousands of participants’ ideas. Across two studies (N = 195 and N = 404), we examined the quality of artificial intelligence-based scoring models (Ocsai, Organisciak et al., 2023) to score Alternate Uses Tasks (Study 1: brick, Study 2: brick, can, rope). Based on more than 6000 ideas provided by participants in Polish and automatically translated to English, we fit a series of idea (response)- and prompt (object)-level structural equation models. When artificial intelligence-based and semantic distance scores were modeled together, latent correlations with human ratings ranged from r = 0.56 to r = 0.95 at the response (idea) level and from r = 0.61 to r = 0.99 at the object (prompt) level. A hierarchical (i.e., person-level) model with three DT tasks modeled together (Study 2) demonstrated a latent correlation between automatized and human ratings of r = 0.96 (Babbage) and r = 0.98 (DaVinci). Notably, the same results were obtained based on untranslated responses provided in Polish. Automated and human scores provided the same serial-order effect pattern and the same profile of differences under “be fluent” vs. “be creative” instructions. This investigation offers an initial yet compelling argument that the new algorithms provide a close-to-perfect score of DT tasks when benchmarked against human ratings, even when the responses are created in a different language and automatically translated to English or used in an untranslated form.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
开水完成签到,获得积分10
刚刚
领导范儿应助科研通管家采纳,获得10
刚刚
丘比特应助科研通管家采纳,获得10
刚刚
mk应助科研通管家采纳,获得10
刚刚
July0717_完成签到,获得积分10
刚刚
Mic应助科研通管家采纳,获得10
刚刚
1秒前
1秒前
1秒前
1秒前
2秒前
crane_完成签到,获得积分10
2秒前
2秒前
zyj发布了新的文献求助10
2秒前
李可以发布了新的文献求助10
2秒前
3秒前
石玉婷发布了新的文献求助10
3秒前
3秒前
疯狂的蜡烛完成签到,获得积分10
3秒前
4秒前
明亮如花完成签到,获得积分10
4秒前
许琦完成签到,获得积分10
6秒前
crane_发布了新的文献求助10
6秒前
研研研发布了新的文献求助10
6秒前
6秒前
旺仔女士发布了新的文献求助10
6秒前
无花果应助Conan采纳,获得10
7秒前
7秒前
8秒前
核桃发布了新的文献求助30
8秒前
脑洞疼应助谢睿元采纳,获得30
8秒前
香蕉大船发布了新的文献求助10
8秒前
靓丽从筠发布了新的文献求助10
10秒前
斯文败类应助Orietta1012采纳,获得20
10秒前
10秒前
Keyl发布了新的文献求助10
10秒前
11秒前
蟑螂恶霸发布了新的文献求助10
11秒前
Gabriel发布了新的文献求助10
11秒前
JamesPei应助灯火阑珊曦采纳,获得10
11秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Handbook of pharmaceutical excipients, Ninth edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6019159
求助须知:如何正确求助?哪些是违规求助? 7611726
关于积分的说明 16161197
捐赠科研通 5166855
什么是DOI,文献DOI怎么找? 2765466
邀请新用户注册赠送积分活动 1747189
关于科研通互助平台的介绍 1635490