创造力
计算机科学
论证(复杂分析)
对象(语法)
人工智能
比例(比率)
心理学
认知心理学
自然语言处理
社会心理学
生物化学
化学
物理
量子力学
作者
Aleksandra Zielińska,Peter Organisciak,Denis Dumas,Maciej Karwowski
标识
DOI:10.1016/j.tsc.2023.101414
摘要
Divergent thinking (DT) has been at the heart of creativity measurement for over seven decades. At the same time, large-scale usage of DT tests is limited due to the tedious procedure of scoring the responses, which often requires several judges to assess thousands of participants’ ideas. Across two studies (N = 195 and N = 404), we examined the quality of artificial intelligence-based scoring models (Ocsai, Organisciak et al., 2023) to score Alternate Uses Tasks (Study 1: brick, Study 2: brick, can, rope). Based on more than 6000 ideas provided by participants in Polish and automatically translated to English, we fit a series of idea (response)- and prompt (object)-level structural equation models. When artificial intelligence-based and semantic distance scores were modeled together, latent correlations with human ratings ranged from r = 0.56 to r = 0.95 at the response (idea) level and from r = 0.61 to r = 0.99 at the object (prompt) level. A hierarchical (i.e., person-level) model with three DT tasks modeled together (Study 2) demonstrated a latent correlation between automatized and human ratings of r = 0.96 (Babbage) and r = 0.98 (DaVinci). Notably, the same results were obtained based on untranslated responses provided in Polish. Automated and human scores provided the same serial-order effect pattern and the same profile of differences under “be fluent” vs. “be creative” instructions. This investigation offers an initial yet compelling argument that the new algorithms provide a close-to-perfect score of DT tasks when benchmarked against human ratings, even when the responses are created in a different language and automatically translated to English or used in an untranslated form.
科研通智能强力驱动
Strongly Powered by AbleSci AI