神经心理学
心理学
认知心理学
前额叶皮质
认知科学
神经科学
认知
作者
Riccardo Loconte,Graziella Orrù,Mirco Tribastone,Pietro Pietrini,Giuseppe Sartori
摘要
The Artificial Intelligence (AI) research community has used ad-hoc benchmarks to measure the 'intelligence' level of Large Language Models (LLMs). Previous research has found that LLMs struggle with cognitive tasks that required the integrity of the human prefrontal lobes, known as 'prefrontal functions.' In December 2022, OpenAI released ChatGPT, a new chatbot that quickly gained popularity for its ability to understand and respond to human instructions. To investigate ChatGPT's level of 'intelligence,' we conducted a neuropsychological evaluation with the same tests routinely used to evaluate prefrontal functioning in humans, since human 'intelligence' requires the functional integrity of the frontal lobes. While ChatGPT is well-known to exhibit outstanding performance on generative linguistic tasks, its performance on prefrontal tests was inhomogeneous, with some tests well above average, others in the lower range, and others frankly impaired. Specifically, we have identified poor planning abilities and difficulty in recognising semantic absurdities and understanding others' intentions and mental states. This inconsistent profile highlights how LLMs' emergent abilities do not yet mimic human cognitive functioning. In addition, our results indicate that standardised neuropsychological batteries developed to assess human cognitive functions may be suitable for challenging ChatGPT performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI