ChatGPT: Jack of all trades, master of none

计算机科学 主数据 航空学 数据库 工程类
作者
Jan Kocoń,Igor Cichecki,Oliwier Kaszyca,Mateusz Kochanek,Dominika Szydło,Joanna Baran,Julita Bielaniewicz,Marcin Gruza,Arkadiusz Janz,Kamil Kanclerz,A. Kocoń,Bartłomiej Koptyra,Wiktoria Mieleszczenko-Kowszewicz,Piotr Miłkowski,Marcin Oleksy,Maciej Piasecki,Łukasz Radliński,Konrad Wojtasik,Stanisław Woźniak,Przemysław Kazienko
出处
期刊:Information Fusion [Elsevier]
卷期号:99: 101861-101861 被引量:395
标识
DOI:10.1016/j.inffus.2023.101861
摘要

OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined ChatGPT's capabilities on 25 diverse analytical NLP tasks, most of them subjective even to humans, such as sentiment analysis, emotion recognition, offensiveness, and stance detection. In contrast, the other tasks require more objective reasoning like word sense disambiguation, linguistic acceptability, and question answering. We also evaluated GPT-4 model on five selected subsets of NLP tasks. We automated ChatGPT and GPT-4 prompting process and analyzed more than 49k responses. Our comparison of its results with available State-of-the-Art (SOTA) solutions showed that the average loss in quality of the ChatGPT model was about 25% for zero-shot and few-shot evaluation. For GPT-4 model, a loss for semantic tasks is significantly lower than for ChatGPT. We showed that the more difficult the task (lower SOTA performance), the higher the ChatGPT loss. It especially refers to pragmatic NLP problems like emotion recognition. We also tested the ability to personalize ChatGPT responses for selected subjective tasks via Random Contextual Few-Shot Personalization, and we obtained significantly better user-based predictions. Additional qualitative analysis revealed a ChatGPT bias, most likely due to the rules imposed on human trainers by OpenAI. Our results provide the basis for a fundamental discussion of whether the high quality of recent predictive NLP models can indicate a tool's usefulness to society and how the learning and validation procedures for such systems should be established.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
2秒前
2秒前
毛豆应助wtdhygygjjd采纳,获得10
3秒前
烂漫的绮玉完成签到,获得积分10
3秒前
7777777发布了新的文献求助10
5秒前
平淡雪糕完成签到,获得积分10
5秒前
丁泓骄发布了新的文献求助10
6秒前
6秒前
思源应助通辽小判官采纳,获得10
6秒前
ceeray23应助稳重元蝶采纳,获得10
6秒前
经冰夏发布了新的文献求助10
6秒前
FashionBoy应助12采纳,获得10
7秒前
orixero应助PP采纳,获得10
8秒前
6小薇123发布了新的文献求助10
9秒前
大鱼完成签到,获得积分10
10秒前
cheng发布了新的文献求助10
10秒前
毛豆应助能干的勒采纳,获得10
12秒前
研友_nVNBVn发布了新的文献求助30
14秒前
Why顺利完成签到,获得积分10
14秒前
杨慕陈发布了新的文献求助10
15秒前
lelelelelelele完成签到,获得积分10
16秒前
gg完成签到,获得积分10
17秒前
xiaoxiao发布了新的文献求助10
18秒前
NexusExplorer应助呆萌烧鹅采纳,获得10
18秒前
扶好三四应助稳重的篮球采纳,获得10
18秒前
毛豆应助孟寐以求采纳,获得10
18秒前
19秒前
llll应助滕皓轩采纳,获得10
19秒前
smg1307发布了新的文献求助10
21秒前
21秒前
Fred发布了新的文献求助10
22秒前
xiaoxiao完成签到,获得积分10
22秒前
Jasper应助雨碎寒江采纳,获得10
23秒前
6小薇123完成签到,获得积分10
24秒前
25秒前
wyj完成签到,获得积分10
25秒前
自由的雨泽完成签到,获得积分10
25秒前
Sarrot发布了新的文献求助10
25秒前
EEBB发布了新的文献求助10
27秒前
高分求助中
Востребованный временем 2500
Classics in Total Synthesis IV: New Targets, Strategies, Methods 1000
Mantids of the euro-mediterranean area 600
The Oxford Handbook of Educational Psychology 600
Injection and Compression Molding Fundamentals 500
Mantodea of the World: Species Catalog Andrew M 500
Insecta 2. Blattodea, Mantodea, Isoptera, Grylloblattodea, Phasmatodea, Dermaptera and Embioptera 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 内科学 物理 纳米技术 计算机科学 基因 遗传学 化学工程 复合材料 免疫学 物理化学 细胞生物学 催化作用 病理
热门帖子
关注 科研通微信公众号,转发送积分 3421984
求助须知:如何正确求助?哪些是违规求助? 3022476
关于积分的说明 8900876
捐赠科研通 2709830
什么是DOI,文献DOI怎么找? 1486149
科研通“疑难数据库(出版商)”最低求助积分说明 686963
邀请新用户注册赠送积分活动 682174