Semantics derived automatically from language corpora contain human-like biases

自然语言处理 人工智能 语义学(计算机科学) 自然语言
作者
Aylin Caliskan,Joanna J. Bryson,Arvind Narayanan
出处
期刊:arXiv: Artificial Intelligence 被引量:44
标识
DOI:10.1126/science.aal4230
摘要

Artificial intelligence and machine learning are in a period of astounding growth. However, there are concerns that these technologies may be used, either with or without intention, to perpetuate the prejudice and unfairness that unfortunately characterizes many human institutions. Here we show for the first time that human-like semantic biases result from the application of standard machine learning to ordinary language---the same sort of language humans are exposed to every day. We replicate a spectrum of standard human biases as exposed by the Implicit Association Test and other well-known psychological studies. We replicate these using a widely used, purely statistical machine-learning model---namely, the GloVe word embedding---trained on a corpus of text from the Web. Our results indicate that language itself contains recoverable and accurate imprints of our historic biases, whether these are morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the {\em status quo} for the distribution of gender with respect to careers or first names. These regularities are captured by machine learning along with the rest of semantics. In addition to our empirical findings concerning language, we also contribute new methods for evaluating bias in text, the Word Embedding Association Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). Our results have implications not only for AI and machine learning, but also for the fields of psychology, sociology, and human ethics, since they raise the possibility that mere exposure to everyday language can account for the biases we replicate here.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
yyr完成签到 ,获得积分10
刚刚
美满的小蘑菇完成签到 ,获得积分10
刚刚
1秒前
橙子味的邱憨憨完成签到 ,获得积分10
2秒前
复杂谷蓝完成签到 ,获得积分10
2秒前
炸酱面完成签到,获得积分10
3秒前
思源应助那奇泡芙采纳,获得10
3秒前
潇洒洙发布了新的文献求助10
5秒前
6秒前
华仔应助hyy采纳,获得10
7秒前
9秒前
9秒前
那奇泡芙完成签到,获得积分20
10秒前
10秒前
Owen应助失眠沉鱼采纳,获得10
11秒前
12秒前
13秒前
14秒前
14秒前
xiaxiao应助Davvy采纳,获得80
15秒前
一与余完成签到,获得积分10
16秒前
潇洒洙完成签到,获得积分20
16秒前
阔达妙柏发布了新的文献求助10
17秒前
咕咕发布了新的文献求助10
17秒前
hyy发布了新的文献求助10
18秒前
小赞芽完成签到 ,获得积分10
19秒前
21秒前
薄冰发布了新的文献求助10
22秒前
华仔应助刘源文采纳,获得50
22秒前
23秒前
希望天下0贩的0应助咕咕采纳,获得10
26秒前
27秒前
30秒前
自觉秋灵完成签到,获得积分10
30秒前
无忧完成签到,获得积分10
30秒前
严晓黎完成签到 ,获得积分10
31秒前
山水有重逢完成签到,获得积分10
33秒前
宗师算个瓢啊完成签到 ,获得积分10
33秒前
33秒前
jiang发布了新的文献求助10
35秒前
高分求助中
Drug Prescribing in Renal Failure: Dosing Guidelines for Adults and Children 5th Edition 2000
All the Birds of the World 1000
IZELTABART TAPATANSINE 500
Armour of the english knight 1400-1450 300
Handbook of Laboratory Animal Science 300
Not Equal : Towards an International Law of Finance 260
Beginners Guide To Clinical Medicine (Pb 2020): A Systematic Guide To Clinical Medicine, Two-Vol Set 250
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3713651
求助须知:如何正确求助?哪些是违规求助? 3261514
关于积分的说明 9918857
捐赠科研通 2975246
什么是DOI,文献DOI怎么找? 1631421
邀请新用户注册赠送积分活动 773972
科研通“疑难数据库(出版商)”最低求助积分说明 744587