亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

The Dark Side of Machine Learning Algorithms

计算机科学 大裂谷 算法 人工智能 机器学习 物理 天文
作者
Mariya I. Vasileva
标识
DOI:10.1145/3394486.3411068
摘要

Machine learning and access to big data are revolutionizing the way many industries operate, providing analytics and automation to many aspects of real-world practical tasks that were previously thought to be necessarily manual. With the pervasiveness of artificial intelligence and machine learning over the past decade, and their epidemic spread in a variety of applications, algorithmic fairness has become a prominent open research problem. For instance, machine learning is used in courts to assess the probability that a defendant recommits a crime; in the medical domain to assist with diagnosis or predict predisposition to certain diseases; in social welfare systems; and autonomous vehicles. The decision making processes in these real-world applications have a direct effect on people's lives, and can cause harm to society if the machine learning algorithms deployed are not designed with considerations to fairness. The ability to collect and analyze large datasets for problems in many domains brings forward the danger of implicit data bias, which could be harmful. Data, especially big data, is often heterogeneous, generated by different subgroups with their owncharacteristics and behaviors. Furthermore, data collection strategies vary vastly across domains, and labelling of examples is performed by human annotators, thus causing the labelling process to amplify inherent biases the annotators might harbor. A model learned on biased data may not only lead to unfair and inaccurate predictions, but also significantly disadvantage certain subgroups, and lead to unfairness in downstream learning tasks. There aremultiple ways in which discriminatory bias can seep into data: for example, in medical domains, there are many instances in whichthe data used are skewed toward certain populations-which canhave dangerous consequences for the underrepresented communities [1]. Another example are large-scale datasets widely used in machine learning tasks, like ImageNet and Open Images: [2] shows that these datasets suffer from representation bias, and advocates for the need to incorporate geo-diversity and inclusion. Yet another example are the popular face recognition and generation datasets like CelebA and Flickr-Faces-HQ, where the ethnic and racial breakdown of example faces shows significant representation bias, evident in downstream tasks like face reconstruction from an obfuscated image [8]. In order to be able to fight discriminatory use of machine learning algorithms that leverage such biases, one needs to first define the notion of algorithmic fairness. Broadly, fairness is the absence of any prejudice or favoritism towards an individual or a group based on their intrinsic or acquired traits in the context of decision making [3]. Fairness definitions fall under three broad types: individual fairness (whereby similar predictions are given to similar individuals [4, 5]), group fairness (whereby different groups are treated equally [4, 5]), and subgroup fairness (whereby a group fairness constraint is being selected, and the task is to determine whether the constraint holds over a large collection of subgroups [6, 7]). In this talk, I will discuss a formal definition of these fairness constraints, examine the ways in which machine learning algorithms can amplify representation bias, and discuss how bias in both the example set and label set of popular datasets has been misused in a discriminatory manner. I will touch upon the issues of ethics and accountability, and present open research directions for tackling algorithmic fairness at the representation level.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
月军完成签到 ,获得积分10
3秒前
TianYou完成签到,获得积分20
27秒前
浮游应助andrele采纳,获得10
1分钟前
TianYou发布了新的文献求助10
1分钟前
今后应助科研通管家采纳,获得10
1分钟前
浮游应助科研通管家采纳,获得10
1分钟前
沙海沉戈完成签到,获得积分0
1分钟前
JamesPei应助敏敏9813采纳,获得10
2分钟前
晓风完成签到 ,获得积分10
2分钟前
丘比特应助科研通管家采纳,获得10
3分钟前
浮游应助科研通管家采纳,获得10
3分钟前
fairy完成签到 ,获得积分10
4分钟前
Criminology34发布了新的文献求助500
4分钟前
sissiarno应助科研通管家采纳,获得30
5分钟前
情怀应助科研通管家采纳,获得10
5分钟前
淡然的剑通完成签到 ,获得积分10
6分钟前
gszy1975完成签到,获得积分10
6分钟前
Mavis完成签到 ,获得积分10
6分钟前
彼得力完成签到 ,获得积分10
6分钟前
7分钟前
敏敏9813发布了新的文献求助10
7分钟前
浮游应助科研通管家采纳,获得10
7分钟前
科研通AI6应助科研通管家采纳,获得10
7分钟前
浮游应助科研通管家采纳,获得10
9分钟前
浮游应助科研通管家采纳,获得10
9分钟前
9分钟前
Kamalika发布了新的文献求助200
10分钟前
vitamin完成签到 ,获得积分10
10分钟前
晨雾锁阳完成签到 ,获得积分10
11分钟前
落落洛栖完成签到 ,获得积分10
11分钟前
YH完成签到,获得积分10
11分钟前
研友_ngJlbL完成签到,获得积分10
12分钟前
Artin发布了新的文献求助30
12分钟前
12分钟前
li199624发布了新的文献求助10
12分钟前
13分钟前
科研通AI6应助科研通管家采纳,获得10
13分钟前
浮游应助科研通管家采纳,获得10
13分钟前
原子超人完成签到,获得积分10
13分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
FUNDAMENTAL STUDY OF ADAPTIVE CONTROL SYSTEMS 500
微纳米加工技术及其应用 500
Nanoelectronics and Information Technology: Advanced Electronic Materials and Novel Devices 500
Performance optimization of advanced vapor compression systems working with low-GWP refrigerants using numerical and experimental methods 500
Constitutional and Administrative Law 500
PARLOC2001: The update of loss containment data for offshore pipelines 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5292297
求助须知:如何正确求助?哪些是违规求助? 4442903
关于积分的说明 13830580
捐赠科研通 4326296
什么是DOI,文献DOI怎么找? 2374768
邀请新用户注册赠送积分活动 1370081
关于科研通互助平台的介绍 1334525