Learning from Imbalanced Data

计算机科学 数据科学 原始数据 机器学习 人工智能 大数据 领域(数学) 数据挖掘 数学 程序设计语言 纯数学
作者
Haibo He,Edwardo A. Garcia
出处
期刊:IEEE Transactions on Knowledge and Data Engineering [Institute of Electrical and Electronics Engineers]
卷期号:21 (9): 1263-1284 被引量:7675
标识
DOI:10.1109/tkde.2008.239
摘要

With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1111发布了新的文献求助10
刚刚
wodel发布了新的文献求助10
1秒前
李振华完成签到,获得积分10
1秒前
CodeCraft应助科研通管家采纳,获得10
4秒前
Lucas应助科研通管家采纳,获得10
4秒前
4秒前
小蘑菇应助科研通管家采纳,获得10
4秒前
科研通AI2S应助科研通管家采纳,获得10
4秒前
李振华发布了新的文献求助10
4秒前
科目三应助科研通管家采纳,获得10
4秒前
4秒前
无花果应助科研通管家采纳,获得10
4秒前
充电宝应助科研通管家采纳,获得10
4秒前
Ava应助科研通管家采纳,获得10
4秒前
脑洞疼应助科研通管家采纳,获得30
4秒前
4秒前
4秒前
5秒前
大气的不凡完成签到 ,获得积分10
5秒前
7秒前
铛铛铛完成签到,获得积分20
7秒前
8秒前
单纯的山河完成签到,获得积分10
8秒前
老金金完成签到 ,获得积分10
8秒前
9秒前
英俊的铭应助dudu采纳,获得10
9秒前
9秒前
10秒前
ivytian完成签到,获得积分10
12秒前
13秒前
大个应助冷酷哈密瓜采纳,获得10
13秒前
xxxxxxx完成签到 ,获得积分10
14秒前
zhuiyu发布了新的文献求助10
14秒前
一年发3篇JACS完成签到,获得积分10
14秒前
14秒前
感性的蜜蜂完成签到,获得积分10
15秒前
弥淮发布了新的文献求助10
15秒前
16秒前
Z趋势完成签到,获得积分10
18秒前
窦窦窦窦窦完成签到,获得积分10
19秒前
高分求助中
The ACS Guide to Scholarly Communication 2500
Sustainability in Tides Chemistry 2000
Pharmacogenomics: Applications to Patient Care, Third Edition 1000
Studien zur Ideengeschichte der Gesetzgebung 1000
TM 5-855-1(Fundamentals of protective design for conventional weapons) 1000
Threaded Harmony: A Sustainable Approach to Fashion 810
《粉体与多孔固体材料的吸附原理、方法及应用》(需要中文翻译版,化学工业出版社,陈建,周力,王奋英等译) 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3084504
求助须知:如何正确求助?哪些是违规求助? 2737517
关于积分的说明 7545573
捐赠科研通 2387170
什么是DOI,文献DOI怎么找? 1265830
科研通“疑难数据库(出版商)”最低求助积分说明 613169
版权声明 598336