计算机科学
分类学(生物学)
一般化
领域(数学)
数据科学
限制
训练集
人工智能
机器学习
数学
植物
机械工程
生物
工程类
数学分析
纯数学
作者
Markus Bayer,Marc‐André Kaufhold,Christian Reuter
摘要
Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization capabilities, it can also address many other challenges and problems, from overcoming a limited amount of training data, to regularizing the objective, to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation and a taxonomy for existing works, this survey is concerned with data augmentation methods for textual classification and aims to provide a concise and comprehensive overview for researchers and practitioners. Derived from the taxonomy, we divide more than 100 methods into 12 different groupings and give state-of-the-art references expounding which methods are highly promising by relating them to each other. Finally, research perspectives that may constitute a building block for future work are provided.
科研通智能强力驱动
Strongly Powered by AbleSci AI