计算机科学
杠杆(统计)
Boosting(机器学习)
学习迁移
标记数据
人工智能
机器学习
构造(python库)
数据建模
试验数据
半监督学习
训练集
程序设计语言
数据库
作者
Wenyuan Dai,Qiang Yang,Gui-Rong Xue,Yong Yu
标识
DOI:10.1145/1273496.1273521
摘要
Traditional machine learning makes a basic assumption: the training and test data should be under the same distribution. However, in many cases, this identical-distribution assumption does not hold. The assumption might be violated when a task from one new domain comes, while there are only labeled data from a similar old domain. Labeling the new data can be costly and it would also be a waste to throw away all the old data. In this paper, we present a novel transfer learning framework called TrAdaBoost, which extends boosting-based learning algorithms (Freund & Schapire, 1997). TrAdaBoost allows users to utilize a small amount of newly labeled data to leverage the old data to construct a high-quality classification model for the new data. We show that this method can allow us to learn an accurate model using only a tiny amount of new data and a large amount of old data, even when the new data are not sufficient to train a model alone. We show that TrAdaBoost allows knowledge to be effectively transferred from the old data to the new. The effectiveness of our algorithm is analyzed theoretically and empirically to show that our iterative algorithm can converge well to an accurate model.
科研通智能强力驱动
Strongly Powered by AbleSci AI