计算机科学
MNIST数据库
人工智能
边距(机器学习)
标记数据
深度学习
主动学习(机器学习)
水准点(测量)
机器学习
半监督学习
选择(遗传算法)
熵(时间箭头)
集合(抽象数据类型)
数据挖掘
物理
大地测量学
量子力学
程序设计语言
地理
标识
DOI:10.1109/ijcnn.2014.6889457
摘要
Deep learning has been shown to achieve outstanding performance in a number of challenging real-world applications. However, most of the existing works assume a fixed set of labeled data, which is not necessarily true in real-world applications. Getting labeled data is usually expensive and time consuming. Active labelling in deep learning aims at achieving the best learning result with a limited labeled data set, i.e., choosing the most appropriate unlabeled data to get labeled. This paper presents a new active labeling method, AL-DL, for cost-effective selection of data to be labeled. AL-DL uses one of three metrics for data selection: least confidence, margin sampling, and entropy. The method is applied to deep learning networks based on stacked restricted Boltzmann machines, as well as stacked autoencoders. In experiments on the MNIST benchmark dataset, the method outperforms random labeling consistently by a significant margin.
科研通智能强力驱动
Strongly Powered by AbleSci AI