计算机科学
杠杆(统计)
水准点(测量)
钥匙(锁)
任务(项目管理)
注释
对象(语法)
数据挖掘
标记数据
情报检索
目标检测
人工智能
相关性(法律)
机器学习
模式识别(心理学)
计算机安全
管理
大地测量学
政治学
法学
经济
地理
作者
Shucheng Li,Boyu Chang,Bo Yang,Hao Wu,Sheng Zhong,Fengyuan Xu
标识
DOI:10.1145/3539618.3591661
摘要
Automatic dataset preparation can help users avoid labor-intensive and costly manual data annotations. The difficulty in preparing a high-quality dataset for object detection involves three key aspects: relevance, naturality, and balance, which are not addressed by existing works. In this paper, we leverage information from the web, and propose a fully-automatic dataset preparation mechanism without any human annotation, which can automatically prepare a high-quality training dataset for the detection task with English text terms describing target objects. It contains three key designs, i.e., keyword expansion, data de-noising, and data balancing. Our experiments demonstrate that the object detectors trained with auto-prepared data are comparable to those trained with benchmark datasets and outperform other baselines. We also demonstrate the effectiveness of our approach in several more challenging real-world object categories that are not included in the benchmark datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI