作者
Tianying Liu,Lu Zhang,Yang Wang,Jihong Guan,Yanwei Fu,Jiajia Zhao,Shuigeng Zhou
摘要
The generic object detection (GOD) task has been successfully tackled by recent deep neural networks, trained by an avalanche of annotated training samples from some common classes. However, it is still non-trivial to generalize these object detectors to the novel long-tailed object classes, which have only few labeled training samples. To this end, the Few-Shot Object Detection (FSOD) has been topical recently, as it mimics the humans’ ability of learning to learn and intelligently transfers the learned generic object knowledge from the common heavy-tailed to the novel long-tailed object classes. Especially, the research in this emerging field has been flourishing in recent years with various benchmarks, backbones, and methodologies proposed. To review these FSOD works, there are several insightful FSOD survey articles [ 58 , 59 , 74 , 78 ] that systematically study and compare them as the groups of fine-tuning/transfer learning and meta-learning methods. In contrast, we review the existing FSOD algorithms from a new perspective under a new taxonomy based on their contributions, i.e., data-oriented, model-oriented, and algorithm-oriented. Thus, a comprehensive survey with performance comparison is conducted on recent achievements of FSOD. Furthermore, we also analyze the technical challenges, the merits and demerits of these methods, and envision the future directions of FSOD. Specifically, we give an overview of FSOD, including the problem definition, common datasets, and evaluation protocols. The taxonomy is then proposed that groups FSOD methods into three types. Following this taxonomy, we provide a systematic review of the advances in FSOD. Finally, further discussions on performance, challenges, and future directions are presented.