计算机科学
人工智能
计算机视觉
主题(计算)
图像处理
嵌入
适应(眼睛)
模式识别(心理学)
图像(数学)
光学
物理
操作系统
作者
Jikai Wang,Wanglong Lu,Yu Wang,Kaijie Shi,Xianta Jiang,Hanli Zhao
标识
DOI:10.1117/1.jei.33.1.013028
摘要
Grouping images into different themes is a challenging task in photo book curation. Unlike image object recognition, image theme recognition focuses on the understanding of the main subject or overall meaning conveyed by an image. However, it is challenging to achieve satisfactory performance using existing general image recognition methods. In this work, we aim to solve the image theme recognition task with few-shot training samples using pre-trained contrastive language-image models. A text-prompt-guided few-shot image adaptation framework is proposed, which incorporates a text-embedding-guided classifier and an auxiliary classification loss to exploit embedded visual and text features, stabilize the network training, and enhance recognition performance. We also present an annotated dataset Theme25 for studying image theme recognition. We conducted experiments on our Theme25 dataset as well as the publicly available CIFAR100 and ImageNet datasets to demonstrate the superiority of our method over the compared state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI