Boosting(机器学习)
计算机科学
人工智能
上下文图像分类
模式识别(心理学)
机器学习
计算机视觉
图像(数学)
标识
DOI:10.1109/icicml60161.2023.10424766
摘要
As the landscape of deep learning has evolved rapidly, numerous models and methodologies have emerged, revolutionizing the domain of image classification. In recent years, OpenAI’s Contrastive Language-Image Pre-Training (CLIP) model, which uniquely bridges visual and textual information, has demonstrated robust generalization across diverse tasks, presenting fresh avenues and opportunities for image classification. Building upon the capabilities of the CLIP model, this research further explores the possibility that finer grained labels may help improve the accuracy of image classification. The proposed method is divided into three steps. First, determine existing or manually annotated sub-class labels to capture nuanced details within primary categories. Second, use CLIP as a feature extractor, augmented with a fully connected layer. This setup facilitates supervised classification, leveraging the granularity of the identified sub-class labels. Third, the classified sub-labels are mapped back to their parent categories, resulting in the final prediction. By introducing and combining the precision of finer-grained labels with CLIP’s robust architecture, this method offers a promising avenue for bolstering classification accuracy. Code is available at https://github.com/24kcqsn/Image-Classification-Leveraging-Finer-Grained-Labels
科研通智能强力驱动
Strongly Powered by AbleSci AI