计算机科学
弹丸
人工智能
适应(眼睛)
特征(语言学)
特征提取
计算机视觉
模式识别(心理学)
心理学
语言学
哲学
神经科学
有机化学
化学
作者
Guangxing Wu,Junxi Chen,Wentao Zhang,Ruixuan Wang
标识
DOI:10.1145/3595916.3626396
摘要
Large Vision-Language models such as CLIP have demonstrated impressive capabilities in zero-shot recognition. To apply CLIP to few-shot classification tasks, several methods have been proposed based on CLIP, achieving significant improvements. However, these methods either insufficiently leverage CLIP's prior knowledge during training or neglect the impact of feature adaptation. In this paper, we propose FAR, a novel approach that balances distribution-altered Feature Adaptation with pRior knowledge of CLIP to further improve the performance of CLIP in few-shot classification tasks. Firstly, we introduce an adapter that enhances the effectiveness of CLIP adaptation by amplifying the differences between the fine-tuned CLIP features and the original CLIP features. Secondly, we leverage the prior knowledge of CLIP to mitigate the risk of overfitting. Through this framework, a good trade-off between feature adaptation and preserving prior knowledge is achieved, enabling effective utilization of both components to enhance performance on downstream tasks. We evaluate our method on over 10 datasets for classification, and our approach consistently outperforms existing methods, demonstrating its effectiveness and robustness.
科研通智能强力驱动
Strongly Powered by AbleSci AI