计算机科学
卷积(计算机科学)
光学(聚焦)
蒸馏
情态动词
图像检索
人工智能
比例(比率)
任务(项目管理)
领域(数学)
图像(数学)
情报检索
模式识别(心理学)
数据挖掘
人工神经网络
工程类
化学
有机化学
物理
数学
系统工程
量子力学
高分子化学
纯数学
光学
作者
Yu Liao,Rui Yang,Tao Xie,Hantong Xing,Dou Quan,Shuang Wang,Biao Hou
标识
DOI:10.1109/igarss52108.2023.10281578
摘要
With the increasing development of remote sensing (RS) technology, remote sensing cross-modal image-text retrieval (RSCMITR) task has gradually attracted wide attention. At present, the large-scale pre-training model is brilliant in the field of natural images cross-modal retrieval, but the current RSCMITR models do not focus on it, resulting in less retrieval performance improvement. This paper proposes a lightweight network structure based on large-scale pre-training model and knowledge distillation, designing a lightweight model based on separable convolution and text convolution. Knowledge distillation technology is used to make the Light model learn the hidden knowledge of large-scale model CLIP-RS, which realizes fast and accurate retrieval. The proposed method achieves state-of-the-art performance on four commonly used RSCMITR datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI