公制(单位)
计算机科学
排名(信息检索)
语义空间
相关性(法律)
语义相似性
情报检索
相似性(几何)
显式语义分析
空格(标点符号)
度量空间
人工智能
相关性反馈
图像检索
自然语言处理
图像(数学)
语义计算
数学
语义网
语义技术
数学分析
运营管理
政治学
法学
经济
操作系统
作者
Sungkwon Choo,Seong Jong Ha,Joonsoo Lee
标识
DOI:10.1109/icip42928.2021.9506697
摘要
Video-text retrieval requires finding an optimal space for comparing the similarity of two different modalities. Most approaches adopt ranking loss as a primary training objective to find the space. The loss is only interested in bringing the samples annotated as pairs closer to each other without considering the semantic relevance of different samples. This rather causes even semantically similar pairs not to get close. To deal with the problem, we propose semantic-preserving metric learning. The proposed method entails the metric space where the similarity ratio between samples is proportional to semantic relevance between annotations. In the extensive experiments on video-text datasets, the proposed method presents a close alignment between the learned metric space and the semantic space. It also demonstrates state-of-the-art retrieval performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI