人工智能
情报检索
计算机科学
自然语言处理
万维网
作者
Aichun Zhu,Zijie Wang,Jingyi Xue,Xili Wan,Jing Jin,Tian Wang,Hichem Snoussi
标识
DOI:10.1109/tnnls.2024.3368217
摘要
Text-based person retrieval is the process of searching a massive visual resource library for images of a particular pedestrian, based on a textual query. Existing approaches often suffer from a problem of color (CLR) over-reliance , which can result in a suboptimal person retrieval performance by distracting the model from other important visual cues such as texture and structure information. To handle this problem, we propose a novel framework to E xcavate A ll-round I nformation B eyond C olor for the task of text-based person retrieval, which is therefore termed EAIBC. The EAIBC architecture includes four branches, namely an RGB branch, a grayscale (GRS) branch, a high-frequency (HFQ) branch, and a CLR branch. Furthermore, we introduce a mutual learning (ML) mechanism to facilitate communication and learning among the branches, enabling them to take full advantage of all-round information in an effective and balanced manner. We evaluate the proposed method on three benchmark datasets, including CUHK-PEDES, ICFG-PEDES, and RSTPReid. The experimental results demonstrate that EAIBC significantly outperforms existing methods and achieves state-of-the-art (SOTA) performance in supervised, weakly supervised, and cross-domain settings.
科研通智能强力驱动
Strongly Powered by AbleSci AI