RGB颜色模型
计算机科学
水准点(测量)
情态动词
人工智能
计算机视觉
模式识别(心理学)
化学
大地测量学
高分子化学
地理
作者
Yanbo Liu,Guo Cao,Boshan Shi,Yingxiang Hu
标识
DOI:10.1109/tmm.2023.3262978
摘要
Presently, to obtain a more accurate density map and crowd number, existing methods often count by combining training RGB images and depth images. However, these methods are not ideal for capturing and fusing complementary features in RGB-D. Therefore, to solve the above problems, we propose a collaborative cross-modal attention network named CCANet for accurate RGB-D crowd counting. CCANet is mainly composed of the collaborative cross-modal attention module (CCAM) and the collaborative cross-modal fusion module (CCFM). Specifically, CCAM focuses on adaptive, interleaved RGB-D information through channel and spatial cross-modal attentions to fully capture complementary features in different modes. CCFM can adaptively integrate these features by weighing the importance of the above complementary features. A large number of experiments on the ShanghaiTechRGBD and MICC benchmarks have proven the effectiveness of CCANet in RGB-D crowd counting. In addition, our CCANet is generally applicable to multimodal crowd counting and has achieved superior counting performance on the RGBT-CC benchmark.
科研通智能强力驱动
Strongly Powered by AbleSci AI