计算机科学
人工智能
杠杆(统计)
模态(人机交互)
模式
判别式
机器学习
缺少数据
条件随机场
模式识别(心理学)
社会科学
社会学
作者
Xiu-Shen Wei,Hongtao Yu,Anqi Xu,Faen Zhang,Yuxin Peng
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:33: 3456-3469
标识
DOI:10.1109/tip.2024.3403051
摘要
Our work focuses on tackling the problem of fine-grained recognition with incomplete multi-modal data, which is overlooked by previous work in the literature. It is desirable to not only capture fine-grained patterns of objects but also alleviate the challenges of missing modalities for such a practical problem. In this paper, we propose to leverage a meta-learning strategy to learn model abilities of both fast modal adaptation and more importantly missing modality completion across a variety of incomplete multi-modality learning tasks. Based on that, we develop a meta-completion method, termed as MECOM, to perform multimodal fusion and explicit missing modality completion by our proposals of cross-modal attention and decoupling reconstruction. To further improve fine-grained recognition accuracy, an additional partial stream (as a counterpart of the main stream of MECOM, i.e., holistic) and the part-level features (corresponding to fine-grained objects' parts) selection are designed, which are tailored for fine-grained nature to capture discriminative but subtle part-level patterns. Comprehensive experiments from quantitative and qualitative aspects, as well as various ablation studies, on two fine-grained multimodal datasets and one generic multimodal dataset show our superiority over competing methods. Our code is open-source and available at https://github.com/SEU-VIPGroup/MECOM.
科研通智能强力驱动
Strongly Powered by AbleSci AI