情态动词
计算机科学
模态(人机交互)
传感器融合
数据挖掘
代表(政治)
秩(图论)
特征(语言学)
机器学习
模式
人工智能
模式识别(心理学)
数学
高分子化学
社会科学
语言学
化学
哲学
组合数学
社会学
政治
政治学
法学
作者
Xin Zou,Chang Tang,Xiao Xiang Zheng,Zhenglai Li,Xiao He,Shan An,Xinwang Liu
标识
DOI:10.1145/3581783.3612652
摘要
With advances in sensing technology, multi-modal data collected from different sources are increasingly available. Multi-modal classification aims to integrate complementary information from multi-modal data to improve model classification performance. However, existing multi-modal classification methods are basically weak in integrating global structural information and providing trustworthy multi-modal fusion, especially in safety-sensitive practical applications (e.g., medical diagnosis). In this paper, we propose a novel Dynamic Poly-attention Network (DPNET) for trustworthy multi-modal classification. Specifically, DPNET has four merits: (i) To capture the intrinsic modality-specific structural information, we design a structure-aware feature aggregation module to learn the corresponding structure-preserved global compact feature representation. (ii) A transparent fusion strategy based on the modality confidence estimation strategy is induced to track information variation within different modalities for dynamical fusion. (iii) To facilitate more effective and efficient multi-modal fusion, we introduce a cross-modal low-rank fusion module to reduce the complexity of tensor-based fusion and activate the implication of different rank-wise features via a rank attention mechanism. (iv) A label confidence estimation module is devised to drive the network to generate more credible confidence. An intra-class attention loss is introduced to supervise the network training. Extensive experiments on four real-world multi-modal biomedical datasets demonstrate that the proposed method achieves competitive performance compared to other state-of-the-art ones.
科研通智能强力驱动
Strongly Powered by AbleSci AI