计算机科学
水下
模态(人机交互)
人工智能
声纳
遮罩(插图)
特征(语言学)
编码(集合论)
模式识别(心理学)
图像融合
情态动词
图像(数学)
过程(计算)
计算机视觉
地理
考古
集合(抽象数据类型)
化学
高分子化学
程序设计语言
视觉艺术
艺术
哲学
操作系统
语言学
作者
Shih‐Wei Yang,Li-Hsiang Shen,Hong-Han Shuai,Kai‐Ten Feng
摘要
Underwater image recognition is crucial for underwater detection applications. Fish classification has been one of the emerging research areas in recent years. Existing image classification models usually classify data collected from terrestrial environments. However, existing image classification models trained with terrestrial data are unsuitable for underwater images, as identifying underwater data is challenging due to their incomplete and noisy features. To address this, we propose a cross-modal augmentation via fusion ( CMAF ) framework for acoustic-based fish image classification. Our approach involves separating the process into two branches: visual modality and sonar signal modality, where the latter provides a complementary character feature. We augment the visual modality, design an attention-based fusion module, and adopt a masking-based training strategy with a mask-based focal loss to improve the learning of local features and address the class imbalance problem. Our proposed method outperforms the state-of-the-art methods. Our source code is available at https://github.com/WilkinsYang/CMAF.
科研通智能强力驱动
Strongly Powered by AbleSci AI