人工智能
计算机科学
计算机视觉
RGB颜色模型
融合
特征(语言学)
模式识别(心理学)
特征提取
传感器融合
语言学
哲学
作者
Shengli Yan,Yuan Rao,Wenhui Hou
标识
DOI:10.1109/icassp48485.2024.10448205
摘要
Unlike RGB images, depth images are robust to complex scenes of densely planted orchards. In this paper, we propose a fruit detection method using a multimodal feature fusion module (MMFF) of RGB and depth images. A dual-stream convolutional neural network is adopted in our method for feature extraction to capture multi-scale information of RGB images and depth images based on feature pyramids. The multimodal feature fusion module can filter similar and different features between modalities to suppress the same features and fuse different features. In addition, we use a multi-scale feature fusion method to fuse more information and improve the accuracy of fruit detection. To validate the effectiveness of our method, experimental research is conducted on a self-created pear dataset with multiple modalities. Extensive experiments demonstrate that our proposed approach can achieve state-of-the-art performance at low computation cost.
科研通智能强力驱动
Strongly Powered by AbleSci AI