YOLACTFusion: An instance segmentation method for RGB-NIR multimodal image fusion based on an attention mechanism

人工智能计算机科学 RGB颜色模型计算机视觉分割图像分割骨干网模式识别（心理学）特征（语言学）计算机网络语言学哲学

作者

Cheng Liu,Qingchun Feng,Yuhuan Sun,Yajun Li,Mengfei Ru,Lijia Xu

出处

期刊：Computers and Electronics in Agriculture [Elsevier BV]
日期：2023-08-30 卷期号：213: 108186-108186 被引量：21

标识

DOI：10.1016/j.compag.2023.108186

摘要

The tomato plant's main-stem is a feasible lead for robotic searching the grows discretely-growing targets of harvesting, pruning or pollinating. Owing to the highlighted reflection characteristics of the main-stem in the near-infrared (NIR) waveband, this study proposes a multimodal hierarchical fusion method (YOLACTFusion) based on the attention mechanism, to achieve an instance segmentation of the main-stem from similar-colored differentiation (i.e., green leaf and green fruit) in robotic vision systems. The model inputs RGB images and 900–1100 nm NIR images into two ResNet50 backbone networks and uses a parallel attention mechanism to fuse feature maps of various scales together into the head network, to improve the segmentation performance of the main-stem of RGB images. The loss function for the multimodal image weights the original loss on the RGB image and the position offset loss and classification loss on the NIR image. Furthermore, the local depthwise separable convolution is used for the backbone network, and Conv-BN layers are merged to reduce the computational complexity. The results show that the precision and recall of YOLACTFusion of the main-stem detection, respectively reached 93.90 % and 62.60 %; and the precision and recall of instance segmentation reached 95.12 % and 63.41 %, respectively. Compared to YOLACT, the mean average precision (mAP) of YOLACTFusion is increased from 39.20 % to 46.29 %, the model size is reduced from 199.03 MB to 165.52 MB, while the image processing efficiency remains similar. The overall results show that the multimodal instance segmentation method proposed in this study significantly improves the detection and segmentation of tomato main-stems under a similar-colored background, which would be a potential method for improving agricultural robot's visual perception.

求助该文献

最长约 10秒，即可获得该文献文件

YOLACTFusion: An instance segmentation method for RGB-NIR multimodal image fusion based on an attention mechanism

今日热心研友