作者
Aichen Wang,Weihao Qian,Ao Li,Yuanzhi Xu,Jin Hu,Yuwen Xie,Liyuan Zhang
摘要
Tomato yield estimation relies significantly on the accurate detection of fruit quantity and size. And object detection and semantic segmentation of fruits emerge as efficacious methodologies for the realization of fruit counting and size detection. To address challenges associated with the detection and segmentation of tomato fruits in complex environments such as sample imbalance of different classes, small targets, and susceptibility to occlusion at varying stages of ripeness, this study proposed a foreground-foreground class balance method and an improved YOLOv8s network, NVW-YOLOv8s, for detecting and segmenting tomato fruits simultaneously. The foreground-foreground class balance method initially performed pixel-wise extraction on fruit samples with fewer instances. Subsequently, it synthesized them with original images containing a limited number of samples from the class, thereby augmenting the overall quantity of this specific category of fruit samples. In the NVW-YOLOv8s network, a C2f-N module, founded on the normalization-based attention module (NAM), was specifically crafted for residual feature learning. This design serves to augment the network's proficiency in extracting and integrating feature information pertaining to tomato fruits within intricate environments. Additionally, a variable focal loss (VFL) was introduced as the classification loss function to address the issue of positive and negative sample imbalance, and a regression loss function based on Wise-IoU (WIoU) was incorporated to tackle challenges related to fruit small targets and susceptibility to occlusion. The YOLOv8s models trained with the augmented balanced dataset revealed that the model's detection and segmentation performance, as indicated by [email protected], improved by 4.8 % and 5.4 %, respectively, compared with the model trained with the augmented original training dataset. The test results on the augmented balanced dataset indicated that the proposed NVW-YOLOv8s network achieved a detection [email protected] of 91.4 %, an F1-Score of 85.4 %, a segmentation [email protected] of 90.7 %, and an F1-Score of 84.8 %. These results surpass the baseline YOLOv8s network by 4.3 % and 5.5 % for detection, 4.1 % and 5.0 % for segmentation, respectively. Additionally, the processing for concurrent detection and segmentation was measured at a speed of 60.2 FPS. Therefore, the proposed method has successfully met the precision and real-time requirements for intelligent yield estimation in horticultural crops.