餐食
机器人
食品科学
计算机科学
人工智能
化学
作者
Yuhe Fan,Lixun Zhang,Canxing Zheng,Yunqin Zu,Xingyuan Wang,Jinghui Zhu
标识
DOI:10.1016/j.jfoodeng.2024.111996
摘要
Meal detection is an important technology to ensure success rate of meal-assisting robotics. However, due to the strong interclass similarity and intraclass variability presented by appearance, gesture, and complex traits of meals in different scenarios, it is more challenging to real-time and accurate detect meals. To address the above problems, a novel method based on deformable convolution and CloFormer (CF) transformer to optimize the YOLOv8s was proposed to achieve efficient and accurate detection for meal. The YOLOv8s model architecture was enhanced by introducing deformable convolution to capture finer-grained spatial information, and the CloFormer module was introduced to capture high-frequency local and low-frequency global information through shared weights and context-aware weights, we notated it as DCF-YOLOv8s. The proposed method was evaluated on meal datasets, which were evaluated separately with baseline model and several state-of-the-art (SOTA) detection models, and results show that the proposed method achieves better performance. Specifically, the proposed method can achieve 88.5% mean average accuracy (mAP) at 43.6 frames per second (FPS), validating its efficiency and accuracy in meal detection for meal-assisting robotics. The effectiveness of introducing deformable convolution and CloFormer modules was verified by ablation experiments, and validating the importance of adopting data augmentation methods. The method proposed in this paper can improve success rate of meal fetching by intelligent meal-assisting robots and also contribute to the development of the field of food engineering to monitor the quality of meals and food management.
科研通智能强力驱动
Strongly Powered by AbleSci AI