计算机科学
计算机视觉
人工智能
计算机图形学(图像)
算法
作者
Shouxin Zhang,Yanyan Wang,Shengzhe Shi,Qingqing Wang,Chun Wang,Sheng Liu
标识
DOI:10.1038/s41598-024-74416-2
摘要
To assist the visually impaired in their daily lives and solve the problems associated with poor portability, high hardware costs, and environmental susceptibility of indoor object-finding aids for the visually impaired, an improved YOLOv5 algorithm was proposed. It was combined with a RealSense D435i depth camera and a voice system to realise an indoor object-finding device for the visually impaired using a Raspberry Pi 4 B device as its core. The algorithm uses GhostNet instead of the YOLOv5s backbone network to reduce the number of parameters and computation of the model, incorporates an attention mechanism (coordinate attention), and replaces the YOLOv5 neck network with a bidirectional feature pyramid network to enhance feature extraction. Compared to the YOLOv5 model, the model size was reduced by 42.4%, number of parameters was reduced by 47.9%, and recall rate increased by 1.2% with the same precision. This study applied the improved YOLOv5 algorithm to an indoor object-finding device for the visually impaired, where the searched object was input by voice, and the RealSense D435i was used to acquire RGB and depth images to realize the detection and ranging of the object, broadcast the specific distance of the target object by voice, and assist the visually impaired in finding the object.
科研通智能强力驱动
Strongly Powered by AbleSci AI