抓住
触觉知觉
计算机视觉
感知
人工智能
计算机科学
融合
人机交互
视觉感受
心理学
神经科学
语言学
哲学
程序设计语言
作者
Zhuangzhuang Zhang,Zhinan Zhang,Lihui Wang,Xiaoxiao Zhu,Huang Huang,Qixin Cao
标识
DOI:10.1016/j.rcim.2023.102601
摘要
Humans can instinctively predict whether a given grasp will be successful through visual and rich haptic feedback. Towards the next generation of smart robotic manufacturing, robots must be equipped with similar capabilities to cope with grasping unknown objects in unstructured environments. However, most existing data-driven methods take global visual images and tactile readings from the real-world system as input, making them incapable of predicting the grasp outcomes for cluttered objects or generating large-scale datasets. First, this paper proposes a visual-tactile fusion method to predict the results of grasping cluttered objects, which is the most common scenario for grasping applications. Concretely, the multimodal fusion network (MMFN) uses the local point cloud within the gripper as the visual signal input, while the tactile signal input is the images provided by two high-resolution tactile sensors. Second, collecting data in the real world is high-cost and time-consuming. Therefore, this paper proposes a digital twin-enabled robotic grasping system to collect large-scale multimodal datasets and investigates how to apply domain randomization and domain adaptation to bridge the sim-to-real transfer gap. Finally, extensive validation experiments are conducted in physical and virtual environments. The experimental results demonstrate the effectiveness of the proposed method in assessing grasp stability for cluttered objects and performing zero-shot sim-to-real policy transfer on the real robot with the aid of the proposed migration strategy.
科研通智能强力驱动
Strongly Powered by AbleSci AI