模棱两可
动作(物理)
人工智能
计算机科学
机器人
人机交互
人机交互
工程类
计算机视觉
过程管理
程序设计语言
物理
量子力学
标识
DOI:10.1016/j.jmsy.2024.05.003
摘要
Human–robot collaboration (HRC) has been recognized as a potent pathway towards mass personalization in the manufacturing sector, by leveraging the synergy of human creativity and robotic precision. Previous approaches rely heavily on visual perception to autonomously comprehend the HRC environment. However, the inherent ambiguity in human–robot communication cannot be consistently neutralized by relying solely on visual cues. With the recently soaring popularity of large language models (LLMs), the consideration of language data as a complementary information source has increasingly drawn research attention, while the application of such large models, particularly within the context of HRC in manufacturing, remains largely under-explored. In response to this gap, a vision-language reasoning approach is proposed to mitigate the communication ambiguity prevalent in human–robot collaborative manufacturing scenarios. A referred object retrieval model is first designed to alleviate the object–reference ambiguity in the human language command. This model is then seamlessly integrated into an LLM-based robotic action planner to achieve an improved HRC performance. The effectiveness of the proposed approach is demonstrated empirically through a series of experiments conducted on the object retrieval model and its application in a human–robot collaborative assembly case.
科研通智能强力驱动
Strongly Powered by AbleSci AI