期刊:IEEE Transactions on Automation Science and Engineering [Institute of Electrical and Electronics Engineers] 日期:2022-11-14卷期号:21 (1): 528-540被引量:14
标识
DOI:10.1109/tase.2022.3221352
摘要
Autonomous robotics play a central role in smart logistics where robots can replace or aid humans in all kinds of tasks, such as items picking, moving, and storing. In this paper, we investigate the problem of task scheduling in automated warehouses with heterogeneous autonomous robotic (HAR) systems. We formulate a long-term non-convex queueing control optimization problem to minimize the queue length of tasks to be processed in the warehouse. Traditional task scheduling solutions based on optimization approaches are inefficient in handling the stochastic nature of the goods/tasks flow and a large number of robots in the system due to their computational cost. We propose a deep reinforcement learning (DRL) based task scheduling algorithm that employs the proximal policy optimization (PPO) method to find an optimal task scheduling policy. Due to the heterogeneity of the system, we propose a proximal weighted federated learning-based algorithm for implementing a decentralized PPO algorithm that improves the performance of the distributed PPO agents that are deployed in the workstations at the geographically distributed warehouses. The simulation results demonstrate the performance improvement of our proposed algorithm compared to the existing methods. Note to Practitioners—Task scheduling for robotic swarms in smart warehouses is substantial for e-commerce. State-of-the-art solutions have focused on efficient task scheduling for homogeneous robotic systems using machine learning techniques implemented in the warehouse management systems (WMS). However, task scheduling for a heterogeneous autonomous robotic (HAR) system has not fully been investigated so far. This article provides a comprehensive task scheduling algorithm for HAR systems that leverages innovative deep reinforcement learning and federated learning techniques. The proposed algorithm can be deployed in the geographically distributed warehouses of an e-commerce company and easily integrated into the WMS to optimally control the operation of the HAR system with stochastic goods/tasks flows in the smart warehousing.