Towards efficient airline disruption recovery with reinforcement learning

强化学习解算器马尔可夫决策过程计算机科学杠杆（统计）数学优化船员趋同（经济学）转换运筹学马尔可夫过程人工智能工程类数学电信统计航空学传输（电信）经济程序设计语言经济增长

作者

Yida Ding,Sebastian Wandelt,Guohua Wu,Yifan Xu,Xiaoqian Sun

出处

期刊：Transportation Research Part E-logistics and Transportation Review [Elsevier BV]
日期：2023-10-05 卷期号：179: 103295-103295 被引量：24

标识

DOI：10.1016/j.tre.2023.103295

摘要

Disruptions to airline schedules precipitate flight delays/cancellations and significant losses for airline operations. The goal of the integrated airline recovery problem is to develop an operational tool that provides the airline with an instant and cost-effective solution concerning aircraft, crew members and passengers in face of the emerging disruptions. In this paper, we formulate a decision recommendation framework which incorporates various recovery decisions including aircraft and crew rerouting, passenger reaccommodation, departure holding, flight cancellation and cruise speed control. Given the computational hardness of solving the mixed-integer nonlinear programming (MINP) model by the commercial solver (e.g., CPLEX), we establish a novel solution framework by incorporating Deep Reinforcement Learning (DRL) to the Variable Neighborhood Search (VNS) algorithm with well-designed neighborhood structures and state evaluator. We utilize Proximal Policy Optimization (PPO) to train the stochastic policy exploited to select neighborhood operations given the current state throughout the Markov Decision Process (MDP). Experimental results show that the objective value generated by our approach is within a 1.5% gap with respect to the optimal/close-to-optimal objective of the CPLEX solver for the small-scale instances, with significant improvement regarding runtime. The pre-trained DRL agent can leverage features/weights obtained from the training process to accelerate the arrival of objective convergence and further improve solution quality, which exhibits the potential of achieving Transfer Learning (TL). Given the inherent intractability of the problem on practical size instances, we propose a method to control the size of the DRL agent’s action space to allow for efficient training process. We believe our study contributes to the efforts of airlines in seeking efficient and cost-effective recovery solutions.

求助该文献

最长约 10秒，即可获得该文献文件

Towards efficient airline disruption recovery with reinforcement learning

今日热心研友