Deep reinforcement learning and parameter transfer based approach for the multi-objective agile earth observation satellite scheduling problem

强化学习计算机科学马尔可夫决策过程数学优化调度（生产过程）人工智能可扩展性作业车间调度地铁列车时刻表背包问题马尔可夫过程算法数学数据库统计操作系统

作者

Luona Wei,Yuning Chen,Ming Chen,Yingwu Chen

出处

期刊：Applied Soft Computing [Elsevier BV]
日期：2021-06-17 卷期号：110: 107607-107607 被引量：51

标识

DOI：10.1016/j.asoc.2021.107607

摘要

The agile earth observation satellite scheduling problem (AEOSSP) consists of selecting and scheduling a number of tasks from a set of user requests in order to optimize one or multiple criteria. In this paper, we consider a multi-objective version of AEOSSP (called MO-AEOSSP) where the failure rate and the timeliness of scheduled requests are optimized simultaneously. Due to its NP-hardness, traditional iterative problem-tailored heuristic methods are sensitive to problem instances and require massive computational overhead. We thus propose a deep reinforcement learning and parameter transfer based approach (RLPT) to tackle the MO-AEOSSP in a non-iterative manner. RLPT first decomposes the MO-AEOSSP into a number of scalarized sub-problems by a weight sum approach where each sub-problem can be formulated as a Markov Decision Process (MDP). RLPT then applies an encoder–decoder structure neural network (NN) trained by a deep reinforcement learning procedure to producing a high-quality schedule for each sub-problem. The resulting schedules of all scalarized sub-problems form an approximate pareto front for the MO-AEOSSP. Once a NN of a subproblem is trained, RLPT applies a parameter transfer strategy to reducing the training expenses for its neighboring sub-problems. Experimental results on a large set of randomly generated instances show that RLPT outperforms three classical multi-objective evolutionary algorithms (MOEAs) in terms of solution quality, solution distribution and computational efficiency. Results on various-size instances also show that RLPT is highly general and scalable. To the best of our knowledge, this study is the first attempt that applies deep reinforcement learning to a satellite scheduling problem considering multiple objectives.

求助该文献

最长约 10秒，即可获得该文献文件

Deep reinforcement learning and parameter transfer based approach for the multi-objective agile earth observation satellite scheduling problem

今日热心研友