强化学习
计算机科学
约束(计算机辅助设计)
集合(抽象数据类型)
启发式
选择(遗传算法)
符号
领域(数学)
人工智能
机器学习
算法
算术
数学
程序设计语言
几何学
纯数学
作者
Shanshan Wang,Chenglong Xiao
出处
期刊:IEEE transactions on artificial intelligence
[Institute of Electrical and Electronics Engineers]
日期:2023-08-24
卷期号:5 (4): 1882-1894
被引量:1
标识
DOI:10.1109/tai.2023.3308099
摘要
Extensible processors, which combine programma-bility and efficiency, are emerging as a promising approach in the field of embedded computing. Automated synthesis of custom instructions from high-level application descriptions is a vital step involved in the design of extensible processors. In automated custom instruction synthesis, selecting custom instructions from a large set of candidates under area constraint is essentially a difficult combinatorial optimization problem. In this paper, we show that the custom instruction selection problem can be formulated as a sequential decision-making problem. Based on this formulation, we present three reinforcement learning-based approaches, namely SARSA, Q-learning, and Double Q-Learning, for solving the custom instruction selection problem. Moreover, we also perform a comprehensive analysis and com-parison of various combinations of learning specifications: the algorithm type and the update strategy for ε-greedy policy. The experiments with 45 test instances reveal that the SARSA, Q-learning, and Double Q-Learning algorithms outperform the meta-heuristic algorithm in terms of the overall performance gains by 26.9%,26.1% and 26.4% respectively. Among the three reinforcement learning algorithms, the SARSA algorithm slightly overwhelms the other two reinforcement learning algorithms. Furthermore, the experimental results suggest that the strategy $F_3 : ε = κ^i,0 < κ < 1$ is generally the most effective one for controlling the exploration and the exploitation of the learning processes.
科研通智能强力驱动
Strongly Powered by AbleSci AI