超级计算机
并行计算
计算机科学
加速
矢量化(数学)
绩效改进
运营管理
经济
作者
Xin Liu,Jun Sun,Lin Zheng,Su Wang,Yao Liu,Tongquan Wei
出处
期刊:IEEE Transactions on Parallel and Distributed Systems
[Institute of Electrical and Electronics Engineers]
日期:2020-11-10
卷期号:32 (4): 975-987
被引量:27
标识
DOI:10.1109/tpds.2020.3037082
摘要
Sunway TaihuLight system is the first supercomputer offering a peak performance over 100 PFlops, which can be utilized to parallelize Non-dominated Sorting Genetic Algorithm II (NSGA-II), a standard approach to multi-objective optimization. However, insufficient off-chip memory bandwidth and limited scratchpad memory capacity of the supercomputer hinder the performance improvement of parallellizing NSGA-II. In this article, we propose an optimized parallel NSGA-II on Sunway TaihuLight system, called swNSGA-II, by utilizing process- and thread-level parallelism of the system based on an improved island/master-slave model. To overcome the hurdles of low memory bandwidth and capacity, we propose a data sharing scheme based on register-level communication that can efficiently parallelize non-dominated sorting and crowding-distance computation of NSGA-II. Several optimization techniques including vectorization, direct memory accessing, and double buffering are also adopted to further accelerate swNSGA-II. Experiment results show that the proposed swNSGA-II can achieve a speedup of 41284 on a use case of path planning, and a speedup of 62692 on ZDT1 as compared to conventional NSGA-II.
科研通智能强力驱动
Strongly Powered by AbleSci AI