转换查询缓冲区
计算机科学
虚拟内存
冗余(工程)
并行计算
隐藏物
物理地址
寻呼
作者
Yang Lin,Dunbo Zhang,Chaoyang Jia,Qiong Wang,Li Shen
标识
DOI:10.1109/paap54281.2021.9720477
摘要
Recently, GPUs are found to be used across a broad range of domains. To support virtual memory, which is required by most applications at present, the address translation process is introduced to GPU side. However, many applications demonstrate that an irregular memory access pattern, in which accesses are poor structured and often data dependent, makes performance worse especially with virtual-to-physical address translations. Modern memory management unit (MMU) employs caching, e.g. page walk buffer (PWB) and page walk cache (PWC), and scheduling mechanisms to accelerate address translations after TLB misses. Constrained by the linear table structure of traditional PWB and PWC, they hold lots of redundant information, which further limits the performance of irregular applications. Although nonlinear structure can eliminate the redundancy, it requires sequential look-up on PWB and PWC, which brings greater performance loss. In this paper, we propose a unified multi-level PWB and PWC structure that can eliminate the redundancy while enabling parallel look-up. We also design four corresponding address translation processes to ensure the efficiency of the new structure. We evaluate our design with real-world benchmarks under GPGPU-Sim simulator. Results show that our design achieves 42.6% IPC improvement.
科研通智能强力驱动
Strongly Powered by AbleSci AI