重复数据消除
计算机科学
操作系统
元数据
日志文件系统
文件系统
德拉姆
嵌入式系统
分布式计算
计算机文件
计算机硬件
作者
Chunlin Song,Xianzhang Chen,Duo Liu,Xiaoliu Feng,Xi Yu,Jia Li,Yujuan Tan,Ao Ren
标识
DOI:10.1109/iccd56317.2022.00111
摘要
Block-level data deduplication is prevalent in various-scaled storage systems for saving storage space and improving I/O performance by reducing write operations. However, data deduplication induces additional metadata of blocks, leading to I/O amplification. Furthermore, to ensure the correctness of deduplicated user data, data deduplication systems need to guarantee crash consistency. In this paper, we propose CADedup, to achieve high performance while ensuring crash consistency by using persistent memory. By taking advantage of the byte-addressability and near-DRAM latency of persistent memory, we design an efficient journaling mechanism to manage the deduplication metadata of CADedup. Additionally, we adopt a hybrid storage architecture of DRAM and persistent memory to minimize space costs. We implement CADedup through the device-mapper interface in the Linux kernel. We conduct extensive experiments on Intel Optane PMEM to evaluate CAD-edup with widely-used benchmarks. Experimental results show that compared with the no-deduplication system, CADedup can achieve up to 1×-3× improvement in many workloads of server storage and has negligible throughput drop in the worst case.
科研通智能强力驱动
Strongly Powered by AbleSci AI