重复数据消除
计算机科学
编码(内存)
相似性(几何)
数据挖掘
计算机数据存储
算法
数据库
人工智能
计算机硬件
图像(数学)
作者
Bowen Song,Limin Xiao,Guangjun Qin,Li Ruan,Shi-Da Qiu
出处
期刊:Communications in computer and information science
日期:2017-01-01
卷期号:: 245-253
标识
DOI:10.1007/978-981-10-3969-0_28
摘要
Satellite applications such as remote sensing application are overwhelmed with vast quantities of data. Nevertheless, the storage resources in the satellite are so limited that it should be used more efficient. The similarity between the remote sensing data is high, but the dissimilar parts of the data distribute irregularly. When using the traditional deduplication algorithm to split the file into chunks, a large amount of chunks are exactly similar but not the same, which results in the bad effect of data deduplication. We propose a deduplication algorithm based on data similarity and delta encoding to reduce the usage of storage resources. The data similarity analysis can find out the similar data. The delta encoding technology can reduce the usage of storage resources. Through experiments on remote sensing application data, we have achieved deduplication ratios up to 30:1, and analyzed how the chunksize affect the experiment results.
科研通智能强力驱动
Strongly Powered by AbleSci AI