计算机科学
连接(拓扑)
查询优化
数据库
排序合并联接
在线聚合
分布式数据库
哈希联接
加入
情报检索
萨尔盖博
查询计划
查询语言
数据挖掘
作者
Jintao Gao,Wenjie Liu,Zhanhuai Li,Jian Zhang,Li Shen
标识
DOI:10.1016/j.ins.2019.10.043
摘要
Abstract The quality of fragments allocation is key for improving performance of join query in distributed database. Current strategies concentrate on using heuristic rules to allocate fragments to corresponding locations, such as picking the location with maximum required data or with greedy algorithm. Notwithstanding their benefits, under distributed environment, facing various query plans, different data distributions and expensive network cost, their scene-sensitive character may easily generate low quality allocation plan due to lack of generalization ability. In this paper, for breaking this limitation, we propose a general strategy for allocating fragments(AlCo, Allocate fragments based on Cost). AlCo evaluates multiple candidate allocation plans based on cost, which is realized by a modified genetic algorithm employed from PostgreSQL. Our fitness function (cost model) synthetically considers various changeable factors to support generalization ability. For reducing the risks caused by randomization of genetic algorithm, AlCo provides an upper bound computed through current heuristic methods to improve the robustness of our genetic algorithm. We implement AlCo in a distributed database system, and the experiments show that, on TPC-H benchmark, AlCo is up to 2x–4x better on performance than existing strategies and performs well in robustness and scalability.
科研通智能强力驱动
Strongly Powered by AbleSci AI