基于GPU的多連接查詢優(yōu)化
發(fā)布時間:2018-09-09 20:14
【摘要】:隨著信息時代的到來,數(shù)據(jù)處理的要求越來越高。一方面是數(shù)據(jù)更加復(fù)雜和數(shù)據(jù)量巨大膨脹,另一方面又要求數(shù)據(jù)處理的短時延和高吞吐量。傳統(tǒng)數(shù)據(jù)庫在單機平臺上的串行處理方式已不能滿足需要,并行處理是滿足大數(shù)據(jù)處理需要的有效方法。而日漸發(fā)展的用于通用計算的圖形處理器GPU以其超強的計算能力和存儲器帶寬,成為并行計算的有力工具,為加速數(shù)據(jù)處理提供了硬件支持。多連接查詢是數(shù)據(jù)處理中最常見和最耗時的操作,多連接查詢的效率是數(shù)據(jù)庫性能的重要因素。因此,本文利用GPU這一硬件平臺,研究、設(shè)計和實現(xiàn)了多連接操作的優(yōu)化工作。在GPU上的多連接查詢優(yōu)化分為兩個階段,第一個階段是建立連接的代價模型,采用啟發(fā)式算法獲取一棵代價最小的多連接查詢樹;第二個階段是在這個最小代價的多連接查詢樹上,用GPU進行并行優(yōu)化。GPU上并行優(yōu)化不僅可以實現(xiàn)每個連接內(nèi)部的并行優(yōu)化,還可以實現(xiàn)各個連接間的并行優(yōu)化。多種并行優(yōu)化方式同時使用,才能充分利用GPU的并行處理能力,最大限度地提高多連接查詢處理的性能。本文一是詳細設(shè)計和實現(xiàn)了在GPU上的兩種單連接的并行優(yōu)化,即排序歸并連接和哈希連接的并行優(yōu)化,并分析比較了這兩種連接的串行實現(xiàn);二是討論了連接間的并行調(diào)度策略,如順序并行執(zhí)行策略、分層并行執(zhí)行策略和右深樹執(zhí)行策略,分析比較了這幾種策略的優(yōu)劣。本文最后,實驗測試了排序歸并連接和哈希連接算法在GPU與多核CPU上性能,結(jié)果表明基于GPU優(yōu)化的排序歸并連接和哈希連接算法性能優(yōu)于多核CPU上的并行算法,加速比分別達到了7.25和5.21。同時測試了兩種算法在GPU平臺上利用不同的并行調(diào)度策略與基于多核CPU并行優(yōu)化的多連接算法的性能,結(jié)果表明基于GPU優(yōu)化的多連接算法性能要優(yōu)于基于多核CPU并行優(yōu)化的多連接算法。本文使用GPU來提高多連接查詢操作的處理效率,得到了一定的效果,為進一步提高數(shù)據(jù)庫管理效率提供有效保障。
[Abstract]:With the advent of the information age, the requirement of data processing is getting higher and higher. On the one hand, the data is more complex and the data volume is expanding enormously, on the other hand, the data processing needs short time delay and high throughput. GPU is a powerful tool for parallel computing and provides hardware support for speeding up data processing. Multi-join query is the most common and time-consuming operation in data processing. The efficiency of multi-join query is the database performance. The optimization of multi-join query on GPU is divided into two stages. The first stage is to establish the cost model of the connection, and to obtain a minimum cost multi-join query tree using heuristic algorithm; the second stage is in the GPU. This minimal cost multi-join query tree is optimized in parallel with GPU. Parallel optimization of GPU can not only realize the parallel optimization within each connection, but also realize the parallel optimization between each connection. In this paper, we first design and implement two kinds of parallel optimization of single connection on GPU, namely, parallel optimization of sorted merge connection and hash connection, and analyze and compare the serial implementation of these two kinds of connections. Secondly, we discuss the parallel scheduling strategies between connections, such as sequential parallel execution strategy and hierarchical parallel execution strategy. Finally, the performance of sorted merge join and hash join algorithm on GPU and multi-core CPU is tested. The results show that the performance of sorted merge join and hash join algorithm based on GPU optimization is better than that of parallel algorithm on multi-core CPU, and the speedup ratio is 7.25 respectively. And 5.21. The performance of the two algorithms on GPU platform using different parallel scheduling strategies and multi-core CPU parallel optimization is tested. The results show that the performance of the multi-join algorithm based on GPU optimization is better than that based on multi-core CPU parallel optimization. The efficiency of the system has achieved certain results, which provides an effective guarantee for further improving the efficiency of database management.
【學位授予單位】:華南理工大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP338.6
[Abstract]:With the advent of the information age, the requirement of data processing is getting higher and higher. On the one hand, the data is more complex and the data volume is expanding enormously, on the other hand, the data processing needs short time delay and high throughput. GPU is a powerful tool for parallel computing and provides hardware support for speeding up data processing. Multi-join query is the most common and time-consuming operation in data processing. The efficiency of multi-join query is the database performance. The optimization of multi-join query on GPU is divided into two stages. The first stage is to establish the cost model of the connection, and to obtain a minimum cost multi-join query tree using heuristic algorithm; the second stage is in the GPU. This minimal cost multi-join query tree is optimized in parallel with GPU. Parallel optimization of GPU can not only realize the parallel optimization within each connection, but also realize the parallel optimization between each connection. In this paper, we first design and implement two kinds of parallel optimization of single connection on GPU, namely, parallel optimization of sorted merge connection and hash connection, and analyze and compare the serial implementation of these two kinds of connections. Secondly, we discuss the parallel scheduling strategies between connections, such as sequential parallel execution strategy and hierarchical parallel execution strategy. Finally, the performance of sorted merge join and hash join algorithm on GPU and multi-core CPU is tested. The results show that the performance of sorted merge join and hash join algorithm based on GPU optimization is better than that of parallel algorithm on multi-core CPU, and the speedup ratio is 7.25 respectively. And 5.21. The performance of the two algorithms on GPU platform using different parallel scheduling strategies and multi-core CPU parallel optimization is tested. The results show that the performance of the multi-join algorithm based on GPU optimization is better than that based on multi-core CPU parallel optimization. The efficiency of the system has achieved certain results, which provides an effective guarantee for further improving the efficiency of database management.
【學位授予單位】:華南理工大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP338.6
【相似文獻】
相關(guān)期刊論文 前10條
1 徐帆;匯總型多表連接查詢的一種優(yōu)化方法[J];計算機工程與設(shè)計;2002年10期
2 張雷;唐桂芬;蘇冉冉;;基于通用空間連接圖的適應(yīng)性多元空間連接查詢[J];計算機光盤軟件與應(yīng)用;2013年13期
3 彭建平,王變琴;再探多連接查詢優(yōu)化方法[J];中山大學學報(自然科學版);2001年02期
4 劉宇,孫莉,田永青;并行空間連接查詢處理[J];上海交通大學學報;2002年04期
5 王果,徐仁佐;結(jié)合哈希過濾的一種改進多連接查詢優(yōu)化算法[J];計算機工程;2004年07期
6 陳恕勝;劉衛(wèi)東;;基于圖的適應(yīng)性多連接查詢優(yōu)化算法[J];計算機工程;2009年10期
7 郭聰莉;朱莉;李向;;基于蟻群算法的多連接查詢優(yōu)化方法[J];計算機工程;2009年10期
8 王,
本文編號:2233454
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2233454.html
最近更新
教材專著