CUDA圖形處理器對(duì)聚類算法的加速實(shí)現(xiàn)
[Abstract]:In recent years, with the rapid development of information processing and communication technology, it has promoted the further improvement of output and data acquisition in many fields. This has brought us a large amount of data to be processed. Nowadays, GB or TB orders of magnitude of data are very common. Therefore, the traditional serial data mining technology has been unable to be used. It is effective to deal with these data, which is replaced by parallel data mining technology. Nowadays, multi-core CPU (Central Processing Unit) is very common as a computing platform for parallel data mining. However, as the scale of data is increasing, the complexity of the algorithms involved in data mining is getting higher and higher, which brings high level. Density data operation will consume a large number of computing time of the processor, which is very bad for the performance and power of the whole system. In order to reduce the computation of this part of large-scale data, we need to choose a computing unit that is more suitable for this kind of operation to reduce the load of the system. And the graphic processor - GPU (Graphics Processing Unit), with its special architecture, is very suitable for parallel computing of large scale and high density data. At the beginning of the design, GPU designers have configured a large number of data computing units and high memory access bandwidth to meet the demands of increasingly harsh graphics processing and electronic games, especially since NV When IDIA launched the GPU products of the CUDA (Compute Unified Device Architecture) architecture in 2007 and its supporting environment, the development of GPU based platform is very similar to the development of the CPU platform, allowing developers to quickly start developing GPU based on CUDA architecture. In many fields, such as scientific computing, financial engineering, Data mining, and so on, developers are trying to use the CUDA architecture GPU to improve the computing power of the system. In addition to the implementation of GPU operation in the traditional single machine environment, and more and more research on the application of GPU in the distributed environment. This paper will select two kinds of clustering algorithms commonly used in the data mining domain, K mean clustering And single chain combined hierarchical clustering, and based on a NVIDIA GTX260 series CUDA architecture GPU respectively to the two clustering algorithms to carry out parallel implementation, thus demonstrating the effectiveness and feasibility of GPU clustering algorithm performance improvement. Finally, this paper will combine the enterprise customer relationship management system (Customer Relationship Management, CRM). The actual requirement is to realize the GPU clustering operation under the Hadoop framework. Clustering is a very common operation in the process of data mining. The main purpose of this paper is to assemble the randomly scattered objects together to form one or more clusters according to some agreed similarity or correlation. The K mean clustering algorithm implemented in this paper mainly refers to N is divided into the nearest cluster according to the distance of its spatial Euclidean distance. After several iterations, K clusters are formed after several iterations. As a classic clustering algorithm, the algorithm has been widely used in data mining, biological information, image recognition, artificial intelligence and so on. The famous Apache Mahout The K means clustering algorithm is used in its clustering operation. For another clustering algorithm implemented in the experiment, combined hierarchical clustering, first N independent data objects are initialized to N independent subclusters, and the Euclidean distance is used as the measure of similarity between subclusters, and the nearest sub cluster is merged and repeated multiple times. After generation, all data objects in the whole data set are included in a cluster as the end of the algorithm. Although the operation and final results are different from the K mean, all the algorithms contain large data operations and there is a high independence between the computing. It is precisely because of this commonality that we can use it. Multi core CPU and GPU parallel clustering algorithm to reduce the running time of the program and improve the throughput of the data. Because enterprise CRM needs to run in the distributed computing environment, the choice of the open source software Hadoop.Haoop framework allows the users to cluster the common computer cluster for large-scale data. Distributed computing, at the same time, provides strong fault tolerance and reliability. At present, such as Google, Facebook and other companies are using the distributed computing based on the Hadoop framework. In this paper, three groups of different sizes of input data and target clustering number are designed to perform K mean operation. The layer clustering algorithm has designed a smaller size of data than the K mean clustering. According to the programming model of CUDA, the program is divided into two parts, in which the program of non high density calculation runs in the CPU end, and the mathematical operation of the large floating point number involved in the clustering operation is run at the GPU end, and this model is also called the heterogeneous calculation of CPU+GPU. At the same time, I We also calculate the same object based on the multi core CPU. Finally, by comparing the time consuming of the two implementations, the acceleration ratio based on the GPU implementation is obtained. Finally, this paper will implement the GPU clustering operation based on the Hadoop framework, and the realization method and the result can be used as the prototype and reference of the enterprise CRM design.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 ;NVIDIA GeForce FX被評(píng)為2002年最佳圖形處理器[J];CAD/CAM與制造業(yè)信息化;2003年Z1期
2 李海燕;張春元;李禮;任巨;;圖形處理器的流執(zhí)行模型[J];計(jì)算機(jī)工程;2008年22期
3 ;MathWorks為MATLAB提供GPU支持[J];電子與電腦;2010年10期
4 楊毅;郭立;史鴻聲;郭安泰;;面向移動(dòng)設(shè)備的3D圖形處理器設(shè)計(jì)[J];小型微型計(jì)算機(jī)系統(tǒng);2009年08期
5 ;MathWorks為MATLAB提供GPU支持[J];電信科學(xué);2010年10期
6 ;MathWorks為MATLAB提供GPU支持[J];中國(guó)電子商情(基礎(chǔ)電子);2010年10期
7 ;MathWorks為MATLAB提供GPU支持[J];電信科學(xué);2010年S2期
8 韓俊剛;劉有耀;張曉;;圖形處理器的歷史現(xiàn)狀和發(fā)展趨勢(shì)[J];西安郵電學(xué)院學(xué)報(bào);2011年03期
9 ;產(chǎn)品推介[J];電子產(chǎn)品世界;2012年09期
10 ;產(chǎn)業(yè)信息[J];單片機(jī)與嵌入式系統(tǒng)應(yīng)用;2013年12期
相關(guān)會(huì)議論文 前7條
1 張春燕;;一種基于圖形處理器的數(shù)據(jù)流計(jì)算模式[A];全國(guó)第19屆計(jì)算機(jī)技術(shù)與應(yīng)用(CACIS)學(xué)術(shù)會(huì)議論文集(下冊(cè))[C];2008年
2 徐侃;陳如山;杜磊;朱劍;楊陽(yáng);;可編程圖形處理器加速無(wú)條件穩(wěn)定的Crank-Nicolson FDTD分析三維微波電路[A];2009年全國(guó)微波毫米波會(huì)議論文集(下冊(cè))[C];2009年
3 周國(guó)亮;馮海軍;何國(guó)明;陳紅;李翠平;王珊;;基于圖形處理器的Cuboid算法[A];第26屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)[C];2009年
4 畢文元;陳志強(qiáng);;利用可編程圖形處理器加速CT重建與體數(shù)據(jù)的繪制[A];第十一屆中國(guó)體視學(xué)與圖像分析學(xué)術(shù)會(huì)議論文集[C];2006年
5 劉偉峰;楊權(quán)一;曹邦功;孟凡密;周潔;;基于GPU的高度并行Marching Cubes改進(jìn)算法[A];2008年全國(guó)開(kāi)放式分布與并行計(jì)算機(jī)學(xué)術(shù)會(huì)議論文集(上冊(cè))[C];2008年
6 林旭生;田緒紅;馮志煒;陳茂資;;GPU加速的蟻群算法在HP模型中的應(yīng)用[A];第十四屆全國(guó)圖象圖形學(xué)學(xué)術(shù)會(huì)議論文集[C];2008年
7 方建文;于金輝;陳海英;;三維卡通水與物體交互作用的動(dòng)畫(huà)建模[A];中國(guó)計(jì)算機(jī)圖形學(xué)進(jìn)展2008--第七屆中國(guó)計(jì)算機(jī)圖形學(xué)大會(huì)論文集[C];2008年
相關(guān)重要報(bào)紙文章 前10條
1 樂(lè)山 樂(lè)水;圖形處理技術(shù)的全球?qū)@季中蝿?shì)[N];中國(guó)知識(shí)產(chǎn)權(quán)報(bào);2010年
2 嚴(yán)威川;明明白白顯卡“芯”[N];中國(guó)電腦教育報(bào);2007年
3 ;NEC圖形處理器每秒運(yùn)行50.2G條指令[N];計(jì)算機(jī)世界;2003年
4 游訊;圖形處理器GPU[N];人民郵電;2011年
5 本報(bào)記者 姜姝;AMD嵌入式技術(shù)為波音飛機(jī)保駕護(hù)航[N];中國(guó)信息化周報(bào);2014年
6 均兒;人人都有臺(tái)超級(jí)計(jì)算機(jī)[N];電腦報(bào);2008年
7 ;AMD啟動(dòng)“Fusion”企業(yè)品牌推廣計(jì)劃[N];人民郵電;2008年
8 本報(bào)記者 田夢(mèng);Adobe CS4全面支持GPU加速[N];計(jì)算機(jī)世界;2009年
9 趙欣;“玩”3D,筆記本也行![N];中國(guó)計(jì)算機(jī)報(bào);2003年
10 ;HP Compaq Evo D210教育信息化的好幫手[N];中國(guó)計(jì)算機(jī)報(bào);2003年
相關(guān)博士學(xué)位論文 前7條
1 祖淵;基于圖形處理器的高速并行算法研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2014年
2 李雪;基于圖形處理的多點(diǎn)地質(zhì)統(tǒng)計(jì)算法及模型評(píng)價(jià)[D];中國(guó)科學(xué)技術(shù)大學(xué);2016年
3 曹小鵬;圖形處理器關(guān)鍵技術(shù)和光線追蹤并行結(jié)構(gòu)研究[D];西安電子科技大學(xué);2015年
4 楊珂;基于圖形處理器的數(shù)據(jù)管理技術(shù)研究[D];浙江大學(xué);2008年
5 穆帥;針對(duì)不規(guī)則應(yīng)用的圖形處理器資源調(diào)度關(guān)鍵技術(shù)研究[D];清華大學(xué);2013年
6 夏健明;基于圖形處理器的大規(guī)模結(jié)構(gòu)計(jì)算研究[D];華南理工大學(xué);2009年
7 黃濤;基于GPU的多點(diǎn)地質(zhì)統(tǒng)計(jì)逐點(diǎn)模擬并行算法的研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2013年
相關(guān)碩士學(xué)位論文 前10條
1 劉銳;GPU在FD-OCT系統(tǒng)數(shù)據(jù)處理及實(shí)時(shí)圖像顯示中的應(yīng)用[D];北京理工大學(xué);2015年
2 彭歡;基于GPU的二維FDTD加速算法研究[D];西安電子科技大學(xué);2013年
3 徐蔚;基于圖形處理器的窗口系統(tǒng)的研究[D];西安工程大學(xué);2015年
4 孫修宇;基于GPU的三維圖像重建方法研究[D];中國(guó)民航大學(xué);2011年
5 劉伍鋒;基于PCI總線的主設(shè)備功能仿真與驗(yàn)證[D];西安電子科技大學(xué);2016年
6 李天驥;圖形處理器存儲(chǔ)系統(tǒng)的高精度System Verilog模型與自動(dòng)化仿真驗(yàn)證[D];西安電子科技大學(xué);2016年
7 陳貴華;基于RDMA高性能通信庫(kù)的設(shè)計(jì)與實(shí)現(xiàn)[D];華中科技大學(xué);2015年
8 馬仁駿;CUDA圖形處理器對(duì)聚類算法的加速實(shí)現(xiàn)[D];上海交通大學(xué);2013年
9 黃偉鈿;面向移動(dòng)平臺(tái)的3D圖形處理器的設(shè)計(jì)[D];華南理工大學(xué);2011年
10 王旭;圖形處理器的仿真驗(yàn)證[D];哈爾濱工業(yè)大學(xué);2007年
,本文編號(hào):2148850
本文鏈接:http://sikaile.net/guanlilunwen/kehuguanxiguanli/2148850.html