基于CPU-GPU的條件隨機(jī)場(chǎng)并行化研究
發(fā)布時(shí)間:2018-04-14 12:37
本文選題:條件隨機(jī)場(chǎng) + 協(xié)作并行模型。 參考:《華中科技大學(xué)》2013年碩士論文
【摘要】:條件隨機(jī)場(chǎng)作為一種機(jī)器學(xué)習(xí)算法,廣泛應(yīng)用于詞性標(biāo)注、信息抽取、圖像分割等領(lǐng)域。實(shí)際應(yīng)用中,條件隨機(jī)場(chǎng)能靈活地生成大量描述性特征來(lái)增強(qiáng)學(xué)習(xí)效果,但特征維度會(huì)輕易突破百萬(wàn)級(jí)別,使訓(xùn)練過(guò)程中需要消耗大量的時(shí)間來(lái)計(jì)算梯度和似然函數(shù),從而在一定程度上限制了條件隨機(jī)場(chǎng)解決實(shí)際問(wèn)題的能力。現(xiàn)有的解決方案分別利用多核CPU或GPU對(duì)條件隨機(jī)場(chǎng)并行化,然而受限于CPU或GPU本身架構(gòu)特性,加速效果并不讓人滿意。 針對(duì)上述問(wèn)題,提出了一種利用CPU-GPU協(xié)同加速條件隨機(jī)場(chǎng)模型的新方法。在CPU-GPU協(xié)作并行架構(gòu)中,利用CPU處理內(nèi)存復(fù)雜度高和分支判斷較多的模型訓(xùn)練優(yōu)化過(guò)程,,解決了傳統(tǒng)GPU并行中因處理器內(nèi)存有限和弱分支判斷處理能力而導(dǎo)致的整體并行效果低的問(wèn)題。在GPU并行化方面,針對(duì)密集型計(jì)算過(guò)程的不同特性,提出兩級(jí)并行方法以獲得最大化并行效果:狀態(tài)矩陣中所有元素在獨(dú)立計(jì)算層Node Level并行;而狀態(tài)矩陣中所有可能路徑的計(jì)算則基于序列依賴計(jì)算層Sentence Level并行。此外,針對(duì)GPU的內(nèi)存訪問(wèn)特性,優(yōu)化了數(shù)據(jù)的內(nèi)存布局,減少不必要的內(nèi)存數(shù)據(jù)通信,提高了并行效率。 實(shí)驗(yàn)結(jié)果表明,基于CPU-GPU協(xié)同并行條件隨機(jī)場(chǎng)的方法,相比于CPU的單線程處理方式,在模型準(zhǔn)確率持平的情況下,訓(xùn)練過(guò)程加速可達(dá)到10倍以上,預(yù)測(cè)過(guò)程可達(dá)到15倍以上的加速;另一方面,相比于只使用GPU并行的方法,協(xié)作并行加速性能有效提升50%。
[Abstract]:As a machine learning algorithm, conditional random field is widely used in the fields of part of speech tagging, information extraction, image segmentation and so on.In practical application, conditional random fields can flexibly generate a large number of descriptive features to enhance the learning effect, but the feature dimension can easily break through millions of levels, which makes it take a lot of time to calculate the gradient and likelihood function in the process of training.To a certain extent, it limits the ability of conditional random field to solve practical problems.The existing solutions use multicore CPU or GPU to parallelize conditional random fields, but due to the architectural characteristics of CPU or GPU, the acceleration effect is not satisfactory.In order to solve the above problems, a new method using CPU-GPU cooperative acceleration condition random field model is proposed.In the CPU-GPU collaborative parallel architecture, the model training optimization process with high memory complexity and more branch judgment is processed by CPU.The problem of low overall parallel effect caused by limited memory and weak branch judgment processing ability in traditional GPU parallelism is solved.In the aspect of GPU parallelization, according to the different characteristics of intensive computing process, a two-level parallel method is proposed to maximize the parallel effect: all the elements in the state matrix are parallel in the independent computing layer Node Level;The computation of all possible paths in the state matrix is based on Sentence Level parallelism.In addition, according to the memory access characteristics of GPU, the memory layout of data is optimized, the unnecessary memory data communication is reduced, and the parallel efficiency is improved.The experimental results show that, compared with the single-thread processing method of CPU, the training process can be accelerated more than 10 times with the same accuracy of the model based on the CPU-GPU co-parallel conditional random field method.The prediction process can be accelerated by more than 15 times. On the other hand, compared with only using GPU parallelism, the performance of collaborative parallel acceleration can be effectively improved by 50%.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP181;TP332
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 王相海;陳明瑩;方玲玲;;概率圖模型及其圖像與視頻應(yīng)用研究[J];中國(guó)圖象圖形學(xué)報(bào);2009年09期
本文編號(hào):1749367
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1749367.html
最近更新
教材專著