天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

一種基于MPI和MapReduce的分布式向量計算框架的研究與實現(xiàn)

發(fā)布時間:2018-03-01 04:41

  本文關(guān)鍵詞: 分布式計算框架 機(jī)器學(xué)習(xí) 向量MPI MapReduce 出處:《浙江大學(xué)》2013年碩士論文 論文類型:學(xué)位論文


【摘要】:機(jī)器學(xué)習(xí)是近20年來興起的多領(lǐng)域交叉學(xué)科,涉及多門學(xué)科,諸如概率論、統(tǒng)計學(xué)、逼近論、凸分析等等。機(jī)器學(xué)習(xí)算法目前已經(jīng)有了廣泛的應(yīng)用,例如數(shù)據(jù)挖掘、自然語言處理、搜索引擎等等。當(dāng)前各種機(jī)器學(xué)習(xí)算法已經(jīng)有開源的單機(jī)實現(xiàn),但是隨著互聯(lián)網(wǎng)的高速發(fā)展,用戶數(shù)據(jù)量急劇增加,單機(jī)實現(xiàn)已經(jīng)不能滿足工業(yè)界的需求,為了滿足算法的高性能實現(xiàn),開發(fā)人員需要利用MPI, Hadoop/MapReduce等計算框架開發(fā)并行程序。 MPI效率高,編程靈活,擴(kuò)展性好,適合高性能計算,然而也存在一些缺點(diǎn):MPI接口眾多,學(xué)習(xí)成本高;當(dāng)前使用MPI實現(xiàn)高性能程序時,往往需要考慮數(shù)據(jù)切分、網(wǎng)絡(luò)通信等問題,缺少類似MapReduce的計算模型,增加了程序員的負(fù)擔(dān);算法實現(xiàn)專有化不利用代碼復(fù)用,缺少統(tǒng)一抽象的分布式數(shù)據(jù)結(jié)構(gòu);程序容錯性較差。 針對以上缺點(diǎn),本論文綜述了MPI容錯方案和MapReduce的應(yīng)用與改進(jìn),結(jié)合抽象向量接口設(shè)計,提出了一種MPI下基于向量和MapReduce的分布式計算框架。該框架將機(jī)器學(xué)習(xí)算法中的矩陣操作抽象成為分布式向量的操作,同時結(jié)合異步收發(fā)提高網(wǎng)絡(luò)傳輸效率,盡可能重疊CPU計算和網(wǎng)絡(luò)收發(fā)。在此基礎(chǔ)之上,引入checkpoint機(jī)制,增加多輪迭代算法的在MPI環(huán)境中的容錯性。 為了驗證程序的效率和正確性,選擇了PageRank算法進(jìn)行對比實驗。實驗證明,本論文提出框架適合并且能有有效解決符合MapReduce模型的機(jī)器學(xué)習(xí)算法的分布式實現(xiàn)問題。
[Abstract]:Machine learning is a multi-field interdisciplinary subject that has emerged in recent 20 years, involving many subjects, such as probability theory, statistics, approximation theory, convex analysis, etc. Machine learning algorithms have been widely used, such as data mining. Natural language processing, search engine and so on. At present, all kinds of machine learning algorithms have been implemented on an open source single machine, but with the rapid development of the Internet, the amount of user data has increased dramatically, and the single machine implementation has not been able to meet the needs of the industry. In order to achieve the high performance of the algorithm, developers need to use MPI, Hadoop/MapReduce and other computing frameworks to develop parallel programs. MPI has high efficiency, flexible programming, good expansibility and is suitable for high performance computing. However, it also has some disadvantages, such as: MPI interface is numerous and learning cost is high. When using MPI to implement high performance program, we often need to consider data segmentation, network communication and so on. The lack of a computing model similar to MapReduce increases the burden on programmers; the proprietary implementation of the algorithm does not use code reuse and lacks a unified abstract distributed data structure; and the fault tolerance of programs is poor. In view of the above shortcomings, this paper summarizes the application and improvement of MPI fault-tolerant scheme and MapReduce, combined with the design of abstract vector interface. This paper presents a distributed computing framework based on vector and MapReduce in MPI, which abstracts the matrix operation in machine learning algorithm into the operation of distributed vector, and improves the transmission efficiency of network by combining asynchronous transceiver and transceiver. The CPU computing and network transceiver are overlapped as much as possible. On this basis, the checkpoint mechanism is introduced to increase the fault-tolerance of multi-round iterative algorithms in the MPI environment. In order to verify the efficiency and correctness of the program, the PageRank algorithm is chosen to carry out a comparative experiment. The experimental results show that the proposed framework is suitable for and can effectively solve the distributed implementation problem of machine learning algorithm in accordance with the MapReduce model.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP181

【參考文獻(xiàn)】

相關(guān)碩士學(xué)位論文 前1條

1 牛海波;基于MPI的并行容錯技術(shù)研究與實現(xiàn)[D];國防科學(xué)技術(shù)大學(xué);2011年



本文編號:1550450

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1550450.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶27ebb***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
国产精品亚洲一区二区| 黄片免费播放一区二区| 日本加勒比在线观看不卡| 又大又长又粗又猛国产精品| 国产中文另类天堂二区| 自拍偷拍福利视频在线观看| 日韩精品一区二区一牛| 91偷拍与自偷拍精品| 日韩欧美一区二区不卡看片| 欧美大粗爽一区二区三区| 国自产拍偷拍福利精品图片| 亚洲av秘片一区二区三区| 男女午夜福利院在线观看| 沐浴偷拍一区二区视频| 国产农村妇女成人精品| 久久精品国产亚洲熟女| 熟女乱一区二区三区丝袜| 亚洲欧美国产网爆精品| 亚洲午夜精品视频观看| 日韩精品中文字幕在线视频| 搡老熟女老女人一区二区| 亚洲欧美日韩精品永久| 亚洲中文字幕综合网在线| 日本东京热视频一区二区三区| 中文文精品字幕一区二区| 国产成人亚洲综合色就色| 亚洲一区二区福利在线| 欧美黑人黄色一区二区| 日韩免费午夜福利视频| 国产情侣激情在线对白| 99久久精品国产麻豆| 精品欧美在线观看国产| 中文字幕无线码一区欧美| 亚洲女同一区二区另类| 亚洲精品成人午夜久久| 在线日韩欧美国产自拍| 国产成人人人97超碰熟女| 好吊日成人免费视频公开| 国产亚洲视频香蕉一区| 99久久人妻精品免费一区| 日本女优一色一伦一区二区三区 |