天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于監(jiān)督學(xué)習(xí)的開(kāi)源平臺(tái)軟件開(kāi)發(fā)行為研究

發(fā)布時(shí)間:2019-04-10 06:35
【摘要】:自二十世紀(jì)末以來(lái),蓬勃發(fā)展的開(kāi)源軟件正在逐步挑戰(zhàn)著傳統(tǒng)專有軟件占主導(dǎo)地位的軟件產(chǎn)業(yè)格局,逐漸增多的開(kāi)源軟件的出現(xiàn)對(duì)軟件產(chǎn)業(yè)的市場(chǎng)結(jié)構(gòu)產(chǎn)生了巨大的影響。分布式開(kāi)發(fā)模型也在隨著開(kāi)源軟件開(kāi)發(fā)需求的轉(zhuǎn)變而逐步發(fā)展,而基于拖拽式的分布式開(kāi)發(fā)模型的出現(xiàn)引領(lǐng)了一種新的關(guān)于分布式軟件開(kāi)發(fā)模式的發(fā)展方向。對(duì)開(kāi)源開(kāi)發(fā)中的開(kāi)發(fā)行為特征的研究是軟件演化領(lǐng)域的研究熱點(diǎn),可以幫助開(kāi)發(fā)者更深刻地理解軟件演化進(jìn)程中的規(guī)律,從而改進(jìn)現(xiàn)存的軟件開(kāi)發(fā)過(guò)程。隨著越來(lái)越多的開(kāi)發(fā)人員參與到開(kāi)源軟件開(kāi)發(fā)中,一些代碼托管平臺(tái),例如GitHub和BitBucket,逐步開(kāi)始為分布式軟件開(kāi)發(fā)提供相應(yīng)的支持。在對(duì)GitHub上的開(kāi)發(fā)行為進(jìn)行分析時(shí),需要對(duì)海量的關(guān)系松散的數(shù)據(jù)進(jìn)行處理,而想要獲得其中的深度價(jià)值往往需要通過(guò)包括機(jī)器學(xué)習(xí)等智能化復(fù)雜分析。本文對(duì)掛載在GitHub上的使用基于拖拽式開(kāi)發(fā)模式的開(kāi)源項(xiàng)目進(jìn)行分析,發(fā)掘出在該模式下開(kāi)發(fā)流程周轉(zhuǎn)、外部貢獻(xiàn)接納以及處理外部貢獻(xiàn)的時(shí)間等規(guī)律。分析開(kāi)發(fā)人員的開(kāi)發(fā)動(dòng)作行為,并且根據(jù)不同的開(kāi)發(fā)行為特征對(duì)貢獻(xiàn)最后能否被接納的影響力大小去構(gòu)建預(yù)測(cè)模型,來(lái)預(yù)測(cè)一個(gè)外部貢獻(xiàn)能否最終被采納。在對(duì)行為特征進(jìn)行提取時(shí),考慮加入基于歷史記錄的行為特征,對(duì)構(gòu)建預(yù)測(cè)模型所需的特征集合進(jìn)行了有效的補(bǔ)充。本文構(gòu)建的預(yù)測(cè)模型要解決的是對(duì)拖拽式請(qǐng)求的最終狀態(tài)進(jìn)行分類的問(wèn)題,將采用適用大規(guī)模數(shù)據(jù)監(jiān)督學(xué)習(xí)算法(支持向量機(jī))來(lái)實(shí)現(xiàn)大規(guī)模數(shù)據(jù)的分類。本文將會(huì)對(duì)所選取的預(yù)測(cè)模型的表現(xiàn)進(jìn)行對(duì)比,在選擇合適的預(yù)測(cè)模型上進(jìn)行研究,并將針對(duì)現(xiàn)存的SVM算法,在核函數(shù)參數(shù)優(yōu)化的過(guò)程中存在著計(jì)算量過(guò)大,學(xué)習(xí)性能以及識(shí)別率不夠高等問(wèn)題加以改進(jìn),最后對(duì)預(yù)測(cè)模型對(duì)于數(shù)據(jù)擬合化的探討。本文的創(chuàng)新研究?jī)?nèi)容如下:1.研究開(kāi)源系統(tǒng)中拖拽式請(qǐng)求的接受策略,本文通過(guò)對(duì)機(jī)器學(xué)習(xí)常見(jiàn)算法分類器對(duì)GitHub海量數(shù)據(jù)特征值進(jìn)行選取和分類,由于考慮到了測(cè)試部分與基于歷史數(shù)據(jù)的行為特征,在特征集合中引入測(cè)試覆蓋、人員歷史成功提交請(qǐng)求率以及項(xiàng)目歷史成功接納請(qǐng)求率因素,對(duì)特征值集進(jìn)行有效擴(kuò)充。2.為了提升網(wǎng)格搜索效率,本文對(duì)網(wǎng)格搜索算法的窮舉模式進(jìn)行改進(jìn),并應(yīng)用到了預(yù)測(cè)模型的構(gòu)建中,提出一種基于模式搜索與網(wǎng)格搜索算法相結(jié)合的網(wǎng)格探測(cè)參數(shù)選擇算法(GDPS)。對(duì)構(gòu)建預(yù)測(cè)模型運(yùn)用的SVM核函數(shù)的最優(yōu)參數(shù)對(duì)進(jìn)行選擇,提升SVM算法學(xué)習(xí)性能和識(shí)別率,從而得到一個(gè)準(zhǔn)確率更高的預(yù)測(cè)模型。
[Abstract]:Since the end of the 20th century, the booming open source software is gradually challenging the traditional proprietary software dominant software industry pattern, the emergence of gradually increasing open source software has a great impact on the market structure of the software industry. The distributed development model is gradually developing with the change of open source software development requirements, and the appearance of the drag-and-drop distributed development model leads to the development direction of a new distributed software development model. The research on the characteristics of development behavior in open source development is a hot topic in the field of software evolution, which can help developers to understand the law of software evolution more deeply and improve the existing software development process. As more and more developers are involved in open source software development, some code-managed platforms, such as GitHub and BitBucket, have gradually begun to provide appropriate support for distributed software development. When analyzing the development behavior on GitHub, it is necessary to deal with a large amount of loose data, and in order to obtain the depth value, it is often necessary to use intelligent and complex analysis, such as machine learning, and so on. In this paper, the open source projects based on drag-and-drop development model mounted on GitHub are analyzed, and the rules of development process turnover, external contribution acceptance and processing time of external contribution are found out. This paper analyzes the developer's development action behavior and constructs a prediction model according to the influence of different development behaviors on the final acceptance of the contribution to predict whether an external contribution can eventually be adopted. In the process of extracting behavior features, we consider adding history-based behavior features to effectively complement the set of features needed to construct the prediction model. In this paper, the prediction model is to solve the problem of classification of the final state of drag-and-drop requests, and a large-scale data supervised learning algorithm (support vector machine) will be used to realize the classification of large-scale data. In this paper, the performance of the selected prediction model will be compared, the selection of a suitable prediction model will be studied, and according to the existing SVM algorithm, there will be too much computation in the process of parameter optimization of the kernel function. Some problems such as learning performance and low recognition rate are improved. Finally, the prediction model for data adaptation is discussed. The innovative research contents of this paper are as follows: 1. This paper studies the acceptance strategy of drag-and-drop requests in open source systems. This paper selects and classifies the eigenvalues of GitHub massive data by machine learning common algorithm classifiers, considering the behavior characteristics of the test part and historical data. The feature set is effectively extended by introducing test coverage, human history successful submission request rate, and project historical success acceptance request rate factor into the feature set. 2. In order to improve the efficiency of grid search, this paper improves the exhaustive pattern of grid search algorithm and applies it to the construction of prediction model. A grid detection parameter selection algorithm (GDPS). Based on the combination of pattern search and grid search algorithm is proposed in this paper. The optimal parameter pairs of the SVM kernel function used to construct the prediction model are selected to improve the learning performance and the recognition rate of the SVM algorithm so as to obtain a prediction model with higher accuracy.
【學(xué)位授予單位】:哈爾濱工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP311.52

【參考文獻(xiàn)】

相關(guān)期刊論文 前5條

1 羅霖;;大規(guī)模機(jī)器學(xué)習(xí)問(wèn)題研究[J];艦船電子工程;2013年02期

2 袁霖;王懷民;尹剛;史殿習(xí);李翔;;開(kāi)源環(huán)境下開(kāi)發(fā)人員行為特征挖掘與分析[J];計(jì)算機(jī)學(xué)報(bào);2010年10期

3 顧昊;錢(qián)曉俊;梁洪亮;;開(kāi)源平臺(tái)下軟件管理技術(shù)的研究[J];計(jì)算機(jī)應(yīng)用研究;2007年08期

4 趙國(guó)棟;黃永中;;開(kāi)源軟件在高校的應(yīng)用與推廣策略研究[J];中國(guó)遠(yuǎn)程教育;2007年01期

5 伍恒,張衛(wèi)民,王靖;軟件的分布式協(xié)同開(kāi)發(fā)環(huán)境[J];吉首大學(xué)學(xué)報(bào)(自然科學(xué)版);2003年01期

相關(guān)博士學(xué)位論文 前3條

1 張利軍;大規(guī)模機(jī)器學(xué)習(xí)理論研究與應(yīng)用[D];浙江大學(xué);2012年

2 薛貞霞;支持向量機(jī)及半監(jiān)督學(xué)習(xí)中若干問(wèn)題的研究[D];西安電子科技大學(xué);2009年

3 王朝勇;支持向量機(jī)若干算法研究及應(yīng)用[D];吉林大學(xué);2008年

相關(guān)碩士學(xué)位論文 前3條

1 董亞?wèn)|;面向不平衡分類的邏輯回歸算法[D];鄭州大學(xué);2015年

2 徐奔;開(kāi)源軟件開(kāi)發(fā)人員行為特征的可視化挖掘[D];上海交通大學(xué);2013年

3 王梅;一種改進(jìn)的核函數(shù)參數(shù)選擇方法[D];西安科技大學(xué);2011年

,

本文編號(hào):2455558

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2455558.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶67a8b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com