當(dāng)前位置：主頁(yè) > 科技論文 > 自動(dòng)化論文 >

半監(jiān)督學(xué)習(xí)框架下基于圖的SVM分類算法研究

發(fā)布時(shí)間：2018-02-26 02:18

本文關(guān)鍵詞： SVM 半監(jiān)督分類偽標(biāo)記 LRR圖去噪處理　出處：《北方民族大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：在機(jī)器學(xué)習(xí)領(lǐng)域,支持向量機(jī)(SVM)算法是較早的一種監(jiān)督學(xué)習(xí)算法,它解決了早期神經(jīng)網(wǎng)絡(luò)中的過(guò)擬合和“維數(shù)災(zāi)難”等問(wèn)題,并在諸多領(lǐng)域發(fā)揮了很好的應(yīng)用。半監(jiān)督學(xué)習(xí)可以有效利用標(biāo)記樣本和無(wú)標(biāo)記樣本,充分挖掘整體樣本集的聚類結(jié)構(gòu)信息。相比監(jiān)督分類,對(duì)標(biāo)記樣本的數(shù)量要求不高,且性能較好。其中,基于圖的半監(jiān)督學(xué)習(xí)是當(dāng)前最流行的一種半監(jiān)督算法。本文在半監(jiān)督學(xué)習(xí)框架下提出一種基于圖模型的SVM分類算法,通過(guò)將無(wú)標(biāo)記樣本的特征信息納入到算法的訓(xùn)練過(guò)程,進(jìn)一步提升SVM算法的分類精度。首先,利用基于圖的半監(jiān)督學(xué)習(xí)方法給無(wú)標(biāo)記樣本賦予偽標(biāo)記;然后將偽標(biāo)記樣本和標(biāo)記樣本信息共同輸入到SVM算法中。生成的偽標(biāo)記樣本可能存在噪聲樣本,我們應(yīng)先對(duì)偽標(biāo)記樣本集進(jìn)行去噪處理,以避免噪聲樣本減弱擴(kuò)充標(biāo)記樣本集所帶來(lái)的正面效應(yīng)。另外,偽標(biāo)記樣本的準(zhǔn)確率越高,噪聲樣本越少,樣本信息越有價(jià)值,工作量也會(huì)減少。所以,本文在擴(kuò)充訓(xùn)練樣本集中標(biāo)記樣本數(shù)目的預(yù)處理階段,通過(guò)實(shí)驗(yàn)對(duì)比選取一個(gè)分類精度較高,性能較好的圖模型,并結(jié)合SVM算法完成實(shí)驗(yàn)。本文主要研究工作如下:(1)第一階段,針對(duì)UCI數(shù)據(jù)集和USPS手寫(xiě)數(shù)據(jù)集,對(duì)指數(shù)權(quán)重圖(EW),k近鄰圖(kNN),1?范數(shù)圖(LN),低秩表示圖(LRR)進(jìn)行實(shí)驗(yàn)和分析,最終選擇低秩表示圖(LRR)作為樣本的預(yù)處理過(guò)程,不同的圖模型結(jié)合高斯場(chǎng)和調(diào)和函數(shù)(GHF)傳播算法完成分類實(shí)驗(yàn)。(2)第二階段,對(duì)低秩表示圖(LRR)賦予偽標(biāo)記后的樣本利用k近鄰圖算法對(duì)比標(biāo)記值剔除噪聲樣本。并針對(duì)UCI數(shù)據(jù)集和USPS手寫(xiě)數(shù)據(jù)集進(jìn)行實(shí)驗(yàn),結(jié)果證明,本文提出算法相對(duì)傳統(tǒng)SVM算法在缺乏標(biāo)記樣本情況下,可充分挖掘整體樣本集樣本分布信息,將SVM轉(zhuǎn)換為一種新的樣本可擴(kuò)充性的半監(jiān)督學(xué)習(xí)算法,且最終的分類精度更高。
[Abstract]:In the field of machine learning, support vector machine (SVM) algorithm is an earlier supervised learning algorithm, which solves the problem of early neural network in over fitting and "dimension disaster" and other issues, and played a very good application in many fields. Semi supervised learning can use labeled samples and unlabeled samples, full clustering structure information of the whole sample set. Compared with supervised classification, the number of labeled samples is not high, and good performance. The graph based semi supervised learning is a semi supervised algorithm is the most popular. This paper proposes a SVM classification algorithm based on graph model in the semi supervised learning framework, into the to the training process of the algorithm through the feature information will be unlabeled samples, to further improve the classification accuracy of SVM algorithm. Firstly, using semi supervised learning method to map the unlabeled samples with pseudo markers based on pseudo; then The labeled and labeled samples information is input to the SVM algorithm. The pseudo labeled samples may generate noise samples, we should first of pseudo labeledsamples denoising, to avoid noise samples decreased the positive effect brought the expansion of the labeledsamples. In addition, the pseudo labeled samples with higher accuracy, noise with fewer samples, sample information more valuable, the workload will be reduced. So, this paper expanded the preprocessing stage of the training sample set the number of labeled samples, select a higher classification accuracy compared with the experiment, graph model is a good performance, and SVM algorithm to complete the experiment. The main research work are as follows: (1) the first stage, according to the UCI data set and USPS data set of handwritten, index weight map (EW), K (kNN), the 1 nearest neighbor graph graph (LN), norm? Low rank representation (LRR) experiment and analysis, the final selection of low rank representation (LRR) as The pretreatment process of samples, different graph model combined with Gauss field and harmonic function (GHF) propagation algorithm to complete the classification experiment. (2) the second stage of low rank representation (LRR) provides pseudo labeled samples using k neighbor graph algorithm removes noise samples. The value of contrast markers and for the UCI data set and USPS handwritten data set for experiment, results show that the proposed algorithm compared with the traditional SVM algorithm in the case of lack of labeled samples, can fully excavate the sample distribution information of the whole sample set, SVM will be converted to a new sample scalable semi supervised learning algorithm, and the final classification accuracy is higher.

【學(xué)位授予單位】：北方民族大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP181

【參考文獻(xiàn)】

相關(guān)期刊論文前1條

1 張健;李白燕;;基于圖論最小割集算法的圖像分割研究[J];激光技術(shù);2014年06期

相關(guān)博士學(xué)位論文前1條

1 張國(guó)云;支持向量機(jī)算法及其應(yīng)用研究[D];湖南大學(xué);2006年

，

本文編號(hào)：1536118

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1536118.html

上一篇：基于PLC的搬運(yùn)機(jī)械手運(yùn)動(dòng)控制系統(tǒng)設(shè)計(jì)
下一篇：結(jié)合光譜和紋理的高分辨率遙感圖像分水嶺分割

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

半監(jiān)督學(xué)習(xí)框架下基于圖的SVM分類算法研究