天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

半監(jiān)督支持向量機(jī)模型與算法研究

發(fā)布時(shí)間:2018-01-03 15:09

  本文關(guān)鍵詞:半監(jiān)督支持向量機(jī)模型與算法研究 出處:《上海大學(xué)》2016年博士論文 論文類型:學(xué)位論文


  更多相關(guān)文章: 半監(jiān)督學(xué)習(xí) 支持向量機(jī) 二次曲面支持向量機(jī) 無核 分類問題 凸松弛


【摘要】:支持向量機(jī)是一種針對小樣本分類問題的機(jī)器學(xué)習(xí)方法,它是根據(jù)統(tǒng)計(jì)學(xué)習(xí)理論中的結(jié)構(gòu)風(fēng)險(xiǎn)極小化原則提出的,由于具有獲得全局最優(yōu)解以及良好的泛化能力被廣泛應(yīng)用到當(dāng)前的熱點(diǎn)領(lǐng)域,如壓縮感知、稀疏優(yōu)化、模式識(shí)別、特征提取、圖像處理和醫(yī)療診斷等領(lǐng)域中.半監(jiān)督支持向量機(jī)是一種同時(shí)考慮有標(biāo)簽樣本和無標(biāo)簽樣本的學(xué)習(xí)方法.由于在實(shí)際問題中人們通常容易獲取大量的無標(biāo)簽樣本和少量的有標(biāo)簽樣本,因而半監(jiān)督支持向量機(jī)被廣泛應(yīng)用到處理大規(guī)模數(shù)據(jù)識(shí)別與分類問題中.然而,半監(jiān)督支持向量機(jī)的挑戰(zhàn)主要在于其數(shù)學(xué)模型是一個(gè)難解的優(yōu)化問題,并且在處理非線性分類時(shí)選取核函數(shù)既耗時(shí)又帶來計(jì)算上的困難.因而研究半監(jiān)督分類的模型與算法設(shè)計(jì)具有重要的理論意義和廣泛的應(yīng)用價(jià)值.本博士學(xué)位論文主要研究了新的半監(jiān)督支持向量機(jī)分類模型與算法,并通過人工數(shù)據(jù)集和分類數(shù)據(jù)庫中的基準(zhǔn)數(shù)據(jù)集測試新方法的分類表現(xiàn).首先,針對半監(jiān)督支持向量機(jī)模型對應(yīng)的優(yōu)化問題難解的挑戰(zhàn)以及帶有二次Hinge損失函數(shù)的半監(jiān)督支持向量機(jī)模型的研究,提出了兩種錐松弛方法.半監(jiān)督支持向量機(jī)模型對應(yīng)的優(yōu)化問題是混合整數(shù)規(guī)劃問題,本文首先提出了一個(gè)新的半正定松弛問題,并近似估計(jì)了原問題最優(yōu)值與該松弛問題最優(yōu)值的最大比值,即該松弛問題對原問題的近似程度.接著,構(gòu)造了與原混合整數(shù)規(guī)劃問題等價(jià)的全正錐規(guī)劃問題.由于該問題通常是NP-難的,故對其進(jìn)行進(jìn)一步松弛,從而得到該問題的雙非負(fù)松弛問題.與半正定松弛相比,雙非負(fù)松弛得到的最優(yōu)值是原問題最優(yōu)值更緊的下界.最后,通過凸優(yōu)化工具包CVX和交替方向算法求解兩個(gè)松弛問題,數(shù)值結(jié)果表明兩種松弛方法都得到了較高的分類精度,并且雙非負(fù)松弛方法比半正定松弛方法分類效果更好.其次,針對選取合適的核函數(shù)比較困難且比較耗時(shí)等問題,首次提出無核半監(jiān)督二次曲面支持向量機(jī)模型,該模型是一個(gè)混合整數(shù)規(guī)劃問題,通常來說是NP-難的.首先將該混合整數(shù)規(guī)劃問題等價(jià)轉(zhuǎn)化為一個(gè)帶有絕對值約束的非凸優(yōu)化問題,再使用向量提升技術(shù)將其松弛為一個(gè)多項(xiàng)式時(shí)間可計(jì)算的半正定規(guī)劃問題,并采用凸優(yōu)化工具包CVX進(jìn)行求解.通過數(shù)值實(shí)驗(yàn)發(fā)現(xiàn),與傳統(tǒng)半監(jiān)督支持向量機(jī)方法和監(jiān)督支持向量機(jī)方法進(jìn)行對比,半監(jiān)督二次曲面支持向量機(jī)能夠得到更高的分類精度.實(shí)驗(yàn)結(jié)果不僅說明無核分類模型是有效的,也表明同時(shí)訓(xùn)練有標(biāo)簽和無標(biāo)簽樣本提高了分類性能.但是該方法存在的問題是當(dāng)數(shù)據(jù)集規(guī)模較大時(shí)容易產(chǎn)生內(nèi)存溢出.最后,針對半監(jiān)督二次曲面支持向量機(jī)存在的計(jì)算時(shí)間較長和存儲(chǔ)內(nèi)存較大兩個(gè)問題,提出無核半監(jiān)督中心二次曲面支持向量機(jī)模型,該模型利用了中心支持向量機(jī)的結(jié)構(gòu)優(yōu)勢,將半監(jiān)督二次曲面支持向量機(jī)模型對應(yīng)的優(yōu)化問題簡化成為一個(gè)只含有等式約束的混合整數(shù)規(guī)劃問題,該問題一般來說是NP-難的.為近似求解該問題,先運(yùn)用半正定松弛技術(shù)再添加線性矩陣不等式約束將原問題松弛為一個(gè)半正定規(guī)劃問題.對于松弛后的問題,設(shè)計(jì)原始交替方向算法進(jìn)行求解.數(shù)值結(jié)果顯示,與半監(jiān)督二次曲面支持向量機(jī)相比該方法有效提高了計(jì)算效率和分類精度;并且有標(biāo)簽樣本的標(biāo)簽與無標(biāo)簽樣本的特征都對分類精度有很大影響.
[Abstract]:Support vector machine is a kind of based on small sample classification problem of machine learning method, which is based on the statistical learning theory of the structure risk minimization principle, due to obtain the global optimal solution and good generalization ability has been widely applied to the hot fields, such as compressed sensing, sparse optimization, pattern recognition, feature extraction the field of image processing and medical diagnosis. Semi supervised support vector machine is a kind of both labeled and unlabeled samples. The learning methods in practical problems people are often easy to obtain large number of unlabeled examples and a small amount of labeled samples, so the semi supervised support vector machine is widely used to deal with the problem the classification of large datasets. However, semi supervised support vector machine challenge lies in its mathematical model is a difficult optimization problem, and in the treatment of non line The kernel function of the classification were time-consuming and bring computational difficulties. So it has important theoretical significance and wide application value of design model and algorithm for semi supervised classification. This dissertation mainly studies the semi supervised support vector machine classification model and algorithm of the new, and the benchmark data and synthetic data sets classification the database classification performance test method. First, according to the complicated optimization problem of semi supervised support vector machine model corresponding to the challenge and with two Hinge loss function of semi supervised support vector machine model research, put forward two kinds of optimization problems. The cone relaxation method for semi supervised support vector machine model corresponding to the mixture the integer programming problem, this paper proposes a new semi definite relaxation problem, and estimates the optimal value of the original problem is the maximum ratio and the optimal value of the relaxation problem That is, the degree of approximation of the relaxation problem of the original problem. Then, construct equivalence with the original mixed integer programming problem is cone programming problem. Because the problem is often difficult to NP-, so the further relaxation, resulting in two non negative relaxation of the problem. Compared with the semi definite relaxation, double non negative optimal relaxation value obtained is the lower bound of original problem optimal value tight. Finally, through the convex optimization toolkit CVX and alternating two relaxation algorithm for solving the problem of direction, the numerical results show that two kinds of relaxation methods have obtained higher classification accuracy, and the double non negative relaxation method better classification than the semi definite relaxation method. Secondly to select the appropriate kernel function, difficult and time-consuming problem, first proposed nuclear free surface two semi supervised support vector machine model, this model is a mixed integer programming problem, generally speaking is difficult to NP- At first the mixed integer programming problem is equivalent to a constraint with the absolute value of the non convex optimization problem, and then use the vector technology to enhance the relaxation as a semidefinite programming problem can be computed in polynomial time, and the convex optimization toolkit CVX is used to solve the problem. Through numerical experiments, and the traditional semi supervised support vector machine method and supervised support vector machine method are compared, the semi supervised support vector machine two surface can get higher classification accuracy. The experimental results not only illustrate the non nuclear classification model is effective, but also shows that the training labels and unlabeled samples to improve the classification performance. But the problem with this method is that when the data in large scale is prone to memory overflow. Finally, according to the semi supervised support vector machine two surface are longer computing time and large storage memory two problems, put forward Non nuclear center two surface semi supervised support vector machine model, this model uses the structure of the center of the advantages of support vector machine, semi supervised optimization problem of the two surface corresponding to the support vector machine model is simplified into a mixed integer programming problem with equality constraints, the problem in general is NP- hard. The approximate solution of the problem, using semi definite relaxation techniques add linear matrix inequality constraints to the original problem into 1.5 relaxation SDP problem. For the relaxed problem, the original design of alternating direction algorithm. The numerical results show that the two surface and the semi supervised support vector machine compared this method effectively improves the calculation efficiency and classification accuracy; and label the sample label and label free sample features have a great influence on the classification accuracy.

【學(xué)位授予單位】:上海大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP181

【參考文獻(xiàn)】

相關(guān)博士學(xué)位論文 前2條

1 郭傳好;幾類錐規(guī)劃問題算法與應(yīng)用的研究[D];上海大學(xué);2013年

2 趙瑩;半監(jiān)督支持向量機(jī)學(xué)習(xí)算法研究[D];哈爾濱工程大學(xué);2010年

,

本文編號:1374408

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1374408.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶b70f9***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com