天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

耦合的支持向量學(xué)習(xí)方法及應(yīng)用研究

發(fā)布時(shí)間:2018-05-30 02:11

  本文選題:概念漂移 + 遷移學(xué)習(xí)。 參考:《江南大學(xué)》2016年博士論文


【摘要】:傳統(tǒng)的機(jī)器學(xué)習(xí)問(wèn)題面向的是單一學(xué)習(xí)機(jī)問(wèn)題,當(dāng)前多學(xué)習(xí)機(jī)問(wèn)題正得到越來(lái)越多的關(guān)注,但目前尚沒(méi)有研究從宏觀的角度來(lái)統(tǒng)一來(lái)描述多學(xué)習(xí)機(jī)問(wèn)題。多任務(wù)學(xué)習(xí)是同時(shí)求解相關(guān)數(shù)據(jù)集上的既關(guān)聯(lián)又有不同特征的多個(gè)學(xué)習(xí)機(jī);遷移學(xué)習(xí)則關(guān)注于相關(guān)歷史場(chǎng)景中豐富但又不能直接使用的數(shù)據(jù)或模型對(duì)當(dāng)前場(chǎng)景建模的增益作用;概念漂移是對(duì)不斷變化的學(xué)習(xí)場(chǎng)景進(jìn)行研究。它們都是直接或間接地對(duì)多個(gè)子學(xué)習(xí)機(jī)及其關(guān)系進(jìn)行研究,本文統(tǒng)一稱之為耦合的機(jī)器學(xué)習(xí)方法。本文提出耦合支持向量學(xué)習(xí)的框架,期望在此角度下,可以使多學(xué)習(xí)機(jī)問(wèn)題的研究焦點(diǎn)更多地放在場(chǎng)景之間的耦合特征上。時(shí)間自適應(yīng)支持向量機(jī)方法在處理非靜態(tài)數(shù)據(jù)集時(shí)表現(xiàn)出良好的性能,但僅根據(jù)鄰接子分類器相似而獲得的相關(guān)信息并不充分,由此可能會(huì)導(dǎo)致訓(xùn)練所得模型不可靠,限制其應(yīng)用能力。通過(guò)定義子分類器序列的相關(guān)性衰減函數(shù),提出新的面向非靜態(tài)數(shù)據(jù)分類問(wèn)題的演進(jìn)支持向量機(jī)(Evolving Support Vector Machines,ESVM)。ESVM使用衰變函數(shù)以體現(xiàn)子分類器之間的相關(guān)程度,通過(guò)約束所有子分類器之間的帶權(quán)差異以求得變化更光滑的子分類器序列,契合了數(shù)據(jù)中隱藏的漸變概念。在各種數(shù)據(jù)緩慢變化場(chǎng)景的對(duì)比實(shí)驗(yàn)中,該文的ESVM方法優(yōu)于以往方法。雖然時(shí)間自適應(yīng)支持向量機(jī)有著從兼顧局部?jī)?yōu)化和全局優(yōu)化的角度同時(shí)求解多個(gè)子分類器的特性,但子分類器之間的直接耦合帶來(lái)了計(jì)算中的矩陣求偽逆問(wèn)題,因而難以從理論上保證其擴(kuò)展核函數(shù)為Mercer核函數(shù);且對(duì)于大數(shù)據(jù)集,較高的計(jì)算代價(jià)限制了其實(shí)用性。針對(duì)此不足,提出了改進(jìn)型時(shí)間自適應(yīng)支持向量機(jī)(Improved Time Adaptive Support Vector Machine,ITA-SVM),用基分類器及一組增量來(lái)描述子分類器序列,以避免因直接求解子分類器序列而帶來(lái)的矩陣求偽逆問(wèn)題;并結(jié)合CVM理論,給出了ITA-SVM的快速算法。ITA-SVM在處理非靜態(tài)數(shù)據(jù)集時(shí)有著與TA-SVM相當(dāng)或更良好的分類性能,同時(shí)又具有漸近線性時(shí)間復(fù)雜度的優(yōu)點(diǎn)。該方法的有效性在實(shí)驗(yàn)中得到了驗(yàn)證。傳統(tǒng)的回歸系統(tǒng)構(gòu)建方法在訓(xùn)練時(shí)僅考慮單一的場(chǎng)景,其伴隨的一個(gè)重要缺陷是:若當(dāng)前場(chǎng)景中重要信息缺失,受訓(xùn)所得系統(tǒng)泛化能力較差。針對(duì)此不足,以支持向量回歸機(jī)為基礎(chǔ),提出了具有遷移學(xué)習(xí)能力的回歸機(jī)系統(tǒng),即遷移學(xué)習(xí)支持向量回歸機(jī)(Transfer learning Support Vector Regression,T-SVR)。T-SVR不僅能充分利用當(dāng)前場(chǎng)景的數(shù)據(jù)信息,而且能有效地利用歷史知識(shí)來(lái)學(xué)習(xí),具有通過(guò)遷移歷史場(chǎng)景知識(shí)來(lái)彌補(bǔ)當(dāng)前場(chǎng)景信息缺失的能力。具體地,通過(guò)控制目標(biāo)函數(shù)中當(dāng)前模型與歷史模型的相似性,使當(dāng)前模型能在信息缺失和不足時(shí)從歷史場(chǎng)景中得到有益信息,得到增強(qiáng)的當(dāng)前場(chǎng)景模型。在模擬數(shù)據(jù)和汾酒光譜數(shù)據(jù)集上的實(shí)驗(yàn)研究亦驗(yàn)證了在信息缺失場(chǎng)景下T-SVR較之于傳統(tǒng)回歸系統(tǒng)建模方法的更好適應(yīng)性。多任務(wù)學(xué)習(xí)方法旨在借助相關(guān)任務(wù)中的信息以提高各個(gè)子學(xué)習(xí)機(jī)的性能,在理論研究及基因測(cè)序、網(wǎng)頁(yè)分類等實(shí)際應(yīng)用方面都已經(jīng)取得了較好的成果。然而以往方法僅關(guān)注于多個(gè)任務(wù)之間的關(guān)聯(lián),而未充分考慮算法的復(fù)雜度。當(dāng)前社會(huì)信息量的急劇膨脹對(duì)多任務(wù)學(xué)習(xí)提出了新的挑戰(zhàn),較高的計(jì)算代價(jià)限制了以往各種多任務(wù)學(xué)習(xí)方法的實(shí)用性。本文提出了快速正則化多任務(wù)學(xué)習(xí)(Fast regularized Multi Task Learning,Fr MTL)方法。Fr MTL方法有著與正則化多任務(wù)學(xué)習(xí)方法相當(dāng)?shù)姆诸愋阅?又能依據(jù)核心向量機(jī)技術(shù)獲得漸近線性時(shí)間復(fù)雜度,使其在面對(duì)大數(shù)據(jù)集時(shí)仍然能夠獲得較快的決策速度。
[Abstract]:The problem of the traditional machine learning is the single learning machine problem. The problem of multi learning machine is getting more and more attention. However, there is no research on the multi learning machine problem from the macro point of view. Multi task learning is a multi learning machine with both the correlation and different characteristics on the related data set at the same time; Learning is concerned with the gain of the rich but undirectly used data or models in the relevant historical scenes for the modeling of the current scene; conceptual drift is a study of changing learning scenes. They are both direct or indirect study of the multiple learning machines and their relationships. This article is called the coupled machine science. The framework of coupled support vector learning is proposed in this paper. It is expected that the focus of research on the multi learning machine problem can be placed more on the coupling characteristics between scenes. The time adaptive support vector machine (time adaptive support vector machine) shows good performance when dealing with non static data sets, but it is obtained only according to the similarity of the adjacent Subclassifier. The relevant information is not sufficient, which may lead to the unreliability of the training model and limit its application ability. By defining the correlation attenuation function of the sub classifier sequence, a new Evolving Support Vector Machines (ESVM).ESVM is proposed for the use of the decay function to reflect the non static data classification problem. The degree of correlation between subclassifiers, by constraining the weight difference between all subclassifiers to obtain a more smooth sequence of subclassifiers, fits the concept of hidden gradient in the data. In the contrast experiments of various data slowly changing scenes, the ESVM method in this paper is superior to the previous method. In the view of both local optimization and global optimization, the characteristics of multiple sub classifiers are solved at the same time, but the direct coupling between the sub classifiers brings the matrix pseudo inverse problem in the calculation, so it is difficult to guarantee the extended kernel function as Mercer kernel function in theory, and the higher computation cost limits its practical use for large data sets. In order to solve this problem, an improved time adaptive support vector machine (Improved Time Adaptive Support Vector Machine, ITA-SVM) is proposed, which uses a base classifier and a group of increments to describe the sequence of the Subclassifier to avoid the matrix pseudo inverse problem caused by the direct solution of the sequence of the Subclassifier, and the fast ITA-SVM is given in conjunction with the CVM theory. The fast algorithm.ITA-SVM has the advantages of the equivalent or better classification performance of the non static data set with the TA-SVM and the asymptotically linear time complexity. The effectiveness of the method is verified in the experiment. The traditional construction method of the regression system only considers a single scene in training, and it is accompanied by an important defect. If the important information is missing in the current scene and the generalization ability of the training income system is poor, the regression machine system with the ability of transfer learning is proposed based on the support vector regression machine, that is, the Transfer learning Support Vector Regression, T-SVR.T-SVR can not only make full use of when it is used. The data information of the front scene, and can effectively use the historical knowledge to learn, has the ability to compensate for the absence of the current scene information by migrating the historical scene knowledge. Specifically, by controlling the similarity of the current model and the historical model in the target function, the current model can get from the historical scene when the information is missing and insufficient. The experimental research on the simulated data and the Fenjiu spectral data set also validates the better adaptability of T-SVR to the traditional regression system modeling method in the absence of information. The multi task learning method aims to improve the performance of each learning machine by using the information in the related tasks to improve the performance of each learning machine. Good results have been achieved in theoretical research, gene sequencing, Web classification and other practical applications. However, the previous methods only paid attention to the association between multiple tasks, but did not fully consider the complexity of the algorithm. The rapid expansion of the current social information has put forward new challenges to multi task learning, and the higher computational cost is limited. The practicability of various multitask learning methods. This paper proposes the fast regularization multitask learning (Fast regularized Multi Task Learning, Fr MTL) method.Fr MTL method has the equivalent classification performance with the regularized multitask learning method, and can also obtain the asymptotic linear time complexity based on the core vector machine technology, so that it is facing the big face. Data sets still achieve faster decision making speed.
【學(xué)位授予單位】:江南大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP181

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 ;數(shù)據(jù)集N鄽2[J];航空材料;1959年09期

2 江海洪 ,羅長(zhǎng)坤;首套中國(guó)數(shù)字化可視人體數(shù)據(jù)集在第三軍醫(yī)大學(xué)研制成功[J];中華醫(yī)學(xué)雜志;2003年09期

3 陳相穎;數(shù)據(jù)集記錄快速定位與篩選方法之探討[J];計(jì)量與測(cè)試技術(shù);2005年06期

4 張曉斌;魏永祥;韓德民;夏寅;李希平;原林;唐雷;王興海;;數(shù)字化耳鼻咽喉數(shù)據(jù)集的采集[J];中華耳鼻咽喉頭頸外科雜志;2005年06期

5 王宏鼎;唐世渭;董國(guó)田;;數(shù)據(jù)集成中數(shù)據(jù)集特征的檢測(cè)方法[J];中國(guó)金融電腦;2006年03期

6 張華;郁書(shū)好;;時(shí)空數(shù)據(jù)集的連接處理和優(yōu)化方法研究[J];皖西學(xué)院學(xué)報(bào);2006年02期

7 苗卿;單立新;裘昱;;信息熵在數(shù)據(jù)集分割中的應(yīng)用研究[J];電腦知識(shí)與技術(shù)(學(xué)術(shù)交流);2007年05期

8 陳德誠(chéng);丘平珠;唐炳莉;;廣西氣象數(shù)據(jù)集設(shè)計(jì)與制作[J];氣象研究與應(yīng)用;2007年04期

9 趙鳳英;王崇駿;陳世福;;用于不均衡數(shù)據(jù)集的挖掘方法[J];計(jì)算機(jī)科學(xué);2007年09期

10 劉密霞;張秋余;趙宏;余冬梅;;入侵檢測(cè)報(bào)警相關(guān)性及評(píng)測(cè)數(shù)據(jù)集研究[J];計(jì)算機(jī)應(yīng)用研究;2008年10期

相關(guān)會(huì)議論文 前10條

1 田捷;;三維醫(yī)學(xué)影像數(shù)據(jù)集處理的集成化平臺(tái)[A];2003年全國(guó)醫(yī)學(xué)影像技術(shù)學(xué)術(shù)會(huì)議論文匯編[C];2003年

2 范明;魏芳;;挖掘基本顯露模式用于分類[A];第二十一屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2004年

3 冷傳良;;飛機(jī)化銑成樣板劃線數(shù)據(jù)集設(shè)計(jì)方法探索[A];第十屆沈陽(yáng)科學(xué)學(xué)術(shù)年會(huì)論文集(信息科學(xué)與工程技術(shù)分冊(cè))[C];2013年

4 孟燁;張鵬;宋大為;王雷;;信息檢索系統(tǒng)性能對(duì)數(shù)據(jù)集特性的依賴性分析[A];第十二屆全國(guó)人機(jī)語(yǔ)音通訊學(xué)術(shù)會(huì)議(NCMMSC'2013)論文集[C];2013年

5 段磊;唐常杰;左R,

本文編號(hào):1953446


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1953446.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶19ae5***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com