天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 自動(dòng)化論文 >

基于實(shí)例的領(lǐng)域適應(yīng)增量學(xué)習(xí)方法研究

發(fā)布時(shí)間:2018-04-03 20:37

  本文選題:文本分類 切入點(diǎn):實(shí)例遷移 出處:《南京理工大學(xué)》2017年碩士論文


【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的高速發(fā)展,人們能夠在互聯(lián)網(wǎng)上獲取到的信息與日俱增。信息的爆炸式增長(zhǎng)有利也有弊,如何高效且充分地利用這些信息成為學(xué)術(shù)界和工業(yè)界亟待解決的問(wèn)題。文本分類是解決此類問(wèn)題的一種比較常用技術(shù),按照學(xué)習(xí)的方式可以分為領(lǐng)域特定和領(lǐng)域適應(yīng)文本分類。目前已有許多基于實(shí)例遷移的領(lǐng)域適應(yīng)算法,然此類方法存在一個(gè)共性的現(xiàn)象,即實(shí)例權(quán)重過(guò)度學(xué)習(xí)造成的過(guò)擬合問(wèn)題。據(jù)了解,目前還沒(méi)有任何工作明確討論過(guò)該問(wèn)題,本文將對(duì)此進(jìn)行系統(tǒng)的研究。另外,在自然語(yǔ)言處理領(lǐng)域,傳統(tǒng)的統(tǒng)計(jì)機(jī)器學(xué)習(xí)模型通常是單任務(wù)的,即模型是從訓(xùn)練數(shù)據(jù)中一次性地學(xué)習(xí)得到的。這無(wú)疑限制了算法的泛化性與可擴(kuò)展性,本文將針對(duì)該弊端進(jìn)行增量式改進(jìn)。首先,本文介紹了當(dāng)前有代表性的基于實(shí)例的領(lǐng)域適應(yīng)算法ILA,并在此基礎(chǔ)上提出了正則化方法以強(qiáng)化遷移學(xué)習(xí)的效果。正則化方法分為六種子方法:三種基于Early-stopping的方法;兩種懲罰因子作為ILA模型正則項(xiàng)的方法;Dropout Training引入實(shí)例加權(quán)學(xué)習(xí)中的方法。文本分類實(shí)驗(yàn)結(jié)果表明,正則化方法一定程度上都能夠提高該實(shí)例遷移算法的性能,其中Dropout Training的效果最為顯著。其次,針對(duì)領(lǐng)域適應(yīng)中權(quán)重學(xué)習(xí)的過(guò)擬合問(wèn)題,本文進(jìn)行了系統(tǒng)的研究。雖然上述的正則化方法能夠變相緩解過(guò)擬合問(wèn)題,但并不能解決根本問(wèn)題,且嚴(yán)重限制了算法的效率和適應(yīng)性。因此,本文提出了基于損失函數(shù)懲罰的方法,根據(jù)實(shí)例的權(quán)重進(jìn)行不同程度的損失函數(shù)懲罰。實(shí)驗(yàn)結(jié)果表明,基于損失函數(shù)懲罰的方法不僅能夠明顯改善過(guò)擬合問(wèn)題,且具有較強(qiáng)的適應(yīng)性和較高的效率,其中基于少數(shù)權(quán)重較大樣本的損失函數(shù)懲罰方法效果是最優(yōu)且最穩(wěn)定的。最后,本文提出了一種基于終生學(xué)習(xí)的增量式樸素貝葉斯模型,在傳統(tǒng)的樸素貝葉斯模型的基礎(chǔ)上,提出了增量式的模型參數(shù)更新方式和終生式學(xué)習(xí)機(jī)制。該模型能夠存儲(chǔ)大規(guī)模歷史任務(wù)中學(xué)習(xí)到的知識(shí),有效輔助少量有樣本標(biāo)注的新任務(wù)的學(xué)習(xí),并以增量的方式更新參數(shù),每次學(xué)習(xí)只需更新歷史模型卻不必重復(fù)訓(xùn)練歷史數(shù)據(jù)。在文本分類上的實(shí)驗(yàn)結(jié)果表明,該模型不僅能夠增量式地利用過(guò)去任務(wù)中學(xué)習(xí)到的知識(shí)指導(dǎo)新任務(wù)的學(xué)習(xí),而且還具有較好的新特征處理和領(lǐng)域自適應(yīng)能力。
[Abstract]:With the rapid development of Internet technology, people can get more and more information on the Internet.The explosive growth of information has both advantages and disadvantages. How to make full use of this information efficiently and fully becomes an urgent problem to be solved in academia and industry.Text classification is a common technique to solve this kind of problem. It can be divided into domain specific and domain adaptive text classification according to the learning method.At present, there are many domain adaptation algorithms based on case migration, but there is a common phenomenon in this kind of methods, that is, the over-fitting problem caused by over-learning of case weights.It is understood that there has not been any work to discuss this problem explicitly, this paper will do a systematic study on it.In addition, in the field of natural language processing, the traditional statistical machine learning model is usually single-task, that is, the model is obtained from the training data in one time.This undoubtedly limits the generalization and extensibility of the algorithm.Firstly, this paper introduces the representative case-based domain adaptation algorithm ILA, and proposes a regularization method to enhance the effect of migration learning.The regularization method is divided into six submethods: three methods based on Early-stopping and two penalty factors as regular terms of ILA model.The results of text classification experiments show that the regularization method can improve the performance of the instance migration algorithm to some extent, and the effect of Dropout Training is the most significant.Secondly, aiming at the problem of over-fitting of weight learning in domain adaptation, this paper makes a systematic study.Although the above regularization method can alleviate the over-fitting problem in a disguised form, it can not solve the fundamental problem, and severely limits the efficiency and adaptability of the algorithm.Therefore, this paper presents a method of penalty based on loss function, which is based on the weight of an example.The experimental results show that the penalty method based on loss function can not only obviously improve the over-fitting problem, but also has strong adaptability and high efficiency.Among them, the penalty effect of loss function based on a few large weight samples is optimal and stable.Finally, an incremental naive Bayesian model based on lifelong learning is proposed. Based on the traditional naive Bayesian model, the incremental model parameter updating method and lifelong learning mechanism are proposed.The model can store the knowledge learned from large-scale history tasks, effectively assist the learning of a small number of new tasks with sample tagging, and update the parameters in an incremental manner. Each learning process only needs to update the historical model without repeatedly training the historical data.The experimental results on text classification show that the model can not only make incremental use of the knowledge learned in the past tasks to guide the learning of new tasks, but also have better ability of new feature processing and domain adaptation.
【學(xué)位授予單位】:南京理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP181

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 許明英;尉永清;趙靜;;一種結(jié)合反饋信息的貝葉斯分類增量學(xué)習(xí)方法[J];計(jì)算機(jī)應(yīng)用;2011年09期

2 羅福星;劉衛(wèi)國(guó);;一種樸素貝葉斯分類增量學(xué)習(xí)算法[J];微計(jì)算機(jī)應(yīng)用;2008年06期

3 姜卯生,王浩,姚宏亮;樸素貝葉斯分類器增量學(xué)習(xí)序列算法研究[J];計(jì)算機(jī)工程與應(yīng)用;2004年14期

4 宮秀軍,劉少輝,史忠植;一種增量貝葉斯分類模型[J];計(jì)算機(jī)學(xué)報(bào);2002年06期

,

本文編號(hào):1706812

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1706812.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶5fdc8***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
成年人视频日本大香蕉久久| 日韩中文字幕人妻精品| 福利视频一区二区三区| 日韩成人免费性生活视频| 亚洲一区二区精品免费| 性感少妇无套内射在线视频| 亚洲超碰成人天堂涩涩| 91精品蜜臀一区二区三区| 久久99精品国产麻豆婷婷洗澡| 日本精品免费在线观看| 高清不卡一卡二卡区在线| 制服丝袜美腿美女一区二区| 夫妻性生活动态图视频| 国产又粗又爽又猛又黄的 | 国产麻豆视频一二三区| 欧美一区二区三区十区| 91亚洲人人在字幕国产| 丰满人妻少妇精品一区二区三区 | 亚洲午夜精品视频观看| 国内尹人香蕉综合在线| 亚洲中文字幕视频在线播放| 欧美激情一区二区亚洲专区| 国产欧美日韩精品一区二| 国产对白老熟女正在播放| 黄片免费播放一区二区| 91亚洲熟女少妇在线观看| 好吊妞视频这里有精品| 国产精品午夜性色视频| 精品国产亚洲免费91| 日韩女优视频国产一区| 免费人妻精品一区二区三区久久久| 国产一区二区三中文字幕| 国内精品美女福利av在线| 黄色片国产一区二区三区| 欧美三级不卡在线观线看| 日韩精品你懂的在线观看| 日韩18一区二区三区| 国产成人亚洲精品青草天美| 免费精品国产日韩热久久| 免费福利午夜在线观看| 亚洲国产精品国自产拍社区|