基于深度學習理論和SVM技術的文本分類研究與實現(xiàn)
[Abstract]:With the rapid development of Internet technology, a large number of data and information are produced. Every day, millions of netizens get valuable and meaningful information through the Internet. How can everyone get the knowledge and skills they want from massive data quickly and accurately? It has become a hot issue in current research. In order to solve this kind of problem, researchers analyze, mine and classify the data to help people improve the efficiency of information retrieval. The main work of this paper is to use deep learning for feature extraction and support vector machine to mine and analyze the massive data text, and finally get the essential features of the text. Traditional text classification algorithms use statistical methods such as expected cross entropy, information gain and mutual information to obtain feature sets by setting threshold values. If the amount of data in the training set is large, it is easy to have some defects, such as unclear feature items and loss of feature information. In order to solve these problems, this paper uses the deep learning method to combine the existing data characteristics. Two methods of deep learning and support vector machine (SVM) are proposed to design classifiers to complete text classification. the main research contents and innovations of this paper are as follows: 1. This paper introduces the research status and significance of the existing text classification technology at home and abroad, and expounds the importance of text classification, and finally points out the work to be done in this paper. 2. Firstly, the traditional classification technology is studied, which is fully studied from three parts: text preprocessing, text feature extraction and text classification, and then the Bayesian and KNN,SVM classification algorithms are described. The applicable scope, advantages and disadvantages of the three algorithms are analyzed. This paper introduces the related theoretical knowledge of depth learning, and proposes to use sparse automatic coding to map the original data in high dimensional space, and to use depth belief network to project the output of sparse automatic coding to obtain text abstract features. The process of text feature extraction based on sparse automatic coding and depth belief network in depth learning is studied. 4. In this paper, based on the deep learning and improved multi-classification SVM method, a classifier based on sparse automatic coding, depth belief network and SVM classification is designed to classify the text. Finally, through the design experiment, the method proposed in this paper is tested, and compared and analyzed with the traditional text classification method. The accuracy of text classification is tested by modifying parameters.
【學位授予單位】:江蘇科技大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1
【參考文獻】
相關期刊論文 前10條
1 郭正斌;張仰森;蔣玉茹;;一種面向文本分類的特征向量優(yōu)化方法[J];計算機應用研究;2017年08期
2 肖江;王曉進;;基于SVM的在線商品評論的情感傾向性分析[J];信息技術;2016年07期
3 耿杰;范劍超;初佳蘭;王洪玉;;基于深度協(xié)同稀疏編碼網(wǎng)絡的海洋浮筏SAR圖像目標識別[J];自動化學報;2016年04期
4 常建秋;沈煒;;基于字符串匹配的中文分詞算法的研究[J];工業(yè)控制計算機;2016年02期
5 盧宏濤;張秦川;;深度卷積神經(jīng)網(wǎng)絡在計算機視覺中的應用研究綜述[J];數(shù)據(jù)采集與處理;2016年01期
6 曲建嶺;杜辰飛;邸亞洲;高峰;郭超然;;深度自動編碼器的研究與展望[J];計算機與現(xiàn)代化;2014年08期
7 袁琳琳;陳紅平;;漢語自動分詞系統(tǒng)的設計與實現(xiàn)[J];信息與電腦(理論版);2014年07期
8 梁勝;成衛(wèi)青;;基于組合型中文分詞技術的改進[J];南京郵電大學學報(自然科學版);2013年06期
9 單麗莉;劉秉權;孫承杰;;文本分類中特征選擇方法的比較與改進[J];哈爾濱工業(yè)大學學報;2011年S1期
10 姜鶴;陳麗亞;;SVM文本分類中一種新的特征提取方法[J];計算機技術與發(fā)展;2010年03期
相關碩士學位論文 前1條
1 馬冬梅;基于深度學習的圖像檢索研究[D];內(nèi)蒙古大學;2014年
,本文編號:2479130
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2479130.html