基于深度學習的文本分類關鍵問題研究
發(fā)布時間:2022-01-25 15:01
文本分類由來已久,近年來,隨著人工智能和機器學習的迅速發(fā)展,文本分類也出現了很多新方法。隨著技術的發(fā)展,一方面,文本語料的數據質量和數量發(fā)生了巨大的變化,大規(guī)模語料的積累為更復雜的模型提供了必要的數據保障。另一方面,計算機的計算性能的提升為大規(guī)模語料的計算和分析提供了有力的計算資源保障。隨著機器學習和深度學習的推進,深度學習的方法在各個領域都表現出強大的優(yōu)勢。本文將在深度學習的基礎上探討文本分類中的基本研究問題。介紹了不同的深度學習方法,如卷積神經網絡(Convolutional Neural Network,CNN)和長短期記憶(Long Short-Term Memory,LSTM)。我們提出了分別利用 CNN 和 LSTM,并利用樸素貝葉斯(Naive Bayes,NB)作為對比方法,,以PyCharm是開發(fā)平臺,在文本情感分類的公開數據集上做了實驗,并對實驗結果進行了分析。結果表明,所提出的方法比基準方法取得了更好的效果。
【文章來源】:華北電力大學(北京)北京市 211工程院校 教育部直屬院校
【文章頁數】:60 頁
【學位級別】:碩士
【文章目錄】:
摘要
ABSTRACT
CHAPTER 1 INTRODUCTION
1.1 Text Classification
1.1.1 Definition
1.1.2 Basic Concepts of Text Classification
1.1.3 Text Classification Processes
1.1.4 Applications of Text Classification
1.2 Deep Learning
1.2.1 Definition
1.2.2 History of Deep Learning
1.2.3 Applications of Deep Learning in Text Mining
1.3 Literature Review
1.4 Research Motivation
1.5 Thesis Layout
CHAPTER 2 DEEP LEARNING TECHNIQUES
2.1 Data Preprocessing
2.1.1 Stemming
2.1.2 Word Segmentation
2.2 Text Representation
2.2.1 Word Embedding
2.2.2 One-hot Vector
2.3 Classification
2.3.1 Convolutional Neural Network (CNN)
2.3.2 Convolutional Neural Network (CNN)Applications
2.3.3 Long Short-Term Memory(LSTM)
2.3.4 Gated Recurrent Unit (GRU)
2.3.5 Long Short-Term Memory(LSTM)Applications
2.4 Evaluation
2.4.1 Confusion Matrix
2.4.2 Accuracy
2.4.3 Precision
2.4.4 Recall
2.4.5 False Positive rate(FP), True Negative rate (TN)and False Negative rate(FN)
2.4.6 F-Measure
CHAPTER 3 CONVOLUTIONAL NEURAL NETWORK(CNN)-BASED TEXT CLASSIFICATION
3.1 Dataset
3.2 Baseline Method: Naive Bayes (NB)
3.2.1 Definition
3.2.2 Environment
3.2.3 Experiment
3.2.4 Results and Analysis
3.3 Proposed Method 1: Convolutional Neural Network (CNN)
3.3.1 Definition
3.3.2 Environment
3.3.3 Model
3.3.4 Experiment
3.3.5 Results and Analysis
CHAPTER 4 LONG SHORT-TERM MEMORY(LSTM)-BASED TEXT CLASSIFICATION
4.1 Definition
4.2 Environment
4.3 Model
4.4 Experiment
4.5 Results and Analysis
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS
5.1 Conclusions
5.2 Future Works
REFERENCES
APPENDIX A-Some Codes From The Baseline Method: Naive Bayes (NB)
APPENDIX B-Codes From The Convolutional Neural Network (CNN)Model
APPENDIX C-Codes From The Long Short-Term Memory (LSTM) Model
ACKNOWLEGEMENT
本文編號:3608745
【文章來源】:華北電力大學(北京)北京市 211工程院校 教育部直屬院校
【文章頁數】:60 頁
【學位級別】:碩士
【文章目錄】:
摘要
ABSTRACT
CHAPTER 1 INTRODUCTION
1.1 Text Classification
1.1.1 Definition
1.1.2 Basic Concepts of Text Classification
1.1.3 Text Classification Processes
1.1.4 Applications of Text Classification
1.2 Deep Learning
1.2.1 Definition
1.2.2 History of Deep Learning
1.2.3 Applications of Deep Learning in Text Mining
1.3 Literature Review
1.4 Research Motivation
1.5 Thesis Layout
CHAPTER 2 DEEP LEARNING TECHNIQUES
2.1 Data Preprocessing
2.1.1 Stemming
2.1.2 Word Segmentation
2.2 Text Representation
2.2.1 Word Embedding
2.2.2 One-hot Vector
2.3 Classification
2.3.1 Convolutional Neural Network (CNN)
2.3.2 Convolutional Neural Network (CNN)Applications
2.3.3 Long Short-Term Memory(LSTM)
2.3.4 Gated Recurrent Unit (GRU)
2.3.5 Long Short-Term Memory(LSTM)Applications
2.4 Evaluation
2.4.1 Confusion Matrix
2.4.2 Accuracy
2.4.3 Precision
2.4.4 Recall
2.4.5 False Positive rate(FP), True Negative rate (TN)and False Negative rate(FN)
2.4.6 F-Measure
CHAPTER 3 CONVOLUTIONAL NEURAL NETWORK(CNN)-BASED TEXT CLASSIFICATION
3.1 Dataset
3.2 Baseline Method: Naive Bayes (NB)
3.2.1 Definition
3.2.2 Environment
3.2.3 Experiment
3.2.4 Results and Analysis
3.3 Proposed Method 1: Convolutional Neural Network (CNN)
3.3.1 Definition
3.3.2 Environment
3.3.3 Model
3.3.4 Experiment
3.3.5 Results and Analysis
CHAPTER 4 LONG SHORT-TERM MEMORY(LSTM)-BASED TEXT CLASSIFICATION
4.1 Definition
4.2 Environment
4.3 Model
4.4 Experiment
4.5 Results and Analysis
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS
5.1 Conclusions
5.2 Future Works
REFERENCES
APPENDIX A-Some Codes From The Baseline Method: Naive Bayes (NB)
APPENDIX B-Codes From The Convolutional Neural Network (CNN)Model
APPENDIX C-Codes From The Long Short-Term Memory (LSTM) Model
ACKNOWLEGEMENT
本文編號:3608745
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/3608745.html