天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 軟件論文 >

基于深度學習的動詞檢錯算法的研究及其設計

發(fā)布時間:2018-04-25 16:22

  本文選題:英文批改 + 規(guī)則語法; 參考:《中國科學技術大學》2017年碩士論文


【摘要】:英文作文自動批閱,是近年來快速發(fā)展的一個領域。它逐漸替代了教師人工批閱,成為緩解英語教師教學負擔的重要工具。同時,通過文獻調研可知,動詞的一致性錯誤和動詞時態(tài)錯誤是英文作文中出錯率最高的兩類語法錯誤。所以,對于動詞錯誤的檢測結果能反映一個自動批閱系統(tǒng)的實用性和有效性,F(xiàn)階段,主流的自動批閱系統(tǒng)有冰果、句酷等。在經過調研后,這些系統(tǒng)對于動詞的一致性錯誤和動詞時態(tài)錯誤的檢測結果不滿足學習者的要求。本文針對這個情況,研究出一種基于深度學習的動詞語法檢錯算法。通過研究和分析發(fā)現(xiàn),動詞一致性錯誤和動詞時態(tài)錯誤的出現(xiàn)與上下文中出現(xiàn)的單詞和短語相關性較大,而深度學習模型LSTM(Long Short-Term Memory)能夠在訓練時有效的保留上下文中有效的信息,所以本文決定采用LSTM作為訓練模型對于已標注的訓練語料進行建模。同時,如何將英文作文中的文字信息轉換為數(shù)值以供后續(xù)計算,也是自動批閱中重要的一步,主流工具中大多使用詞袋模型,即按照每個單詞在詞典中的順序,對每個單詞進行編碼。這種編碼方式雖然簡單易用,但是既會導致向量丟失文字的順序信息,也容易出現(xiàn)維度災難。所以本文采用詞嵌入模型對于文字進行編碼,將文本信息按照順序映射到一個低維度的向量空間,這樣既不丟失文字的位置信息,也避免了維度災難。之后,本文收集了一定的語料樣本,將本文算法與句酷和冰果進行對比驗證,驗證結果表明本文算法在動詞檢錯上的優(yōu)越性。本文針對主流的自動批閱系統(tǒng)在動詞語法檢錯上的效果較差的情況,提出了基于深度學習的動詞語法檢錯,算法整體的正確率、召回率和F1度都優(yōu)于目前主流的自動批閱系統(tǒng)。
[Abstract]:The automatic marking of English composition is a field of rapid development in recent years. It gradually replaces teachers' manual reading and becomes an important tool to ease the burden of English teachers' teaching. At the same time, through literature research, we can see that verb consistency error and verb tense error are the two kinds of grammatical errors with the highest error rate in English composition. Therefore, the result of verb error detection can reflect the practicability and validity of an automatic marking system. At this stage, the mainstream automatic marking system has ice fruit, sentence cool and so on. After investigation, the results of the detection of verb consistency errors and verb tense errors do not meet the learners' requirements. In this paper, a verb grammar error detection algorithm based on deep learning is proposed. Through the research and analysis, it is found that the occurrence of verb consistency errors and verb tense errors is highly correlated with the words and phrases that appear in the context, while the in-depth learning model (LSTM(Long Short-Term memory) can effectively retain the valid information in the context during training. So this paper uses LSTM as the training model to model the tagged training corpus. At the same time, how to convert the text information from English composition to numerical value for subsequent calculation is also an important step in automatic marking. Most mainstream tools use the word bag model, that is, according to the order of each word in the dictionary. Encode each word. This coding method is simple and easy to use, but it can cause vector to lose the sequence information of text, and it is prone to dimensionality disaster. So this paper uses the word embedding model to encode the text and map the text information to a low-dimensional vector space according to the sequence so that the location information of the text is not lost and the dimensionality disaster is avoided. After that, we collect some corpus samples, compare the algorithm with sentence cool and ice fruit, and verify the superiority of the algorithm in verb error detection. In view of the poor effect of the mainstream automatic marking system on verb grammar error checking, this paper puts forward that the verb grammar error detection based on in-depth learning, the overall correct rate, recall rate and F1 degree of the algorithm are all superior to those of the current mainstream automatic marking system.
【學位授予單位】:中國科學技術大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1

【參考文獻】

相關期刊論文 前2條

1 張海粟;馬大明;鄧智龍;;基于維基百科的語義知識庫及其構建方法研究[J];計算機應用研究;2011年08期

2 馮志偉;;自然語言處理的新發(fā)展與語言文字規(guī)范化[J];現(xiàn)代語文;2006年04期

相關碩士學位論文 前1條

1 劉雷;英語作文智能批改中語法檢查的研究與實現(xiàn)[D];北京郵電大學;2013年

,

本文編號:1802084

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1802084.html


Copyright(c)文論論文網All Rights Reserved | 網站地圖 |

版權申明:資料由用戶4a331***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com