天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

基于LDA模型的領(lǐng)域自動(dòng)問答系統(tǒng)

發(fā)布時(shí)間:2018-06-17 13:49

  本文選題:分詞 + LDA模型; 參考:《安徽大學(xué)》2013年碩士論文


【摘要】:隨著因特網(wǎng)的發(fā)展,其包含的信息量不斷增加,人們普遍希望能在互聯(lián)網(wǎng)中快速地找到自己想要的信息。同時(shí),目前搜索引擎的有效應(yīng)用率不高,搜索引擎的不足仍有很多,限制著人們獲取信息的效率。自動(dòng)問答系統(tǒng)可以更智能、更快速、更準(zhǔn)確地獲取用戶想查詢的內(nèi)容,近年來成為了國內(nèi)外學(xué)者廣泛研究的熱點(diǎn)。 本文以實(shí)現(xiàn)一個(gè)針對(duì)計(jì)算機(jī)常見故障的解決辦法這一領(lǐng)域的自動(dòng)問答系統(tǒng)為目標(biāo),深入探討了自動(dòng)問答系統(tǒng)從問題處理一直到最終給出答案的全過程。在研究過程中,發(fā)現(xiàn)領(lǐng)域分詞和語義相似度的計(jì)算是自動(dòng)問答系統(tǒng)的核心內(nèi)容,相對(duì)于目前的系統(tǒng)需求以及研究現(xiàn)狀,還有很多改進(jìn)的地方。本文主要對(duì)這兩個(gè)方面進(jìn)行改進(jìn),在每一節(jié)也地改進(jìn)后的結(jié)果進(jìn)行了實(shí)驗(yàn)論證,說明改進(jìn)后的確加強(qiáng)了檢索的結(jié)果。最后設(shè)計(jì)實(shí)現(xiàn)了一個(gè)可以對(duì)用戶提出的計(jì)算機(jī)故障相關(guān)問題自動(dòng)給出解決辦法的一個(gè)原型系統(tǒng)。 首先,本文討論了在中文分詞領(lǐng)域常用的方法,對(duì)基于詞典的分詞方法、基于統(tǒng)計(jì)的分詞方法這兩個(gè)經(jīng)典的方法做了深入分析,對(duì)其他方法做了簡要介紹,并比較了不同的方法的特性和效果。然后提出了一個(gè)基于領(lǐng)域詞典與詞串互信息的分詞方法,該方法加入了語義的信息,并考慮到領(lǐng)域?qū)I(yè)詞匯的特性,最后加入了詞串的互信息來解決分詞中的岐義問題。通過實(shí)驗(yàn)證明,這些改進(jìn)提升了領(lǐng)域文本的分詞性能。 其次,本文對(duì)語義相似度的概念和計(jì)算原則做了簡單討論,并研究了基于編輯距離的語義相似度計(jì)算方法、基于依存關(guān)系的語義相似度計(jì)算方法以及基于語義距離和本體的相似度計(jì)算方法,同時(shí)提出了對(duì)經(jīng)典相似度計(jì)算方法改進(jìn)的一個(gè)新方法。新方法使用LDA模型,經(jīng)過領(lǐng)域語料庫的訓(xùn)練,得到一個(gè)領(lǐng)域相關(guān)的詞一主題的分布,由于考慮了同一個(gè)主題下的詞之間的語義相關(guān)性,因此計(jì)算得到的語義相似度更為可靠。 最后,本文對(duì)針對(duì)計(jì)算機(jī)常見故障的解決辦法這一領(lǐng)域的自動(dòng)問答系統(tǒng)進(jìn)行了系統(tǒng)設(shè)計(jì),良好的設(shè)計(jì)使系統(tǒng)的框架具備了高內(nèi)聚、低耦合的特性,這樣可以大大減小系統(tǒng)的升級(jí)和后期的維護(hù)的代價(jià)。同時(shí)在Windows XP平臺(tái)下,基于.NET Framework框架開發(fā)實(shí)現(xiàn)了這一系統(tǒng)的演示版本,通過實(shí)際測試,系統(tǒng)的運(yùn)行效果良好。
[Abstract]:With the development of the Internet, the amount of information it contains is increasing. People generally hope to find the information they want quickly in the Internet. At the same time, the effective application rate of search engine is not high, and the lack of search engine is still a lot, which limits the efficiency of people to obtain information. The automatic question answering system can be more intelligent, faster and more accurate to obtain the content that the user wants to query, which has become a hot spot of domestic and foreign scholars in recent years. Aiming at the realization of an automatic question answering system in the field of solving common computer faults, this paper deeply discusses the whole process of the automatic question answering system from question processing to the final answer. In the research process, it is found that the computation of domain word segmentation and semantic similarity is the core content of the automatic question answering system, and there are still many improvements compared with the current system requirements and research status. In this paper, the two aspects are improved, and the experimental results are demonstrated in each section, which shows that the improved results really strengthen the retrieval results. In the end, a prototype system is designed and implemented, which can automatically solve the problems related to computer faults raised by users. First of all, this paper discusses the commonly used methods in the field of Chinese word segmentation, and makes an in-depth analysis of the two classical methods of word segmentation based on dictionary and statistics, and briefly introduces the other methods. The characteristics and effects of different methods are compared. Then, a word segmentation method based on domain dictionary and string mutual information is proposed. This method adds semantic information and takes into account the characteristics of domain specialized vocabulary, and finally adds the mutual information of string to solve the ambiguity problem in word segmentation. Experimental results show that these improvements improve the performance of domain text segmentation. Secondly, the concept and calculation principle of semantic similarity are briefly discussed, and the method of calculating semantic similarity based on editing distance is studied. The semantic similarity calculation method based on dependency relationship and the similarity calculation method based on semantic distance and ontology are presented. A new method to improve the classical similarity calculation method is proposed. The new method uses the LDA model and the domain corpus is trained to obtain the distribution of a domain-dependent word-topic. Because the semantic correlation between the words under the same topic is considered the calculated semantic similarity is more reliable. Finally, the system design of the automatic question answering system in the field of the solution of common computer faults is carried out in this paper. The good design makes the system frame have the characteristics of high cohesion and low coupling. This can greatly reduce the system upgrade and later maintenance costs. At the same time, the demo version of the system is developed based on. Net Framework on Windows XP platform. Through the actual test, the running effect of the system is good.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 鄧志鴻,唐世渭,張銘,楊冬青,陳捷;Ontology研究綜述[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2002年05期

2 莫麗萍,王樹西,姜吉發(fā),雷雨霞;問答系統(tǒng)和淺層結(jié)構(gòu)模式推理[J];廣西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年01期

3 郭艷華,周昌樂;一種漢語語句依存關(guān)系網(wǎng)協(xié)動(dòng)生成方法研究[J];杭州電子工業(yè)學(xué)院學(xué)報(bào);2000年04期

4 溫滔,朱巧明,呂強(qiáng);一種快速漢語分詞算法[J];計(jì)算機(jī)工程;2004年19期

5 孫茂松,肖明,鄒嘉彥;基于無指導(dǎo)學(xué)習(xí)策略的無詞表?xiàng)l件下的漢語自動(dòng)分詞[J];計(jì)算機(jī)學(xué)報(bào);2004年06期

6 吳棟,滕育平;中文信息檢索引擎中的分詞與檢索技術(shù)[J];計(jì)算機(jī)應(yīng)用;2004年07期

7 徐德智;鄭春卉;K. Passi;;基于SUMO的概念語義相似度研究[J];計(jì)算機(jī)應(yīng)用;2006年01期

8 李彬,劉挺,秦兵,李生;基于語義依存的漢語句子相似度計(jì)算[J];計(jì)算機(jī)應(yīng)用研究;2003年12期

9 揭春雨 ,劉源 ,梁南元;論漢語自動(dòng)分詞方法[J];中文信息學(xué)報(bào);1989年01期

10 閆引堂,周曉強(qiáng);交集型歧義字段切分方法研究[J];情報(bào)學(xué)報(bào);2000年06期

相關(guān)博士學(xué)位論文 前1條

1 邱明;語義相似性度量及其在設(shè)計(jì)管理系統(tǒng)中的應(yīng)用[D];浙江大學(xué);2006年

,

本文編號(hào):2031272

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2031272.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶aa672***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
东京热加勒比一区二区| 东京热电东京热一区二区三区| 91超精品碰国产在线观看| 日韩精品日韩激情日韩综合| 91免费一区二区三区| 在线观看国产午夜福利| 色涩一区二区三区四区| 国产成人亚洲综合色就色| 色综合久久中文综合网| 日本少妇中文字幕不卡视频| 风韵人妻丰满熟妇老熟女av| 国产欧美性成人精品午夜| 黄色在线免费高清观看| 婷婷激情五月天丁香社区| av在线免费观看在线免费观看| 欧美日韩视频中文字幕| 久草视频这里只是精品| 大香蕉大香蕉手机在线视频| 91熟女大屁股偷偷对白| 国产色一区二区三区精品视频| 国产爆操白丝美女在线观看| 日韩黄色一级片免费收看| 日本熟妇五十一区二区三区| 日韩人妻免费视频一专区 | 粉嫩内射av一区二区| 青青操视频在线播放免费| 久久精品亚洲欧美日韩 | 久草国产精品一区二区| 国产精品一区二区视频成人| 精品国自产拍天天青青草原| 九九视频通过这里有精品| 亚洲中文在线男人的天堂| 久久精品亚洲精品一区| 亚洲一区二区三区免费的视频| 粗暴蹂躏中文一区二区三区| 91人妻久久精品一区二区三区| 日韩欧美综合在线播放| 欧美黑人在线精品极品| 最近最新中文字幕免费| 东京不热免费观看日本| 国产成人av在线免播放观看av|