基于深度學習的問答系統(tǒng)技術研究
本文選題:問答系統(tǒng) + 詞向量; 參考:《浙江大學》2017年碩士論文
【摘要】:問答系統(tǒng)是目前自然語言處理領域中的研究熱點,它既能讓用戶通過自然語言直接發(fā)問,又能直接向用戶返回精確、簡潔的答案,而不是一系列相關網(wǎng)頁。近年來,深度學習技術為問答系統(tǒng)領域帶來諸多突破,基于深度學習技術的問答算法研究成為了自然語言處理最熱門的研究方向,誕生了大量優(yōu)秀的文章與開發(fā)框架,如Google在2016年推出的SyntaxNet,大大降低了高性能問答系統(tǒng)的開發(fā)成本。本文應用了深度學習技術進行問答系統(tǒng)的構造,開展的工作如下:1.利用詞向量與卷積神經(jīng)網(wǎng)絡搭建了一套高準確率的面向具體任務問答系統(tǒng),改進了已有的卷積神經(jīng)網(wǎng)絡問句分類算法,探索了模型初始化參數(shù)與模型性能的關系。2.基于雙向長短時記憶模型與注意力機制搭建了一套端到端開放領域問答系統(tǒng),改進了前人基于單向長短時記憶模型的端到端問答算法在問句語義表征上的缺點。3.在Facebook bAbI、Ubuntu Dialogue Corpus等常用數(shù)據(jù)集上進行了實驗對比,通過實驗結果對比突出了本文設計的問答算法的有效性與合理性,并對實驗結果做了較為詳細的分析。4.利用TensorFlow、Docker構建了一套維護成本低、部署方便的問答系統(tǒng)微服務,解決了 TensorFlow框架線上服務部署困難的問題。本文的主要貢獻如下:1.創(chuàng)造性地發(fā)現(xiàn)了基于詞向量和卷積神經(jīng)網(wǎng)絡的問句語義相似度算法性能與詞向量維數(shù)之間的關系,并通過實驗加以驗證。2.嘗試了通過復制插值的方式擴展基于詞向量和卷積神經(jīng)網(wǎng)絡的問句語義相似度算法中詞向量輸入部分的維數(shù),解決了問句類別數(shù)上升時模型性能下降的問題。3.使用雙向長短時記憶模型與注意力機制改進了現(xiàn)有基于循環(huán)神經(jīng)網(wǎng)絡的端到端問答算法模型,提高了平均問答長度等性能指標。4.基于TensorFlow與Docker實現(xiàn)了一整套問答系統(tǒng)微服務,創(chuàng)新性地使用Spring Boot包裝算法腳本,解決了TensorFlow Serving的兼容性問題,實現(xiàn)了彈性部署與擴容,維護成本低。
[Abstract]:Question and answer system is a hot research topic in the field of natural language processing. It can not only let users directly ask questions through natural language, but also return accurate and concise answers to users directly, rather than a series of related web pages. In recent years, deep learning technology has brought many breakthroughs to the field of question and answer system. The research of question and answer algorithm based on deep learning technology has become the most popular research direction of natural language processing, and a large number of excellent articles and development frameworks have been born. SyntaxNet, for example, launched by Google in 2016, has greatly reduced the cost of developing a high-performance question-and-answer system. In this paper, the deep learning technology is used to construct the Q & A system, and the work is as follows: 1. Using word vector and convolutional neural network, a set of quizzes oriented question answering system with high accuracy is set up. The existing convolutional neural network question classification algorithm is improved, and the relationship between model initialization parameters and model performance is explored. An end-to-end open domain question-and-answer system based on bidirectional long and short term memory model and attention mechanism is constructed, which improves the shortcomings of the previous end-to-end question answering algorithm based on one-way long and short term memory model on the semantic representation of question sentences. The experimental results are compared on Facebook bAbIbuntu Dialogue Corpus and other common data sets. The validity and rationality of the question and answer algorithm designed in this paper are highlighted by the comparison of experimental results, and the experimental results are analyzed in detail. By using Tensor flow Docker, a question-and-answer system micro-service with low maintenance cost and convenient deployment is constructed, which solves the problem of difficult service deployment on the TensorFlow framework. The main contributions of this paper are as follows: 1. The relationship between the performance of semantic similarity algorithm based on word vector and convolution neural network and the dimension of word vector is found out creatively. This paper attempts to extend the dimension of word vector input in the semantic similarity algorithm of question sentence based on word vector and convolutional neural network by replicating and interpolating, and solves the problem of deterioration of model performance when the number of question categories increases. The existing end-to-end question-and-answer algorithm model based on cyclic neural network is improved by using bidirectional long and short time memory model and attention mechanism, and the performance index of average question length is improved. Based on TensorFlow and Docker, a set of question answering system micro-service is implemented, and Spring Boot packaging algorithm script is innovatively used, which solves the compatibility problem of TensorFlow Serving, realizes flexible deployment and expansion, and has low maintenance cost.
【學位授予單位】:浙江大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1
【相似文獻】
相關期刊論文 前10條
1 王樹西;趙星秋;潘碩;;問答系統(tǒng)在教學中的應用[J];中國教育信息化;2007年07期
2 毛先領;李曉明;;問答系統(tǒng)研究綜述[J];計算機科學與探索;2012年03期
3 莫麗萍,王樹西,姜吉發(fā),雷雨霞;問答系統(tǒng)和淺層結構模式推理[J];廣西師范大學學報(自然科學版);2004年01期
4 盧志堅,張冬茉;中文問答系統(tǒng)中的問句理解[J];計算機工程;2004年18期
5 王樹西;問答系統(tǒng):核心技術、發(fā)展趨勢[J];計算機工程與應用;2005年18期
6 林曉慶;;問答系統(tǒng)中基于列表類問題的研究[J];電腦知識與技術(學術交流);2007年07期
7 張積賓;徐志明;王恒;潘啟樹;;面向大規(guī)模網(wǎng)絡數(shù)據(jù)的社會化問答系統(tǒng)[J];哈爾濱工業(yè)大學學報;2008年12期
8 賈君枝;毛海飛;;漢語框架網(wǎng)絡問答系統(tǒng)問句處理研究[J];圖書情報工作;2008年10期
9 胡小華;劉軒;劉丹;陸偉;;基于冗余的仿真問答系統(tǒng)的輕量級局部文本分析[J];圖書情報知識;2009年01期
10 張中峰;李秋丹;;社區(qū)問答系統(tǒng)研究綜述[J];計算機科學;2010年11期
相關會議論文 前10條
1 何靖;陳,
本文編號:1960603
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1960603.html