中文社區(qū)問(wèn)答系統(tǒng)中問(wèn)題檢索技術(shù)研究
本文關(guān)鍵詞:中文社區(qū)問(wèn)答系統(tǒng)中問(wèn)題檢索技術(shù)研究 出處:《北京理工大學(xué)》2016年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 向量空間模型 TFIDF算法 社區(qū)問(wèn)答系統(tǒng) 問(wèn)句相似度
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的迅猛發(fā)展,網(wǎng)絡(luò)中的信息呈現(xiàn)指數(shù)級(jí)增長(zhǎng),網(wǎng)絡(luò)的日益普及,逐漸改變著人們獲取信息的主要方式。近年來(lái)社區(qū)問(wèn)答系統(tǒng)迅速增長(zhǎng),對(duì)人類的信息獲取方式產(chǎn)生了革命性影響。而在網(wǎng)絡(luò)信息急劇增加的同時(shí),超負(fù)荷的巨大信息量也使人們產(chǎn)生了閱讀困擾。因此,如何在海量信息中快捷、準(zhǔn)確地獲取感興趣的內(nèi)容,成為人們面臨的一個(gè)新的難題。目前,社區(qū)問(wèn)答系統(tǒng)已獲得廣泛應(yīng)用,其中相似問(wèn)題檢索可將相似問(wèn)題的答案推薦給用戶,從而避免了用戶重復(fù)提交問(wèn)題,也方便用戶更加快速地獲取問(wèn)題的答案。本文針對(duì)社區(qū)問(wèn)答系統(tǒng)問(wèn)句相似度計(jì)算問(wèn)題,提出了一種改進(jìn)的TFIDF算法。首先,按照用戶的查詢意圖對(duì)問(wèn)句進(jìn)行分類,然后根據(jù)特征詞在類別中的分布對(duì)權(quán)值進(jìn)行調(diào)整;其次,將問(wèn)句的主題詞歸入特征項(xiàng)進(jìn)行TFIDF計(jì)算。實(shí)驗(yàn)結(jié)果表明,與傳統(tǒng)TFIDF算法和參考改進(jìn)算法相比,該算法明顯提高了檢索性能。
[Abstract]:With the rapid development of Internet technology, the information in the network is increasing exponentially. With the increasing popularity of the network, the main way to obtain information has been gradually changed. In recent years, the community question answering system has grown rapidly. It has a revolutionary impact on the way of human information acquisition. While the information on the network is increasing rapidly, the huge amount of information overload also causes people to read puzzles. Therefore, how to quickly in the mass of information. It has become a new difficult problem for people to get the content of interest accurately. At present, the community question answering system has been widely used, in which similarity question retrieval can recommend the answer to similar question to the user. In order to avoid the user to submit the question repeatedly, but also to facilitate the user to obtain the answer more quickly. This paper aims at the question similarity calculation problem of community question answering system. An improved TFIDF algorithm is proposed. Firstly, the question sentences are classified according to the user's query intention, and then the weights are adjusted according to the distribution of the feature words in the category. Secondly, the topic words of the question are classified into the feature items for TFIDF calculation. The experimental results show that the algorithm improves the retrieval performance obviously compared with the traditional TFIDF algorithm and the reference improved algorithm.
【學(xué)位授予單位】:北京理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 毛先領(lǐng);李曉明;;問(wèn)答系統(tǒng)研究綜述[J];計(jì)算機(jī)科學(xué)與探索;2012年03期
2 莫麗萍,王樹西,姜吉發(fā),雷雨霞;問(wèn)答系統(tǒng)和淺層結(jié)構(gòu)模式推理[J];廣西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年01期
3 盧志堅(jiān),張冬茉;中文問(wèn)答系統(tǒng)中的問(wèn)句理解[J];計(jì)算機(jī)工程;2004年18期
4 王樹西;問(wèn)答系統(tǒng):核心技術(shù)、發(fā)展趨勢(shì)[J];計(jì)算機(jī)工程與應(yīng)用;2005年18期
5 林曉慶;;問(wèn)答系統(tǒng)中基于列表類問(wèn)題的研究[J];電腦知識(shí)與技術(shù)(學(xué)術(shù)交流);2007年07期
6 張積賓;徐志明;王恒;潘啟樹;;面向大規(guī)模網(wǎng)絡(luò)數(shù)據(jù)的社會(huì)化問(wèn)答系統(tǒng)[J];哈爾濱工業(yè)大學(xué)學(xué)報(bào);2008年12期
7 賈君枝;毛海飛;;漢語(yǔ)框架網(wǎng)絡(luò)問(wèn)答系統(tǒng)問(wèn)句處理研究[J];圖書情報(bào)工作;2008年10期
8 胡小華;劉軒;劉丹;陸偉;;基于冗余的仿真問(wèn)答系統(tǒng)的輕量級(jí)局部文本分析[J];圖書情報(bào)知識(shí);2009年01期
9 張中峰;李秋丹;;社區(qū)問(wèn)答系統(tǒng)研究綜述[J];計(jì)算機(jī)科學(xué);2010年11期
10 陳玉;;基于“為什么”問(wèn)句的中文問(wèn)答系統(tǒng)研究[J];農(nóng)業(yè)網(wǎng)絡(luò)信息;2010年11期
相關(guān)會(huì)議論文 前10條
1 何靖;陳,
本文編號(hào):1412348
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1412348.html