天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

互動問答社區(qū)中回答可信性分析

發(fā)布時間:2018-06-27 20:28

  本文選題:互動問答社區(qū) + 多字詞表達; 參考:《北京信息科技大學(xué)》2013年碩士論文


【摘要】:近年來,隨著Web2.0的發(fā)展,用戶不僅是網(wǎng)頁內(nèi)容的瀏覽者,同時也是網(wǎng)頁內(nèi)容的編輯者,隨之產(chǎn)生了大量的用戶產(chǎn)生內(nèi)容型(User Generated Content)的網(wǎng)絡(luò)應(yīng)用,互動問答社區(qū)(Question Answering Community)就是在此背景下產(chǎn)生的網(wǎng)絡(luò)應(yīng)用;訂柎鹕鐓^(qū)的基本模式是用戶根據(jù)自身的需求提出問題,由其他用戶給出回答。在互動問答社區(qū)中,由于給出答案的用戶具有多樣性的特征,所以不同用戶給出的回答可信性也高低不一,不同可信性的回答會對問題提問者和問題的瀏覽者產(chǎn)生重要的影響。因此,互動問答社區(qū)中問題回答的可信性判別成了問答社區(qū)主要的問題;诖,本文主要針對互動問答社區(qū)中回答可信性分析進行研究,將課題研究分為三部分:互動問答社區(qū)問句中多字詞表達抽取、互動問答社區(qū)中回答可信性分類、互動問答社區(qū)中最可信回答辨析。 第一,互動問答社區(qū)中多字詞表達抽取研究。對互動問答社區(qū)問句中多字詞表達進行抽取主要應(yīng)用于問句理解和構(gòu)建可信信息庫;诨訂柎鹕鐓^(qū)問句中多字詞表達的特點,提出適用于互動問答社區(qū)的多字詞表達提取方法。該方法在利用互信息和停用詞表的方法從問句中抽取候選多字詞表達的基礎(chǔ)上,將候選多字詞表達分為正確串、殘缺串、冗余串和錯誤串四類,借助搜索引擎對查詢串的優(yōu)化和候選多字詞表達在互聯(lián)網(wǎng)上的檢索結(jié)果,設(shè)計了候選多字詞表達校正方法,實現(xiàn)對多字詞表達的提取。以新浪愛問知識人問題庫里的問句進行實驗,結(jié)果表明多字詞表達抽取的準(zhǔn)確率、召回率、F值分別達到了84%、52%、0.64,,具有較好的實驗效果。 第二,互動問答社區(qū)中回答可信性分類研究。針對互動問答社區(qū)中的特點,提出回答文本規(guī)范性特征和不確定性語氣特征,從更多的角度對回答可信性進行分類。利用Logistic Regression模型,結(jié)合經(jīng)典的文本特征、統(tǒng)計特征和用戶特征,對回答可信性進行分析。以新浪愛問知識人中醫(yī)療與健康領(lǐng)域的問答對進行實驗表明,在經(jīng)典特征的基礎(chǔ)之上,所提的回答文本規(guī)范性特征和不確定性語氣特征能夠較好提高回答可信性分類的準(zhǔn)確率,驗證了所提特征的有效性。 第三,最可信回答辨析研究。提出了構(gòu)建可信信息庫的方法,并提出應(yīng)用可信信息庫與傳統(tǒng)的問答對基本特征進行結(jié)合的最可信回答辨析思路,使得辨析結(jié)果得到了較大提高。選取可信問答對和與問題相關(guān)的可信資料作為可信信息庫的主要內(nèi)容,并設(shè)計了恰當(dāng)?shù)慕M織結(jié)構(gòu)將這兩部分聯(lián)系起來,為可信信息庫的使用提供了便利。提出了一種使用可信信息庫的方法,并以實驗驗證了構(gòu)建可信信息庫對最可信回答辨析的有效性。應(yīng)用本文提出的最可信回答辨析方法,使得最可信回答辨析達到了較好的實驗效果。
[Abstract]:In recent years, with the development of Web 2.0, the user is not only the viewer of web content, but also the editor of web content. Question Answering Community is a network application under this background. The basic model of interactive Q & A community is that users ask questions according to their own needs and other users give answers. In the interactive Q & A community, due to the diversity of the users giving the answers, different users give different credibility of the answer, different credibility answers will have an important impact on the question questioner and the viewer of the question. Therefore, the credibility of the question-answering in the interactive Q & A community is the main question in the Q & A community. Based on this, this paper mainly focuses on the analysis of the credibility of answers in the interactive Q & A community, and divides the research into three parts: the extraction of multi-word expressions in questions in interactive Q & A community, and the classification of credibility of responses in interactive question-and-answer communities. The most credible answers in the interactive Q & A community. First, the study of multi-word expression extraction in interactive Q & A community. The extraction of multi-word expression in interactive question answering community is mainly applied to question comprehension and the construction of credible information base. Based on the characteristics of multi-word expression in question in interactive question community, a multi-word expression extraction method suitable for interactive Q & A community is proposed. On the basis of extracting candidate multi-word expressions from question sentences by using mutual information and stopping vocabulary, the method divides candidate multi-word expressions into four categories: correct string, incomplete string, redundant string and error string. With the help of search engine optimization of query string and retrieval results of candidate multi-word expression on the Internet, a candidate multi-word expression correction method is designed to extract multi-word expression. The experiment was carried out by using the question sentence in the question bank of Sina love to ask the knowledge person. The result shows that the accuracy of multi-word expression extraction is correct, and the recall rate of F is 84% 52% 0.64, respectively, which has good experimental effect. Second, the interactive Q & A community in the credibility of the answer classification. According to the characteristics of the interactive Q & A community, the normative features and the uncertain mood features of the answer text are proposed, and the credibility of the answer is classified from more angles. Based on Logistic Regression model, this paper analyzes the credibility of the answer by combining the classical text features, statistical features and user features. Based on the quizzes in the field of medical and health, the results show that, on the basis of the classical features, the normative features and uncertain mood features of the answer text can improve the accuracy of credibility classification. The validity of the proposed feature is verified. Third, the most credible answer discrimination research. In this paper, the method of constructing trusted information base is put forward, and the most credible answer that combines the trusted information base with the traditional question and answer is put forward, which improves the result of the discrimination greatly. The trusted question and answer pair and question related trusted information are selected as the main contents of the trusted information base, and the appropriate organization structure is designed to link the two parts together, which provides convenience for the use of the trusted information base. In this paper, a method of using trusted information base is proposed, and the validity of constructing trusted information base is verified by experiments. The most credible answer discrimination method proposed in this paper is used to make the most credible answer discrimination achieve better experimental effect.
【學(xué)位授予單位】:北京信息科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP393.092;TP391.1

【參考文獻】

相關(guān)期刊論文 前10條

1 肖健;徐建;徐曉蘭;袁琦;;英中可比語料庫中多詞表達自動提取與對齊[J];計算機工程與應(yīng)用;2010年31期

2 張中峰;李秋丹;;社區(qū)問答系統(tǒng)研究綜述[J];計算機科學(xué);2010年11期

3 李晨;巢文涵;陳小明;李舟軍;;中文社區(qū)問答中問題答案質(zhì)量評價和預(yù)測[J];計算機科學(xué);2011年06期

4 許莉;王大玲;夏秀峰;;基于句法和語義信息的問句特征提取方法[J];計算機工程;2010年21期

5 張成;曲明成;倪寧;仇光;卜佳俊;;基于概率潛在語義分析模型的自動答案選擇[J];計算機工程;2011年14期

6 孔維澤;劉奕群;張敏;馬少平;;問答社區(qū)中回答質(zhì)量的評價方法研究[J];中文信息學(xué)報;2011年01期

7 王德鵬;;成語在中文多詞表達中的提取[J];科教文匯(中旬刊);2012年11期

8 來社安;蔡中民;;基于相似度的問答社區(qū)問答質(zhì)量評價方法[J];計算機應(yīng)用與軟件;2013年02期

9 張興剛;袁毅;;問答類社區(qū)用戶關(guān)系網(wǎng)絡(luò)研究——以百度“知道”為例[J];情報理論與實踐;2011年11期

10 趙發(fā)珍;;近幾年我國網(wǎng)絡(luò)互動問答平臺研究述評[J];情報探索;2012年02期

相關(guān)博士學(xué)位論文 前1條

1 路遙;用戶交互式問答系統(tǒng)中問題推薦機制的研究[D];中國科學(xué)技術(shù)大學(xué);2012年



本文編號:2075185

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2075185.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶2997a***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com