天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

社區(qū)問答系統(tǒng)中的社團(tuán)發(fā)現(xiàn)技術(shù)研究及其應(yīng)用

發(fā)布時間:2018-08-28 20:10
【摘要】:社區(qū)問答系統(tǒng)(Community-based Question and Answering System, CQA)通過聚集大眾智慧,能夠免費提供問題的個性化解決方案,例如Yahoo! Answer,百度知道等。然而CQA系統(tǒng)無顯式的社團(tuán)結(jié)構(gòu),因此“社團(tuán)”性質(zhì)沒能得到充分應(yīng)用;而且CQA系統(tǒng)具有較高的開放性:知識內(nèi)容共享和搜索引擎可接觸,使得CQA系統(tǒng)易受到虛假賬戶的入侵,導(dǎo)致CQA賬戶行為規(guī)律復(fù)雜,知識質(zhì)量急劇下降。 為解決CQA系統(tǒng)的上述問題,有必要深入研究系統(tǒng)中賬戶行為規(guī)律和網(wǎng)絡(luò)性質(zhì)。同時這些研究工作也有助于解決如下問題,例如相關(guān)用戶推薦,相似問答內(nèi)容融合,新型話題發(fā)現(xiàn),虛假用戶識別,個性化問答服務(wù)等,這些都能提高CQA系統(tǒng)中的知識質(zhì)量。 本文以中國最大的CQA系統(tǒng)“百度知道”為代表,分析CQA系統(tǒng)中賬戶的行為規(guī)律。通過探索賬戶間的問答關(guān)系,本文構(gòu)建兩種網(wǎng)絡(luò)模型,展示了CQA系統(tǒng)的基本網(wǎng)絡(luò)性質(zhì)。為檢測CQA系統(tǒng)中的以興趣為中心的賬戶社團(tuán),基于標(biāo)簽傳播算法SLPA,我們提出一個面向CQA系統(tǒng)的社團(tuán)發(fā)現(xiàn)算法MSLPA (Multilayer speaker-listener label propagation algorithm)。本文從網(wǎng)絡(luò)規(guī)模、社團(tuán)主題、聚合效果、層次結(jié)構(gòu)等多方面評估MSLPA算法的性能,和已有的幾種社團(tuán)發(fā)現(xiàn)算法相比,MSLPA能夠發(fā)現(xiàn)大規(guī)模CQA網(wǎng)絡(luò)中有意義的、重疊的、具有層次結(jié)構(gòu)的賬戶社團(tuán),避免生成大量的微型社團(tuán),有效聚合關(guān)聯(lián)賬戶。 基于MSLPA社團(tuán)發(fā)現(xiàn)技術(shù),本文提出一個CQA系統(tǒng)中鑒別虛假賬戶的方法。首先給出一組具有較高區(qū)分度的賬戶屬性集合,包括具有一定物理含義的賬戶個體屬性和賬戶所屬的社團(tuán)性質(zhì),其中個體屬性由統(tǒng)計分析得到,社團(tuán)性質(zhì)由本文的社團(tuán)發(fā)現(xiàn)結(jié)果得到。本文將新提出的屬性集合應(yīng)用于簡潔的J48決策樹分類器上,判斷賬戶為正常賬戶或者虛假賬戶。實驗結(jié)果顯示,該方法表現(xiàn)出良好的性能和效果,分類準(zhǔn)確率得到較大的提高。
[Abstract]:Community Q & A (Community-based Question and Answering System, CQA) provides free personalized solutions to problems, such as Yahoo! Answer, Baidu knows wait. However, there is no explicit community structure in CQA system, so the nature of "community" has not been fully applied, and the CQA system is highly open: knowledge content sharing and search engine are accessible, which makes CQA system vulnerable to the invasion of false accounts. As a result, the behavior of CQA accounts is complicated and the quality of knowledge drops sharply. In order to solve the above problems of CQA system, it is necessary to deeply study the law of account behavior and the nature of network in the system. At the same time, these researches can also help to solve the following problems, such as related user recommendation, similar question and answer content fusion, new topic discovery, false user identification, personalized question and answer service, which can improve the quality of knowledge in CQA system. This paper takes Baidu know, the largest CQA system in China, as a representative to analyze the behavior of accounts in CQA system. By exploring the question and answer relationship between accounts, this paper constructs two kinds of network models and shows the basic network properties of CQA system. In order to detect the interest centered account community in CQA system, based on the tag propagation algorithm SLPA, we propose a community discovery algorithm MSLPA (Multilayer speaker-listener label propagation algorithm). For CQA system. This paper evaluates the performance of MSLPA algorithm in terms of network size, community theme, aggregation effect, hierarchical structure and so on. Account societies with hierarchical structure avoid generating a large number of microsocieties and effectively aggregate associated accounts. Based on MSLPA community discovery technology, this paper presents a method to identify false accounts in CQA system. First of all, a set of account attributes with higher degree of differentiation is given, including the individual attributes of accounts with certain physical meanings and the community properties of accounts, in which the individual attributes are obtained by statistical analysis. The nature of the community is obtained from the results of the community discovery in this paper. In this paper, the new attribute set is applied to the simple J48 decision tree classifier to judge whether the account is a normal account or a false account. The experimental results show that the method has good performance and effect, and the classification accuracy is greatly improved.
【學(xué)位授予單位】:中國科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.092

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 李晨;巢文涵;陳小明;李舟軍;;中文社區(qū)問答中問題答案質(zhì)量評價和預(yù)測[J];計算機(jī)科學(xué);2011年06期

2 毛先領(lǐng);李曉明;;問答系統(tǒng)研究綜述[J];計算機(jī)科學(xué)與探索;2012年03期

,

本文編號:2210486

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2210486.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶08a8a***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com