極性異構(gòu)信息網(wǎng)絡(luò)的聯(lián)系預(yù)測(cè)技術(shù)研究
發(fā)布時(shí)間:2018-04-21 18:32
本文選題:鏈接預(yù)測(cè) + 極性預(yù)測(cè); 參考:《山東大學(xué)》2014年碩士論文
【摘要】:我們生活在一個(gè)相互關(guān)聯(lián)的世界。大多數(shù)數(shù)據(jù)或信息對(duì)象、組件等是內(nèi)部關(guān)聯(lián)或者相互作用的,形成了無數(shù)的、巨大的、相互關(guān)聯(lián)的復(fù)雜網(wǎng)絡(luò)。不失一般性,相互關(guān)聯(lián)的網(wǎng)絡(luò)稱為信息網(wǎng)絡(luò)。分析和挖掘信息網(wǎng)絡(luò)已經(jīng)成為計(jì)算機(jī)科學(xué)、社會(huì)學(xué)、生物學(xué)等領(lǐng)域的研究人員廣泛關(guān)注的課題。 信息網(wǎng)絡(luò)分為同構(gòu)信息網(wǎng)絡(luò)和異構(gòu)信息網(wǎng)絡(luò)。同構(gòu)信息網(wǎng)絡(luò)只有一種類型的節(jié)點(diǎn)和一種類型的關(guān)系,如在朋友關(guān)系網(wǎng)絡(luò)中,節(jié)點(diǎn)都是人這一類型,邊只表示朋友關(guān)系。然而,現(xiàn)實(shí)中的大部分網(wǎng)絡(luò)都是異構(gòu)的。在異構(gòu)信息網(wǎng)絡(luò)中,節(jié)點(diǎn)有多種類型,而不同類型的節(jié)點(diǎn)之間的關(guān)系屬于不同的類型,如IMDB網(wǎng)絡(luò)中,有電影、導(dǎo)演、演員等不同類型的節(jié)點(diǎn)和電影-導(dǎo)演之間的執(zhí)導(dǎo)關(guān)系、電影與演員之間的參演關(guān)系等具有不同語義的關(guān)系類型。隨著網(wǎng)絡(luò)的發(fā)展,人們?cè)诰W(wǎng)絡(luò)社交時(shí)越來越多地表達(dá)自己的情感,因此網(wǎng)絡(luò)中的邊便有了極性,即邊是正的(表示信任、喜歡、朋友等關(guān)系)或負(fù)的(表示不信任、不喜歡、反對(duì)等)。我們稱有極性的異構(gòu)信息網(wǎng)絡(luò)為極性異構(gòu)信息網(wǎng)絡(luò)。 信息網(wǎng)絡(luò)已有了很多的分析和挖掘方法的研究,聯(lián)系預(yù)測(cè)是其中的一個(gè)重要任務(wù)。在極性異構(gòu)信息網(wǎng)絡(luò)中,聯(lián)系預(yù)測(cè)包含鏈接預(yù)測(cè)和極性預(yù)測(cè),分別預(yù)測(cè)邊的存在性和極性。鏈接預(yù)測(cè)在分析演化網(wǎng)絡(luò)、推薦、聚類等領(lǐng)域有重要的價(jià)值,極性預(yù)測(cè)可以應(yīng)用在推薦、決策制定、網(wǎng)絡(luò)演化模型等眾多領(lǐng)域。 雖然鏈接預(yù)測(cè)和極性預(yù)測(cè)都有了很多的研究成果,但大多數(shù)鏈接預(yù)測(cè)都以非極性信息網(wǎng)絡(luò)為基礎(chǔ),極性預(yù)測(cè)多以同構(gòu)信息網(wǎng)絡(luò)為基礎(chǔ),而現(xiàn)實(shí)中大多數(shù)網(wǎng)絡(luò)是極性異構(gòu)信息網(wǎng)絡(luò),所以如何解決極性異構(gòu)信息網(wǎng)絡(luò)中的聯(lián)系預(yù)測(cè)問題成為新的挑戰(zhàn)。本文針對(duì)極性異構(gòu)信息網(wǎng)絡(luò),探索了該網(wǎng)絡(luò)下的聯(lián)系預(yù)測(cè)問題,主要工作可歸結(jié)于以下幾點(diǎn): 1.提出了極性異構(gòu)信息網(wǎng)絡(luò)的鏈接預(yù)測(cè)解決方法。在本文中,我們提出基于規(guī)則的方法,稱為RulePredict來解決鏈接預(yù)測(cè)問題。在RulePredict模型中,我們首先系統(tǒng)抽取特征,特征包括促進(jìn)鏈接存在的正特征和減弱鏈接存在可能性的負(fù)特征。鏈接是否出現(xiàn)服從概率為p的二項(xiàng)分布,p為所有特征值的函數(shù)。然后,使用基于廣義最小二乘法的監(jiān)督學(xué)習(xí)方法學(xué)習(xí)不同特征對(duì)應(yīng)的權(quán)重。將學(xué)習(xí)到的權(quán)重應(yīng)用到測(cè)試數(shù)據(jù)中來預(yù)測(cè)鏈接是否存在。 2.提出了極性異構(gòu)信息網(wǎng)絡(luò)的極性預(yù)測(cè)解決方法。我們提出一個(gè)新的方法HeteSign來解決極性預(yù)測(cè)問題。首先定義不同關(guān)系下的節(jié)點(diǎn)相似值,每個(gè)節(jié)點(diǎn)相似值看作一個(gè)特征,有相對(duì)應(yīng)的權(quán)重。節(jié)點(diǎn)間的相似度定義為特征和權(quán)重的數(shù)學(xué)表達(dá)式。計(jì)算鏈接的極性得分,根據(jù)得分判斷鏈接是正是負(fù)。得分表示為節(jié)點(diǎn)相似度和現(xiàn)有網(wǎng)絡(luò)的鏈接的函數(shù),現(xiàn)有的鏈接由于正負(fù)邊的重要性不同,賦予相對(duì)應(yīng)的系數(shù)。采用監(jiān)督學(xué)習(xí)框架,使用極大似然估計(jì)算法求得權(quán)重和系數(shù)。 3.在真實(shí)的數(shù)據(jù)集IMDB和Epinions網(wǎng)絡(luò)上驗(yàn)證上述兩種方法的有效性,實(shí)驗(yàn)結(jié)果證明我們的方法在準(zhǔn)確性上比其它方法表現(xiàn)更好。
[Abstract]:We live in an interconnected world . Most data or information objects , components , etc . are internal links or interactions , forming numerous , huge , interrelated complex networks . Unlost generality , interconnected networks are known as information networks . Analytical and mining information networks have become a subject of extensive concern to researchers in the fields of computer science , sociology , biology , etc .
The information network is divided into a homogeneous information network and a heterogeneous information network . The homogeneous information network has only one kind of node and one kind of relation , such as in a friend relationship network , the nodes are of the same type , and the relationship between the film and the actor is of different types . As the network grows , people express their feelings more and more in the network , so the edges in the network are positive ( representing trust , affection , friends , etc . ) or negative ( representing distrust , dislike , opposition , etc . ) . We call polar heterogeneous information networks as polar heterogeneous information networks .
The information network has a lot of research on the analysis and mining methods , and the contact prediction is an important task . In the polar heterogeneous information network , the contact prediction includes link prediction and polarity prediction , and the existence and polarity of the edges are predicted respectively . The link prediction has important value in the fields of analysis and evolution network , recommendation , clustering and so on . The polarity prediction can be applied in many fields such as recommendation , decision making , network evolution model and so on .
Although the link prediction and the polarity prediction have many research results , most of the link prediction is based on the non - polar information network , the polarity prediction is based on the homogeneous information network , and most networks in the reality are polar heterogeneous information networks , so how to solve the problem of contact prediction in the polar heterogeneous information network becomes a new challenge . In this paper , the connection prediction problem under the network is explored for the polar heterogeneous information network , and the main work can be attributed to the following points :
1 . In this paper , we propose a method of link prediction for polar heterogeneous information networks . In this paper , we propose a rule - based approach , called Rulemaking , to solve the link prediction problem . In the Rulemaking model , we first extract features , which include the positive features that promote the existence of links and the negative features of the possibility of weakening links . Whether the links appear binomial distributions with probability p and p is a function of all the eigenvalues . Then , the weights that correspond to different features are learned using supervised learning methods based on generalized least squares . The weights learned are applied to test data to predict whether links exist .
2 . A new method for predicting the polarity of polar heterogeneous information networks is proposed . We propose a new method HeteSign to solve the problem of polarity prediction . Firstly , we define the nodes similarity values under different relations . Each node ' s similarity value is regarded as a feature with corresponding weights . The similarity between nodes is defined as a function of the characteristics and weights . The scores are expressed as functions of nodes similarity and links of existing networks . The existing links are given corresponding coefficients due to the importance of positive and negative edges . Using the supervised learning framework , the weights and coefficients are obtained using the maximum likelihood estimation algorithm .
3 . The validity of the two methods is verified on the real data set IMDB and the Ephedra network , and the experimental results show that our method is better in accuracy than in other methods .
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.02
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 呂琳媛;;復(fù)雜網(wǎng)絡(luò)鏈路預(yù)測(cè)[J];電子科技大學(xué)學(xué)報(bào);2010年05期
2 邢春曉;高鳳榮;戰(zhàn)思南;周立柱;;適應(yīng)用戶興趣變化的協(xié)同過濾推薦算法[J];計(jì)算機(jī)研究與發(fā)展;2007年02期
3 張宇;于彤;;Mining Trust Relationships from Online Social Networks[J];Journal of Computer Science & Technology;2012年03期
4 許海玲;吳瀟;李曉東;閻保平;;互聯(lián)網(wǎng)推薦系統(tǒng)比較研究[J];軟件學(xué)報(bào);2009年02期
,本文編號(hào):1783631
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1783631.html
最近更新
教材專著