在線社會網(wǎng)絡(luò)中信息擴散研究

發(fā)布時間：2018-03-26 19:39

本文選題：社會網(wǎng)絡(luò)　切入點：信息擴散　出處：《哈爾濱工業(yè)大學》2014年博士論文

【摘要】：Facebook、Twitter等社交類網(wǎng)站的迅猛發(fā)展,預(yù)示著社會媒體(Social Media)已成為當今網(wǎng)絡(luò)技術(shù)發(fā)展的熱點和趨勢。社會媒體中的用戶可以建立各種關(guān)系(關(guān)注、好友等),從而產(chǎn)生了各種不同的虛擬的在線社會網(wǎng)絡(luò)。網(wǎng)絡(luò)中的用戶不僅可以發(fā)布信息,同時還可以通過共享、轉(zhuǎn)發(fā)等行為來傳播信息。因此,在線社會網(wǎng)絡(luò)支撐著信息的發(fā)布和擴散。在線社會網(wǎng)絡(luò)中信息擴散研究可以幫助網(wǎng)絡(luò)用戶獲取有用信息、幫助企業(yè)推廣產(chǎn)品、幫助政府調(diào)控輿情,應(yīng)用價值巨大。本文以真實的在線社會網(wǎng)絡(luò)數(shù)據(jù)和信息擴散數(shù)據(jù)為研究對象,構(gòu)建了在線社會網(wǎng)絡(luò)中信息擴散研究的整體框架,并針對研究框架中的用戶興趣描述、信息擴散模型、信息擴散最大化問題、信息擴散和用戶推薦相結(jié)合等問題展開了研究。本文的研究內(nèi)容主要包括以下四個部分:傳統(tǒng)的信息檢索研究中,通常使用詞向量來描述用戶的興趣,每個詞的權(quán)重使用TF-IDF方法來計算。社會化媒體中存在用戶、資源和標簽這樣的三元關(guān)系數(shù)據(jù),而傳統(tǒng)的詞向量模型無法充分使用上述三元關(guān)系來準確描述用戶興趣,而且詞向量方法還存在一詞多語義問題。為解決上述問題,本文提出了標簽網(wǎng)絡(luò)模型來描述用戶興趣。在標簽網(wǎng)絡(luò)中,節(jié)點代表標簽,邊代表標簽之間的關(guān)系。節(jié)點和邊都是有權(quán)重的,代表用戶的興趣度和興趣間的關(guān)聯(lián)強度。特別的,本文還提出了一種改進的TF-IDF方法來計算標簽權(quán)重。在Movie Lens和Cite ULike數(shù)據(jù)集上的實驗結(jié)果證實了文中提出方法的有效性。信息擴散預(yù)測模型可以應(yīng)用在輿情預(yù)警和爆炸性信息識別等方面,具有重要研究意義和應(yīng)用價值。當前的信息擴散預(yù)測模型大多存在兩方面問題:一是不具有時間相關(guān)的信息擴散預(yù)測能力,二是模型訓練大都需要耗費較多的時間。為解決這些問題,本文提出了一種新穎的信息擴散預(yù)測模型(GT模型)。不同于過去的信息擴散預(yù)測模型,在GT模型中,網(wǎng)絡(luò)中的節(jié)點不再被動的受到鄰居的影響而執(zhí)行行為,而是被視為自治的、智能的、理智的個體。用戶會計算不同選擇下的利益,從而做出理智選擇。該模型中引入了時間相關(guān)的用戶利益,使得GT模型具有了預(yù)測信息擴散進程時間動態(tài)性的能力。文中創(chuàng)新性的提出了結(jié)合全局影響力和社會影響力來計算用戶利益的方法。在新浪微博和Flickr數(shù)據(jù)集上的實驗結(jié)果驗證了文中所提出模型在預(yù)測信息擴散時間動態(tài)性方面的有效性。當前信息擴散最大化研究基本上都是在無標注社會網(wǎng)絡(luò)中展開的,這種網(wǎng)絡(luò)只包含朋友或者信任這類正向關(guān)系。然而,信息擴散最大化問題在標注社會網(wǎng)絡(luò)中的研究仍然是一個有挑戰(zhàn)性的并且被忽視的問題。信息擴散最大化研究如果不區(qū)分網(wǎng)絡(luò)用戶間關(guān)系的極性,將標注社會網(wǎng)絡(luò)粗略的視為無標注網(wǎng)絡(luò),那么用戶的正影響力和負影響力都會被誤認為正影響力。為解決該問題,本文將信息擴散最大化問題拓展到標注社會網(wǎng)絡(luò)中,提出了極性相關(guān)的信息擴散最大(PRIM)問題和極性相關(guān)的獨立級聯(lián)模型,并提出了使用貪心算法來解決該問題。在兩個標注社會網(wǎng)絡(luò)數(shù)據(jù)集中(Epinions和Slashdot)的實驗結(jié)果表明,文中提出的方法在解決PRIM問題時要優(yōu)于未考慮關(guān)系極性的貪心算法和其他啟發(fā)式方法。社會網(wǎng)絡(luò)主要有兩個功能:社會交互和信息擴散。用戶推薦研究基于用戶的偏好和網(wǎng)絡(luò)結(jié)構(gòu)幫助用戶找到合適的朋友,這就增強了社會網(wǎng)絡(luò)的社會交互功能。與此同時,用戶推薦會促進社會網(wǎng)絡(luò)中產(chǎn)生新的鏈接關(guān)系,從而加快網(wǎng)絡(luò)的進化并改變網(wǎng)絡(luò)結(jié)構(gòu),而這會直接影響信息擴散,大多數(shù)用戶推薦方法忽視了這一點。為解決上述問題,文中提出了用戶擴散度的概念和計算方法,用戶擴散度可以用來對傳統(tǒng)推薦算法得到推薦結(jié)果進行重排序,從而使得推薦算法可以促進信息擴散。在Email數(shù)據(jù)和Amazon數(shù)據(jù)上的實驗結(jié)果證實了文中所提出的用戶擴散度的有效性。此外,本文還提出了可以配合用戶擴散度使用的基于超圖的用戶推薦算法,在新浪微博數(shù)據(jù)集上的結(jié)果表明該方法在推薦指標上要優(yōu)于過去的方法。
[Abstract]:The rapid development of Facebook, Twitter and other social networking sites, indicates that social media (Social Media) has become a hot spot and trend of the development of network technology. In social media users can build relationships (attention, friends), resulting in a variety of online social network virtual network users only. You can release information, but also through sharing, forwarding and other acts to spread information. Therefore, the online social network to support the dissemination of information and information diffusion. Diffusion research in online social networks can help Internet users to obtain useful information, help enterprises to promote their products, help the government regulation of public opinion, the huge application value. Based on social network data and information online diffusion of real data as the research object, constructs the framework of information diffusion in the online social network, and according to the research framework of the The user interest description, information diffusion model, the diffusion of information maximization problem, information diffusion and user recommendation combination are researched. The main content of this paper includes the following four parts: the study of traditional information retrieval, usually use the word vector to describe the user's interest, the weight of each word using the TF-IDF method to calculate there are users of social media data, three yuan of such resources and tags, and word vector of the traditional model can not make full use of the three yuan to describe user interest, and there are many methods of word vector semantic word problem. To solve the above problems, in this paper the tag network model to describe user interest in the label. In the network, the nodes represent the relationship between the edges represent labels, tags. Nodes and edges are weighted, on behalf of the user of interest and interest relation with the strength. Otherwise, this paper also proposes an improved TF-IDF method to calculate the weights. In the Movie Lens label and Cite ULike data sets. The experimental results confirm the effectiveness of the proposed method in this paper. The information diffusion prediction model can be used in public opinion warning and explosive information recognition. It has important research significance and Application value. The information diffusion prediction model mostly has two problems: one is to do not have the time related information diffusion prediction ability, two is the model training mostly takes more time. To solve these problems, this paper proposes a novel information diffusion model (GT model). Different from the past information diffusion model and in the GT model, the nodes in the network are no longer passive neighbors influence the execution behavior, but is regarded as autonomous, intelligent, rational individuals. The users of the accounting calculation Under the choice of interests, to make rational choice. The time related to the interests of users is introduced in the model, the GT model has the ability to predict the information diffusion process of dynamic time. In the paper, the paper proposed a method to calculate the user interest combined with global influence and social influence. In Sina, micro-blog and Flickr data set the experimental results verify the effectiveness of the model in terms of time dynamic information diffusion prediction proposed in this paper. The current research of information diffusion maximization are basically in no annotation of social network, this network containing only friends or trust this kind of positive relationship. However, the information diffusion maximization problem in marking study on social networks is still a challenging and neglected problem. If the maximum information diffusion study does not distinguish between polar network relationship between users, will mark agency The network will roughly as unlabeled network, then positive influence and negative influence of users will be mistaken for positive influence. In order to solve this problem, this paper will expand to the information diffusion problem of maximizing annotation in social networks, the polar correlation information diffusion (PRIM) and polar correlation independent cascade model, and proposed using the greedy algorithm to solve the problem. In two marked social network data (Epinions and Slashdot). The experimental results show that the proposed method is better than that without considering the relationship between the polarity of the greedy algorithm and other heuristic methods in solving the PRIM problem. Social network has two main functions social interaction and information diffusion. Research on user preferences and network structure to help users find the right friend recommendation based on user, which enhances the social interaction social network. At the same time, with the User recommendation will promote the new generation links in social networks, thus speeding up the evolution of the network and change the network structure, which will directly affect the spread of information, most users recommended methods ignore this. To solve the above problems, this paper puts forward the concept and calculation method of user diffusion degree, the user can spread for the traditional recommendation algorithm for reordering results is recommended, so that the recommendation algorithm can promote the diffusion of information in Email data and Amazon data. The experimental results confirm the validity of the user diffusion is proposed in this paper. In addition, this paper also put forward the user can cooperate with diffusion using recommendation algorithm based on user hypergraph in Sina, micro-blog data sets. The results show that this method is better than the past method in the recommended index.

【學位授予單位】：哈爾濱工業(yè)大學
【學位級別】：博士
【學位授予年份】：2014
【分類號】：TP393.092
，

本文編號：1669338

資料下載

論文發(fā)表

本文鏈接：http://sikaile.net/guanlilunwen/ydhl/1669338.html

上一篇：基于貝葉斯算法的垃圾郵件過濾系統(tǒng)設(shè)計與實現(xiàn)
下一篇：基于人工魚群算法的QoS全局最優(yōu)Web服務(wù)選擇的研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

在線社會網(wǎng)絡(luò)中信息擴散研究