在線社會(huì)網(wǎng)絡(luò)中信息擴(kuò)散研究
發(fā)布時(shí)間:2018-03-26 19:39
本文選題:社會(huì)網(wǎng)絡(luò) 切入點(diǎn):信息擴(kuò)散 出處:《哈爾濱工業(yè)大學(xué)》2014年博士論文
【摘要】:Facebook、Twitter等社交類(lèi)網(wǎng)站的迅猛發(fā)展,預(yù)示著社會(huì)媒體(Social Media)已成為當(dāng)今網(wǎng)絡(luò)技術(shù)發(fā)展的熱點(diǎn)和趨勢(shì)。社會(huì)媒體中的用戶可以建立各種關(guān)系(關(guān)注、好友等),從而產(chǎn)生了各種不同的虛擬的在線社會(huì)網(wǎng)絡(luò)。網(wǎng)絡(luò)中的用戶不僅可以發(fā)布信息,同時(shí)還可以通過(guò)共享、轉(zhuǎn)發(fā)等行為來(lái)傳播信息。因此,在線社會(huì)網(wǎng)絡(luò)支撐著信息的發(fā)布和擴(kuò)散。在線社會(huì)網(wǎng)絡(luò)中信息擴(kuò)散研究可以幫助網(wǎng)絡(luò)用戶獲取有用信息、幫助企業(yè)推廣產(chǎn)品、幫助政府調(diào)控輿情,應(yīng)用價(jià)值巨大。本文以真實(shí)的在線社會(huì)網(wǎng)絡(luò)數(shù)據(jù)和信息擴(kuò)散數(shù)據(jù)為研究對(duì)象,構(gòu)建了在線社會(huì)網(wǎng)絡(luò)中信息擴(kuò)散研究的整體框架,并針對(duì)研究框架中的用戶興趣描述、信息擴(kuò)散模型、信息擴(kuò)散最大化問(wèn)題、信息擴(kuò)散和用戶推薦相結(jié)合等問(wèn)題展開(kāi)了研究。本文的研究?jī)?nèi)容主要包括以下四個(gè)部分:傳統(tǒng)的信息檢索研究中,通常使用詞向量來(lái)描述用戶的興趣,每個(gè)詞的權(quán)重使用TF-IDF方法來(lái)計(jì)算。社會(huì)化媒體中存在用戶、資源和標(biāo)簽這樣的三元關(guān)系數(shù)據(jù),而傳統(tǒng)的詞向量模型無(wú)法充分使用上述三元關(guān)系來(lái)準(zhǔn)確描述用戶興趣,而且詞向量方法還存在一詞多語(yǔ)義問(wèn)題。為解決上述問(wèn)題,本文提出了標(biāo)簽網(wǎng)絡(luò)模型來(lái)描述用戶興趣。在標(biāo)簽網(wǎng)絡(luò)中,節(jié)點(diǎn)代表標(biāo)簽,邊代表標(biāo)簽之間的關(guān)系。節(jié)點(diǎn)和邊都是有權(quán)重的,代表用戶的興趣度和興趣間的關(guān)聯(lián)強(qiáng)度。特別的,本文還提出了一種改進(jìn)的TF-IDF方法來(lái)計(jì)算標(biāo)簽權(quán)重。在Movie Lens和Cite ULike數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果證實(shí)了文中提出方法的有效性。信息擴(kuò)散預(yù)測(cè)模型可以應(yīng)用在輿情預(yù)警和爆炸性信息識(shí)別等方面,具有重要研究意義和應(yīng)用價(jià)值。當(dāng)前的信息擴(kuò)散預(yù)測(cè)模型大多存在兩方面問(wèn)題:一是不具有時(shí)間相關(guān)的信息擴(kuò)散預(yù)測(cè)能力,二是模型訓(xùn)練大都需要耗費(fèi)較多的時(shí)間。為解決這些問(wèn)題,本文提出了一種新穎的信息擴(kuò)散預(yù)測(cè)模型(GT模型)。不同于過(guò)去的信息擴(kuò)散預(yù)測(cè)模型,在GT模型中,網(wǎng)絡(luò)中的節(jié)點(diǎn)不再被動(dòng)的受到鄰居的影響而執(zhí)行行為,而是被視為自治的、智能的、理智的個(gè)體。用戶會(huì)計(jì)算不同選擇下的利益,從而做出理智選擇。該模型中引入了時(shí)間相關(guān)的用戶利益,使得GT模型具有了預(yù)測(cè)信息擴(kuò)散進(jìn)程時(shí)間動(dòng)態(tài)性的能力。文中創(chuàng)新性的提出了結(jié)合全局影響力和社會(huì)影響力來(lái)計(jì)算用戶利益的方法。在新浪微博和Flickr數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果驗(yàn)證了文中所提出模型在預(yù)測(cè)信息擴(kuò)散時(shí)間動(dòng)態(tài)性方面的有效性。當(dāng)前信息擴(kuò)散最大化研究基本上都是在無(wú)標(biāo)注社會(huì)網(wǎng)絡(luò)中展開(kāi)的,這種網(wǎng)絡(luò)只包含朋友或者信任這類(lèi)正向關(guān)系。然而,信息擴(kuò)散最大化問(wèn)題在標(biāo)注社會(huì)網(wǎng)絡(luò)中的研究仍然是一個(gè)有挑戰(zhàn)性的并且被忽視的問(wèn)題。信息擴(kuò)散最大化研究如果不區(qū)分網(wǎng)絡(luò)用戶間關(guān)系的極性,將標(biāo)注社會(huì)網(wǎng)絡(luò)粗略的視為無(wú)標(biāo)注網(wǎng)絡(luò),那么用戶的正影響力和負(fù)影響力都會(huì)被誤認(rèn)為正影響力。為解決該問(wèn)題,本文將信息擴(kuò)散最大化問(wèn)題拓展到標(biāo)注社會(huì)網(wǎng)絡(luò)中,提出了極性相關(guān)的信息擴(kuò)散最大(PRIM)問(wèn)題和極性相關(guān)的獨(dú)立級(jí)聯(lián)模型,并提出了使用貪心算法來(lái)解決該問(wèn)題。在兩個(gè)標(biāo)注社會(huì)網(wǎng)絡(luò)數(shù)據(jù)集中(Epinions和Slashdot)的實(shí)驗(yàn)結(jié)果表明,文中提出的方法在解決PRIM問(wèn)題時(shí)要優(yōu)于未考慮關(guān)系極性的貪心算法和其他啟發(fā)式方法。社會(huì)網(wǎng)絡(luò)主要有兩個(gè)功能:社會(huì)交互和信息擴(kuò)散。用戶推薦研究基于用戶的偏好和網(wǎng)絡(luò)結(jié)構(gòu)幫助用戶找到合適的朋友,這就增強(qiáng)了社會(huì)網(wǎng)絡(luò)的社會(huì)交互功能。與此同時(shí),用戶推薦會(huì)促進(jìn)社會(huì)網(wǎng)絡(luò)中產(chǎn)生新的鏈接關(guān)系,從而加快網(wǎng)絡(luò)的進(jìn)化并改變網(wǎng)絡(luò)結(jié)構(gòu),而這會(huì)直接影響信息擴(kuò)散,大多數(shù)用戶推薦方法忽視了這一點(diǎn)。為解決上述問(wèn)題,文中提出了用戶擴(kuò)散度的概念和計(jì)算方法,用戶擴(kuò)散度可以用來(lái)對(duì)傳統(tǒng)推薦算法得到推薦結(jié)果進(jìn)行重排序,從而使得推薦算法可以促進(jìn)信息擴(kuò)散。在Email數(shù)據(jù)和Amazon數(shù)據(jù)上的實(shí)驗(yàn)結(jié)果證實(shí)了文中所提出的用戶擴(kuò)散度的有效性。此外,本文還提出了可以配合用戶擴(kuò)散度使用的基于超圖的用戶推薦算法,在新浪微博數(shù)據(jù)集上的結(jié)果表明該方法在推薦指標(biāo)上要優(yōu)于過(guò)去的方法。
[Abstract]:The rapid development of Facebook, Twitter and other social networking sites, indicates that social media (Social Media) has become a hot spot and trend of the development of network technology. In social media users can build relationships (attention, friends), resulting in a variety of online social network virtual network users only. You can release information, but also through sharing, forwarding and other acts to spread information. Therefore, the online social network to support the dissemination of information and information diffusion. Diffusion research in online social networks can help Internet users to obtain useful information, help enterprises to promote their products, help the government regulation of public opinion, the huge application value. Based on social network data and information online diffusion of real data as the research object, constructs the framework of information diffusion in the online social network, and according to the research framework of the The user interest description, information diffusion model, the diffusion of information maximization problem, information diffusion and user recommendation combination are researched. The main content of this paper includes the following four parts: the study of traditional information retrieval, usually use the word vector to describe the user's interest, the weight of each word using the TF-IDF method to calculate there are users of social media data, three yuan of such resources and tags, and word vector of the traditional model can not make full use of the three yuan to describe user interest, and there are many methods of word vector semantic word problem. To solve the above problems, in this paper the tag network model to describe user interest in the label. In the network, the nodes represent the relationship between the edges represent labels, tags. Nodes and edges are weighted, on behalf of the user of interest and interest relation with the strength. Otherwise, this paper also proposes an improved TF-IDF method to calculate the weights. In the Movie Lens label and Cite ULike data sets. The experimental results confirm the effectiveness of the proposed method in this paper. The information diffusion prediction model can be used in public opinion warning and explosive information recognition. It has important research significance and Application value. The information diffusion prediction model mostly has two problems: one is to do not have the time related information diffusion prediction ability, two is the model training mostly takes more time. To solve these problems, this paper proposes a novel information diffusion model (GT model). Different from the past information diffusion model and in the GT model, the nodes in the network are no longer passive neighbors influence the execution behavior, but is regarded as autonomous, intelligent, rational individuals. The users of the accounting calculation Under the choice of interests, to make rational choice. The time related to the interests of users is introduced in the model, the GT model has the ability to predict the information diffusion process of dynamic time. In the paper, the paper proposed a method to calculate the user interest combined with global influence and social influence. In Sina, micro-blog and Flickr data set the experimental results verify the effectiveness of the model in terms of time dynamic information diffusion prediction proposed in this paper. The current research of information diffusion maximization are basically in no annotation of social network, this network containing only friends or trust this kind of positive relationship. However, the information diffusion maximization problem in marking study on social networks is still a challenging and neglected problem. If the maximum information diffusion study does not distinguish between polar network relationship between users, will mark agency The network will roughly as unlabeled network, then positive influence and negative influence of users will be mistaken for positive influence. In order to solve this problem, this paper will expand to the information diffusion problem of maximizing annotation in social networks, the polar correlation information diffusion (PRIM) and polar correlation independent cascade model, and proposed using the greedy algorithm to solve the problem. In two marked social network data (Epinions and Slashdot). The experimental results show that the proposed method is better than that without considering the relationship between the polarity of the greedy algorithm and other heuristic methods in solving the PRIM problem. Social network has two main functions social interaction and information diffusion. Research on user preferences and network structure to help users find the right friend recommendation based on user, which enhances the social interaction social network. At the same time, with the User recommendation will promote the new generation links in social networks, thus speeding up the evolution of the network and change the network structure, which will directly affect the spread of information, most users recommended methods ignore this. To solve the above problems, this paper puts forward the concept and calculation method of user diffusion degree, the user can spread for the traditional recommendation algorithm for reordering results is recommended, so that the recommendation algorithm can promote the diffusion of information in Email data and Amazon data. The experimental results confirm the validity of the user diffusion is proposed in this paper. In addition, this paper also put forward the user can cooperate with diffusion using recommendation algorithm based on user hypergraph in Sina, micro-blog data sets. The results show that this method is better than the past method in the recommended index.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.092
,
本文編號(hào):1669338
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1669338.html
最近更新
教材專著