天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 數(shù)學(xué)論文 >

多層合作網(wǎng)絡(luò)中的鏈接預(yù)測(cè)

發(fā)布時(shí)間:2018-05-29 03:50

  本文選題:社會(huì)網(wǎng)絡(luò)分析(Social + network; 參考:《北京交通大學(xué)》2017年碩士論文


【摘要】:對(duì)復(fù)雜網(wǎng)絡(luò)及其特性的研究引起了各領(lǐng)域?qū)W者們的廣泛關(guān)注。復(fù)雜網(wǎng)絡(luò)是真實(shí)世界的網(wǎng)絡(luò)的一種抽象表示。它們?cè)诒举|(zhì)上具有高度的動(dòng)態(tài)性,并在不斷地演化。此外,復(fù)雜網(wǎng)絡(luò)最初的規(guī)模較小,但在大數(shù)據(jù)時(shí)代下,復(fù)雜網(wǎng)絡(luò)的規(guī)模以驚人的速度在增長(zhǎng)。研究和分析動(dòng)態(tài)大型網(wǎng)絡(luò)是網(wǎng)絡(luò)科學(xué)家的一大挑戰(zhàn)。現(xiàn)實(shí)世界的許多系統(tǒng)可以建模成角色之間的合作網(wǎng)絡(luò)(網(wǎng)絡(luò)中節(jié)點(diǎn)的角色可以是用戶,作家,論文,項(xiàng)目,蛋白質(zhì)等),且這樣的網(wǎng)絡(luò)是動(dòng)態(tài)變化的。復(fù)雜網(wǎng)絡(luò)也可以是在線社交網(wǎng)絡(luò),用來描述人與人之間的社交關(guān)系,比如朋友之間的交互關(guān)系。合作網(wǎng)絡(luò)展示了一些業(yè)務(wù)關(guān)系(例如,學(xué)術(shù)合著或合作出版網(wǎng)絡(luò),產(chǎn)品聯(lián)合采購網(wǎng)絡(luò)等)。生物系統(tǒng)(例如蛋白質(zhì)相互作用網(wǎng)絡(luò))或計(jì)算機(jī)科學(xué)網(wǎng)絡(luò)(例如互聯(lián)網(wǎng)和對(duì)等網(wǎng)絡(luò))等,也是復(fù)雜網(wǎng)絡(luò)的一種。這些系統(tǒng)用圖中的節(jié)點(diǎn)來表示角色,節(jié)點(diǎn)之間的鏈接表示角色之間的各種相互作用、合作或影響。所有類型的復(fù)雜網(wǎng)絡(luò)都有一些公共的拓?fù)湫再|(zhì),如直徑或平均距離較小、節(jié)點(diǎn)度分布服從冪率分布、高聚集系數(shù)以及社區(qū)結(jié)構(gòu)等。近年來,大量的研究工作都集中于研究復(fù)雜系統(tǒng)中單層網(wǎng)絡(luò)的特性。但是,研究證明,對(duì)于大型復(fù)雜系統(tǒng),例如交通網(wǎng)絡(luò)、社交網(wǎng)絡(luò)等,由于它們十分龐大、結(jié)構(gòu)復(fù)雜多樣,如果只研究其中的單層網(wǎng)絡(luò)的話,將很難捕捉到復(fù)雜系統(tǒng)中的鏈接模式。人們自然會(huì)想到將復(fù)雜系統(tǒng)中的多層網(wǎng)絡(luò)融合在一起進(jìn)行研究。但是如果簡(jiǎn)單的將多層網(wǎng)絡(luò)進(jìn)行疊加,在融合后的網(wǎng)絡(luò)上做鏈接預(yù)測(cè)的話,將失去很多有用的信息。因此,對(duì)于復(fù)雜系統(tǒng),有必要考慮其網(wǎng)絡(luò)的多層次結(jié)構(gòu),不能僅從單層結(jié)構(gòu)的角度,研究其網(wǎng)絡(luò)鏈接的特點(diǎn)。我們需要同時(shí)考慮多層網(wǎng)絡(luò)結(jié)構(gòu),發(fā)現(xiàn)其中的鏈接模式的規(guī)律。因此,在本研究中,為了更好地捕捉復(fù)雜網(wǎng)絡(luò)中的特點(diǎn),我們構(gòu)建了一個(gè)多層網(wǎng)絡(luò)模型,這個(gè)多層的網(wǎng)絡(luò)模型同時(shí)考慮了兩個(gè)網(wǎng)絡(luò)的特點(diǎn),并基于此提出了一種多層網(wǎng)絡(luò)模型中的鏈路預(yù)測(cè)方法。多復(fù)雜性(multiplexity)理論最初來源于社交網(wǎng)絡(luò)。在社交網(wǎng)絡(luò)分析中,人與人之間存在著多種多樣的關(guān)系,由于關(guān)系類型的不同,他們之間的交互行為也會(huì)有所不同。例如,在社交網(wǎng)絡(luò)中,人與人之間的關(guān)系可以是親密的朋友關(guān)系、鄰居關(guān)系,也可以是同事關(guān)系等。在社交網(wǎng)絡(luò)分析中,兩個(gè)人之間的多重關(guān)系被稱為關(guān)系的多樣性。事實(shí)上,不僅在社交網(wǎng)絡(luò)分析中存在關(guān)系的多樣性這一特點(diǎn),現(xiàn)實(shí)世界的其他復(fù)雜系統(tǒng)的實(shí)體之間也存在著關(guān)系多樣性的特點(diǎn)。本研究的核心問題是鏈接預(yù)測(cè),鏈接預(yù)測(cè)是指預(yù)測(cè)兩個(gè)節(jié)點(diǎn)之間的鏈接是否存在,它在復(fù)雜網(wǎng)絡(luò)分析中具有重要意義,也是圖挖掘的一個(gè)重要方向。一些研究者在該領(lǐng)域已經(jīng)做出了許多貢獻(xiàn),但直到21世紀(jì)初,研究人員將機(jī)器學(xué)習(xí)和數(shù)據(jù)挖掘應(yīng)用于此領(lǐng)域后,才得到了如何通過高關(guān)聯(lián)數(shù)據(jù)的結(jié)構(gòu)性質(zhì)共同探索提取知識(shí)的方法。傳統(tǒng)的機(jī)器學(xué)習(xí)方法的缺點(diǎn)是無法準(zhǔn)確理解和利用實(shí)體之間的關(guān)聯(lián)信息,而這種新的方法充分利用了實(shí)體之間的關(guān)系的數(shù)據(jù)。在圖這類特殊的數(shù)據(jù)結(jié)構(gòu)上進(jìn)行數(shù)據(jù)挖掘時(shí),圖中的邊會(huì)提供實(shí)體間的關(guān)聯(lián)關(guān)系,所以當(dāng)對(duì)實(shí)體間的關(guān)系進(jìn)行挖掘時(shí),圖挖掘的相關(guān)方法將很有效。預(yù)測(cè)網(wǎng)絡(luò)中新的鏈接關(guān)系即所謂的“未來鏈接預(yù)測(cè)”的問題是指通過研究一段時(shí)間內(nèi)網(wǎng)絡(luò)中鏈接的出現(xiàn)或消失的記錄,預(yù)測(cè)未來將要出現(xiàn)的鏈接關(guān)系的問題。鏈接預(yù)測(cè)在不同領(lǐng)域有著廣泛的應(yīng)用,如向你推薦社交網(wǎng)站中的朋友、識(shí)別隱藏的犯罪關(guān)系。在醫(yī)學(xué)和生物學(xué)領(lǐng)域,找到合適且預(yù)測(cè)精度足夠高的方法去指導(dǎo)實(shí)驗(yàn),不但縮小了實(shí)驗(yàn)范圍,而且還能提高實(shí)驗(yàn)的成功率,在很大程度上降低了實(shí)驗(yàn)成本,節(jié)省了大量的時(shí)間和人力。另外,鏈路預(yù)測(cè)的研究與網(wǎng)絡(luò)演化機(jī)制的問題有著密切的聯(lián)系。了解網(wǎng)絡(luò)的演化機(jī)制,得到其變化的規(guī)律,這里所蘊(yùn)含的規(guī)律很可能是復(fù)雜網(wǎng)絡(luò)中鏈接形成的主要推動(dòng)力。近年來學(xué)者們提出了多種鏈路預(yù)測(cè)模型。大多數(shù)現(xiàn)有方法只考慮由單層組成的簡(jiǎn)單網(wǎng)絡(luò),其中所有鏈接都是相同的類型。然而,現(xiàn)實(shí)中許多網(wǎng)絡(luò)通常都是異構(gòu)的,它們涉及不同類型的鏈接和節(jié)點(diǎn)。例如,關(guān)注科學(xué)家之間的相互作用,可以定義不同類型的鏈接:如果兩個(gè)科學(xué)家共同出版了一些研究論文,或者如果他們?cè)谕粋(gè)會(huì)議上發(fā)表了他們的文章,或他們的研究領(lǐng)域相同;如果他們?cè)谧约旱奈恼轮幸昧似渌茖W(xué)家的作品,那么他們也可以聯(lián)系起來。共同作者網(wǎng)絡(luò)或科學(xué)家網(wǎng)絡(luò)可以通過多層網(wǎng)絡(luò)更好地建模,異構(gòu)鏈接信息可以非常好地用于改進(jìn)鏈接預(yù)測(cè)的結(jié)果。合作網(wǎng)絡(luò),特別是科學(xué)家的合作網(wǎng)絡(luò),有各種豐富的信息,可用于各類網(wǎng)絡(luò)分析任務(wù)的研究,如鏈路預(yù)測(cè),社區(qū)檢測(cè),節(jié)點(diǎn)識(shí)別等。另外該網(wǎng)絡(luò)中由于存在不同種類的鏈接信息,這些網(wǎng)絡(luò)已被用于研究復(fù)雜網(wǎng)絡(luò)的異構(gòu)性質(zhì)。所以,在科學(xué)家合作網(wǎng)絡(luò)這個(gè)復(fù)雜的大型系統(tǒng)中,研究如何利用多層網(wǎng)絡(luò)的信息來預(yù)測(cè)未來潛在的鏈接關(guān)系是有非常意義的。因此,在這項(xiàng)工作中,我們將研究利用多層網(wǎng)絡(luò)預(yù)測(cè)某個(gè)單層網(wǎng)絡(luò)中的未來可能產(chǎn)生的鏈接,充分利用了多層網(wǎng)絡(luò)中豐富的特征和信息。在我們構(gòu)建的科學(xué)家合作網(wǎng)絡(luò)中,一個(gè)層次上的鏈接代表兩個(gè)作者在某個(gè)期刊上的合作關(guān)系,這樣的預(yù)測(cè)對(duì)于發(fā)現(xiàn)潛在合作關(guān)系進(jìn)行合作者推薦是有潛在價(jià)值的。從廣義的多層網(wǎng)絡(luò)鏈接預(yù)測(cè)來講,我們所給出的方法也具有很好的可推廣性,因?yàn)樗诟拍钌虾徒Y(jié)構(gòu)上都與許多實(shí)際的網(wǎng)絡(luò)分析問題相關(guān)。本文主要在科學(xué)家合作網(wǎng)絡(luò)上評(píng)估了我們的鏈接預(yù)測(cè)方法,數(shù)據(jù)來自APS(美國物理學(xué)會(huì))數(shù)據(jù)集。鏈接預(yù)測(cè)是一個(gè)二分類問題。機(jī)器學(xué)習(xí)中分類算法有很多,這里結(jié)合我們的數(shù)據(jù)和實(shí)驗(yàn)結(jié)果,我們選擇了邏輯回歸的方法進(jìn)行鏈接預(yù)測(cè)。邏輯回歸屬于有監(jiān)督學(xué)習(xí)算法。在有監(jiān)督學(xué)習(xí)中,每個(gè)例子都是由特征向量和對(duì)應(yīng)輸出標(biāo)簽組成。在我們的問題中,輸入就是對(duì)節(jié)點(diǎn)的一系列特征,包括單層網(wǎng)絡(luò)上的特征和多層網(wǎng)絡(luò)特征,輸出就是這對(duì)節(jié)點(diǎn)在未來的一段時(shí)間內(nèi)是否存在鏈接關(guān)系。我們的創(chuàng)新點(diǎn)在于,不同于傳統(tǒng)的單層網(wǎng)絡(luò)的鏈接預(yù)測(cè)問題,我們構(gòu)建了基于多層網(wǎng)絡(luò)的特征集合,即充分的利用了兩個(gè)節(jié)點(diǎn)在不同網(wǎng)絡(luò)中的多種關(guān)系的信息。具體來講,我們從數(shù)據(jù)集中提取相關(guān)信息,利用一個(gè)特定領(lǐng)域的論文數(shù)據(jù)構(gòu)建了三個(gè)單層網(wǎng)絡(luò)。其中兩個(gè)單層網(wǎng)絡(luò)的是由2000-2004年來自兩個(gè)期刊的數(shù)據(jù)生成的網(wǎng)絡(luò),另外一個(gè)單層網(wǎng)絡(luò)是由2005-2009年其中一個(gè)期刊的數(shù)據(jù)生成的網(wǎng)絡(luò)。我們目標(biāo)是預(yù)測(cè)2005-2009年間期刊“PHYSICAL REVIEW LETTERS”上可能出現(xiàn)的科學(xué)家之間的合作關(guān)系。有別于普通的單層網(wǎng)絡(luò)上的鏈接預(yù)測(cè)問題,為了充分利用科學(xué)家之間的交互關(guān)系,我們構(gòu)建了多層網(wǎng)絡(luò),該多層網(wǎng)絡(luò)的第一層網(wǎng)絡(luò)是2000-2004年,期刊“PHYSICAL REVIEW LETTERS”上的科學(xué)家之間合作的信息;第二層網(wǎng)絡(luò)是同一時(shí)間段,即2000-2004年,期刊“PHYSICALREVIEWE”上的科學(xué)家之間合作的信息。我們用這個(gè)多層網(wǎng)絡(luò)上的數(shù)據(jù)來預(yù)測(cè)2005-2009年,期刊“PHYSICAL REVIEW LETTERS”上的科學(xué)家之間的合作關(guān)系。在我們提出的基于多層網(wǎng)絡(luò)的鏈接預(yù)測(cè)模型中,我們主要構(gòu)建了鏈接的兩類特征屬性。第一類屬性是基本的單層網(wǎng)絡(luò)特征,第二組屬性是基于多層網(wǎng)絡(luò)的復(fù)雜屬性,并將我們?cè)O(shè)計(jì)的方法應(yīng)用于上述數(shù)據(jù)集中。為了證明我們提出的多層鏈接預(yù)測(cè)模型的有效性,我們?cè)O(shè)計(jì)了對(duì)比實(shí)驗(yàn)。對(duì)比實(shí)驗(yàn)中,我們僅用期刊“PHYSICAL REVIEW LETTERS”在2000-2004年的科學(xué)家合作網(wǎng)絡(luò)中的鏈接信息來預(yù)測(cè)該期刊2005-2009年的科學(xué)家合作網(wǎng)絡(luò)中的鏈接,即用傳統(tǒng)的單層網(wǎng)絡(luò)鏈接預(yù)測(cè)方法。實(shí)驗(yàn)結(jié)果表明,我們提出的基于多層網(wǎng)絡(luò)的鏈接預(yù)測(cè)模型,即同時(shí)用了 2000-2004 年的“PHYSICAL REVIEW LETTERS”和“PHYSICAL REVIEWE”兩個(gè)期刊的信息,豐富了兩個(gè)合作者之間的信息,較傳統(tǒng)的單層網(wǎng)絡(luò)上的鏈接預(yù)測(cè)方法有更高的準(zhǔn)確率。特別是在評(píng)價(jià)指標(biāo)AUC這一項(xiàng)上,我們方法的平均值比基準(zhǔn)方法的平均值有很大的提升。當(dāng)然,除了 AUC指標(biāo)外,F1指標(biāo)、準(zhǔn)確率、召回率這些方面,我們的方法的結(jié)果也有提升。
[Abstract]:The study of complex networks and their characteristics has aroused wide attention of scholars in various fields. Complex networks are an abstract representation of the real world network. They are highly dynamic and evolving in essence. In addition, complex networks are initially small in size, but in the large data age, the scale of complex networks is astonishing. The research and analysis of dynamic large networks is a major challenge for network scientists. Many systems in the real world can be modeled as a cooperative network between roles (the roles of nodes in the network can be users, writers, papers, projects, proteins, etc.), and such networks are dynamic. Complex networks can also be online. Social networks, used to describe social relationships between people, such as the interaction between friends. The cooperative network shows some business relationships (such as academic co authored or cooperative publishing networks, product joint procurement networks, etc.). Biological systems (such as protein interaction networks) or computer science networks (such as the Internet and peer-to-peer networks) Collaterals, etc., it is also a kind of complex network. These systems use the nodes in the graph to represent the roles, the links between the nodes represent the various interactions, cooperation or influence between the roles. All types of complex networks have some common topological properties, such as the smaller diameter or the average distance, the node degree distribution obeys the power rate distribution, and the high aggregation set. In recent years, a lot of research work has been focused on the study of the characteristics of single layer networks in complex systems. However, research has proved that it is difficult to capture complex systems, such as traffic networks, social networks, and so on, because they are very large and complex and complex. It is natural to think of the integration of multi-layer networks in complex systems. But if a simple multi-layer network is superposed and linked in the merged network, a lot of useful information will be lost. Therefore, it is necessary to consider the multi-level network for complex systems. Structure can not only study the characteristics of its network links from a single layer structure. We need to consider the multi-layer network structure at the same time and discover the rules of the link pattern. In this study, in order to better capture the characteristics of the complex network, we build a multi-layer network model, which is tested at the same time. The characteristics of two networks are considered, and based on this, a link prediction method in a multilayer network model is proposed. The multiplexity theory is originally derived from social networks. In social network analysis, there are a variety of relationships between people and people. Because of the different types of relationships, the interaction between them will not be possible. In social networks, for example, the relationship between people can be a close friend, a neighbor, or a colleague relationship. In the social network analysis, the multiple relationships between the two people are called the diversity of the relationship. In fact, there is not only the diversity of relationships in the social network analysis, but the real world. The core problem of the other complex systems is the relationship diversity. The core problem of this study is link prediction. Link prediction is the prediction of whether the links between the two nodes exist. It is important in the complex network analysis and is an important direction of the graph mining. Some researchers have already made it in this field. A lot of contributions were made, but until the early twenty-first Century, when researchers applied machine learning and data mining to this field, the method of exploring how to extract knowledge through the structural properties of high related data was obtained. The disadvantage of the traditional machine learning method is that it can not accurately understand and utilize the Association information between entities, and this new method is new. The method makes full use of the data between entities. When data mining is carried out on a special data structure such as graphs, the edges of the graph will provide the relationship between entities, so when the relationship between entities is excavated, the related methods of graph mining will be very effective. The problem of prediction is to predict the link relationship that will occur in the future by studying the occurrence or disappearance of links in the network for a period of time. Link prediction is widely used in different fields, such as recommending friends in social networking sites to you, identifying hidden criminal relationships. Finding a combination in the medical and biological fields. The method is suitable to guide the experiment with high accuracy, which not only reduces the scope of the experiment, but also improves the success rate of the experiment, reduces the cost of the experiment to a great extent, saves a lot of time and manpower. In addition, the link prediction research is closely related to the problem of network evolution mechanism. In recent years, scholars have proposed a variety of link prediction models. Most existing methods only consider simple networks composed of monolayers, all of which are the same types. However, many networks are usually in reality. They are heterogeneous, and they involve different types of links and nodes. For example, focusing on the interaction between scientists can define different types of links: if two scientists publish some research papers together, or if they publish their articles at the same meeting, or their research fields are the same; if they are in the same field, In his article, the works of other scientists are cited, so they can also be linked. The co author network or the scientist network can be better modeled through a multi-layer network, and the heterogeneous link information can be used to improve the results of the link prediction. Information can be used in the research of various network analysis tasks, such as link prediction, community detection, and node recognition. In addition, the network has been used to study the heterogeneous nature of complex networks because of the existence of different types of link information. So, in the complex large-scale system of the scientist cooperative network, the study of how to use multi layers is studied. The information of the network is very meaningful to predict the potential link relationship in the future. In this work, we will study the future potential links in the prediction of a single layer network using multi-layer networks, making full use of the rich features and information in the multi-layer network. The next link represents a cooperative relationship between two authors in a periodical. This prediction is of potential value for the collaborator recommendation to find potential cooperative relationships. From the generalized multi layer network link prediction, the method we give is also very good, because it is both conceptually and structurally. This paper evaluates our link prediction method mainly on the scientist cooperation network. The data comes from the APS (American Physics Society) data set. Link prediction is a two classification problem. There are many classification algorithms in machine learning. Here we combine our data and experimental results, and we choose logical regression. In the supervised learning, each example is composed of the feature vector and the corresponding output label. In our problem, the input is a series of features of the node, including the features on the single layer network and the multi layer network features, and the output is the node in the future. Whether there is a link relationship for a period of time, our innovation is that, unlike the traditional single layer network link prediction problem, we build a feature set based on the multi-layer network, that is, to fully utilize the information of the multiple relationships between the two nodes in different networks. Three single layer networks are constructed using a specific field of paper data. Of which two single layer networks are generated by 2000-2004 years of data from two periodicals, and one single layer network is a network generated by one of the periodicals for 2005-2009 years. The goal is to predict the 2005-2009 years' Journal "PHYSICAL REVIE. The possible collaboration between scientists on W LETTERS is different from the link prediction problem on the ordinary single layer network. In order to make full use of the interaction between scientists, we build a multi-layer network. The first layer of the multilayer network is 2000-2004 years, between scientists on "PHYSICAL REVIEW LETTERS". Information about cooperation; the second layer network is the information about the cooperation between scientists on the same time period, 2000-2004 years, the journal "PHYSICALREVIEWE". We use the data on this multi-layer network to predict the cooperation between the scientists on the periodical "PHYSICAL REVIEW LETTERS" in 2005-2009 years. In the link prediction model, we mainly build two types of characteristic attributes of the link. The first class attribute is the basic single layer network feature, the second attributes are based on the complex attributes of the multilayer network, and apply the method we design to the above data set. Contrastive experiments are designed. In contrast, we use the link information of the journal "PHYSICAL REVIEW LETTERS" to predict the link in the scientist's cooperative network for the 2005-2009 year of the journal, that is, the traditional single layer network link prediction method. The link prediction model of the layer network, that is, uses the information of two periodicals of "PHYSICAL REVIEW LETTERS" and "PHYSICAL REVIEWE", enriches the information between two collaborators, and has a higher accuracy rate than the traditional single layer network link prediction method, especially on the evaluation index AUC. The average value of the method has been greatly improved than the average of the benchmark method. In addition to the AUC index, the F1 index, the accuracy rate, the recall rate, and the results of our method have also improved.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:O157.5

【相似文獻(xiàn)】

相關(guān)會(huì)議論文 前1條

1 時(shí)國華;周斌;韓毅;;一種微博事件源頭發(fā)現(xiàn)的方法[A];第27次全國計(jì)算機(jī)安全學(xué)術(shù)交流會(huì)論文集[C];2012年

相關(guān)博士學(xué)位論文 前1條

1 張?chǎng)?復(fù)雜網(wǎng)絡(luò)中社區(qū)發(fā)現(xiàn)方法研究[D];哈爾濱工業(yè)大學(xué);2017年

相關(guān)碩士學(xué)位論文 前1條

1 WALEED JAMIL;多層合作網(wǎng)絡(luò)中的鏈接預(yù)測(cè)[D];北京交通大學(xué);2017年



本文編號(hào):1949374

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/yysx/1949374.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶57fa4***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com