天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

基于向量模型的加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布隱私保護(hù)方法研究

發(fā)布時(shí)間:2018-01-23 15:15

  本文關(guān)鍵詞: 加權(quán)社會(huì)網(wǎng)絡(luò) 隱私保護(hù) 向量模型 邊空間 差分隱私 隨機(jī)投影 出處:《江蘇大學(xué)》2015年博士論文 論文類型:學(xué)位論文


【摘要】:社會(huì)網(wǎng)絡(luò)是社會(huì)個(gè)體間因互動(dòng)而形成的相對(duì)穩(wěn)定的關(guān)系體系,是多種社會(huì)現(xiàn)象的表示模型,是復(fù)雜網(wǎng)絡(luò)中最具代表性的現(xiàn)實(shí)網(wǎng)絡(luò)之一。隨著社交網(wǎng)絡(luò)數(shù)量不斷增加,越來(lái)越多的社會(huì)個(gè)體在社交網(wǎng)絡(luò)注冊(cè),大量個(gè)體的信息被收集、獲取。為滿足科學(xué)研究、數(shù)據(jù)共享等需要,數(shù)據(jù)收集者需對(duì)社會(huì)網(wǎng)絡(luò)數(shù)據(jù)集進(jìn)行發(fā)布,由于數(shù)據(jù)集中包含個(gè)體的敏感信息,數(shù)據(jù)發(fā)布使個(gè)體的隱私面臨泄露的風(fēng)險(xiǎn)。隨著公眾對(duì)隱私認(rèn)知不斷提高,隱私泄露問(wèn)題已成為數(shù)據(jù)發(fā)布的主要障礙,為確保社會(huì)個(gè)體的隱私安全,在社會(huì)網(wǎng)絡(luò)發(fā)布時(shí)需進(jìn)行隱私保護(hù)處理。目前,已有的社會(huì)網(wǎng)絡(luò)發(fā)布隱私保護(hù)研究成果大都針對(duì)無(wú)權(quán)網(wǎng)絡(luò)。在無(wú)權(quán)網(wǎng)絡(luò)中,社會(huì)個(gè)體間的連接屬于布爾關(guān)系,只能說(shuō)明個(gè)體間是否存在相互作用,對(duì)個(gè)體間相互作用的強(qiáng)度差異卻無(wú)法標(biāo)識(shí)。越來(lái)越多的社會(huì)網(wǎng)絡(luò)實(shí)證研究表明,社會(huì)個(gè)體間存在著耦合強(qiáng)度不同的連接,并不完全是布爾關(guān)系。比如人與人之間的親疏關(guān)系、Internet網(wǎng)絡(luò)上的帶寬、航空網(wǎng)絡(luò)中機(jī)場(chǎng)間航班的數(shù)量或者座位數(shù)、科學(xué)家合作網(wǎng)絡(luò)中科學(xué)家間的合作次數(shù)等都是影響網(wǎng)絡(luò)性質(zhì)的重要因素。因此,在社會(huì)網(wǎng)絡(luò)的拓?fù)浣Y(jié)構(gòu)中引入衡量節(jié)點(diǎn)間耦合程度的物理量非常必要,即在兩個(gè)節(jié)點(diǎn)間的連邊上加一個(gè)權(quán)值,以衡量?jī)蓚(gè)節(jié)點(diǎn)間關(guān)系的強(qiáng)弱。加權(quán)社會(huì)網(wǎng)絡(luò)由于邊權(quán)重引入,使得網(wǎng)絡(luò)中包含的信息要比無(wú)權(quán)網(wǎng)絡(luò)豐富,因此對(duì)加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布的隱私保護(hù)進(jìn)行研究是十分必要而有意義的工作。本文針對(duì)加權(quán)社會(huì)網(wǎng)絡(luò),提出采用基于向量模型的局部擾動(dòng)策略設(shè)計(jì)隱私保護(hù)方法實(shí)現(xiàn)數(shù)據(jù)發(fā)布,具體內(nèi)容如下:(1)提出了依據(jù)隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用兩個(gè)性能指標(biāo)界定加權(quán)社會(huì)網(wǎng)絡(luò)的發(fā)布場(chǎng)景,并給出了具體場(chǎng)景的定義。對(duì)社會(huì)網(wǎng)絡(luò)發(fā)布實(shí)施隱私保護(hù),先要確定發(fā)布場(chǎng)景,明確攻擊者的背景知識(shí)、發(fā)布數(shù)據(jù)的用途和需要保護(hù)的隱私信息,才能采取有效的保護(hù)策略,設(shè)計(jì)隱私保護(hù)方法。針對(duì)社會(huì)網(wǎng)絡(luò)發(fā)布,衡量隱私保護(hù)方法性能的兩個(gè)重要指標(biāo)是隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用。根據(jù)發(fā)布數(shù)據(jù)的特性及實(shí)際的發(fā)布需求,數(shù)據(jù)發(fā)布者可能面臨三種選擇:是在獲得可接受的隱私保護(hù)質(zhì)量前提下,盡可能地提高發(fā)布數(shù)據(jù)效用;二是在獲取可接受的發(fā)布數(shù)據(jù)效用前提下,盡可能地提高隱私保護(hù)質(zhì)量;三是兼顧隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用,希望取得二者的折中。本文針對(duì)三種發(fā)布選擇確定了三個(gè)發(fā)布場(chǎng)景,在每個(gè)發(fā)布場(chǎng)景中,選取加權(quán)社會(huì)網(wǎng)絡(luò)的節(jié)點(diǎn)(包括節(jié)點(diǎn)間連邊的權(quán)重)作為隱私信息,發(fā)布數(shù)據(jù)的用途是進(jìn)行網(wǎng)絡(luò)結(jié)構(gòu)特征分析(重點(diǎn)關(guān)注平均路徑長(zhǎng)度、平均聚類系數(shù)、權(quán)重分布),擬定攻擊者分別擁有三種關(guān)于節(jié)點(diǎn)的背景知識(shí)(度、子圖、邊權(quán)重)。(2)提出了采用向量作為加權(quán)社會(huì)網(wǎng)絡(luò)的發(fā)布模型。以圖的邊空間理論為基礎(chǔ),采用向量描述加權(quán)社會(huì)網(wǎng)絡(luò),為降低向量維數(shù),采用基于節(jié)點(diǎn)的隨機(jī)分割和聚類分割兩種方法構(gòu)建加權(quán)社會(huì)網(wǎng)絡(luò)的向量模型。通過(guò)分割將加權(quán)社會(huì)網(wǎng)絡(luò)表示為若干個(gè)子圖,用向量表示每個(gè)子圖,將所有子圖的向量構(gòu)成的集合作為加權(quán)社會(huì)網(wǎng)絡(luò)的發(fā)布模型。分割子圖與節(jié)點(diǎn)數(shù)量相同的稠密圖相比屬于稀疏圖,通過(guò)對(duì)分割子圖的向量進(jìn)行擾動(dòng),實(shí)施對(duì)加權(quán)社會(huì)網(wǎng)絡(luò)的局部擾動(dòng)策略,進(jìn)而實(shí)現(xiàn)加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布的隱私保護(hù)。(3)針對(duì)提高發(fā)布數(shù)據(jù)效用的需求,提出了采用基于向量相似的隨機(jī)擾動(dòng)方法實(shí)現(xiàn)加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布。該方法以加權(quán)歐氏距離作為向量相似的度量標(biāo)準(zhǔn),根據(jù)發(fā)布者選定的閾值構(gòu)建子圖向量的發(fā)布候選集;從子圖的向量候選集中隨機(jī)選取向量構(gòu)建加權(quán)社會(huì)網(wǎng)絡(luò)的發(fā)布向量集;根據(jù)發(fā)布向量集構(gòu)建最終的加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布集。提出的方法能夠迫使攻擊者在一個(gè)向量發(fā)生概率相同的龐大結(jié)果集中進(jìn)行重識(shí)別,增加了識(shí)別的不確定性和子圖候選向量集中向量的相似性,進(jìn)而最大限度地保證了發(fā)布社會(huì)網(wǎng)絡(luò)與原始社會(huì)網(wǎng)絡(luò)的相似性,提高發(fā)布數(shù)據(jù)效用。(4)針對(duì)提高隱私保護(hù)質(zhì)量的需求,提出了采用基于差分隱私模型的向量映射方法實(shí)現(xiàn)加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布。該方法利用差分隱私模型可實(shí)現(xiàn)隱私信息強(qiáng)保護(hù)的特征,針對(duì)加權(quán)社會(huì)網(wǎng)絡(luò)設(shè)計(jì)了滿足差分隱私的查詢模型—WSQuery, WSQuery模型可捕獲加權(quán)社會(huì)網(wǎng)絡(luò)的結(jié)構(gòu),以有序三元組序列作為查詢結(jié)果集;依據(jù)WSQuery模型設(shè)計(jì)了滿足差分隱私的算法—WSPA,WSPA算法將查詢結(jié)果集映射為一個(gè)實(shí)數(shù)向量,通過(guò)在向量中注入Laplace噪音實(shí)現(xiàn)隱私保護(hù);針對(duì)WSPA算法誤差較高的問(wèn)題提出了改進(jìn)算法—LWSPA,LWSPA算法對(duì)查詢結(jié)果集中的三元組序列進(jìn)行分割,對(duì)每個(gè)子序列構(gòu)建滿足差分隱私的算法,降低了誤差,可滿足發(fā)布數(shù)據(jù)的效用需求,提高隱私保護(hù)質(zhì)量。(5)針對(duì)兼顧隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用的折中需求,提出了采用基于隨機(jī)投影的向量映射方法實(shí)現(xiàn)加權(quán)社會(huì)網(wǎng)絡(luò)發(fā)布。該方法將加權(quán)社會(huì)網(wǎng)絡(luò)用高維向量進(jìn)行描述,應(yīng)用隨機(jī)投影技術(shù)的低失真映射將原始高維向量集進(jìn)行降維操作得到低維目標(biāo)向量集,通過(guò)數(shù)據(jù)降維即可實(shí)現(xiàn)去除冗余又可通過(guò)降維轉(zhuǎn)換采用數(shù)值畸變方法實(shí)現(xiàn)隱私信息保護(hù)。在基本向量集隨機(jī)投影方法的基礎(chǔ)上,為避免隨機(jī)投影轉(zhuǎn)換矩陣泄露導(dǎo)致原始數(shù)據(jù)集被重構(gòu),提出了改進(jìn)的向量集隨機(jī)投影方法,采用了兩個(gè)隨機(jī)函數(shù)地組合構(gòu)建隨機(jī)矩陣元素,并證明了利用該矩陣實(shí)現(xiàn)地隨機(jī)映射滿足Johnson-Lindenstrauss引理的條件,該方法在提升隱私保護(hù)質(zhì)量的同時(shí)能獲得較高的發(fā)布數(shù)據(jù)效用,可實(shí)現(xiàn)隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用的折中。(6)針對(duì)提出的基于向量模型的三種隱私保護(hù)方法在六個(gè)真實(shí)數(shù)據(jù)集上進(jìn)行了仿真實(shí)驗(yàn),并通過(guò)與已有算法進(jìn)行實(shí)驗(yàn)對(duì)比,分析了每種方法的性能,驗(yàn)證了所提出方法的有效性。對(duì)基于三種隱私保護(hù)方法實(shí)現(xiàn)的算法執(zhí)行時(shí)間進(jìn)行了分析;選取了與提出的三種隱私保護(hù)方法相關(guān)的六個(gè)算法結(jié)合具體隱私攻擊進(jìn)行了實(shí)驗(yàn)對(duì)比,從基于度、基于子圖和基于權(quán)重的三種背景知識(shí)的節(jié)點(diǎn)識(shí)別攻擊測(cè)試了算法的隱私保護(hù)質(zhì)量;從平均最短路徑、平均聚類系數(shù)和權(quán)重分布三個(gè)結(jié)構(gòu)特征參數(shù)的效用測(cè)試了算法的發(fā)布數(shù)據(jù)效用。根據(jù)實(shí)驗(yàn)結(jié)果及分析可知,提出的三種隱私保護(hù)方法可滿足各自發(fā)布場(chǎng)景的需求,能較好地平衡隱私保護(hù)質(zhì)量和發(fā)布數(shù)據(jù)效用的關(guān)系。
[Abstract]:The social network system is a relatively stable relationship between individuals because of social interaction to form the model is expressed in a variety of social phenomena, the reality of complex networks is one of the most representative. With the increasing number of social networks, more and more individuals registered in the social network, a large number of individual information is collected, access to meet the needs of scientific research, data sharing, data collectors need to be released to the social network data sets, the sensitive information of the data set contains individual data released, make individual privacy risk disclosure. As the public on cognitive privacy continues to improve, privacy issues have become a major obstacle to the release of the data, to ensure that the social individual privacy, the social network privacy protection is required when processing. At present, the existing social network privacy protection research mostly for free Right in the network. To the network, the connection between individuals belonging to Boolean relations, can only explain whether interactions exist between individuals, the intensity differences on the interaction between individuals is not identified. More and more empirical research on social network shows that social individual exists between the coupling strength of different connections, is not entirely the Boolean relations for example the relationship between man and man, Internet on the network bandwidth, the number of flights between airports in the aviation network or the number of seats, scientists cooperation network scientists cooperation times are the important factors influencing the quality of the network. Therefore, the introduction of a measure of physical coupling degree between nodes is necessary in topology the social network, which is in between two nodes even on the edge of an increase in weight, to measure the strength of the relationship between two nodes. The weighted social network because of the edge weight is introduced, making the net Network information contained in the rich than the unweighted network, the weighted social network privacy protection research is necessary and meaningful work. Based on the weighted social network, the local disturbance privacy protection method design strategy vector model to realize the data released based on the specific contents are as follows: (1) put forward the basis the quality and utility of the data privacy protection released the two performance indicators defined weighted social network publishing scene, and gives the definition of specific scenes. The social network privacy protection promulgated, first issued to determine the scene, clear the attacker's background knowledge, publication data and privacy information need to be protected, in order to take effective protection strategies the design of privacy protection method. According to the social network, two important indicators to measure the performance of privacy protection method is privacy protection and quality According to data released cloth data utility. The characteristics and the actual demand for the release of the data, the publisher may face three choices: in privacy protection under the premise of acceptable quality, as much as possible to improve the release of data utility; two is to obtain acceptable data utility under the premise, as far as possible to improve privacy protection quality; three is both privacy protection and data quality utility, hope to achieve the two compromise. According to the three release selection identified three release scenarios, released in each scenario, node selection weighted social network (including the node connected between the edge weight) as privacy information, publishing data use analysis the network structure (focusing on the average path length, clustering coefficient, weight distribution), to have three attackers respectively on the node of background knowledge (degree, subgraph, Bian Quanzhong (2)). The vector as the publishing model weighted social network. The edge space theory to as the basis, using the weighted vector to describe the social network, in order to reduce the dimension of vector, the vector model of random segmentation and clustering segmentation method for constructing a two node weighted based on the social network. By dividing the weighted social network is expressed as a number of sub graph. Each sub graph with the vector, the set of all sub graph vector constitute publication model weighted social network. The same number of sub graph segmentation dense graph with node compared to a sparse graph of disturbance by vector on the segmentation graph, the implementation of local perturbation strategy on weighted social network, and protect the privacy of the weighted the social network publishing. (3) in order to improve the utility of released data demand, put forward by the random perturbation method to realize vector similarity based on weighted social network Network release. In this method the weighted Euclidean distance as the standard vector similarity measure, according to the release candidate publisher selected threshold to construct the sub vector; vector candidate subgraphs from randomly selected vector to construct a weighted social network released according to the published set of vector set; weighted social network construction to the final set. The method can force the attacker in a vector of the same probability results occurred huge re recognition, increased uncertainty and similarity subgraph candidate vector set vector recognition, and to maximize the similarity with the original release of social network social network, improve the data released (4) in order to improve the effectiveness. The quality of privacy protection needs, put forward the vector mapping method for differential privacy model for weighted social network based on the method. By using the difference of privacy The model can realize the characteristics of strong privacy information protection, according to the weighted social network designed to meet the WSQuery query model: privacy, WSQuery model can capture the weighted social network, to order three tuple sequence as the query result set; according to the WSQuery model, designed to meet the differential privacy algorithm - WSPA algorithm, WSPA query the result set is mapped to a real vector, through the injection of Laplace noise in the vector to achieve privacy protection; WSPA algorithm for high error problem put forward to improve the algorithm LWSPA, LWSPA algorithm for the three tuple sequence set in the query results of segmentation, each sub sequence is constructed to meet the differential privacy algorithm reduces the error. Can meet the requirements of data release utility, improve the quality of privacy protection. (5) according to the needs of both compromise quality and privacy protection data released utility, puts forward the base Random projection vector mapping method to achieve weighted social network publishing. The weighted social network with high dimensional vector description, low distortion mapping the original high-dimensional vector set reduction operation to get the low dimensional target vector set using random projection technology, through the data dimensionality reduction can be realized by removing redundant and dimensionality reduction by using the numerical method to realize the conversion of the distortion of privacy protection. Based on random projection method set in the basic vector, in order to avoid the random projection transformation matrix led to disclosure of the original data set was reconstructed, put forward the improved method of random projection vector set, using two random function to construct random combination of matrix elements, and prove the realization of random meet the conditions of lemma Johnson-Lindenstrauss mapping by using this matrix, this method can obtain higher efficiency in data released while improving the quality of privacy protection Use, can realize the protection of privacy and utility of the released data quality trade-off. (6) put forward three kinds of privacy protection method based on vector model on six real data sets are carried out simulation experiments, and through the comparison and analysis of the existing algorithms, each method can verify the effectiveness this method. To realize the three kinds of privacy protection method based on the algorithm execution time is analyzed; selected six algorithms and three kinds of privacy protection methods proposed by combining the specific privacy attacks. The experimental results based on degree, based on sub graph and node identification attack three weights based on background knowledge test the quality of privacy protection algorithm; the average shortest path, clustering coefficient and average weight distribution of three structural characteristic parameters of the utility test algorithm to publish data utility. According to the experimental results and analysis It can be seen that the three methods of privacy protection can meet the needs of each release scene, and can better balance the relationship between the quality of privacy protection and the release of data utility.

【學(xué)位授予單位】:江蘇大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2015
【分類號(hào)】:TP309

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 韓建民;于娟;虞慧群;賈l,

本文編號(hào):1457824


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1457824.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶61f01***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com