集值數(shù)據(jù)和社交網(wǎng)絡(luò)聯(lián)合發(fā)布中隱私保護方法研究

發(fā)布時間：2019-04-18 15:17

【摘要】：隨著網(wǎng)絡(luò)的飛速發(fā)展和普遍,各種應(yīng)用產(chǎn)生了海量數(shù)據(jù),比如微信、facebook、購物平臺等。數(shù)據(jù)之間存在潛在的關(guān)聯(lián)關(guān)系具有不可估量的社會和經(jīng)濟價值,比如進行群體行為分析,輔助商業(yè)決策等多方面的數(shù)據(jù)應(yīng)用價值。在發(fā)布數(shù)據(jù)給數(shù)據(jù)挖掘者時,需要把數(shù)據(jù)進行隱私保護,因為數(shù)據(jù)一般都包含許多用戶的隱私信息,容易導(dǎo)致隱私信息泄露,所以數(shù)據(jù)隱私保護就顯得尤為重要。近幾年,數(shù)據(jù)隱私保護是熱門研究領(lǐng)域,已有不少相關(guān)研究成果,但現(xiàn)有的研究主要是針對單類型數(shù)據(jù)進行隱私保護。在大數(shù)據(jù)時代,數(shù)據(jù)的挖掘已經(jīng)多源化,比如社交網(wǎng)絡(luò)數(shù)據(jù)和事務(wù)性數(shù)據(jù)結(jié)合挖掘,解決購物推薦系統(tǒng)的冷啟動問題等。在多源數(shù)據(jù)情況下,背景知識增多帶來新的隱私問題,現(xiàn)有的隱私保護方法已不適用于多源數(shù)據(jù)的聯(lián)合發(fā)布。相對關(guān)系型數(shù)據(jù),集值數(shù)據(jù)具有高維度、稀疏等特征。關(guān)系型數(shù)據(jù)的隱私保護方法顯然對集值數(shù)據(jù)已不適用,比如用k匿名隱私模型對集值數(shù)據(jù)進行保護會導(dǎo)致數(shù)據(jù)的信息損失過大。針對該情況,ρ-不確定性模型能較好地平衡隱私保護和信息損失,近年來也有許多基于ρ-不確定性的集值數(shù)據(jù)隱私保護的研究成果。社交網(wǎng)絡(luò)數(shù)據(jù)方面也有很多數(shù)據(jù)保護模型,比如k度匿名、l多樣性等,這些模型通過增刪邊或節(jié)點來滿足隱私要求。這些保護模型能對單類型數(shù)據(jù)進行保護,但在社會網(wǎng)絡(luò)數(shù)據(jù)與集值數(shù)據(jù)聯(lián)合發(fā)布情況下,背景知識增多,使得受害者信息的泄露概率大于ρ,不符合數(shù)據(jù)隱私要求。因此,針對社會網(wǎng)絡(luò)數(shù)據(jù)與集值數(shù)據(jù)聯(lián)合發(fā)布,本文提出分組ρ-不確定性隱私保護模型。本文主要工作如下:首先,分析集值數(shù)據(jù)和社交網(wǎng)絡(luò)數(shù)據(jù)現(xiàn)有的隱私保護模型,提出數(shù)據(jù)聯(lián)合發(fā)布的攻擊模型,現(xiàn)有的單數(shù)據(jù)類型隱私保護模型對該攻擊模型已不適用。在集值數(shù)據(jù)中任意數(shù)據(jù)項的背景知識情況下,ρ-不確定性模型確保能推斷出敏感數(shù)據(jù)項的概率不超過ρ。該模型在集值數(shù)據(jù)單獨發(fā)布情況下是有效的,但與社交網(wǎng)絡(luò)聯(lián)合發(fā)布情況下,若攻擊者還了解受害者在社交應(yīng)用中有幾個朋友,即了解社會網(wǎng)絡(luò)數(shù)據(jù)受害者節(jié)點的度,則成功推斷受害者在集值數(shù)據(jù)敏感項的概率大于ρ,不滿足隱私要求。其次,針對上面的攻擊模型,結(jié)合ρ-不確定性模型和度匿名模型,本文提出分組ρ-不確定性隱私保護模型。首先,該保護模型需要根據(jù)項目屬性制定泛化樹,比如apple、banana泛化為fruit。然后根據(jù)泛化樹把集值數(shù)據(jù)分組,即集值數(shù)據(jù)中非敏感項目在泛化樹中具有相同父節(jié)點的記錄分為一組�；讦�-不確定性模型,該模型要求每個分組都滿足ρ-不確定性模型,并證明了每個分組滿足ρ-不確定性模型情況下,整體的數(shù)據(jù)也是滿足ρ-不確定性模型。最后把社交網(wǎng)絡(luò)的節(jié)點分組(與集值數(shù)據(jù)的分組一致)并組內(nèi)匿名處理,使得社交網(wǎng)絡(luò)的節(jié)點在組內(nèi)具有相同的度數(shù)。因此,在上面的背景知識下,攻擊受害者的敏感項概率低于ρ,從而達到匿名需求。再次,基于分組ρ-不確定性隱私保護模型,本文還設(shè)計了一種隱私保護方算法。為了減少信息損失,提高數(shù)據(jù)實用性,該算法結(jié)合局部泛化和部分刪除的方法來處理集值數(shù)據(jù)。在處理過程中采用自頂向下的局部泛化,當數(shù)據(jù)不滿足隱私需求時,采用部分刪除的方法來達到隱私需求。項目向下泛化會減少信息損失,但部分刪除會增加損失,故此時要評估泛化前后的信息損失。若泛化后數(shù)據(jù)的信息損失較少就采用本次泛化,否則拒絕該泛化。在匿名社交網(wǎng)絡(luò)數(shù)據(jù)時,為了提高數(shù)據(jù)實用性,該算法盡量保護社區(qū)結(jié)構(gòu)的完整性,即優(yōu)先刪除社區(qū)間的邊和優(yōu)先添加社區(qū)內(nèi)的邊,減少增刪邊對社區(qū)結(jié)構(gòu)的影響。最后,為了驗證算法的實用性,本文從信息損失等方面來評估集值數(shù)據(jù)的效用性,從杰卡德相似系數(shù)等來衡量社交網(wǎng)絡(luò)數(shù)據(jù)的效用性,實驗結(jié)果證表明該算法在保護隱私同時,也有較好的數(shù)據(jù)實用性。
[Abstract]:With the rapid development and widespread use of the network, various applications have generated massive data, such as WeChat, facebook, shopping platform and so on. There is an immeasurable social and economic value between the data, such as group behavior analysis, auxiliary business decision and so on. When data is published to a data miner, the data needs to be protected by the privacy, since the data generally contains the privacy information of many users, which can easily lead to the disclosure of the privacy information, so the data privacy protection is particularly important. In recent years, data privacy protection is a popular research field, and there are many relevant research results, but the existing research is mainly for the privacy protection of single-type data. In the age of large data, data mining has been widely used, such as social network data and transactional data mining, to solve the cold start problem of the shopping recommendation system, and so on. In the case of multi-source data, the increase of the background knowledge brings new privacy problems, and the existing privacy protection method is not applicable to the joint release of multi-source data. Relative relation type data, set-valued data has the features of high dimension, sparse and so on. The privacy protection method of relational data is obviously not applicable to set-valued data, such as using the k-anonymity privacy model to protect the set-valued data, which can cause the data loss to be too large. In view of this situation, the time-uncertainty model can balance the privacy protection and information loss well, and in recent years there are many research results on the privacy protection of set-valued data based on the uncertainty. There are also many data protection models in social networking data, such as the k-degree anonymous, l-diversity, and so on, and these models meet the privacy requirements by adding or deleting edges or nodes. The protection model can protect the single-type data, but in the case of the joint release of the social network data and the set-valued data, the background knowledge is increased, so that the leakage probability of the victim information is greater than the threshold value, and the data privacy requirement is not met. Therefore, for the joint release of social network data and set-valued data, this paper proposes a packet-level-uncertainty privacy protection model. The main work is as follows: First, the existing privacy protection model of set-valued data and social network data is analyzed, and the attack model of data joint release is put forward. The existing single data type privacy protection model is not applicable to the attack model. In the case of the background knowledge of any data item in the set-valued data, the constraint-uncertainty model ensures that the probability of the sensitive data item is not more than the threshold value. The model is effective when the set-valued data is distributed separately, but in the case of a joint release with the social network, if the attacker also knows that the victim has several friends in the social application, that is, the degree of the social network data victim node, Then it is concluded that the probability of the victim in the set-valued data sensitive term is greater than the threshold value and the privacy requirement is not met. Secondly, based on the above attack model, combined with the model of the uncertainty model and the degree of anonymity, this paper puts forward the packet-uncertainty privacy protection model. First, the protection model requires a generalization tree, such as apple, bana, to be generalized to fruit based on the project properties. And then grouping the set-valued data according to the generalization tree, that is, the records of the non-sensitive items in the set-valued data have the same parent node in the generalization tree are divided into a group. Based on the uncertainty model, the model requires that each group meet the constraint-uncertainty model, and it is proved that each group meets the constraint-uncertainty model, and the whole data also satisfies the constraint-uncertainty model. And finally, grouping the nodes of the social network (consistent with the grouping of the set-valued data) and the anonymous processing in the group, so that the nodes of the social network have the same degree in the group. Therefore, under the background knowledge above, the probability of the sensitive term of the attack victim is lower than the threshold value, thus reaching the anonymous requirement. Thirdly, based on the packet-based-uncertainty privacy protection model, a privacy protection algorithm is also designed in this paper. In order to reduce the loss of information and improve the practicability of the data, the algorithm combines the local generalization and partial deletion to process the set-valued data. The top-down local generalization is adopted in the processing process, and when the data does not meet the privacy requirement, the method of partial deletion is adopted to achieve the privacy requirement. The downward generalization of the project will reduce the loss of information, but the partial deletion will increase the loss, so the information loss before and after the generalization is to be evaluated at this time. If the information loss of the data after generalization is less, the generalization is adopted, otherwise the generalization is rejected. In the case of anonymous social network data, in order to improve the data utility, the algorithm can protect the integrity of the community structure as much as possible, that is, to preferentially delete the edge between the communities and to preferentially add the edges within the community, and to reduce the impact of the addition and deletion on the community structure. Finally, in order to validate the practicability of the algorithm, this paper evaluates the utility of the set-valued data from the aspects of information loss and the like, and measures the utility of social network data from the similar coefficient of Jardard and the like. The results of the experiment show that the algorithm has good data practicability while protecting the privacy.
【學(xué)位授予單位】：廣西師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP309

【相似文獻】

相關(guān)期刊論文前10條

1 ;守住你的秘密——隱私保護神[J];計算機與網(wǎng)絡(luò);2002年05期

2 李學(xué)聚;;新時期讀者隱私保護探析[J];科技情報開發(fā)與經(jīng)濟;2006年13期

3 管重;;誰偷窺了你的隱私[J];數(shù)字通信;2007年15期

4 孔為民;;大學(xué)圖書館與隱私保護[J];科技情報開發(fā)與經(jīng)濟;2007年26期

5 尹凱華;熊璋;吳晶;;個性化服務(wù)中隱私保護技術(shù)綜述[J];計算機應(yīng)用研究;2008年07期

6 高楓;張峰;周偉;;網(wǎng)絡(luò)環(huán)境中的隱私保護標準化研究[J];電信科學(xué);2013年04期

7 高密;薛寶賞;;我的電腦信息隱私保護很強大[J];網(wǎng)友世界;2010年11期

8 ;為自己的電子商務(wù)設(shè)計隱私保護[J];個人電腦;2000年07期

9 ;隱私保護的10個準則[J];個人電腦;2000年07期

10 岑婷婷;韓建民;王基一;李細雨;;隱私保護中K-匿名模型的綜述[J];計算機工程與應(yīng)用;2008年04期

相關(guān)會議論文前10條

1 鄭思琳;陳紅;葉運莉;;實習(xí)護士病人隱私保護意識和行為調(diào)查分析[A];中華護理學(xué)會第8屆全國造口、傷口、失禁護理學(xué)術(shù)交流會議、全國外科護理學(xué)術(shù)交流會議、全國神經(jīng)內(nèi)、外科護理學(xué)術(shù)交流會議論文匯編[C];2011年

2 孫通源;;基于局部聚類和雜度增益的數(shù)據(jù)信息隱私保護方法探討[A];中國水利學(xué)會2013學(xué)術(shù)年會論文集——S4水利信息化建設(shè)與管理[C];2013年

3 張亞維;朱智武;葉曉俊;;數(shù)據(jù)空間隱私保護平臺的設(shè)計[A];第二十五屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（一）[C];2008年

4 公偉;隗玉凱;王慶升;胡鑫磊;李換雙;;美國隱私保護標準及隱私保護控制思路研究[A];2013年度標準化學(xué)術(shù)研究論文集[C];2013年

5 張鵬;于波;童云海;唐世渭;;基于隨機響應(yīng)的隱私保護關(guān)聯(lián)規(guī)則挖掘[A];第二十一屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（技術(shù)報告篇）[C];2004年

6 桂瓊;程小輝;;一種隱私保護的分布式關(guān)聯(lián)規(guī)則挖掘方法[A];2009年全國開放式分布與并行計算機學(xué)術(shù)會議論文集(下冊)[C];2009年

7 俞笛;徐向陽;解慶春;劉寅;;基于保序加密的隱私保護挖掘算法[A];第八屆全國信息隱藏與多媒體安全學(xué)術(shù)大會湖南省計算機學(xué)會第十一屆學(xué)術(shù)年會論文集[C];2009年

8 李貝貝;樂嘉錦;;分布式環(huán)境下的隱私保護關(guān)聯(lián)規(guī)則挖掘[A];第二十二屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2005年

9 徐振龍;郭崇慧;;隱私保護數(shù)據(jù)挖掘研究的簡要綜述[A];第七屆（2012）中國管理學(xué)年會商務(wù)智能分會場論文集（選編）[C];2012年

10 潘曉;郝興;孟小峰;;基于位置服務(wù)中的連續(xù)查詢隱私保護研究[A];第26屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（A輯）[C];2009年

相關(guān)重要報紙文章前10條

1 記者李舒瑜;更關(guān)注隱私保護和人格尊重[N];深圳特區(qū)報;2011年

2 荷蘭鹿特丹醫(yī)學(xué)中心博士吳舟橋;荷蘭人的隱私[N];東方早報;2012年

3 本報記者周靜;私密社交應(yīng)用風(fēng)潮來襲聚焦小眾隱私保護是關(guān)鍵[N];通信信息報;2013年

4 獨立分析師陳志剛;隱私管理應(yīng)歸個人[N];通信產(chǎn)業(yè)報;2013年

5 本報記者朱寧寧;商業(yè)利益與隱私保護需立法平衡[N];法制日報;2014年

6 袁元;手機隱私保護萌發(fā)商機[N];證券日報;2014年

7 王爾山;跟隱私說再見[N];21世紀經(jīng)濟報道;2008年

8 記者武曉黎;360安全瀏覽器推“隱私瀏覽”模式[N];中國消費者報;2008年

9 早報記者是冬冬;“美國隱私保護法律已過時”[N];東方早報;2012年

10 張曉明;隱私的兩難[N];電腦報;2013年

相關(guān)博士學(xué)位論文前10條

1 孟祥旭;基于位置的移動信息服務(wù)技術(shù)與應(yīng)用研究[D];國防科學(xué)技術(shù)大學(xué);2013年

2 蘭麗輝;基于向量模型的加權(quán)社會網(wǎng)絡(luò)發(fā)布隱私保護方法研究[D];江蘇大學(xué);2015年

3 柯昌博;云服務(wù)組合隱私分析與保護方法研究[D];南京航空航天大學(xué);2014年

4 李敏;基于位置服務(wù)的隱私保護研究[D];電子科技大學(xué);2014年

5 陳東;信息物理融合系統(tǒng)安全與隱私保護關(guān)鍵技術(shù)研究[D];東北大學(xué);2014年

6 張柯麗;信譽系統(tǒng)安全和隱私保護機制的研究[D];北京郵電大學(xué);2015年

7 Kamenyi Domenic Mutiria;[D];電子科技大學(xué);2014年

8 孫崇敬;面向?qū)傩耘c關(guān)系的隱私保護數(shù)據(jù)挖掘理論研究[D];電子科技大學(xué);2014年

9 劉向宇;面向社會網(wǎng)絡(luò)的隱私保護關(guān)鍵技術(shù)研究[D];東北大學(xué);2014年

10 高勝;移動感知計算中位置和軌跡隱私保護研究[D];西安電子科技大學(xué);2014年

相關(guān)碩士學(xué)位論文前10條

1 鄒朝斌;SNS用戶隱私感知與自我表露行為的關(guān)系研究[D];西南大學(xué);2015年

2 李汶龍;大數(shù)據(jù)時代的隱私保護與被遺忘權(quán)[D];中國政法大學(xué);2015年

3 孫琪;基于位置服務(wù)的連續(xù)查詢隱私保護研究[D];湖南工業(yè)大學(xué);2015年

4 尹惠;無線傳感器網(wǎng)絡(luò)數(shù)據(jù)融合隱私保護技術(shù)研究[D];西南交通大學(xué);2015年

5 王鵬飛;位置服務(wù)中的隱私保護技術(shù)研究[D];南京理工大學(xué);2015年

6 顧鋮;基于關(guān)聯(lián)規(guī)則的隱私保護算法研究[D];南京理工大學(xué);2015年

7 崔堯;基于匿名方案的位置隱私保護技術(shù)研究[D];西安工業(yè)大學(xué);2015年

8 畢開圓;社會網(wǎng)絡(luò)中用戶身份隱私保護模型的研究[D];大連海事大學(xué);2015年

9 黃奚芳;基于差分隱私保護的集值型數(shù)據(jù)發(fā)布技術(shù)研究[D];江西理工大學(xué);2015年

10 高超;具有隱私保護意識的大樣本雙盲隨機對照試驗數(shù)據(jù)管理系統(tǒng)的設(shè)計與實現(xiàn)[D];山東大學(xué);2015年

，

本文編號：2460136

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2460136.html

上一篇：基于單目視覺的前方車輛檢測與測距
下一篇：基于近景攝影測量的巷道表面位移監(jiān)測

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

集值數(shù)據(jù)和社交網(wǎng)絡(luò)聯(lián)合發(fā)布中隱私保護方法研究