事務(wù)型數(shù)據(jù)流發(fā)布的隱私保護方法研究

發(fā)布時間：2018-08-07 10:53

【摘要】：近年來,隨著網(wǎng)絡(luò)信息技術(shù)的快速發(fā)展及其在人們生活中的應(yīng)用,使得互聯(lián)網(wǎng)中產(chǎn)生了大量的數(shù)據(jù),這些數(shù)據(jù)包含大量信息,商業(yè)機構(gòu)、政府部門可以利用這些數(shù)據(jù)來進行商業(yè)決策、科學(xué)研究等。因此數(shù)據(jù)共享變得非常有必要,然而這些數(shù)據(jù)包含了大量的個人敏感信息,如果數(shù)據(jù)擁有者發(fā)布這些數(shù)據(jù)前不做適當處理就有可能使得個體的隱私信息泄露,從而給個體帶來危害。因此,數(shù)據(jù)發(fā)布中的隱私保護問題成為當前研究的熱點。該問題研究如何對數(shù)據(jù)進行處理使得發(fā)布的數(shù)據(jù)不泄露個體隱私的同時具有較高的可用性。目前數(shù)據(jù)發(fā)布中的隱私保護技術(shù)大都是基于靜態(tài)數(shù)據(jù)集的。隨著物聯(lián)網(wǎng)的推廣以及大數(shù)據(jù)時代的到來,網(wǎng)絡(luò)點擊數(shù)據(jù)、電話呼叫記錄、大型超市的購物數(shù)據(jù)等通常表現(xiàn)為動態(tài)變化的數(shù)據(jù)流,數(shù)據(jù)流具有海量性、實時性和動態(tài)變化性。傳統(tǒng)的隱私保護技術(shù)不適用于數(shù)據(jù)流環(huán)境中。事務(wù)型數(shù)據(jù)流是數(shù)據(jù)流的一種普遍形式,如商場的購物數(shù)據(jù)等,其中每條記錄是由項目組成的集合。這種數(shù)據(jù)通常包含用戶的敏感信息。目前數(shù)據(jù)流發(fā)布中的隱私保護方法主要針對關(guān)系型數(shù)據(jù)流。事務(wù)型數(shù)據(jù)流具有高維和稀疏的特性,使得用于處理關(guān)系型數(shù)據(jù)流的隱私保護技術(shù)也不能直接適用于事務(wù)型數(shù)據(jù)流;事務(wù)型數(shù)據(jù)流具有實時和動態(tài)變化性,使得傳統(tǒng)的處理靜態(tài)事務(wù)型數(shù)據(jù)的隱私保護方法不能直接應(yīng)用于數(shù)據(jù)流環(huán)境。本文研究事務(wù)型數(shù)據(jù)流發(fā)布的隱私保護問題,提出了基于滑動窗口的事務(wù)型數(shù)據(jù)流發(fā)布的隱私保護方法。主要研究工作如下:首先,結(jié)合滑動窗口和ρ-不確定性隱私保護模型,提出基于滑動窗口的ρ-不確定性模型,即要求任一滑動窗口都滿足ρ-不確定性。當新數(shù)據(jù)的到達和舊數(shù)據(jù)的刪除使滑動窗口內(nèi)的數(shù)據(jù)不斷更新而導(dǎo)致滑動窗口不再滿足隱私要求時,本文分析了刪除子窗口和添加子窗口對滑動窗口的影響,同時給出了信息損失度量方法。其次,根據(jù)刪除子窗口和添加子窗口建立受影響的關(guān)聯(lián)規(guī)則樹,從而快速找出造成當前窗口不滿足ρ-不確定性隱私要求的敏感關(guān)聯(lián)規(guī)則,利用抑制方法選擇盡可能少的項目進行刪除使得當前窗口滿足ρ-不確定性。為了減少數(shù)據(jù)的信息損失,本文進一步提出了抑制和概化相結(jié)合的方法,根據(jù)信息損失度量來判斷選擇將項目刪除還是概化使得當前窗口達到隱私要求。最后,給出了系統(tǒng)的設(shè)計方案以及各模塊的詳細實現(xiàn)過程。從數(shù)據(jù)匿名的效率和數(shù)據(jù)的效用性兩個方面,對本文提出的方法與直接利用靜態(tài)事務(wù)型數(shù)據(jù)匿名方法進行比較,實驗結(jié)果表明,本文的方法能快速匿名,同時能有效地保證數(shù)據(jù)的效用性。
[Abstract]:In recent years, with the rapid development of network information technology and its application in people's lives, the Internet has produced a lot of data, which contains a lot of information, business organizations, Government departments can use the data for business decisions, scientific research, and so on. Data sharing is therefore necessary, but the data contains a large amount of personal sensitive information, and if the data owner does not properly handle the data before releasing it, it may cause the privacy information of the individual to be leaked. Thus bring harm to the individual. Therefore, privacy protection in data publishing has become the focus of current research. This problem studies how to process the data so that the published data does not reveal the privacy of individuals and has high availability. At present, privacy protection techniques in data publishing are mostly based on static data sets. With the promotion of the Internet of things and the arrival of the big data era, the network click data, telephone call records, shopping data of large supermarkets and other data are usually shown as dynamic data flow, data stream has mass, real-time and dynamic variability. Traditional privacy protection techniques are not suitable for data flow environments. Transactional data flow is a common form of data flow, such as shopping data in shopping malls, in which each record is a collection of items. This data usually contains sensitive information about the user. At present, privacy protection in data stream publishing is mainly aimed at relational data stream. Because transactional data streams have the characteristics of high and sparse, the privacy protection techniques used to deal with relational data streams can not be directly applied to transactional data streams, and transactional data streams have real-time and dynamic variability. The traditional privacy protection method for static transaction data can not be directly applied to data flow environment. In this paper, the privacy protection of transactional data stream publishing is studied, and a method of privacy protection for transactional data stream publishing based on sliding window is proposed. The main research work is as follows: firstly, combining sliding window and 蟻 uncertainty privacy protection model, a 蟻-uncertainty model based on sliding window is proposed, that is, every sliding window is required to satisfy 蟻-uncertainty. When the arrival of new data and the deletion of old data make the data in the sliding window update continuously, which leads to the sliding window no longer meet the privacy requirements, this paper analyzes the effect of deleting and adding the child window on the sliding window. At the same time, the measurement method of information loss is given. Secondly, the affected association rules tree is established based on deleting and adding sub-windows, and the sensitive association rules which cause the current window not to meet the 蟻 -uncertainty privacy requirements are quickly found. The suppression method is used to select as few items as possible to delete so that the current window satisfies 蟻-uncertainty. In order to reduce the information loss of data, this paper further proposes a combination of suppression and generalizability, according to the information loss measure to determine whether to delete or generalize the items to make the current window meet the privacy requirements. Finally, the design scheme of the system and the detailed implementation process of each module are given. From the efficiency of data anonymity and the utility of data, this paper compares the proposed method with the method of using static transaction data anonymity directly. The experimental results show that the method proposed in this paper can be anonymous quickly. At the same time, the utility of the data can be effectively guaranteed.
【學(xué)位授予單位】：廣西師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP311.13;TP309

【相似文獻】

相關(guān)期刊論文前10條

1 李斌;數(shù)據(jù)流處理自動化和重新設(shè)計[J];管理科學(xué)文摘;1997年05期

2 侯太平,顧大權(quán),王柏春,朱紅偉;遠程天氣會商系統(tǒng)中的數(shù)據(jù)流處理[J];計算機工程;2003年03期

3 陳昕,宋瀚濤;基于數(shù)據(jù)流的近似查詢計算及其應(yīng)用研究[J];計算機應(yīng)用研究;2003年11期

4 陳昕,陳維興,蘇錦祥;基于數(shù)據(jù)流模式的聚集快速查詢計算研究[J];計算機集成制造系統(tǒng);2004年06期

5 張冬冬,李建中,王偉平,郭龍江;分布式復(fù)式數(shù)據(jù)流的處理[J];計算機研究與發(fā)展;2004年10期

6 王金棟;周良;張磊;丁秋林;;一類數(shù)據(jù)流連續(xù)查詢的降載策略研究[J];武漢大學(xué)學(xué)報(工學(xué)版);2005年06期

7 劉景春;;數(shù)據(jù)流分類關(guān)鍵技術(shù)研究[J];佳木斯大學(xué)學(xué)報(自然科學(xué)版);2007年01期

8 李琳;孫士兵;;數(shù)據(jù)流聚類方法發(fā)展研究[J];長沙民政職業(yè)技術(shù)學(xué)院學(xué)報;2008年04期

9 陳軍;周明天;楊曉燕;;數(shù)據(jù)流系統(tǒng)降載研究綜述[J];計算機應(yīng)用研究;2008年10期

10 傅鸝;魯先志;蔡斌;;一種基于數(shù)據(jù)流驅(qū)動的數(shù)據(jù)流連續(xù)查詢模型[J];重慶工學(xué)院學(xué)報(自然科學(xué)版);2008年10期

相關(guān)會議論文前10條

1 張冬冬;李建中;王偉平;郭龍江;;分布式復(fù)式數(shù)據(jù)流的處理[A];第二十一屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2004年

2 楚紅濤;寒楓;張燕;王婷;;基于數(shù)據(jù)流的挖掘研究[A];計算機技術(shù)與應(yīng)用進展·2007——全國第18屆計算機技術(shù)與應(yīng)用（CACIS）學(xué)術(shù)會議論文集[C];2007年

3 尹婷;李紅燕;;窗口模型下數(shù)據(jù)流查詢流水化執(zhí)行的研究[A];第二十一屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（技術(shù)報告篇）[C];2004年

4 孟軍;張航黎;張建英;郭禾;;分布式數(shù)據(jù)流的漸增式聚集維護算法[A];2006年全國開放式分布與并行計算學(xué)術(shù)會議論文集（二）[C];2006年

5 韓近強;楊冬青;唐世渭;;數(shù)據(jù)流處理中一種自適應(yīng)的直方圖維護算法[A];第二十屆全國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2003年

6 蔡致遠;熊方;錢衛(wèi)寧;周傲英;;核合并分析及其在數(shù)據(jù)流密度估計上的應(yīng)用[A];第二十屆全國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2003年

7 王亦兵;楊樹強;王曉偉;;一個面向數(shù)據(jù)流的多維分析系統(tǒng)的研究與實現(xiàn)[A];全國計算機安全學(xué)術(shù)交流會論文集（第二十四卷）[C];2009年

8 于亞新;王國仁;陳燦;蘇林;朱歆華;趙相國;;基于操作符優(yōu)先級的兩種分布式數(shù)據(jù)流負載分配算法研究[A];第二十四屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2007年

9 周銳;肖川;王國仁;韓東紅;霍歡;;數(shù)據(jù)流滑動窗口連接上的卸載技術(shù)的研究[A];第二十三屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（技術(shù)報告篇）[C];2006年

10 田李;王樂;賈焰;鄒鵬;李愛平;;分布式數(shù)據(jù)流上低通信開銷的連續(xù)極值查詢方法研究[A];第二十四屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集（研究報告篇）[C];2007年

相關(guān)博士學(xué)位論文前10條

1 張麗;數(shù)據(jù)流上序敏感查詢處理關(guān)鍵技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2013年

2 李颯;數(shù)據(jù)流軟聚類理論及其在瓦斯災(zāi)害預(yù)警中的應(yīng)用[D];遼寧工程技術(shù)大學(xué);2014年

3 陳華輝;基于遺忘特性的數(shù)據(jù)流概要結(jié)構(gòu)及其應(yīng)用研究[D];復(fù)旦大學(xué);2008年

4 孔英會;數(shù)據(jù)流技術(shù)及其在電力信息處理中的應(yīng)用研究[D];華北電力大學(xué)（河北）;2009年

5 崇志宏;基于屏蔽/匯總技術(shù)的數(shù)據(jù)流處理算法[D];復(fù)旦大學(xué);2006年

6 姚遠;海量動態(tài)數(shù)據(jù)流分類方法研究[D];大連理工大學(xué);2013年

7 曹振麗;面向養(yǎng)殖環(huán)境監(jiān)測的數(shù)據(jù)流處理方法研究[D];中國農(nóng)業(yè)大學(xué);2015年

8 朱輝生;基于情節(jié)規(guī)則匹配的數(shù)據(jù)流預(yù)測研究[D];復(fù)旦大學(xué);2011年

9 袁志堅;數(shù)據(jù)流突發(fā)檢測若干關(guān)鍵技術(shù)研究[D];國防科學(xué)技術(shù)大學(xué);2008年

10 王金棟;數(shù)據(jù)流系統(tǒng)中負載管理技術(shù)應(yīng)用研究[D];南京航空航天大學(xué);2006年

相關(guān)碩士學(xué)位論文前10條

1 王川;面向位置服務(wù)的物聯(lián)網(wǎng)數(shù)據(jù)質(zhì)量保證方法研究[D];南京理工大學(xué);2015年

2 祝然威;基于時間窗口的數(shù)據(jù)流頻繁項挖掘算法[D];復(fù)旦大學(xué);2014年

3 邱孝兵;基于GPU的數(shù)據(jù)流聚類及相關(guān)性分析[D];大連理工大學(xué);2015年

4 張野;數(shù)據(jù)流查詢語言中語法分析器的設(shè)計[D];電子科技大學(xué);2015年

5 閆新院;基于概要模型的數(shù)據(jù)流聚合技術(shù)研究[D];西安電子科技大學(xué);2014年

6 王濤;基于Ntrip協(xié)議的實時數(shù)據(jù)流軟件的設(shè)計與實現(xiàn)[D];西安電子科技大學(xué);2014年

7 陳彬;數(shù)據(jù)流實時存儲關(guān)鍵技術(shù)[D];浙江工業(yè)大學(xué);2015年

8 王高洋;基于網(wǎng)格和加速粒子群優(yōu)化的數(shù)據(jù)流聚類算法研究[D];哈爾濱師范大學(xué);2015年

9 錢海振;大數(shù)據(jù)流滯后相關(guān)性挖掘方法[D];遼寧師范大學(xué);2015年

10 劉祥佳;制造物聯(lián)海量數(shù)據(jù)流模式挖掘算法研究[D];廣東工業(yè)大學(xué);2016年

，

本文編號：2169797

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2169797.html

上一篇：基于PXI總線技術(shù)的電阻抗成像系統(tǒng)研究
下一篇：視覺假體中小波邊緣檢測算法的VLSI結(jié)構(gòu)設(shè)計

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

事務(wù)型數(shù)據(jù)流發(fā)布的隱私保護方法研究