ERP信息系統(tǒng)數(shù)據(jù)發(fā)布的匿名化技術(shù)研究
發(fā)布時間:2018-09-01 12:50
【摘要】:隨著互聯(lián)網(wǎng)等信息技術(shù)的飛速發(fā)展,各個領(lǐng)域都蘊藏著海量的信息數(shù)據(jù),如今各行各業(yè)進行數(shù)據(jù)收集、分析和挖掘的能力得到了很大提升,尤其是數(shù)據(jù)挖掘技術(shù),有助于發(fā)掘數(shù)據(jù)中蘊藏的巨大價值,數(shù)據(jù)挖掘中很重要的一步是數(shù)據(jù)發(fā)布,數(shù)據(jù)發(fā)布能借助第三方技術(shù)和公眾智慧,集思廣益,充分挖掘數(shù)據(jù)價值,更好地提供戰(zhàn)略決策。但數(shù)據(jù)發(fā)布要面對一個重要問題——隱私泄露和信息安全,這一問題也成為制約數(shù)據(jù)分析和挖掘技術(shù)進一步發(fā)展的瓶頸。為了在保證用戶隱私不被泄露的前提下進行數(shù)據(jù)發(fā)布,通常的做法是將唯一標(biāo)識個體的某個或某些屬性用無意義的符號進行替換,但這種處理方式不能起到很好的保護效果,攻擊者可以通過背景知識等其他掌握的信息識別用戶身份,進而獲取用戶的敏感信息。學(xué)術(shù)界就此提出了很多技術(shù)和方法,其中,匿名化技術(shù)是一種經(jīng)典的隱私保護方法。ERP信息系統(tǒng)因其內(nèi)部數(shù)據(jù)具有真實性高且質(zhì)量好的特點,而具有極高的數(shù)據(jù)發(fā)布與分析價值。本文以ERP信息系統(tǒng)為背景,主要研究該系統(tǒng)數(shù)據(jù)發(fā)布中的隱私保護方法,所做的主要工作和貢獻如下:第一,,針對ERP信息中構(gòu)建的攻擊模型,提出了基于k—匿名的隱私保護方法。首先,分析了實驗數(shù)據(jù)集SAP GBI 2.3,并考慮到ERP信息系統(tǒng)數(shù)據(jù)的普遍特點,在提出合適的數(shù)據(jù)結(jié)構(gòu)和相關(guān)假設(shè)的基礎(chǔ)上,構(gòu)建基于銷售訂單的攻擊模型,并引入實用的數(shù)據(jù)可用性度量;接著,針對提出的攻擊模型開發(fā)了基于加權(quán)匹配的k-匿名算法,并通過與另外兩種算法的對比,以數(shù)據(jù)可用性度量為標(biāo)尺,驗證了本算法的有效性和優(yōu)越性。所做的重要貢獻是針對ERP信息系統(tǒng)的數(shù)據(jù)特點提出了具有普遍適用性的基于銷售訂單的攻擊模型,并在此基礎(chǔ)上開發(fā)出一種高效的匿名算法。第二,針對一個具體領(lǐng)域的ERP信息系統(tǒng)——鐵路ERP系統(tǒng)提出了一種新的數(shù)據(jù)結(jié)構(gòu)和匿名化方法。由于鐵路ERP系統(tǒng)數(shù)據(jù)的多樣性,結(jié)合考慮用戶的社交網(wǎng)絡(luò)信息和地理位置信息,提出了基于超圖的地理社交網(wǎng)絡(luò)(GSN)模型,并在此基礎(chǔ)上構(gòu)建了攻擊模型和匿名模型,還定義了若干種數(shù)據(jù)可用性度量,并在此基礎(chǔ)上,開發(fā)出地理社交網(wǎng)絡(luò)的(κ,m)—匿名算法和(κ,m,l)—匿名算法,通過大量實驗,以數(shù)據(jù)可用性度量為標(biāo)尺,對不同階段的實驗結(jié)果做出評估,驗證了本算法的有效性。所做的重要貢獻是根據(jù)鐵路ERP系統(tǒng)數(shù)據(jù)的復(fù)雜特點,提出了基于超圖的GSN模型,并提出實用的攻擊模型和匿名模型,定義可靠的數(shù)據(jù)可用性度量,進而開發(fā)出一套較為完善的匿名算法。
[Abstract]:With the rapid development of information technology such as the Internet, there is a huge amount of information data in various fields. Nowadays, the ability of data collection, analysis and mining in various industries has been greatly improved, especially in data mining technology. A very important step in data mining is data release, which can utilize the third party technology and public wisdom, pool wisdom, fully mine the value of data, and provide better strategic decision. However, data release has to face an important problem-privacy disclosure and information security, which has become a bottleneck restricting the further development of data analysis and mining technology. In order to release the data without compromising the privacy of the user, the usual practice is to replace one or some attributes of the unique identity of the individual with meaningless symbols, but this approach does not have a good protection effect. The attacker can identify the user through other information, such as background knowledge, and then obtain the sensitive information of the user. The academic circles have put forward many techniques and methods. Among them, anonymous technology is a classical privacy protection method. ERP information system has high value of data release and analysis because of its high authenticity and good quality of internal data. Taking ERP information system as the background, this paper mainly studies the privacy protection method in the data release of the system. The main work and contributions are as follows: first, aiming at the attack model in the ERP information, A privacy protection method based on k-anonymity is proposed. Firstly, the experimental data set SAP GBI 2.3 is analyzed, and considering the general characteristics of ERP information system data, the attack model based on sales order is constructed on the basis of appropriate data structure and relevant assumptions. Then, a k- anonymous algorithm based on weighted matching is developed for the attack model, and compared with the other two algorithms, the data availability measurement is used as a scale. The validity and superiority of this algorithm are verified. The important contribution of this paper is to propose a universally applicable attack model based on sales order according to the data characteristics of ERP information system and to develop an efficient anonymous algorithm on the basis of this model. Secondly, a new data structure and anonymous method for a specific ERP information system-Railway ERP system is proposed. Because of the diversity of railway ERP system data, combined with the social network information and geographical location information of users, the (GSN) model of geographical social network based on hypergraph is proposed, and the attack model and anonymous model are constructed on the basis of this model. Several kinds of data usability measures are also defined. On the basis of this, the (魏 m) -anonymous algorithm and (魏 -m-1) -anonymous algorithm of geographical social network are developed. Through a large number of experiments, the data availability measurement is used as a scale. The experimental results at different stages are evaluated to verify the effectiveness of this algorithm. The important contribution is that according to the complex characteristics of railway ERP system data, a GSN model based on hypergraph is proposed, and a practical attack model and anonymous model are proposed to define reliable data availability measurement. Then a set of relatively perfect anonymous algorithm is developed.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP309
本文編號:2217252
[Abstract]:With the rapid development of information technology such as the Internet, there is a huge amount of information data in various fields. Nowadays, the ability of data collection, analysis and mining in various industries has been greatly improved, especially in data mining technology. A very important step in data mining is data release, which can utilize the third party technology and public wisdom, pool wisdom, fully mine the value of data, and provide better strategic decision. However, data release has to face an important problem-privacy disclosure and information security, which has become a bottleneck restricting the further development of data analysis and mining technology. In order to release the data without compromising the privacy of the user, the usual practice is to replace one or some attributes of the unique identity of the individual with meaningless symbols, but this approach does not have a good protection effect. The attacker can identify the user through other information, such as background knowledge, and then obtain the sensitive information of the user. The academic circles have put forward many techniques and methods. Among them, anonymous technology is a classical privacy protection method. ERP information system has high value of data release and analysis because of its high authenticity and good quality of internal data. Taking ERP information system as the background, this paper mainly studies the privacy protection method in the data release of the system. The main work and contributions are as follows: first, aiming at the attack model in the ERP information, A privacy protection method based on k-anonymity is proposed. Firstly, the experimental data set SAP GBI 2.3 is analyzed, and considering the general characteristics of ERP information system data, the attack model based on sales order is constructed on the basis of appropriate data structure and relevant assumptions. Then, a k- anonymous algorithm based on weighted matching is developed for the attack model, and compared with the other two algorithms, the data availability measurement is used as a scale. The validity and superiority of this algorithm are verified. The important contribution of this paper is to propose a universally applicable attack model based on sales order according to the data characteristics of ERP information system and to develop an efficient anonymous algorithm on the basis of this model. Secondly, a new data structure and anonymous method for a specific ERP information system-Railway ERP system is proposed. Because of the diversity of railway ERP system data, combined with the social network information and geographical location information of users, the (GSN) model of geographical social network based on hypergraph is proposed, and the attack model and anonymous model are constructed on the basis of this model. Several kinds of data usability measures are also defined. On the basis of this, the (魏 m) -anonymous algorithm and (魏 -m-1) -anonymous algorithm of geographical social network are developed. Through a large number of experiments, the data availability measurement is used as a scale. The experimental results at different stages are evaluated to verify the effectiveness of this algorithm. The important contribution is that according to the complex characteristics of railway ERP system data, a GSN model based on hypergraph is proposed, and a practical attack model and anonymous model are proposed to define reliable data availability measurement. Then a set of relatively perfect anonymous algorithm is developed.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP309
【參考文獻】
相關(guān)期刊論文 前4條
1 胡佳林;;ERP數(shù)據(jù)系統(tǒng)的分析與研究[J];東方汽輪機;2012年04期
2 朱姝;;淺析ERP系統(tǒng)中的數(shù)據(jù)倉庫[J];電腦知識與技術(shù);2012年05期
3 江華;;ERP系統(tǒng)在鐵路貨車制造中的實施與應(yīng)用[J];鐵道車輛;2011年07期
4 孫美麗;美國和歐盟的數(shù)據(jù)隱私保護策略[J];情報科學(xué);2004年10期
相關(guān)碩士學(xué)位論文 前1條
1 陳成;國電S發(fā)電公司ERP信息系統(tǒng)應(yīng)用研究[D];華北電力大學(xué)(北京);2008年
,本文編號:2217252
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2217252.html
最近更新
教材專著