溯源數(shù)據(jù)精簡(jiǎn)方法研究

發(fā)布時(shí)間：2018-02-14 08:34

本文關(guān)鍵詞： 數(shù)據(jù)溯源數(shù)據(jù)精簡(jiǎn) 中心性分析圖聚類　出處：《山東大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：數(shù)據(jù)溯源是對(duì)目標(biāo)數(shù)據(jù)衍生前的原始數(shù)據(jù)及其演變過(guò)程的追溯、重現(xiàn)與展示。因其在監(jiān)測(cè)數(shù)據(jù)流失、完成數(shù)據(jù)重建以及驗(yàn)證數(shù)據(jù)的安全與可信性等方面具有獨(dú)特的優(yōu)勢(shì),在大數(shù)據(jù)工程和信息安全領(lǐng)域具有廣闊的應(yīng)用前景。但是,自溯源系統(tǒng)出現(xiàn)以來(lái),溯源數(shù)據(jù)的規(guī)模問(wèn)題一直是制約其應(yīng)用的瓶頸。為保證目標(biāo)數(shù)據(jù)的可溯源性,溯源數(shù)據(jù)的規(guī)模常常遠(yuǎn)大于目標(biāo)數(shù)據(jù),而對(duì)于面向大數(shù)據(jù)工程的溯源系統(tǒng),這個(gè)問(wèn)題更為突出。規(guī)模巨大的溯源數(shù)據(jù)不僅嚴(yán)重降低了溯源查詢的效率,使其存儲(chǔ)、計(jì)算和管理成本激增,還因數(shù)據(jù)關(guān)聯(lián)過(guò)于復(fù)雜、細(xì)密,使溯源結(jié)果的理解更加困難,極大降低了數(shù)據(jù)溯源的質(zhì)量,并直接影響到數(shù)據(jù)溯源技術(shù)的推廣應(yīng)用。目前,國(guó)內(nèi)外關(guān)于精簡(jiǎn)溯源數(shù)據(jù)主要采用的基于去冗壓縮和消噪過(guò)濾等方法不能從根本上解決溯源數(shù)據(jù)規(guī)模巨大的問(wèn)題,本文基于溯源數(shù)據(jù)的特點(diǎn)以及溯源圖結(jié)構(gòu),從分離冷數(shù)據(jù)和細(xì)粒度關(guān)聯(lián)數(shù)據(jù)的角度,對(duì)大規(guī)模溯源數(shù)據(jù)進(jìn)行粗粒度化,提出精簡(jiǎn)溯源數(shù)據(jù)規(guī)模的有效方法。本文的主要工作包括:1.基于類型的溯源數(shù)據(jù)分層精簡(jiǎn)方法的研究,利用數(shù)據(jù)項(xiàng)之間依賴關(guān)系的傳遞性重構(gòu)數(shù)據(jù)對(duì)象間的依賴關(guān)聯(lián),將溯源數(shù)據(jù)按其類型進(jìn)行分層劃分,對(duì)其中粒度較小、使用頻度較低的"冷數(shù)據(jù)"層進(jìn)行剝離,并以此簡(jiǎn)化溯源數(shù)據(jù),提高溯源效率。2.基于中心性差值的溯源數(shù)據(jù)精簡(jiǎn)方法的研究,根據(jù)數(shù)據(jù)節(jié)點(diǎn)中心性差值對(duì)任務(wù)層數(shù)據(jù)進(jìn)行邊界劃分,通過(guò)提取任務(wù)內(nèi)影響力較高的邊界數(shù)據(jù)節(jié)點(diǎn)作為關(guān)鍵溯源,實(shí)現(xiàn)溯源數(shù)據(jù)規(guī)模的精簡(jiǎn)。3.基于相關(guān)性聚類的溯源數(shù)據(jù)精簡(jiǎn)方法的研究,即:將數(shù)據(jù)按照相關(guān)性進(jìn)行粗粒度聚類,對(duì)描述任務(wù)細(xì)節(jié)的非邊界數(shù)據(jù)進(jìn)行分級(jí)存儲(chǔ)或修剪,從溯源數(shù)據(jù)粗粒度聚類角度實(shí)現(xiàn)溯源數(shù)據(jù)的精簡(jiǎn)。本文的創(chuàng)新點(diǎn)為:1.提出一種基于類型的溯源數(shù)據(jù)分層精簡(jiǎn)方法,該方法將溯源數(shù)據(jù)按其對(duì)象類型進(jìn)行分層劃分后,剝離使用頻度較低的"冷數(shù)據(jù)"層,以此實(shí)現(xiàn)數(shù)據(jù)溯源規(guī)模精簡(jiǎn)。2.提出一種基于中心差值的溯源數(shù)據(jù)精簡(jiǎn)方法,該方法利用中心性差值識(shí)別粗粒度任務(wù)邊界,通過(guò)提取任務(wù)內(nèi)影響力較高的邊界數(shù)據(jù)節(jié)點(diǎn)作為關(guān)鍵溯源,實(shí)現(xiàn)溯源數(shù)據(jù)規(guī)模的精簡(jiǎn)。3.提出一種基于相關(guān)性聚類的溯源數(shù)據(jù)精簡(jiǎn)方法,該方法根據(jù)溯源數(shù)據(jù)之間的相關(guān)性,實(shí)現(xiàn)溯源數(shù)據(jù)的聚類,通過(guò)對(duì)聚類后內(nèi)關(guān)聯(lián)數(shù)據(jù)的剝離,實(shí)現(xiàn)溯源數(shù)據(jù)的精簡(jiǎn)。本文基于哈佛大學(xué)PASSv2標(biāo)準(zhǔn)溯源Trace數(shù)據(jù)集,對(duì)所提出的溯源數(shù)據(jù)精簡(jiǎn)方法分別進(jìn)行了實(shí)驗(yàn),實(shí)驗(yàn)結(jié)果驗(yàn)證了所提出方法的可行性和有效性。
[Abstract]:Data traceability is the tracing, reproducing and displaying of the original data and its evolution process before the derivation of the target data, because of its unique advantages in monitoring the data loss, completing the data reconstruction and verifying the security and credibility of the data. Big data has a broad application prospect in the field of engineering and information security. However, since the emergence of traceability system, the scale of traceability data has been the bottleneck of its application. The scale of traceability data is often much larger than that of target data, but for the traceability system oriented to big data project, this problem is more prominent. The large scale traceability data not only reduces the efficiency of traceability query, but also makes it stored. The surge in computing and management costs, as well as the complexity and fineness of data association, make it more difficult to understand the traceability results, greatly reduce the quality of data traceability, and directly affect the popularization and application of data traceability technology. At home and abroad, the methods of reducing traceability data based on de-redundancy compression and denoising filtering can not fundamentally solve the problem of large scale traceability data. This paper is based on the characteristics of traceability data and traceability graph structure. From the angle of separating cold data from fine-grained correlation data, coarse-grained large-scale traceability data is coarse-grained. This paper proposes an effective method for reducing the scale of traceability data. The main work of this paper includes: 1.The hierarchical reduction method of traceability data based on type is studied, and the transitive relation between data items is used to reconstruct the dependency relation between data objects. The traceability data is stratified according to its type, and the "cold data" layer with smaller granularity and low frequency is used to simplify the traceability data. Improving traceability efficiency. 2. Research on the method of reducing traceability data based on centrality difference, divide the boundary of task layer data according to the centrality difference of data node, and extract the influential boundary data node in the task as the key traceability. Reduction of traceability data scale. 3. Research on traceability data reduction method based on correlation clustering, that is, coarse-grained clustering of data according to correlation, hierarchical storage or pruning of non-boundary data describing task details. From the point of view of coarse-grained clustering of traceability data, the innovation of this paper is: 1.This paper presents a typology based hierarchical reduction method for traceability data, which divides traceability data into layers according to their object types. In order to reduce the scale of data traceability, a traceability data reduction method based on central difference is proposed, in which the coarse-grained task boundary is identified by centrality difference. By extracting the influential boundary data node as the key traceability, the traceability data scale is reduced. 3. A traceability data reduction method based on correlation clustering is proposed, which is based on the correlation between traceability data. To realize the clustering of traceability data, the traceability data can be reduced by stripping the associated data after clustering. Based on the traceability Trace dataset of Harvard University PASSv2 standard, this paper makes experiments on the proposed traceability data reduction method. The experimental results show that the proposed method is feasible and effective.
【學(xué)位授予單位】：山東大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP311.13;TP309

【相似文獻(xiàn)】

相關(guān)期刊論文前9條

1 慈瑞梅;;一種基于多維分層的數(shù)據(jù)精簡(jiǎn)方法[J];揚(yáng)州職業(yè)大學(xué)學(xué)報(bào);2006年03期

2 魏瀛寰;雷邦成;周偉趙;;逆向工程中掃描數(shù)據(jù)精簡(jiǎn)技術(shù)研究[J];汽車工藝與材料;2013年04期

3 柴興;馬淑梅;;散亂點(diǎn)云數(shù)據(jù)精簡(jiǎn)技術(shù)研究[J];機(jī)械工程師;2007年12期

4 劉德平;陳建軍;;逆向工程中數(shù)據(jù)精簡(jiǎn)技術(shù)的研究[J];西安電子科技大學(xué)學(xué)報(bào);2008年02期

5 趙柳;馬禮;楊銀剛;紀(jì)麗婷;;逆向工程中散亂點(diǎn)云數(shù)據(jù)精簡(jiǎn)研究[J];光電技術(shù)應(yīng)用;2010年01期

6 上官建林;郭三刺;;反求工程中數(shù)據(jù)精簡(jiǎn)技術(shù)的研究[J];機(jī)械管理開發(fā);2011年04期

7 王志清;李偉;張英平;鞠魯粵;;基于逆向工程的數(shù)據(jù)精簡(jiǎn)方法研究[J];機(jī)械制造;2005年11期

8 李珂珍;婁小平;呂乃光;;用于點(diǎn)云曲面重構(gòu)的數(shù)據(jù)精簡(jiǎn)方法研究[J];北京機(jī)械工業(yè)學(xué)院學(xué)報(bào);2009年01期

9 孫肖霞;孫殿柱;李延瑞;范志先;;反求工程中測(cè)量數(shù)據(jù)的精簡(jiǎn)算法[J];機(jī)械設(shè)計(jì)與制造;2006年08期

相關(guān)碩士學(xué)位論文前1條

1 密鴻吉;溯源數(shù)據(jù)精簡(jiǎn)方法研究[D];山東大學(xué);2017年

，

本文編號(hào)：1510292

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1510292.html

上一篇：SMA驅(qū)動(dòng)的模塊化仿人手指設(shè)計(jì)與研究
下一篇：面向在線評(píng)論的比較觀點(diǎn)挖掘研究綜述

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

溯源數(shù)據(jù)精簡(jiǎn)方法研究