天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 計算機論文 >

溯源的高效存儲管理及在安全方面的應用研究

發(fā)布時間:2018-09-10 08:28
【摘要】:如今,全世界每天都在爆炸性的產(chǎn)生各種新的信息量。對于存儲系統(tǒng)的容量需求,也從PB(Petabyte)、EB(Exabyte)到如今能容納‘'Big Data"的海量存儲系統(tǒng)在發(fā)展。盡管有各種新的存儲器件在不斷產(chǎn)生,新的存儲體系架構(gòu)也在不斷提出,但對于海量數(shù)據(jù)本身的分析和理解卻停滯不前。比如,當我們在云端獲取某些重要數(shù)據(jù)時,我們可能會問,這些數(shù)據(jù)從哪里來,之前有人用過么,可靠性和安全性如何? 溯源(Provenance),作為一種包含了數(shù)據(jù)對象歷史信息的元數(shù)據(jù),正好可以用來解答這樣的問題。比如,一個數(shù)據(jù)對象是如何被創(chuàng)建的,經(jīng)過了哪些修改,兩個數(shù)據(jù)對象的祖先有什么不同。在系統(tǒng)領(lǐng)域,一個數(shù)據(jù)的溯源是所有影響這個數(shù)據(jù)最終狀態(tài)的進程信息和相關(guān)數(shù)據(jù)。正因為溯源揭示了數(shù)據(jù)對象的過去或產(chǎn)生過程,使得溯源具有了更廣泛的使用價值。現(xiàn)在,溯源已經(jīng)被科學家用來驗證重要的實驗數(shù)據(jù)集,提高桌面搜索的效率,審計重要的財務賬目等,還有一些研究正在將它用于重復性數(shù)據(jù)刪除,分布式安全等領(lǐng)域。但目前針對溯源特點的研究還并不多。比如,溯源的一大特點是數(shù)據(jù)量大,但現(xiàn)在還很少有比較好的算法在大量壓縮溯源的同時支持對溯源的高效查詢。另外,溯源記載了數(shù)據(jù)的生成歷史,但對于用溯源來保證數(shù)據(jù)可靠性以及根據(jù)這種生成歷史來分析系統(tǒng)入侵行為的研究卻并不多。 提出了一種可高效壓縮溯源的基于web圖形壓縮和字典編碼的混合壓縮方法。通過利用溯源圖和web圖的相似性,該方法充分挖掘了溯源圖節(jié)點中的局部性和相似性特征,以及消除了溯源信息中固有的一些重復性字符串。和以往的壓縮方法相比,該方法能進一步壓縮溯源圖中邊上的信息,具有更細的壓縮粒度,并且支持對溯源的高效查詢。在大量溯源trace上的實驗表明,該方法在壓縮率、壓縮時間和查詢性能等方面,相比其它壓縮模式提供了最好的折衷。 提出了一種面向單個數(shù)據(jù)對象進行重建、可并行重建及設置重建優(yōu)先級的基于溯源的數(shù)據(jù)重建方法。通過回溯數(shù)據(jù)文件的生成過程,該方法可以準確地重建丟失或受損的文件。相比以往更注重整個硬盤或系統(tǒng)安全的保證數(shù)據(jù)存儲可靠性的解決方案(例如,日志文件、快照或備份),其優(yōu)勢主要在于,能重建單個數(shù)據(jù)對象,能并行重建多個數(shù)據(jù)對象,以及優(yōu)先重建重要的數(shù)據(jù)文件;谒菰吹臄(shù)據(jù)重建系統(tǒng)在文件被正常讀取時,能夠收集文件的溯源信息。而在文件丟失或損壞后,能自動重建這些文件。并且在重建過程中,能恢復受影響的其它文件。實驗結(jié)果表明,基于溯源的重建性能顯著優(yōu)于以日志為基礎(chǔ)的重建性能。盡管有溯源數(shù)據(jù)庫大小等影響溯源重建的因素,但實驗表明,這些因素對基于溯源的重建性能影響并不大。 提出了一種采用溯源信息來進行入侵檢測的方法,通過對和系統(tǒng)進行交互的進程收集溯源信息,從而確定入侵進程對文件訪問和修改的詳細行為模式,進而方便快捷地判斷系統(tǒng)是否入侵以及找出系統(tǒng)漏洞。該方法克服了采用傳統(tǒng)的系統(tǒng)/網(wǎng)絡日志來進行人工分析時的復雜性和低效性。另外,由于日志一般記錄的僅僅是系統(tǒng)事件中的部分信息,比如說]HTTP連接或者Login記錄,從而使得整個分析過程非常困難。基于溯源的入侵檢測方法,將和系統(tǒng)進行交互的網(wǎng)絡連接當做文件對象,并收集系統(tǒng)進程和文件對象之間依賴關(guān)系的溯源信息,然后構(gòu)造溯源圖,這樣管理員就可以找出入侵路徑。通過對入侵鏈上的每個事件進行分析,就可以確定系統(tǒng)漏洞以及入侵攻擊來源。實驗結(jié)果表明,基于溯源的入侵檢測機制和傳統(tǒng)方法相比,具有較低的誤檢率以及更高的檢測率,只有較小的空間開銷,并且?guī)缀鯇ο到y(tǒng)性能無影響。 提出了一種利用基于對象的主動存儲技術(shù)來顯著優(yōu)化溯源處理和在網(wǎng)絡上傳輸?shù)男阅艿姆椒āK菰磾?shù)據(jù)產(chǎn)生的持續(xù)性和大量性,使得溯源數(shù)據(jù)在網(wǎng)絡環(huán)境下的傳輸成為了一個重要的網(wǎng)絡瓶頸因素。采用基于對象的主動存儲技術(shù)能很好地解決這一問題。一方面,主動存儲技術(shù)將溯源的處理從主機下放到存儲設備,從而大大減少了溯源經(jīng)由存儲設備在網(wǎng)絡上傳輸?shù)臄?shù)據(jù)量;另一方面,基于對象的存儲設備相比傳統(tǒng)的塊設備,具有更強大的處理能力,可以更加智能化、自動化的處理溯源。在對象存儲設備內(nèi),普通的數(shù)據(jù)文件和溯源數(shù)據(jù)庫記錄都被當做用戶對象。而各種數(shù)據(jù)處理任務則被當做功能對象,它們將被靈活的調(diào)度執(zhí)行來完成系統(tǒng)所要執(zhí)行的一系列任務,如溯源數(shù)據(jù)的壓縮、查詢、數(shù)據(jù)的重建等。評估表明,基于對象的主動存儲技術(shù)能顯著地提升利用溯源來重建數(shù)據(jù)的性能。
[Abstract]:Nowadays, all kinds of new information are produced explosively all over the world. The capacity requirement of storage system is also developing from PB (Petabyte), EB (Exabyte) to mass storage system which can accommodate''Big Data'. The analysis and understanding of quantitative data itself is stagnant. For example, when we get some important data in the cloud, we might ask, where does this data come from, have anyone used it before, and how reliable and secure is it?
Provenance, as a metadata that contains historical information about data objects, can be used to answer questions such as how a data object is created, what modifications have been made, and how the ancestors of the two data objects differ. Traceability is now being used by scientists to validate important experimental datasets, improve the efficiency of desktop search, audit important financial accounts, and so on. It is used in the fields of repetitive data deletion, distributed security and so on. However, there are not many researches on traceability. For example, traceability is characterized by large amount of data, but few good algorithms support efficient query of traceability while compressing a large amount of traceability. However, there are few studies on traceability to ensure data reliability and to analyze system intrusion based on this generation history.
A hybrid compression method based on web graphics compression and dictionary encoding is proposed, which can compress traceability efficiently. By using the similarity between traceability graph and web graph, the locality and similarity characteristics of traceability graph nodes are fully exploited, and some repetitive strings inherent in traceability information are eliminated. Compared with other compression schemes, this method can further compress the edge information in the traceability graph, has finer compression granularity, and supports efficient query for traceability.
This paper presents a traceability-based data reconstruction method for reconstructing a single data object, which can reconstruct and prioritize the reconstructed data in parallel. By tracing back the generation process of data files, this method can reconstruct the lost or damaged files accurately. Sexual solutions (e.g., log files, snapshots, or backups) have the advantage of reconstructing a single data object, reconstructing multiple data objects in parallel, and giving priority to reconstructing important data files. The experimental results show that the performance of traceability-based reconstruction is significantly better than that of log-based reconstruction. Although there are factors such as the size of traceability database that affect traceability reconstruction, experiments show that these factors affect traceability-based reconstruction. Performance has little impact.
This paper presents a method of Intrusion Detection Based on traceability information. By collecting traceability information from the process interacting with the system, the intrusion process can determine the detailed behavior mode of file access and modification, and then judge whether the system is intruded and find out the system vulnerabilities quickly and conveniently. The complexity and inefficiency of system/network logs for manual analysis. In addition, because logs generally record only part of the information in system events, such as] HTTP connections or login records, the whole analysis process is very difficult. File objects collect the traceability information of dependencies between system processes and file objects, and then construct traceability graph, so that administrators can find the intrusion path. By analyzing each event in the intrusion chain, we can determine the system vulnerabilities and the source of intrusion attacks. Compared with traditional methods, the proposed method has lower false alarm rate and higher detection rate, less space overhead and almost no impact on system performance.
An object-based active storage technique is proposed to significantly optimize the performance of traceability processing and transmission over the network. The persistence and abundance of traceability data make the transmission of traceability data become an important bottleneck factor in the network environment. On the one hand, active storage technology reduces the amount of data transmitted by traceability from the host to the storage device, and on the other hand, object-based storage devices have more powerful processing power and can be more intelligent than traditional block devices. Automated processing traceability. In object storage devices, ordinary data files and traceable database records are treated as user objects. While various data processing tasks are treated as functional objects, they will be flexibly scheduled to perform a series of tasks, such as compression of traceable data, query, data reconstruction. Evaluations show that object-based active storage technology can significantly improve the performance of data reconstruction using traceability.
【學位授予單位】:華中科技大學
【學位級別】:博士
【學位授予年份】:2013
【分類號】:TP333

【參考文獻】

相關(guān)期刊論文 前1條

1 王黎維;鮑芝峰;KOEHLER Henning;周曉方;SADIQ Shazia;;一種優(yōu)化關(guān)系型溯源信息存儲的新方法[J];計算機學報;2011年10期

,

本文編號:2233914

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2233914.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶bce10***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com