分布式溯源信息存儲(chǔ)系統(tǒng)的研究與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-03-23 02:35
本文選題:溯源信息 切入點(diǎn):存儲(chǔ)系統(tǒng) 出處:《電子科技大學(xué)》2016年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著云計(jì)算與大數(shù)據(jù)技術(shù)的迅速發(fā)展和應(yīng)用,海量數(shù)據(jù)的存儲(chǔ)和管理問(wèn)題成為人們關(guān)注的焦點(diǎn),對(duì)數(shù)據(jù)存儲(chǔ)的靈活性、可擴(kuò)展性以及并發(fā)性等都提出了更高的要求。眾多的互聯(lián)網(wǎng)應(yīng)用使得多樣化的非結(jié)構(gòu)化數(shù)據(jù)大量地產(chǎn)生,而傳統(tǒng)的關(guān)系型數(shù)據(jù)庫(kù)使用二維表來(lái)描述數(shù)據(jù)及數(shù)據(jù)之間的關(guān)系,因此它不適宜用于存儲(chǔ)靈活多變的非結(jié)構(gòu)化數(shù)據(jù)。為滿足這些需求,許多新的存儲(chǔ)設(shè)備和存儲(chǔ)技術(shù)應(yīng)運(yùn)而生,例如SSD、NoSql、分布式存儲(chǔ)技術(shù)等,以適應(yīng)非結(jié)構(gòu)化數(shù)據(jù)應(yīng)用場(chǎng)景,提高存儲(chǔ)和讀寫效率并盡量降低存儲(chǔ)成本。面對(duì)海量的數(shù)據(jù),人們通常會(huì)關(guān)心某些數(shù)據(jù)的生命周期,例如它是何時(shí)被創(chuàng)建的、被哪些用戶使用過(guò)、存在多少副本等,這些信息對(duì)于數(shù)據(jù)管理、系統(tǒng)安全維護(hù)等來(lái)說(shuō)有著十分重要的意義,通常也被稱為溯源信息。溯源信息描述了一個(gè)對(duì)象的歷史運(yùn)動(dòng)軌跡和動(dòng)態(tài)衍生過(guò)程以及對(duì)象之間相互作用和推進(jìn)的關(guān)系,隨著時(shí)間的推移,這些數(shù)據(jù)越來(lái)越龐大,對(duì)象之間的關(guān)系變得越來(lái)越復(fù)雜,因此如何有效地描述和存儲(chǔ)海量的溯源信息使得用戶可以簡(jiǎn)單高效地對(duì)其進(jìn)行存取,便是本文研究的核心。針對(duì)海量溯源信息的存儲(chǔ)問(wèn)題,本文設(shè)計(jì)和實(shí)現(xiàn)了一個(gè)高性能溯源信息存儲(chǔ)系統(tǒng)DBPS(Double Buffer Provenance Store)。DBPS根據(jù)溯源信息的特點(diǎn),在基于中心節(jié)點(diǎn)的分布式體系架構(gòu)基礎(chǔ)上采取了多層次的存儲(chǔ)架構(gòu),包括緩存層和持久化存儲(chǔ)層。DBPS在緩存層采用了讀寫分離的雙緩存架構(gòu),設(shè)計(jì)了特定于溯源信息的數(shù)據(jù)存儲(chǔ)結(jié)構(gòu)和索引,對(duì)溯源信息具有感知能力,在持久化存儲(chǔ)層它采用key-value數(shù)據(jù)庫(kù)作為底層的持久化存儲(chǔ)引擎,在提高數(shù)據(jù)的讀寫效率同時(shí)降低了存儲(chǔ)資源的消耗。與DBPS相比,大多數(shù)的溯源系統(tǒng)或溯源應(yīng)用都直接使用關(guān)系型數(shù)據(jù)庫(kù)或圖形數(shù)據(jù)庫(kù)等現(xiàn)有的數(shù)據(jù)庫(kù)來(lái)存儲(chǔ)溯源信息,在讀寫溯源信息時(shí)需要對(duì)數(shù)據(jù)進(jìn)行復(fù)雜的處理,讀寫效率較低。實(shí)驗(yàn)結(jié)果表明,本文設(shè)計(jì)和實(shí)現(xiàn)的DBPS系統(tǒng)在創(chuàng)建和查詢溯源對(duì)象的數(shù)據(jù)時(shí)具有較高的效率,而在修改和刪除數(shù)據(jù)時(shí)效率相對(duì)較低,但在實(shí)際的應(yīng)用中修改和刪除操作的使用頻率很低,因此使用DBPS來(lái)存取溯源信息的整體性能突出,能夠很好地滿足用戶的需求。
[Abstract]:With the rapid development and application of cloud computing and big data technology, the storage and management of massive data has become the focus of attention, the flexibility of data storage, Extensibility and concurrency are higher requirements. Many Internet applications make a variety of unstructured data generated in large quantities, while traditional relational databases use two-dimensional tables to describe the relationship between data and data. Therefore, it is not suitable for storing flexible and changeable unstructured data. In order to meet these requirements, many new storage devices and storage technologies have emerged, such as SSDN NoSql, distributed storage technology and so on, in order to adapt to unstructured data application scenarios. People usually care about the life cycle of certain data, such as when it is created, who has used it, how many copies it is, and so on. This information is of great significance to data management, system security maintenance, etc. Also known as traceability information. Traceability information describes an object's historical trajectory and dynamic derivation process, as well as the interaction and advancement of objects, and these data grow larger and larger over time. The relationship between objects becomes more and more complex, so how to effectively describe and store massive traceability information to enable users to easily and efficiently access it is the core of this paper. In this paper, a high performance traceability information storage system DBPS(Double Buffer Provenance Store).DBPS is designed and implemented. According to the characteristics of traceability information, a multi-layer storage architecture is adopted based on the distributed architecture based on central node. The cache layer and the persistent storage layer. DBPS adopts the dual cache architecture of read-write separation in the cache layer, designs the data storage structure and index specific to traceability information, and has the ability to perceive the traceability information. In the persistent storage layer, key-value database is used as the underlying persistent storage engine, which improves the efficiency of data reading and writing and reduces the consumption of storage resources. Most traceability systems or applications directly use existing databases such as relational databases or graphic databases to store traceability information, which requires complex processing when reading and writing traceability information. The experimental results show that the DBPS system designed and implemented in this paper has a high efficiency in creating and querying the data of traceability objects, but in modifying and deleting the data, the efficiency is relatively low. However, the frequency of modifying and deleting operations is very low in practical applications, so the overall performance of using DBPS to access traceability information is outstanding and can meet the needs of users.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 明華;張勇;符小輝;;數(shù)據(jù)溯源技術(shù)綜述[J];小型微型計(jì)算機(jī)系統(tǒng);2012年09期
相關(guān)博士學(xué)位論文 前1條
1 謝雨來(lái);溯源的高效存儲(chǔ)管理及在安全方面的應(yīng)用研究[D];華中科技大學(xué);2013年
,本文編號(hào):1651551
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1651551.html
最近更新
教材專著