內存數(shù)據(jù)庫存儲結構及索引的研究與設計
發(fā)布時間:2018-09-17 13:59
【摘要】:20世紀80年代后,隨著內存價格不斷走低,存儲芯片的集成度越來越高,數(shù)據(jù)庫中的全部數(shù)據(jù)或者大部分數(shù)據(jù)存在主存中已完全成為可行,內存數(shù)據(jù)庫的研究與應用開始逐步興起。內存數(shù)據(jù)庫在實時和高并發(fā)的應用中所展現(xiàn)出的數(shù)據(jù)訪問能力是磁盤數(shù)據(jù)庫不可比擬的,然而內存數(shù)據(jù)庫的研究與應用畢竟起步比較晚,它的很多設計理念和傳統(tǒng)的磁盤數(shù)據(jù)庫不再完全一致。由于磁盤和內存這兩種存儲介質間的的巨大的差別,基于磁盤結構的數(shù)據(jù)庫系統(tǒng)的數(shù)據(jù)結構、算法、查詢策略、索引結構等,不一定適用于內存數(shù)據(jù)庫系統(tǒng)。 本文以內存數(shù)據(jù)庫為研究方向,以內存數(shù)據(jù)庫中的存儲結構和索引為重點研究對象,關注于緩存和TLB。 在存儲結構上,深入的分析了廣泛應用于當前內存數(shù)據(jù)庫中N-Array存儲結構,指出它的不足:較差的緩存利用率。隨后文章給出了改進方案:利用離散存儲結構的思想,在單個內存頁面中對要存儲的業(yè)務表的屬性值進行分區(qū)儲存。這樣相同屬性值在內存頁面中空間局部性得到提高,在屬性查詢上該存儲結構能提供更高的緩存利用率。 在索引結構上,根據(jù)內存數(shù)據(jù)庫索引結構的研究歷程,,從當今廣泛應用于內存數(shù)據(jù)庫的T樹到研究火熱的緩存敏感樹。在緩存敏感索引樹的領域中重點分析了緩存敏感樹CSS樹、CSB+樹、HT樹,指出他們各自的優(yōu)缺點及適用情形。CSS樹更新代價太大、CSB+樹由于一味注重緩存而忽略了TLB對性能的影響。在HT樹的設計中通過把葉子節(jié)點設計成Hash桶來降低索引樹的高度從而減少TLB失配。在HT樹設計思想的指引下,本文對CSB+樹提出改進,擴大樹節(jié)點的扇出度,在兼顧緩存的同時控制索引樹的高度。相比于CSB+樹,改進的CSB+樹中增加了樹節(jié)點內分區(qū)和樹節(jié)點內的索引。最后給出了改進CSB+樹的查詢和插入時最主要的算法。最后給出了改進的CSB+樹和CSB+樹的緩存失配和TLB失配實驗對比。
[Abstract]:Since the 1980s, as the price of memory has been falling, the integration of memory chips has become more and more high, and it has become completely feasible to store all or most of the data in the database in the main memory. The research and application of memory database began to rise gradually. The data access ability of memory database in real-time and high concurrency application is incomparable to disk database. However, the research and application of memory database start relatively late. Many of its design ideas are no longer fully consistent with traditional disk databases. Because of the huge difference between disk and memory, the data structure, algorithm, query strategy and index structure of the database system based on disk structure may not be suitable for the memory database system. This paper takes the memory database as the research direction, and focuses on the storage structure and index in the memory database, focusing on the cache and TLB.. On the storage structure, this paper deeply analyzes the N-Array storage structure which is widely used in the current memory database, and points out its deficiency: poor cache utilization. Then an improved scheme is presented: using the idea of discrete storage structure, the attribute values of the business table to be stored are partitioned in a single memory page. In this way, the spatial locality of the same attribute value is improved in the memory page, and the storage structure can provide higher cache utilization on the attribute query. In the index structure, according to the research history of the index structure of the memory database, from the T tree which is widely used in the memory database nowadays to the cache sensitive tree which is the hot research. In the field of cache sensitive index tree, this paper mainly analyzes the cache sensitive tree CSS tree / CSS tree HT tree, and points out their respective advantages and disadvantages and their application. The cost of updating the CSS tree is too high. The TLB tree neglects the effect of TLB on the performance because of paying attention to cache blindly. In the design of HT tree, the TLB mismatch is reduced by reducing the height of the index tree by designing the leaf node as a Hash bucket. Under the guidance of the design idea of HT tree, this paper proposes an improvement on CSB tree to expand the fan out degree of tree node, and to control the height of index tree while taking into account the cache. Compared with the CSB tree, the improved CSB tree adds partitioning within the tree node and index within the tree node. Finally, the most important algorithms for query and insertion of improved CSB tree are given. Finally, the buffer mismatch between the improved CSB tree and the CSB tree is compared with the TLB mismatch experiment.
【學位授予單位】:湖南大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP311.13;TP333
本文編號:2246148
[Abstract]:Since the 1980s, as the price of memory has been falling, the integration of memory chips has become more and more high, and it has become completely feasible to store all or most of the data in the database in the main memory. The research and application of memory database began to rise gradually. The data access ability of memory database in real-time and high concurrency application is incomparable to disk database. However, the research and application of memory database start relatively late. Many of its design ideas are no longer fully consistent with traditional disk databases. Because of the huge difference between disk and memory, the data structure, algorithm, query strategy and index structure of the database system based on disk structure may not be suitable for the memory database system. This paper takes the memory database as the research direction, and focuses on the storage structure and index in the memory database, focusing on the cache and TLB.. On the storage structure, this paper deeply analyzes the N-Array storage structure which is widely used in the current memory database, and points out its deficiency: poor cache utilization. Then an improved scheme is presented: using the idea of discrete storage structure, the attribute values of the business table to be stored are partitioned in a single memory page. In this way, the spatial locality of the same attribute value is improved in the memory page, and the storage structure can provide higher cache utilization on the attribute query. In the index structure, according to the research history of the index structure of the memory database, from the T tree which is widely used in the memory database nowadays to the cache sensitive tree which is the hot research. In the field of cache sensitive index tree, this paper mainly analyzes the cache sensitive tree CSS tree / CSS tree HT tree, and points out their respective advantages and disadvantages and their application. The cost of updating the CSS tree is too high. The TLB tree neglects the effect of TLB on the performance because of paying attention to cache blindly. In the design of HT tree, the TLB mismatch is reduced by reducing the height of the index tree by designing the leaf node as a Hash bucket. Under the guidance of the design idea of HT tree, this paper proposes an improvement on CSB tree to expand the fan out degree of tree node, and to control the height of index tree while taking into account the cache. Compared with the CSB tree, the improved CSB tree adds partitioning within the tree node and index within the tree node. Finally, the most important algorithms for query and insertion of improved CSB tree are given. Finally, the buffer mismatch between the improved CSB tree and the CSB tree is compared with the TLB mismatch experiment.
【學位授予單位】:湖南大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP311.13;TP333
【參考文獻】
相關期刊論文 前10條
1 鄒樂天;;內存數(shù)據(jù)庫索引技術研究[J];電腦與信息技術;2007年03期
2 許麗花;;內存數(shù)據(jù)庫的關鍵技術研究[J];電腦知識與技術;2011年36期
3 姜華;;內存數(shù)據(jù)庫的數(shù)據(jù)組織結構分析[J];電信快報;2010年12期
4 蟲蟲;;CPU技術面面觀[J];電腦知識與技術(經驗技巧);2008年07期
5 李國徽,楊進才;內存數(shù)據(jù)庫查詢優(yōu)化[J];華中科技大學學報(自然科學版);2003年04期
6 何坤;;基于內存數(shù)據(jù)庫的分布式數(shù)據(jù)庫架構[J];程序員;2010年07期
7 吳紹春,胡國玲,李國輝,舒良才;一種內存數(shù)據(jù)庫定義及相關技術探討[J];江漢石油學院學報;1996年04期
8 鄭增威,吳震華,林懷忠;主存數(shù)據(jù)庫中針對小內存的優(yōu)化[J];計算機工程與應用;2003年16期
9 王珊;肖艷芹;劉大為;覃雄派;;內存數(shù)據(jù)庫關鍵技術研究[J];計算機應用;2007年10期
10 周游弋;董道國;金城;;高并發(fā)集群監(jiān)控系統(tǒng)中內存數(shù)據(jù)庫的設計與應用[J];計算機應用與軟件;2011年06期
本文編號:2246148
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2246148.html
最近更新
教材專著