包含完整性約束的概率關(guān)系數(shù)據(jù)庫(kù)更新和查詢優(yōu)化方法研究
本文關(guān)鍵詞: 概率關(guān)系數(shù)據(jù)庫(kù) 可能世界 基于約束的更新 函數(shù)依賴 關(guān)系查詢 出處:《華中科技大學(xué)》2016年博士論文 論文類型:學(xué)位論文
【摘要】:隨著數(shù)據(jù)清洗、傳感器網(wǎng)絡(luò)、追蹤移動(dòng)物體等應(yīng)用對(duì)不確定數(shù)據(jù)的管理要求越來越高,概率關(guān)系數(shù)據(jù)模型作為一個(gè)對(duì)不確定數(shù)據(jù)進(jìn)行有效管理的重要模型,自2003年開始引起學(xué)術(shù)界和工業(yè)界開始高度關(guān)注。從形式上說,一個(gè)概率關(guān)系數(shù)據(jù)庫(kù)是一組傳統(tǒng)關(guān)系數(shù)據(jù)庫(kù)上(可能世界)的概率分布,而完整性約束是關(guān)系數(shù)據(jù)上的重要信息,因此,提出一個(gè)包含完整性約束的概率關(guān)系數(shù)據(jù)庫(kù)模型,并研究該模型上的更新與查詢方法具有重要意義。針對(duì)目前大部分的不確定數(shù)據(jù)模型研究著重于描述具體數(shù)據(jù)之間的約束關(guān)系,而沒有考慮模式級(jí)別的約束關(guān)系的問題,提出了一個(gè)包含完整性約束的概率關(guān)系數(shù)據(jù)庫(kù)模型。不確定數(shù)據(jù)模式級(jí)別的完整性約束信息能捕捉動(dòng)態(tài)更新下的數(shù)據(jù)間的關(guān)聯(lián)關(guān)系,因此,利用基于約束的概率關(guān)系數(shù)據(jù)庫(kù)更新,自動(dòng)更新數(shù)據(jù)間的關(guān)聯(lián)關(guān)系,有效防止了概率關(guān)系數(shù)據(jù)庫(kù)包含不合理的可能世界的發(fā)生。由于現(xiàn)有將不確定數(shù)據(jù)從可能世界集合表示方式轉(zhuǎn)化為基于變量的表示方式的數(shù)據(jù)模型轉(zhuǎn)化方法導(dǎo)致元組表達(dá)式十分冗長(zhǎng),通過分析元組表達(dá)式的生成規(guī)則,提出了一個(gè)高效的數(shù)據(jù)模型轉(zhuǎn)化方法。該轉(zhuǎn)化方法基于一個(gè)消除表達(dá)式中重復(fù)變量的公式,減少了后續(xù)查詢?cè)谔幚碓M表達(dá)式的計(jì)算開銷。實(shí)驗(yàn)表明該數(shù)據(jù)模型轉(zhuǎn)化方法在沒有增加額外時(shí)間開銷的前提下,大大簡(jiǎn)化了元組表達(dá)式,且提高了后續(xù)查詢的處理效率。為了解決目前基于約束的概率關(guān)系數(shù)據(jù)庫(kù)更新方法枚舉概率關(guān)系數(shù)據(jù)庫(kù)中元組的表達(dá)式里出現(xiàn)的所有變量的取值,而導(dǎo)致的高時(shí)間復(fù)雜度的問題,提出了一個(gè)高效的更新方法。該方法只需考慮在約束中出現(xiàn)的變量取值,且采用變量替換機(jī)制更新元組的表達(dá)式,避免了概率關(guān)系數(shù)據(jù)庫(kù)中其他變量的參與。實(shí)驗(yàn)表明該方法在各種參數(shù)配置下,都優(yōu)于現(xiàn)有的更新方法。針對(duì)目前基于約束的概率關(guān)系數(shù)據(jù)庫(kù)更新方法,在獲取相關(guān)變量滿足約束的取值集合這個(gè)十分耗時(shí)的重要步驟中,沒有考慮針對(duì)常見的函數(shù)依賴約束的特征進(jìn)行優(yōu)化的問題,提出了兩種更新優(yōu)化策略。剪枝策略將相關(guān)元組的表達(dá)式單獨(dú)遍歷,避免了遍歷一個(gè)由各相關(guān)元組表達(dá)式組合而成的復(fù)雜表達(dá)式,減少了遍歷到的變量數(shù)量,從而減少了獲取相關(guān)變量滿足約束的取值集合的時(shí)間。在剪枝策略的基礎(chǔ)上,變量消除策略合并多個(gè)滿足約束且對(duì)應(yīng)相同可能世界的變量取值來最小化新生成的變量數(shù)目,利于后續(xù)的查詢處理。實(shí)驗(yàn)結(jié)果表明剪枝策略能進(jìn)一步提高基于約束的概率關(guān)系數(shù)據(jù)庫(kù)更新方法的效率,而變量消除策略能在不帶來額外開銷的情況下減少新生成的變量數(shù)量。針對(duì)目前大部分的概率關(guān)系數(shù)據(jù)庫(kù)上的一般查詢優(yōu)化方法著重于研究加速查詢結(jié)果世系邏輯表達(dá)式,而沒有考慮在查詢處理過程中生成簡(jiǎn)化的結(jié)果世系表達(dá)式的問題,提出了一個(gè)利用模式級(jí)別的約束信息來簡(jiǎn)化查詢結(jié)果世系數(shù)據(jù)表達(dá)式的優(yōu)化方法。分別利用函數(shù)依賴約束和引用完整性約束這兩種模式級(jí)別的信息對(duì)兩種關(guān)系操作的世系數(shù)據(jù)給出了簡(jiǎn)化的生成方式。假設(shè)查詢對(duì)于概率關(guān)系數(shù)據(jù)庫(kù)有重要應(yīng)用價(jià)值。為了避免目前基于生成新數(shù)據(jù)庫(kù)版本通用處理方法會(huì)帶來額外更新開銷的問題,提出了一種利用條件概率來處理假設(shè)查詢的優(yōu)化方法。該方法通過計(jì)算結(jié)果在假設(shè)條件下的條件概率,避免了不必要的概率關(guān)系數(shù)據(jù)庫(kù)更新。實(shí)驗(yàn)結(jié)果驗(yàn)證了一般查詢優(yōu)化方法和假設(shè)查詢優(yōu)化方法的有效性。
[Abstract]:With data cleaning, sensor networks, motion tracking applications of uncertain data management requirements more and more high, the probability of the relational data model as an important model for the uncertain data of effective management, since the beginning of 2003 caused by industrial and academic circles began to pay close attention to. From the form, a probabilistic relational database is a group of traditional relational database (World) probability distribution, and integrity constraints are important information, so the relationship between data, propose a contains integrity constraints probabilistic relational database model, and it is important to study the update and query method of the model. At present, most of the research on uncertain data the model focuses on the relationship between the specific constraint description data, without considering the constraint mode level of the problem, propose a probability of integrity constraints The relational database model. The uncertain data model level integrity constraint information can capture the dynamic relationship between the update data, using the update probability constraints based on relational database, automatically update the correlation between data, effectively prevent the probabilistic relational database contains irrational world may occur. Because the existing will not to determine the data from possible world representation into the set of variables representing the data model conversion method based on the result tuple expression is long, by generating rules analysis tuple expression, proposes an efficient conversion method. The data model conversion method based on a eliminate duplication of variables in the expression formula, reduced in subsequent query the computational overhead processing tuple expression. The experimental results show that the data model transformation method in the absence of additional time overhead. Under the premise, greatly simplifies the tuple expression and improves the efficiency of subsequent processing, query. In order to solve the present value of all the variables based on the tuple expression probabilistic database constraint update method for enumeration of probabilistic relational database in, due to the high complexity of the problem, we propose an efficient update method. This method only need to consider the variables appearing in the constraint in the expression and the use of variable substitution mechanism to update tuples, avoid other variable probability in a relational database. Experimental results show that the method in the parameter configuration, updating method is superior to the existing in the probabilistic relational database based on constraint update method in to obtain relevant variables to satisfy the constraint set value of this very time-consuming important step, no consideration for common function dependency features into For optimization problems, put forward two kinds of optimization strategies. The pruning strategy update tuple expression separate traversal, traverse a complex expression by each tuple expression together to avoid, reduce the number of variables to traverse, thus reducing the access to relevant variables to meet set constraint value based on time. Pruning strategy, the number of variables to eliminate variables merge multiple constraints and strategies corresponding to the same possible worlds to new generation minimization, for subsequent query processing. The experimental results show that the pruning strategy can further improve the efficiency of a probabilistic relational constraint update method based on variable elimination strategy can reduce the number of the new generation do not bring extra variables in the overhead. At present, most of the probabilistic relational database query optimization method generally focuses on research The accelerated query results lineage logic expression, without considering the simplified generation during query processing results of lineage expression problems, we proposed a model level constraint information to simplify the query results of data lineage expression optimization method. Using function dependency constraints and reference to the relationship between the two operation of the lineage of data integrity the two constraint level information model was introduced to produce simplified. Suppose the query has important application value for the probabilistic relational database. In order to avoid the formation of new processing method based on the generic version of the database will bring additional update overhead, proposes the use of a conditional probability method assumes that queries to conditional probability the processing. Through the calculation results under the assumption that the probability of avoiding unnecessary updates in relational database. The experimental results verify The general query optimization method and the hypothesis query optimization method are effective.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前8條
1 李俊麗;;一種改進(jìn)的概率關(guān)系模型及其應(yīng)用研究[J];中北大學(xué)學(xué)報(bào)(自然科學(xué)版);2011年02期
2 王建衛(wèi);郝忠孝;;概率關(guān)系模式與概率XML模式轉(zhuǎn)換算法的研究[J];計(jì)算機(jī)應(yīng)用研究;2011年02期
3 汪榮貴,沈明玉,偶春生;Bayes網(wǎng)絡(luò)與關(guān)系模型的集成:概率關(guān)系模型[J];微電子學(xué)與計(jì)算機(jī);2002年03期
4 姬東耀,張軍英;一個(gè)概率關(guān)系專家數(shù)據(jù)庫(kù)模型[J];計(jì)算機(jī)工程與應(yīng)用;1999年06期
5 高紅梅,馬元元,孫志揮;基于概率關(guān)系數(shù)據(jù)庫(kù)模型的研究[J];東南大學(xué)學(xué)報(bào)(自然科學(xué)版);2000年02期
6 李石君,鄭鵬,周洞汝;一種靈活的概率關(guān)系數(shù)據(jù)庫(kù)模式及代數(shù)[J];計(jì)算機(jī)工程與應(yīng)用;1999年11期
7 江彤;金宗安;謝東;;概率數(shù)據(jù)庫(kù)的聚集查詢[J];計(jì)算機(jī)工程;2010年11期
8 ;[J];;年期
相關(guān)會(huì)議論文 前1條
1 姬東耀;蔡希堯;;一個(gè)概率關(guān)系專家數(shù)據(jù)庫(kù)模型[A];數(shù)據(jù)庫(kù)研究進(jìn)展97——第十四屆全國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(下)[C];1997年
相關(guān)博士學(xué)位論文 前1條
1 張彩彩;包含完整性約束的概率關(guān)系數(shù)據(jù)庫(kù)更新和查詢優(yōu)化方法研究[D];華中科技大學(xué);2016年
相關(guān)碩士學(xué)位論文 前5條
1 李俊麗;一種改進(jìn)的概率關(guān)系模型及其概率查詢問題的研究[D];太原科技大學(xué);2011年
2 徐春麗;基于動(dòng)態(tài)概率關(guān)系模型的規(guī)劃識(shí)別研究及應(yīng)用[D];江蘇科技大學(xué);2013年
3 趙夢(mèng)非;概率關(guān)系模型在負(fù)荷管理系統(tǒng)中的應(yīng)用研究[D];太原科技大學(xué);2008年
4 江彤;概率關(guān)系查詢?cè)u(píng)估與Top-k查詢技術(shù)研究[D];湖南大學(xué);2010年
5 金宗安;概率數(shù)據(jù)庫(kù)及有效查詢技術(shù)的研究[D];中南大學(xué);2009年
,本文編號(hào):1550057
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1550057.html