基于關(guān)鍵詞的關(guān)系數(shù)據(jù)庫時態(tài)信息檢索方法研究
本文關(guān)鍵詞:基于關(guān)鍵詞的關(guān)系數(shù)據(jù)庫時態(tài)信息檢索方法研究 出處:《大連海事大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 時態(tài)數(shù)據(jù)圖 時態(tài)倒排索引 時態(tài)信息檢索 關(guān)系數(shù)據(jù)庫 關(guān)鍵詞檢索
【摘要】:隨著大數(shù)據(jù)時代的來臨,大量積累的數(shù)據(jù)已成為了各行各業(yè)重要的數(shù)據(jù)資產(chǎn)。時間是體現(xiàn)數(shù)據(jù)價值的重要指標,時態(tài)數(shù)據(jù)越來越得到人們的關(guān)注。因此如何對時態(tài)數(shù)據(jù)進行有效的存儲、管理和檢索已成為數(shù)據(jù)庫和信息檢索領(lǐng)域研究的熱點。數(shù)據(jù)庫信息檢索技術(shù)是數(shù)據(jù)庫和信息檢索的交叉領(lǐng)域,可以有效的支持普通用戶在數(shù)據(jù)庫上高效地進行關(guān)鍵詞檢索,該領(lǐng)域已取得了眾多的研究成果。時態(tài)信息檢索領(lǐng)域研究表明,通過將時態(tài)信息融入信息檢索技術(shù),可以有效地處理用戶的時態(tài)查詢,快速、高效地檢索用戶所需要的信息。然而現(xiàn)有的關(guān)系數(shù)據(jù)庫關(guān)鍵詞檢索方法未考慮數(shù)據(jù)時態(tài)性,缺乏對時態(tài)數(shù)據(jù)的檢索。因此針對這一問題,本文從時間維度出發(fā),研究關(guān)系數(shù)據(jù)庫上基于關(guān)鍵詞的時態(tài)信息檢索方法。首先,介紹時態(tài)信息處理的相關(guān)理論,為本文時態(tài)檢索方法的研究提供思路和理論支持。然后,在原有關(guān)系數(shù)據(jù)庫關(guān)鍵詞檢索方法基礎(chǔ)上,引入時間維度提出了時態(tài)信息檢索模型,設(shè)計了基于關(guān)鍵詞的關(guān)系數(shù)據(jù)庫時態(tài)信息檢索方法。該方法包括三部分內(nèi)容:(1)通過分析數(shù)據(jù)庫中存儲的時態(tài)實體和實體之間的時態(tài)關(guān)系,構(gòu)建時態(tài)數(shù)據(jù)圖;(2)由于現(xiàn)有的索引方法不能滿足時態(tài)關(guān)鍵詞結(jié)點的快速查找,設(shè)計了時態(tài)倒排索引,該索引通過對每個關(guān)鍵詞對應(yīng)的時態(tài)結(jié)點集合進行時態(tài)分區(qū)來提高查找效率;(3)設(shè)計了時態(tài)檢索算法T-STAR,該算法主要采用時間修剪策略,對不滿足時態(tài)約束的邊進行修剪,保證檢索結(jié)果滿足時態(tài)查詢的時間約束,同時,提出了時態(tài)邊權(quán)重的計算方法,更好的滿足檢索結(jié)果的內(nèi)容相關(guān)性。最后,實現(xiàn)了基于關(guān)鍵詞的關(guān)系數(shù)據(jù)庫時態(tài)檢索原型系統(tǒng),利用Employees和NBA時態(tài)數(shù)據(jù)集從P@K和MAP兩個評價指標對本文提出的方法進行了實驗評估。實驗結(jié)果表明了該方法在保證檢索效率的前提下,有效提高了數(shù)據(jù)庫信息檢索效果,滿足用戶的時態(tài)檢索需求。
[Abstract]:With the advent of the large data age, a large amount of accumulated data has become an important data asset in all walks of life. Time is an important indicator of the value of data, and more and more attention has been paid to the temporal data. Therefore, how to effectively store, manage and retrieve temporal data has become a hot topic in the field of database and information retrieval. Database information retrieval technology is a cross field between database and information retrieval. It can effectively support ordinary users to search keywords efficiently in database, and has achieved many research results in this field. In the field of temporal information retrieval, it is shown that temporal information can be applied to information retrieval technology, which can effectively handle temporal queries of users, and retrieve users' information quickly and efficiently. However, the existing relational database keyword retrieval methods do not take into account the temporal nature of the data, and lack the retrieval of temporal data. Therefore, in view of this problem, this paper studies the temporal information retrieval method based on key words on the relational database from the time dimension. First, the relevant theory of temporal information processing is introduced, which provides ideas and theoretical support for the research of the temporal retrieval method. Then, based on the keyword search method of original relational database, a temporal information retrieval model is proposed by introducing time dimension, and a relational database temporal information retrieval method based on keywords is designed. The method includes three parts: (1) the temporal relationship between the temporal entity and stored in the database analysis, the construction of temporal data; (2) due to the existing indexing methods to meet the fast search keywords temporal node, design of temporal inverted index, the index of temporal partition set to improve the search efficiency the temporal node for each keyword corresponding; (3) the design of temporal retrieval algorithm T-STAR, the algorithm mainly uses the pruning strategy to prune the time, does not meet the temporal constraints, ensure the retrieval results meet the time constraints, temporal query at the same time, put forward the method of calculating weights of temporal edge content relevance to better meet the needs of retrieval the results of the. Finally, a relational database temporal retrieval prototype system based on keywords is implemented. Based on Employees and NBA temporal data set, two evaluation indexes from P@K and MAP are used to evaluate the method proposed in this paper. The experimental results show that the method can effectively improve the database information retrieval effect and meet the user's temporal retrieval requirements under the premise of ensuring the retrieval efficiency.
【學(xué)位授予單位】:大連海事大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP311.13;TP391.3
【參考文獻】
相關(guān)期刊論文 前9條
1 肖蒙;王梅;;SHB~+樹——一種面向時態(tài)數(shù)據(jù)的分段混合索引[J];計算機研究與發(fā)展;2015年S1期
2 楊丹;陳默;孫良旭;王剛;;異構(gòu)信息空間中支持多模態(tài)融合實體搜索的多層時態(tài)數(shù)據(jù)模型[J];計算機科學(xué);2015年04期
3 岳紹敏;李萬龍;王璐;光順利;;基于Lucene索引的數(shù)據(jù)庫全文檢索[J];吉林大學(xué)學(xué)報(理學(xué)版);2014年05期
4 舒忠梅;左亞堯;張祖?zhèn)?;時態(tài)信息的語義抽取與排序方法研究及系統(tǒng)實現(xiàn)[J];計算機工程與科學(xué);2014年08期
5 衛(wèi)冰潔;王斌;;面向微博搜索的時間感知的混合語言模型[J];計算機學(xué)報;2014年01期
6 王寧;楊揚;由海涌;趙耀培;孟坤;;極大有序頻繁項目集的時間屬性分析方法[J];小型微型計算機系統(tǒng);2013年01期
7 林子雨;楊冬青;王騰蛟;張東站;;基于關(guān)系數(shù)據(jù)庫的關(guān)鍵詞查詢[J];軟件學(xué)報;2010年10期
8 葉小平;;基于時態(tài)變量對象關(guān)系模型及代數(shù)運算[J];計算機研究與發(fā)展;2007年11期
9 湯庸,湯娜,葉小平;時態(tài)信息處理技術(shù)研究綜述[J];中山大學(xué)學(xué)報(自然科學(xué)版);2003年04期
,本文編號:1340933
本文鏈接:http://sikaile.net/shoufeilunwen/xixikjs/1340933.html