基于模糊邏輯的關(guān)系數(shù)據(jù)庫信息檢索方法研究
本文選題:關(guān)系數(shù)據(jù)庫 + 對象級別檢索; 參考:《大連海事大學(xué)》2013年碩士論文
【摘要】:關(guān)系數(shù)據(jù)庫關(guān)鍵詞檢索技術(shù)使得用戶不需要了解任何SQL語言和底層數(shù)據(jù)庫模式,就可以方便地檢索到數(shù)據(jù)庫中存儲的信息,就像使用搜索引擎一樣簡單、便捷,因此關(guān)鍵詞檢索技術(shù)成為數(shù)據(jù)庫查詢領(lǐng)域的研究重點(diǎn)。數(shù)據(jù)庫的規(guī)范化設(shè)計(jì)使得檢索結(jié)果以元組級別的形式展現(xiàn),這將導(dǎo)致檢索結(jié)果不完整,語義難以理解,因此提出對象級別的信息檢索技術(shù)。 對象級別的信息檢索技術(shù)可以直接地表達(dá)檢索結(jié)果的語義,返回的結(jié)果也更加完整。但當(dāng)用戶輸入的是具有模糊性的檢索關(guān)鍵詞時(shí),其檢索效果并不好。而模糊數(shù)學(xué)知識的引入則可以很好地解決數(shù)值型關(guān)鍵詞的模糊檢索問題。在對數(shù)值型的關(guān)鍵詞進(jìn)行操作時(shí),首先對數(shù)據(jù)庫進(jìn)行分析,然后有針對性地提出隸屬函數(shù),利用隸屬函數(shù)和模糊化算子對數(shù)值型關(guān)鍵詞進(jìn)行操作。在相關(guān)性排序時(shí),通過利用模糊邏輯中的模糊推理方法來計(jì)算對象結(jié)點(diǎn)的權(quán)重。 在對關(guān)鍵詞檢索結(jié)果進(jìn)行排序時(shí),不僅要考慮信息檢索技術(shù)本身的特點(diǎn),還要考慮數(shù)據(jù)庫特性,主要有元組重要性,屬性重要性以及屬性上關(guān)鍵詞的IR分?jǐn)?shù)。數(shù)據(jù)庫中各個(gè)元組/屬性被檢索到的次數(shù)是不同的,這就表明不同的元組/屬性對用戶的重要性是不一樣的。在計(jì)算屬性上關(guān)鍵詞的IR分?jǐn)?shù)時(shí),采用傳統(tǒng)的TF/IDF可以很好地得到結(jié)果,但這只限于對非數(shù)值屬性的關(guān)鍵詞,對于數(shù)值屬性的關(guān)鍵詞則很不實(shí)用。因此需要利用隸屬函數(shù)對數(shù)值型關(guān)鍵詞進(jìn)行操作,進(jìn)而實(shí)現(xiàn)對數(shù)值屬性關(guān)鍵詞的模糊檢索,并有效地進(jìn)行相關(guān)性排序。 本文采用上述方法設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)基于模糊邏輯的關(guān)系數(shù)據(jù)庫對象級別信息檢索原型系統(tǒng)。利用DBLP數(shù)據(jù)集對該原型系統(tǒng)進(jìn)行了實(shí)驗(yàn)驗(yàn)證,并采用P@K和MAP兩個(gè)評價(jià)指標(biāo)對實(shí)驗(yàn)進(jìn)行評價(jià)。最終的實(shí)驗(yàn)結(jié)果表明本文的方法能有效改善檢索結(jié)果的排序效果。
[Abstract]:Relational database keyword retrieval technology makes it easy for users to retrieve the information stored in the database without knowing any SQL language and underlying database schema, which is as simple and convenient as using search engine. Therefore, keyword retrieval technology has become the research focus in database query field. The standardized design of the database makes the retrieval result display in the form of tuple level, which will lead to incomplete retrieval results and difficult semantic understanding. Therefore, an object-level information retrieval technique is proposed. Object level information retrieval technology can directly express the semantics of retrieval results, and the returned results are more complete. However, when users input fuzzy search keywords, the retrieval effect is not good. The introduction of fuzzy mathematics knowledge can solve the problem of fuzzy retrieval of numerical keywords. When the key words of numerical type are operated, the database is analyzed first, then the membership function is put forward, and the membership function and fuzzy operator are used to operate the key words of numerical type. The weights of object nodes are calculated by using fuzzy reasoning method in fuzzy logic. When sorting the results of keyword retrieval, we should consider not only the characteristics of information retrieval technology, but also the characteristics of database, including tuple importance, attribute importance and IR score of attribute keywords. The number of times each tuple / attribute is retrieved in the database is different, which indicates that the importance of different tuple / attribute to the user is different. When calculating the IR score of the keywords on the attributes, the traditional TF/IDF can get the results well, but this is limited to the keywords of the non-numeric attributes, but it is not practical for the keywords of the numerical attributes. Therefore, it is necessary to use the membership function to operate the numeric keywords, and then to realize the fuzzy retrieval of the numeric attribute keywords, and to sort the correlation effectively. This paper designs and implements a prototype system of object level information retrieval in relational database based on fuzzy logic. The prototype system is verified by using DBLP data set, and the experiment is evaluated by two evaluation indexes: P@ K and MAP. The final experimental results show that the proposed method can effectively improve the ranking effect of retrieval results.
【學(xué)位授予單位】:大連海事大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP311.132.3;TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 何瑩;;關(guān)系數(shù)據(jù)庫的模式抽取[J];信息技術(shù);2009年08期
2 林子雨;楊冬青;王騰蛟;張東站;;基于關(guān)系數(shù)據(jù)庫的關(guān)鍵詞查詢[J];軟件學(xué)報(bào);2010年10期
3 王斌;楊曉春;王國仁;;關(guān)系數(shù)據(jù)庫中支持語義的Top-K關(guān)鍵字搜索(英文)[J];軟件學(xué)報(bào);2008年09期
4 王翔;;數(shù)據(jù)庫技術(shù)[J];程序員;2007年01期
5 王珊;張俊;彭朝暉;戰(zhàn)疆;杜小勇;;基于本體的關(guān)系數(shù)據(jù)庫語義檢索[J];計(jì)算機(jī)科學(xué)與探索;2007年01期
6 曾孟佳;程兆麟;;異種數(shù)據(jù)庫在Lotus Notes中的訪問[J];微型電腦應(yīng)用;2007年08期
7 呂漢興;孫德保;程良銓;;用關(guān)系數(shù)據(jù)庫系統(tǒng)實(shí)現(xiàn)中醫(yī)的辨證論治[J];微型機(jī)與應(yīng)用;1989年02期
8 馬宗民,,嚴(yán)麗;關(guān)系數(shù)據(jù)庫中一種混合類不完全信息的引入[J];計(jì)算機(jī)研究與發(fā)展;1996年11期
9 唐潛,楊德華;用JAVA類封裝RDB庫表──在關(guān)系數(shù)據(jù)庫上運(yùn)用OO技術(shù)探討[J];計(jì)算機(jī)應(yīng)用研究;1999年11期
10 宋小安,李志華;基于關(guān)系數(shù)據(jù)庫的故障診斷專家系統(tǒng)在雷達(dá)電源中的應(yīng)用[J];河海大學(xué)常州分校學(xué)報(bào);2004年03期
相關(guān)會議論文 前10條
1 何義劍;姚青;洪曉光;;基于關(guān)系數(shù)據(jù)庫的業(yè)務(wù)流程本體存儲模式研究[A];第二十四屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集(技術(shù)報(bào)告篇)[C];2007年
2 吳紅偉;王慶;蕭建昌;周傲英;;XML鍵約束在關(guān)系數(shù)據(jù)庫中的實(shí)現(xiàn)[A];第十九屆全國數(shù)據(jù)庫學(xué)術(shù)會議論文集(技術(shù)報(bào)告篇)[C];2002年
3 陳欣;金遠(yuǎn)平;呂揚(yáng);;基于本體的關(guān)系數(shù)據(jù)庫的語義設(shè)計(jì)模式[A];第二十一屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集(技術(shù)報(bào)告篇)[C];2004年
4 顧平;周力;;基于MDA的關(guān)系數(shù)據(jù)庫的設(shè)計(jì)與實(shí)現(xiàn)[A];第二十三屆中國數(shù)據(jù)庫學(xué)術(shù)會議論文集(技術(shù)報(bào)告篇)[C];2006年
5 汪t熺
本文編號:1921933
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1921933.html