基于模糊邏輯的關(guān)系數(shù)據(jù)庫(kù)信息檢索方法研究
本文選題:關(guān)系數(shù)據(jù)庫(kù) + 對(duì)象級(jí)別檢索 ; 參考:《大連海事大學(xué)》2013年碩士論文
【摘要】:關(guān)系數(shù)據(jù)庫(kù)關(guān)鍵詞檢索技術(shù)使得用戶不需要了解任何SQL語(yǔ)言和底層數(shù)據(jù)庫(kù)模式,就可以方便地檢索到數(shù)據(jù)庫(kù)中存儲(chǔ)的信息,就像使用搜索引擎一樣簡(jiǎn)單、便捷,因此關(guān)鍵詞檢索技術(shù)成為數(shù)據(jù)庫(kù)查詢領(lǐng)域的研究重點(diǎn)。數(shù)據(jù)庫(kù)的規(guī)范化設(shè)計(jì)使得檢索結(jié)果以元組級(jí)別的形式展現(xiàn),這將導(dǎo)致檢索結(jié)果不完整,語(yǔ)義難以理解,因此提出對(duì)象級(jí)別的信息檢索技術(shù)。 對(duì)象級(jí)別的信息檢索技術(shù)可以直接地表達(dá)檢索結(jié)果的語(yǔ)義,返回的結(jié)果也更加完整。但當(dāng)用戶輸入的是具有模糊性的檢索關(guān)鍵詞時(shí),其檢索效果并不好。而模糊數(shù)學(xué)知識(shí)的引入則可以很好地解決數(shù)值型關(guān)鍵詞的模糊檢索問(wèn)題。在對(duì)數(shù)值型的關(guān)鍵詞進(jìn)行操作時(shí),首先對(duì)數(shù)據(jù)庫(kù)進(jìn)行分析,然后有針對(duì)性地提出隸屬函數(shù),利用隸屬函數(shù)和模糊化算子對(duì)數(shù)值型關(guān)鍵詞進(jìn)行操作。在相關(guān)性排序時(shí),通過(guò)利用模糊邏輯中的模糊推理方法來(lái)計(jì)算對(duì)象結(jié)點(diǎn)的權(quán)重。 在對(duì)關(guān)鍵詞檢索結(jié)果進(jìn)行排序時(shí),不僅要考慮信息檢索技術(shù)本身的特點(diǎn),還要考慮數(shù)據(jù)庫(kù)特性,主要有元組重要性,屬性重要性以及屬性上關(guān)鍵詞的IR分?jǐn)?shù)。數(shù)據(jù)庫(kù)中各個(gè)元組/屬性被檢索到的次數(shù)是不同的,這就表明不同的元組/屬性對(duì)用戶的重要性是不一樣的。在計(jì)算屬性上關(guān)鍵詞的IR分?jǐn)?shù)時(shí),采用傳統(tǒng)的TF/IDF可以很好地得到結(jié)果,但這只限于對(duì)非數(shù)值屬性的關(guān)鍵詞,對(duì)于數(shù)值屬性的關(guān)鍵詞則很不實(shí)用。因此需要利用隸屬函數(shù)對(duì)數(shù)值型關(guān)鍵詞進(jìn)行操作,進(jìn)而實(shí)現(xiàn)對(duì)數(shù)值屬性關(guān)鍵詞的模糊檢索,并有效地進(jìn)行相關(guān)性排序。 本文采用上述方法設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)基于模糊邏輯的關(guān)系數(shù)據(jù)庫(kù)對(duì)象級(jí)別信息檢索原型系統(tǒng)。利用DBLP數(shù)據(jù)集對(duì)該原型系統(tǒng)進(jìn)行了實(shí)驗(yàn)驗(yàn)證,并采用P@K和MAP兩個(gè)評(píng)價(jià)指標(biāo)對(duì)實(shí)驗(yàn)進(jìn)行評(píng)價(jià)。最終的實(shí)驗(yàn)結(jié)果表明本文的方法能有效改善檢索結(jié)果的排序效果。
[Abstract]:Relational database keyword retrieval technology makes it easy for users to retrieve the information stored in the database without knowing any SQL language and underlying database schema, which is as simple and convenient as using search engine. Therefore, keyword retrieval technology has become the research focus in database query field. The standardized design of the database makes the retrieval result display in the form of tuple level, which will lead to incomplete retrieval results and difficult semantic understanding. Therefore, an object-level information retrieval technique is proposed. Object level information retrieval technology can directly express the semantics of retrieval results, and the returned results are more complete. However, when users input fuzzy search keywords, the retrieval effect is not good. The introduction of fuzzy mathematics knowledge can solve the problem of fuzzy retrieval of numerical keywords. When the key words of numerical type are operated, the database is analyzed first, then the membership function is put forward, and the membership function and fuzzy operator are used to operate the key words of numerical type. The weights of object nodes are calculated by using fuzzy reasoning method in fuzzy logic. When sorting the results of keyword retrieval, we should consider not only the characteristics of information retrieval technology, but also the characteristics of database, including tuple importance, attribute importance and IR score of attribute keywords. The number of times each tuple / attribute is retrieved in the database is different, which indicates that the importance of different tuple / attribute to the user is different. When calculating the IR score of the keywords on the attributes, the traditional TF/IDF can get the results well, but this is limited to the keywords of the non-numeric attributes, but it is not practical for the keywords of the numerical attributes. Therefore, it is necessary to use the membership function to operate the numeric keywords, and then to realize the fuzzy retrieval of the numeric attribute keywords, and to sort the correlation effectively. This paper designs and implements a prototype system of object level information retrieval in relational database based on fuzzy logic. The prototype system is verified by using DBLP data set, and the experiment is evaluated by two evaluation indexes: P@ K and MAP. The final experimental results show that the proposed method can effectively improve the ranking effect of retrieval results.
【學(xué)位授予單位】:大連海事大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP311.132.3;TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 何瑩;;關(guān)系數(shù)據(jù)庫(kù)的模式抽取[J];信息技術(shù);2009年08期
2 林子雨;楊冬青;王騰蛟;張東站;;基于關(guān)系數(shù)據(jù)庫(kù)的關(guān)鍵詞查詢[J];軟件學(xué)報(bào);2010年10期
3 王斌;楊曉春;王國(guó)仁;;關(guān)系數(shù)據(jù)庫(kù)中支持語(yǔ)義的Top-K關(guān)鍵字搜索(英文)[J];軟件學(xué)報(bào);2008年09期
4 王翔;;數(shù)據(jù)庫(kù)技術(shù)[J];程序員;2007年01期
5 王珊;張俊;彭朝暉;戰(zhàn)疆;杜小勇;;基于本體的關(guān)系數(shù)據(jù)庫(kù)語(yǔ)義檢索[J];計(jì)算機(jī)科學(xué)與探索;2007年01期
6 曾孟佳;程兆麟;;異種數(shù)據(jù)庫(kù)在Lotus Notes中的訪問(wèn)[J];微型電腦應(yīng)用;2007年08期
7 呂漢興;孫德保;程良銓;;用關(guān)系數(shù)據(jù)庫(kù)系統(tǒng)實(shí)現(xiàn)中醫(yī)的辨證論治[J];微型機(jī)與應(yīng)用;1989年02期
8 馬宗民,,嚴(yán)麗;關(guān)系數(shù)據(jù)庫(kù)中一種混合類(lèi)不完全信息的引入[J];計(jì)算機(jī)研究與發(fā)展;1996年11期
9 唐潛,楊德華;用JAVA類(lèi)封裝RDB庫(kù)表──在關(guān)系數(shù)據(jù)庫(kù)上運(yùn)用OO技術(shù)探討[J];計(jì)算機(jī)應(yīng)用研究;1999年11期
10 宋小安,李志華;基于關(guān)系數(shù)據(jù)庫(kù)的故障診斷專(zhuān)家系統(tǒng)在雷達(dá)電源中的應(yīng)用[J];河海大學(xué)常州分校學(xué)報(bào);2004年03期
相關(guān)會(huì)議論文 前10條
1 何義劍;姚青;洪曉光;;基于關(guān)系數(shù)據(jù)庫(kù)的業(yè)務(wù)流程本體存儲(chǔ)模式研究[A];第二十四屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2007年
2 吳紅偉;王慶;蕭建昌;周傲英;;XML鍵約束在關(guān)系數(shù)據(jù)庫(kù)中的實(shí)現(xiàn)[A];第十九屆全國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2002年
3 陳欣;金遠(yuǎn)平;呂揚(yáng);;基于本體的關(guān)系數(shù)據(jù)庫(kù)的語(yǔ)義設(shè)計(jì)模式[A];第二十一屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2004年
4 顧平;周力;;基于MDA的關(guān)系數(shù)據(jù)庫(kù)的設(shè)計(jì)與實(shí)現(xiàn)[A];第二十三屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2006年
5 汪t熺
本文編號(hào):1921933
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1921933.html