天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 搜索引擎論文 >

關(guān)系數(shù)據(jù)庫(kù)對(duì)象級(jí)別檢索結(jié)果相關(guān)性排序算法研究

發(fā)布時(shí)間:2019-03-12 18:40
【摘要】:隨著互聯(lián)網(wǎng)的發(fā)展,網(wǎng)頁(yè)搜索引擎獲得了巨大的成功,用戶可以用簡(jiǎn)單的關(guān)鍵詞找到自己需要的信息。而關(guān)系數(shù)據(jù)庫(kù)是當(dāng)前數(shù)據(jù)庫(kù)的主流形式,它采用結(jié)構(gòu)化查詢語(yǔ)言進(jìn)行內(nèi)容檢索,并要求用戶掌握一定的查詢語(yǔ)言和數(shù)據(jù)庫(kù)模式知識(shí)。由此就產(chǎn)生了一個(gè)很自然的需求,讓關(guān)系數(shù)據(jù)庫(kù)支持高效的關(guān)鍵詞查詢,因?yàn)殛P(guān)鍵詞檢索可以使用戶擺脫SQL語(yǔ)句的束縛。 相比于網(wǎng)絡(luò)搜索引擎,關(guān)系數(shù)據(jù)庫(kù)關(guān)鍵詞檢索又有了新的特點(diǎn),例如:元組之間擁有語(yǔ)義關(guān)系;數(shù)據(jù)庫(kù)中的屬性值隱藏著等價(jià)和傳遞關(guān)系;數(shù)據(jù)庫(kù)中的文本都是短文本等等。因此一些信息檢索的方法僅是在關(guān)系數(shù)據(jù)庫(kù)上進(jìn)行元組級(jí)別的關(guān)鍵詞檢索,并不適合于關(guān)系數(shù)據(jù)庫(kù),需要研究一種適合于關(guān)系數(shù)據(jù)庫(kù)本身特點(diǎn)的相關(guān)性排序算法。本文針對(duì)關(guān)系數(shù)據(jù)庫(kù)的特點(diǎn)以及信息檢索的特點(diǎn),研究了一種對(duì)象級(jí)別的相關(guān)性排序算法。解決了元組級(jí)別檢索排序的信息分散問(wèn)題。本文的技術(shù)路線是:首先對(duì)關(guān)系數(shù)據(jù)庫(kù)構(gòu)建全文索引,按照模式圖對(duì)數(shù)據(jù)庫(kù)的元組進(jìn)行信息整合,得到需要的對(duì)象;接下來(lái)在構(gòu)建好的對(duì)象上進(jìn)行關(guān)鍵詞檢索;最后針對(duì)檢索出的結(jié)果進(jìn)行相關(guān)性排序。 本文提出的相關(guān)性排序算法首先需要發(fā)現(xiàn)屬性值之間的傳遞關(guān)系。一個(gè)屬性值出現(xiàn)的次數(shù)越多,屬性值與關(guān)鍵詞的聯(lián)系越緊密,利用信息熵的方法為屬性分配權(quán)值。信息熵的大小與數(shù)據(jù)分布的情況有關(guān),可以通過(guò)計(jì)算信息熵來(lái)反映當(dāng)前屬性值分布的情況,找到屬性值與關(guān)鍵詞的相關(guān)情況,得到信息檢索的相關(guān)性得分。其次需要考慮每個(gè)對(duì)象本身的結(jié)構(gòu)特點(diǎn)。包括對(duì)象中的元組和元組之間的邊的情況來(lái)得到數(shù)據(jù)庫(kù)結(jié)構(gòu)相關(guān)性得分,由兩者共同得到相關(guān)性得分。 本文采用上述方法設(shè)計(jì)了關(guān)系數(shù)據(jù)庫(kù)對(duì)象級(jí)別檢索結(jié)果相關(guān)性排序的總體框架,并實(shí)現(xiàn)了該算法。以手機(jī)領(lǐng)域的數(shù)據(jù)表為數(shù)據(jù)集對(duì)該算法進(jìn)行驗(yàn)證,其結(jié)果證實(shí)了該算法的可用性以及算法的可行性。本文的排序過(guò)程不僅能得到包含關(guān)鍵詞的對(duì)象信息,而且可以區(qū)分包含相同關(guān)鍵詞的對(duì)象之間的差別;與傳統(tǒng)的關(guān)鍵詞檢索排序算法相比,本文使用的方法能有效改善關(guān)系數(shù)據(jù)庫(kù)關(guān)鍵詞檢索排序的效果。
[Abstract]:With the development of Internet, Web search engine has achieved great success, users can use simple keywords to find the information they need. Relational database is the mainstream form of database at present. It uses structured query language to retrieve content and requires users to master some knowledge of query language and database schema. As a result, there is a natural need for relational databases to support efficient keyword queries, because keyword retrieval enables users to get rid of the constraints of SQL statements. Compared with the web search engine, relational database keyword retrieval has new features, such as: there are semantic relationships between tuples; attribute values in the database hide equivalence and transmission relations; the text in the database is short text, and so on. Therefore, some information retrieval methods only do tuple-level keyword retrieval on relational databases, and are not suitable for relational databases. Therefore, we need to study a kind of correlation sorting algorithm which is suitable for the characteristics of relational databases. In this paper, according to the characteristics of relational database and information retrieval, an object-level correlation sorting algorithm is studied. The problem of information dispersion in tuple level retrieval and sorting is solved. The technical route of this paper is as follows: firstly, the full-text index of the relational database is constructed, the tuples of the database are integrated according to the schema diagram, and the required objects are obtained; secondly, the keyword retrieval is carried out on the constructed objects; Finally, the correlation order of the retrieved results is given. The correlation sorting algorithm proposed in this paper first needs to find the transitive relationship between attribute values. The more times an attribute value appears, the closer the relationship between the attribute value and the keyword is. The method of information entropy is used to assign the weight value to the attribute. The size of information entropy is related to the distribution of data. By calculating the information entropy, we can reflect the current distribution of attribute value, find the correlation between attribute value and keyword, and get the correlation score of information retrieval. Secondly, it is necessary to consider the structural characteristics of each object itself. The database structure correlation score is obtained by including the tuple and the edge between tuples in the object, and the correlation score is obtained by the two together. In this paper, we design an overall framework of relational ranking for object-level retrieval results in relational databases, and implement the algorithm. The proposed algorithm is verified by the data table in mobile phone field. The results show that the algorithm is feasible and available. The sorting process of this paper can not only get the object information containing keywords, but also distinguish the differences between objects that contain the same keywords. Compared with the traditional keyword retrieval sorting algorithm, the method used in this paper can effectively improve the sorting effect of keyword retrieval in relational database.
【學(xué)位授予單位】:大連海事大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP311.13

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 王翔;;NoSQL從口號(hào)到實(shí)踐[J];程序員;2010年10期

2 李慶紅;;關(guān)系數(shù)據(jù)庫(kù)中近似查詢的自動(dòng)采樣改進(jìn)方法研究[J];湖南人文科技學(xué)院學(xué)報(bào);2011年02期

3 張海濤;江暢;姜杰;顧燕;;《空間數(shù)據(jù)庫(kù)》課程內(nèi)容體系研究[J];測(cè)繪與空間地理信息;2011年03期

4 朱興統(tǒng);;基于DOM的XML文檔到關(guān)系數(shù)據(jù)庫(kù)的數(shù)據(jù)轉(zhuǎn)換方法[J];電腦知識(shí)與技術(shù);2011年13期

5 黃楠;;模糊關(guān)系數(shù)據(jù)庫(kù)查詢的探究[J];信息與電腦(理論版);2011年06期

6 楊云;;基于Versant對(duì)象數(shù)據(jù)庫(kù)在油田信息化中的應(yīng)用研究[J];中國(guó)西部科技;2011年22期

7 曾箏;;論項(xiàng)目教學(xué)法在《數(shù)據(jù)庫(kù)原理及應(yīng)用》中的應(yīng)用[J];現(xiàn)代商貿(mào)工業(yè);2011年11期

8 王磊;詹惠琴;;iFIX組態(tài)軟件在污水處理控制系統(tǒng)中的應(yīng)用[J];自動(dòng)化應(yīng)用;2011年08期

9 王磊;詹惠琴;;iFIX組態(tài)軟件在污水處理控制系統(tǒng)中的應(yīng)用[J];辦公自動(dòng)化;2011年12期

10 李慶紅;;關(guān)系數(shù)據(jù)庫(kù)近似匹配查詢方法研究[J];計(jì)算機(jī)工程;2011年13期

相關(guān)會(huì)議論文 前10條

1 何義劍;姚青;洪曉光;;基于關(guān)系數(shù)據(jù)庫(kù)的業(yè)務(wù)流程本體存儲(chǔ)模式研究[A];第二十四屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2007年

2 吳紅偉;王慶;蕭建昌;周傲英;;XML鍵約束在關(guān)系數(shù)據(jù)庫(kù)中的實(shí)現(xiàn)[A];第十九屆全國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2002年

3 陳欣;金遠(yuǎn)平;呂揚(yáng);;基于本體的關(guān)系數(shù)據(jù)庫(kù)的語(yǔ)義設(shè)計(jì)模式[A];第二十一屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2004年

4 顧平;周力;;基于MDA的關(guān)系數(shù)據(jù)庫(kù)的設(shè)計(jì)與實(shí)現(xiàn)[A];第二十三屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2006年

5 汪t熺,

本文編號(hào):2439053


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2439053.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶22ce8***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com