基于領(lǐng)域本體的中文財(cái)經(jīng)Blog搜索引擎的設(shè)計(jì)與實(shí)現(xiàn)
[Abstract]:With the rapid development of blog (Blog), the number of Blog pages has increased in geometric order. How to find the Blog pages of interest in the massive Blog pages is particularly important. So the professional search engine (Blog search engine) for Blog pages was born. This paper mainly focuses on the ontology-based financial Blog search engine. It is found that the Blog search engine has some shortcomings in three aspects: first, the similarity calculation of Blog pages can not support document level query. The reason is that the existing Blog search engine has no effective method to calculate the similarity of Blog pages, the second is that the search results can not meet the query intention of users, the reason is whether the similarity is semantic similarity or the similarity value is inaccurate. Third, how to rank the content related results first, which is related to the sorting algorithm of the retrieval results. This article has carried on the thorough research to these deficiencies, and summed up the following two aspects: 1. On the aspect of Blog web page similarity calculation, this paper proposes an ontology-based Blog web page similarity calculation method (CSFBO method) based on the research of existing Blog web page similarity calculation methods. In this method, the financial keywords represent the information of Blog pages, and the similarity calculation of Blog pages is transformed into the similarity calculation between financial and financial keywords. This keyword extraction is particularly important. Based on the traditional TF*IDF algorithm, different parts of Blog pages are given different weights according to the characteristics of Blog pages, thus the algorithm of extracting financial keywords is improved, and the accuracy of similarity calculation is improved. 2. On the aspect of Blog search result sorting, this paper analyzes the BlogRank algorithm and B2Rank algorithm, combines the characteristics of financial Blog, according to the influence factors of the financial Blog sorting algorithm and the shortcomings of the existing sorting algorithm. This paper presents a Blog search result sorting algorithm (SFBS algorithm) for finance and economics. In this paper, the financial domain ontology is constructed, the improved algorithm is applied, the financial Blog search engine based on domain ontology is implemented, and a large number of network data are collected for testing. The effectiveness of the improved algorithm is verified by the implementation of the system. It has high practical value in practical application.
【學(xué)位授予單位】:江西理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 劉仁寧;李禹生;;領(lǐng)域本體構(gòu)建方法[J];武漢工業(yè)學(xué)院學(xué)報(bào);2008年01期
2 李瑜;郭俊波;虎嵩林;;一種基于發(fā)布訂閱模型的博客搜索系統(tǒng)[J];微電子學(xué)與計(jì)算機(jī);2009年09期
3 丁晟春,顧德訪;Jena在實(shí)現(xiàn)基于Ontology的語義檢索中的應(yīng)用研究[J];現(xiàn)代圖書情報(bào)技術(shù);2005年10期
相關(guān)碩士學(xué)位論文 前7條
1 盧革超;基于本體的主題搜索引擎技術(shù)研究[D];吉林大學(xué);2011年
2 盧凡;基于領(lǐng)域本體的主題爬蟲系統(tǒng)研究與實(shí)現(xiàn)[D];電子科技大學(xué);2011年
3 艾丹祥;基于本體論的知識檢索研究[D];武漢大學(xué);2004年
4 陳建;領(lǐng)域本體的創(chuàng)建和應(yīng)用研究[D];對外經(jīng)濟(jì)貿(mào)易大學(xué);2006年
5 張志剛;領(lǐng)域本體構(gòu)建方法的研究與應(yīng)用[D];大連海事大學(xué);2008年
6 李峰;基于博客特性和鏈接分析的博客搜索結(jié)果排序算法研究[D];浙江大學(xué);2008年
7 林碧霞;基于領(lǐng)域本體的主題爬蟲研究及實(shí)現(xiàn)[D];西南交通大學(xué);2010年
,本文編號:2306298
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2306298.html