中文知識工程和知識服務(wù)平臺的設(shè)計(jì)與實(shí)現(xiàn)
[Abstract]:The rapid development of the Internet has driven the rapid growth of the number of network users, more and more users enter the Internet to become Internet users, making the demand for network resources of users in the rapid growth. How to retrieve the information that users need from the vast information resources is a major challenge facing the development of the Internet. At present, the Internet search engines and part of the Q & A community are the main service providers of information retrieval for Internet users. They have come a long way in providing users with a wide range of data services, such as Baidu and Baidu know. However, the accuracy of the data is not very good, especially when the user needs high accuracy of knowledge information, whether search engines or Q & A community seem to be unable to do. In this paper, aiming at the problem that the rapid expansion of network information does not match the demand of network users for knowledge information, this paper proposes to use the relevant technologies of Chinese knowledge engineering to create Chinese knowledge base, and to establish a platform to provide Chinese knowledge service. The platform aims to provide high-quality and efficient knowledge sharing information for network users. In the construction of knowledge base, this paper proposes to use the information box of encyclopedia page to extract attribute pairs, and to train the classification model according to the attribute pairs extracted from the information frame, using this model and combining with modern Chinese automatic word segmentation. The technology of part of speech and named entity tagging realizes the extraction of attribute pairs from encyclopedia pages which never contain information box, and sets up attribute value database by using extracted attribute pairs to realize the accurate location of users' retrieval of knowledge information. At the same time, when retrieving a knowledge information, users are also concerned about some other knowledge information related to it, so this paper proposes a method of entity correlation degree calculation based on Wikipedia. This method uses coexisting link information contained in Wikipedia page to calculate the correlation degree of two named entities. In the aspect of knowledge service, this paper uses the HITS algorithm based on link analysis to sort the retrieval results, and then calculates the similarity between the page and the problem by calculating the similarity between the page and the problem to determine the sorting of the answer shell surface.
【學(xué)位授予單位】:北方工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2012
【分類號】:TP393.09;TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 劉高軍;馬硯忠;段建勇;;基于維基百科的中文命名實(shí)體關(guān)聯(lián)度計(jì)算[J];北方工業(yè)大學(xué)學(xué)報(bào);2012年01期
2 田久樂;趙蔚;;基于同義詞詞林的詞語相似度計(jì)算方法[J];吉林大學(xué)學(xué)報(bào)(信息科學(xué)版);2010年06期
3 李淑英;;中文分詞技術(shù)[J];科技信息(科學(xué)教研);2007年36期
4 劉斌,黃鐵軍,程軍,高文;一種新的基于統(tǒng)計(jì)的自動(dòng)文本分類方法[J];中文信息學(xué)報(bào);2002年06期
5 秦春秀;趙捧未;劉懷亮;;詞語相似度計(jì)算研究[J];情報(bào)理論與實(shí)踐;2007年01期
6 李滿華;;股市財(cái)富效應(yīng)相關(guān)問題研究[J];商場現(xiàn)代化;2010年12期
7 牟晉娟;包宏;;中文實(shí)體關(guān)系抽取研究[J];計(jì)算機(jī)工程與設(shè)計(jì);2009年15期
8 李滿華;;財(cái)富與財(cái)富效應(yīng)相關(guān)問題研究[J];現(xiàn)代商貿(mào)工業(yè);2010年11期
相關(guān)博士學(xué)位論文 前1條
1 李榮陸;文本分類及其相關(guān)技術(shù)研究[D];復(fù)旦大學(xué);2005年
相關(guān)碩士學(xué)位論文 前1條
1 顧申華;基于互動(dòng)問答系統(tǒng)的問題推薦[D];上海交通大學(xué);2009年
本文編號:2374655
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2374655.html