基于微博中的人物圖譜的構(gòu)建方法研究
[Abstract]:With the rapid development of the Internet, more and more users participate in the Internet. A large amount of data is generated every day on the Internet, which contains a lot of useful information. How to extract useful structured data from these unstructured text data is the focus of this paper. However, in these natural language documents, a large number of character social relations are described. Automatic extraction of character social relations from these documents is very useful for the analysis and research of character social relations. Bootstrap relational extraction system can be effectively applied to Weibo environment. This paper puts forward four suggestions for improvement on the basis of this model. The main contents of this paper are given below. In this paper, an arrangement algorithm based on graph is proposed. The bootstrap relationship extraction model can extract the character entity pairs under the specific relationship. In order to improve the performance of the model, a graph-based permutation algorithm is proposed in this paper. For the results of the model, the algorithm takes into account the similarity between the results and the species subset, thus improving the performance of the model. In this paper, a seed set construction model based on target relation is proposed. Because the traditional seed set construction method needs a lot of artificial intervention in relational extraction, the efficiency of the experiment becomes lower. The seed set construction method proposed in this paper is to use Baidu encyclopedia to construct Chinese semantic knowledge base, and then to classify the relationships in Chinese semantic knowledge base. This paper only considers the relationship extraction problem of three categories, and finally uses the Chinese knowledge base combined with search engine to construct seed set. In this paper, the entity pair similarity calculation method is improved. In the graph-based arrangement algorithm, it is very important to construct the entity pair graph and the similarity calculation method between the entity pair. In this paper, the similarity calculation method between the two entity pairs in the original entity pair is improved. In this paper, the potential relation analysis (LRA) is used to calculate the similarity. This method can solve the problem of dimension reduction and denoising, and can improve the accuracy of the calculation. In this paper, the similarity calculation method of content pattern is improved. In the graph-based arrangement algorithm, it is necessary to construct the content pattern diagram, and the similarity calculation method between the content patterns is very important. This paper also improves the similarity calculation method between the content patterns in the original content pattern diagram. In this paper, the path inclusion tree is used to represent the content pattern, and the convolution tree kernel function is used to calculate the similarity between the content patterns. This improved method can improve the accuracy of similarity. At the end of this paper, the visual relationship map of characters is constructed, and the experiment proves the applicability and feasibility of the research content in this paper. The method proposed in this paper can be used in any type of relational extraction, and has strong scalability.
【學(xué)位授予單位】:西華大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1;TP393.092
【參考文獻】
相關(guān)期刊論文 前10條
1 黃衛(wèi)春;范少帥;熊李艷;鐘茂生;;基于特征選擇的人物關(guān)系抽取方法[J];科學(xué)技術(shù)與工程;2015年03期
2 武金剛;;知識圖譜——搜索引擎的進化[J];百科知識;2013年22期
3 王連喜;;微博短文本預(yù)處理及學(xué)習(xí)研究綜述[J];圖書情報工作;2013年11期
4 李卓君;;搜索引擎問題分析及發(fā)展趨勢研究[J];中國市場;2011年49期
5 張小娣;宋余慶;;基于科學(xué)知識圖譜的搜索引擎前沿分析[J];科技管理研究;2011年18期
6 邱均平;胡文君;羅力;;基于知識圖譜的國際網(wǎng)絡(luò)搜索引擎研究現(xiàn)狀與前沿分析[J];圖書情報工作;2010年24期
7 唐明偉;卞藝杰;陶飛飛;;基于語義向量空間模型的文檔檢索系統(tǒng)研究[J];情報雜志;2010年05期
8 黃鑫;朱巧明;錢龍華;劉梅梅;;基于特征組合的中文實體關(guān)系抽取[J];微電子學(xué)與計算機;2010年04期
9 莊成龍;錢龍華;周國棟;;基于樹核函數(shù)的實體語義關(guān)系抽取方法研究[J];中文信息學(xué)報;2009年01期
10 車萬翔,劉挺,李生;實體關(guān)系自動抽取[J];中文信息學(xué)報;2005年02期
相關(guān)碩士學(xué)位論文 前2條
1 杜振雷;面向微博短文本的情感分析研究[D];北京信息科技大學(xué);2013年
2 牛鴿軍;新浪微博虛擬社區(qū)的網(wǎng)絡(luò)結(jié)構(gòu)研究[D];哈爾濱工業(yè)大學(xué);2013年
,本文編號:2497686
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2497686.html