基于信息抽取的個(gè)性化校園日歷系統(tǒng)的研究
[Abstract]:With the rapid development of the Internet, the information data is becoming more and more diversified and complicated, which also brings a lot of inconvenience to users in querying information. How to extract the needed information from a large number of daily data has also become the focus of natural language processing. The technology of information extraction which is studied in this paper arises as the times require. A large amount of disordered and irregular information is extracted out and stored structurally, which plays an important role in promoting the development of information technology. The feature of this paper is to study the information extraction technology with event and time as the center, and design and implement the personalized campus calendar system. The main innovations and research results are as follows: firstly, a Chinese entity relation extraction algorithm combining rule and statistical model is designed and implemented. The machine learning method combined with conditional random field model and maximum entropy model is used to give the supplementary results, which improves the accuracy and recall rate. This method has achieved good results in the SlotFilling task evaluated by TAC-KBP. Secondly, a personalized campus calendar system is proposed and implemented. The system extracts the event information and collates the time information of the event, which provides a clue for people to understand the event comprehensively. In this system, the time expression of text information is extracted and normalized by rule-based method. On the basis of this, a method of identifying the expression of event start and end time based on word activation force model is proposed. The timing of events provides more information about the evolution of events. The system has been successfully applied in the campus entity search engine system COSE. Thirdly, an extension method of affective propensity lexicon based on WAF and a method to judge the affective tendency of text based on machine learning are proposed. This method has achieved good results in the task-viewpoint word extraction and tendency judgment of 2011COAE evaluation. The algorithm model adds the function of emotional orientation judgment for the campus calendar system. This function can be further applied to the monitoring of campus public opinion.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 劉克彬;李芳;劉磊;韓穎;;基于核函數(shù)中文關(guān)系自動(dòng)抽取系統(tǒng)的實(shí)現(xiàn)[J];計(jì)算機(jī)研究與發(fā)展;2007年08期
2 李保利,陳玉忠,俞士汶;信息抽取研究綜述[J];計(jì)算機(jī)工程與應(yīng)用;2003年10期
3 張曉艷;王挺;陳火旺;;命名實(shí)體識(shí)別研究[J];計(jì)算機(jī)科學(xué);2005年04期
4 鄧擘;樊孝忠;楊立公;;用語義模式提取實(shí)體關(guān)系的方法[J];計(jì)算機(jī)工程;2007年10期
5 劉遷;焦慧;賈惠波;;信息抽取技術(shù)的發(fā)展現(xiàn)狀及構(gòu)建方法的研究[J];計(jì)算機(jī)應(yīng)用研究;2007年07期
6 車萬翔,劉挺,李生;實(shí)體關(guān)系自動(dòng)抽取[J];中文信息學(xué)報(bào);2005年02期
7 孫茂松,黃昌寧,,高海燕,方捷;中文姓名的自動(dòng)辨識(shí)[J];中文信息學(xué)報(bào);1995年02期
8 張小衡,王玲玲;中文機(jī)構(gòu)名稱的識(shí)別與分析[J];中文信息學(xué)報(bào);1997年04期
相關(guān)博士學(xué)位論文 前1條
1 張素香;信息抽取中關(guān)鍵技術(shù)的研究[D];北京郵電大學(xué);2007年
本文編號(hào):2183091
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2183091.html