天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 搜索引擎論文 >

基于粒計算的Web信息融合方法研究

發(fā)布時間:2018-03-27 13:19

  本文選題:Web挖掘 切入點:粒計算 出處:《武漢理工大學(xué)》2013年碩士論文


【摘要】:隨著Internet的發(fā)展與普及,企業(yè)的運營日益擴展到Internet上,Internet已經(jīng)成為世界上包含信息量最大、涵蓋知識面最廣的信息知識庫,是全球信息傳播的主要渠道,為人們提供了最有價值的信息源。Internet的迅速發(fā)展讓W(xué)eb信息更具多樣性,人們在Internet這個廣闊的選擇空間中獲得有用知識信息的同時也面臨著巨大的挑戰(zhàn)。傳統(tǒng)搜索引擎存在檢索結(jié)果信息冗余、不精準(zhǔn)和碎片化的問題,用戶不得不承受沉重的信息加工負擔(dān)。信息融合技術(shù)已經(jīng)廣泛用于軍事、經(jīng)濟和生物醫(yī)學(xué)等領(lǐng)域,其在改善信息置信度、降低信息冗余度方面的能力為Web信息處理提供了新途徑,已有的信息融合技術(shù)在處理結(jié)構(gòu)化數(shù)據(jù)方面具有很好的發(fā)展前景,但是并不適用于具有非結(jié)構(gòu)化、大容量并且動態(tài)變化的Web信息;谝陨蠁栴},本文從“構(gòu)造-集成”和“事件-索引”兩個認知角度處理非結(jié)構(gòu)化的Web信息,研究Web信息多粒度融合方法。借鑒現(xiàn)有的粒計算理論和Web信息融合理論,采用Web信息抽取技術(shù)抽取Web信息作為知識源,采用Web挖掘技術(shù)對Web信息進行深層挖掘,對Web信息進行提取及分析,將大量的、不確定的、非結(jié)構(gòu)化Web信息轉(zhuǎn)換為量化的、結(jié)構(gòu)化的文本信息,研究Web信息多粒度融合模型和Web信息多粒度融合算法,主要的工作如下: (1)采用Web信息抽取技術(shù)抽取Web信息中包含的標(biāo)題、正文、發(fā)布時間、信息來源等信息作為知識源,針對文本信息非結(jié)構(gòu)化的特征,采用Web內(nèi)容挖掘技術(shù)對Web信息進行內(nèi)容上的深層挖掘,對文本信息進行量化表達;采用Web結(jié)構(gòu)挖掘技術(shù)挖掘文本信息中的結(jié)構(gòu)信息,設(shè)計包括概念、內(nèi)容屬性和鏈接結(jié)構(gòu)屬性的Web信息表示模型。 (2)研究能反映知識的粒度性的構(gòu)造-集成認知模型,結(jié)合模糊商空間理論從“構(gòu)造-集成”認知角度設(shè)計Web信息粒度空間模型;研究能反映知識的關(guān)聯(lián)性的事件-索引認知模型,針對Web信息的特征從“事件-索引”認知角度設(shè)計Web信息粒度關(guān)聯(lián)模型; (3)研究文本特征權(quán)重計算方法,針對Web信息動態(tài)更新的特征,研究主題增量聚類算法;針對同—主題信息結(jié)合Web信息粒度空間模型進行多粒度劃分及表示,研究信息粒度空間生成算法;分析Web信息的內(nèi)容屬性以及結(jié)構(gòu)屬性,結(jié)合Web信息多粒度關(guān)聯(lián)模型研究Web信息粒度關(guān)聯(lián)融合算法。并以新浪網(wǎng)中新聞中心的新聞信息為實驗數(shù)據(jù),進行了實例分析,證明了本文提出的方法是有效的。
[Abstract]:With the development and popularization of Internet, the operation of enterprises is increasingly extended to the Internet. It has become the information knowledge base with the largest amount of information and the most extensive knowledge in the world, and it is the main channel of global information dissemination. Provides people with the most valuable sources of information. The rapid development of the Internet makes Web information more diverse. People are faced with great challenges while obtaining useful knowledge information in the vast choice space of Internet. Traditional search engines have the problems of redundant retrieval results, inaccuracy and fragmentation. Users have to bear a heavy burden of information processing. Information fusion technology, which has been widely used in military, economic and biomedical fields, is improving confidence in information. The ability of reducing information redundancy provides a new way for Web information processing. The existing information fusion technology has a good development prospect in dealing with structured data, but it is not suitable for unstructured data processing. Large volume and dynamic Web information. Based on the above problems, this paper deals with unstructured Web information from two cognitive perspectives, namely "structure-integration" and "event-index". This paper studies the multi-granularity fusion method of Web information, draws lessons from existing granular computing theory and Web information fusion theory, uses Web information extraction technology to extract Web information as knowledge source, and uses Web mining technology to mine Web information deeply. The Web information is extracted and analyzed. A large amount of uncertain and unstructured Web information is transformed into quantitative and structured text information. The multi-granularity fusion model of Web information and the multi-granularity fusion algorithm of Web information are studied. The main work is as follows:. Using Web information extraction technology to extract the title, text, release time, information source and other information contained in Web information as knowledge source, aiming at the unstructured features of text information. The Web content mining technology is used to mine the Web information in depth and to express the text information quantitatively, and the Web structure mining technology is used to mine the structural information in the text information. Web information representation model for content attributes and link structure attributes. 2) the construction-integrated cognitive model which can reflect the granularity of knowledge is studied, and the Web information granularity space model is designed from the perspective of "structure-integration" cognitive theory combined with fuzzy quotient space theory. This paper studies the event-index cognitive model which can reflect the relevance of knowledge, and designs the Web information granularity correlation model from the perspective of "event-index" cognition according to the characteristics of Web information. Thirdly, the paper studies the method of calculating the weight of text feature, studies the incremental clustering algorithm for the dynamic update of Web information, divides and expresses the multi-granularity of the same topic information combined with Web information granularity space model. This paper studies the algorithm of generating information granularity space, analyzes the content attribute and structure attribute of Web information, studies the fusion algorithm of Web information granularity association with Web information multi-granularity association model, and takes the news information of news center in Sina.com as experimental data. An example is given to show that the proposed method is effective.
【學(xué)位授予單位】:武漢理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP202

【參考文獻】

相關(guān)期刊論文 前9條

1 張玉峰;蔡皎潔;;基于數(shù)據(jù)挖掘的Web文本語義分析與標(biāo)注研究[J];情報理論與實踐;2010年02期

2 張玉峰;何超;;基于Web挖掘的網(wǎng)絡(luò)輿情智能分析研究[J];情報理論與實踐;2011年04期

3 張小明;李舟軍;巢文涵;;基于增量型聚類的自動話題檢測研究[J];軟件學(xué)報;2012年06期

4 謝剛;劉靜;;粒計算研究現(xiàn)狀及展望[J];軟件;2011年03期

5 劉平峰;章佩璐;張軍;余文艷;;面向主題的Web信息融合模型[J];圖書情報工作;2011年08期

6 范聰賢;徐汀榮;范強賢;;Web結(jié)構(gòu)挖掘中HITS算法改進的研究[J];微計算機信息;2010年03期

7 余燕芳;;基于改進遺傳算法的Web文本挖掘系統(tǒng)[J];微電子學(xué)與計算機;2010年04期

8 張文;唐錫晉;吉田武稔;;AIS—基于文本挖掘的增強型Web信息處理技術(shù)[J];系統(tǒng)工程理論與實踐;2010年01期

9 楊瀟;馬軍;楊同峰;杜言琦;邵海敏;;主題模型LDA的多文檔自動文摘[J];智能系統(tǒng)學(xué)報;2010年02期

,

本文編號:1671678

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1671678.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶4f6e9***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com