基于數(shù)據(jù)挖掘的語(yǔ)義web系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-03-01 11:25
本文關(guān)鍵詞: 語(yǔ)義Web 本體 數(shù)據(jù)挖掘 Apriori算法 出處:《電子科技大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)絡(luò)的飛速發(fā)展,互聯(lián)網(wǎng)絡(luò)涉及到了新聞、政府、教育、廣告等社會(huì)各個(gè)方面。Internet應(yīng)用的普及使得數(shù)據(jù)挖掘技術(shù)的重點(diǎn)已經(jīng)從傳統(tǒng)的基于數(shù)據(jù)庫(kù)的應(yīng)用轉(zhuǎn)移到了基于Web的應(yīng)用。在信息技術(shù)的推動(dòng)下,Web已經(jīng)成為了社會(huì)上信息生產(chǎn)、加工、發(fā)布和處理的主要憑條,Web上的數(shù)據(jù)正在呈爆炸式增長(zhǎng),為了幫助用戶在海量的Web數(shù)據(jù)中迅速找到有用的信息,從Web服務(wù)和文檔中發(fā)現(xiàn)有用信息的數(shù)據(jù)挖掘也成為了當(dāng)前研究的重點(diǎn)。Web挖掘就是從互聯(lián)網(wǎng)絡(luò)上的Web文檔中抽取隱藏的信息和模式,但是Web海量的數(shù)據(jù)大多都是非結(jié)構(gòu)化或者半結(jié)構(gòu)化的,因此利用傳統(tǒng)的數(shù)據(jù)挖掘技術(shù)來(lái)挖掘web上有用信息的效果不佳。語(yǔ)義Web是現(xiàn)有Web的擴(kuò)展,并且使得Web不僅僅是一種信息展示的平臺(tái),同時(shí)也有助于計(jì)算機(jī)理解Web上的內(nèi)容。本文一方面,對(duì)如何在Web上提取新的語(yǔ)義本體結(jié)構(gòu)來(lái)發(fā)展Web挖掘進(jìn)行了研究;另一方面,如何針對(duì)所研究的語(yǔ)義網(wǎng)結(jié)構(gòu)在Web挖掘中的應(yīng)用進(jìn)行了實(shí)例驗(yàn)證。針對(duì)語(yǔ)義Web的數(shù)據(jù)挖掘研究所做的具體工作如下:首先,針對(duì)語(yǔ)義Web的研究,主要采用Protégé工具對(duì)如何創(chuàng)建本體,以及如何往本體中添加實(shí)例和屬性。其次,利用成熟的Apriori關(guān)聯(lián)規(guī)則數(shù)據(jù)挖掘算法對(duì)已經(jīng)創(chuàng)建,且添加了實(shí)例的本體進(jìn)行數(shù)據(jù)挖掘,從而獲得其中新的知識(shí)。最后,在結(jié)果分析方面,借助java開(kāi)發(fā)工具包中的Jena工具局,利用其推理功能,對(duì)已經(jīng)建立的語(yǔ)義Web進(jìn)行推理,并且將所得到的新的推理規(guī)則與本文所研究的語(yǔ)義Web系統(tǒng)中獲得的知識(shí)進(jìn)行對(duì)比。本文所研究得到的推理規(guī)則,放到推理規(guī)則庫(kù)中,用于語(yǔ)義Web庫(kù)的擴(kuò)展,實(shí)現(xiàn)本體提取的半自動(dòng)化,并且最大程度的獲取網(wǎng)絡(luò)無(wú)規(guī)則數(shù)據(jù)中的新知識(shí),不僅可以較好的克服傳統(tǒng)Web所存在的一些缺陷,同時(shí)也有利于提高網(wǎng)絡(luò)上信息的利用效率。
[Abstract]:With the rapid development of the Internet, the Internet involves news, government, education, With the popularity of Internet applications, the emphasis of data mining technology has shifted from the traditional database based application to the Web based application. With the promotion of information technology, web has become the information production and processing in the society. Data published and processed primarily on Web is exploding in order to help users quickly find useful information in a vast amount of Web data. Discovery of useful information from Web services and documents has also become the focus of current research. Web mining is extracting hidden information and patterns from Web documents on the Internet. However, the mass data of Web is mostly unstructured or semi-structured, so it is not effective to mine useful information on web by using traditional data mining technology. Semantic Web is an extension of existing Web. Web is not only a platform for information display, but also helpful for computer to understand the content of Web. On the one hand, this paper studies how to extract new semantic ontology structure from Web to develop Web mining; on the other hand, How to validate the application of semantic web structure in Web mining. The research work of semantic Web data mining is as follows: first, the research on semantic Web. It mainly uses Prot 茅 g 茅 tool to create ontology, and how to add instances and attributes to ontology. Secondly, it uses mature Apriori association rule data mining algorithm to mine the ontology that has been created and has added instances. Finally, in the aspect of result analysis, with the help of the Jena tool bureau in the java development toolkit, the reasoning function is used to infer the established semantic Web. The new reasoning rules obtained in this paper are compared with the knowledge gained in the semantic Web system studied in this paper. The reasoning rules obtained in this paper are put into the inference rule base for the extension of the semantic Web library. To realize the semi-automation of ontology extraction and to obtain the new knowledge in the non-regular data of the network to the greatest extent can not only overcome some defects of the traditional Web, but also improve the efficiency of information utilization on the network.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.09;TP311.52
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 鄭向群;趙政;;基于S-CART決策樹(shù)的多關(guān)系空間數(shù)據(jù)挖掘方法[J];計(jì)算機(jī)應(yīng)用;2008年03期
2 萬(wàn)志華,歐陽(yáng)為民,張平庸;一種基于劃分的動(dòng)態(tài)聚類算法[J];計(jì)算機(jī)工程與設(shè)計(jì);2005年01期
,本文編號(hào):1551719
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1551719.html
最近更新
教材專著