基于深度學(xué)習(xí)的大規(guī)模圖數(shù)據(jù)挖掘

發(fā)布時(shí)間：2018-09-11 05:56

【摘要】：隨著大數(shù)據(jù)思維逐漸深入人心以及深度學(xué)習(xí)的廣泛研究和應(yīng)用,圖結(jié)構(gòu)逐漸被用來表征現(xiàn)實(shí)世界中大規(guī)模的、錯(cuò)綜復(fù)雜的數(shù)據(jù),而深層挖掘大規(guī)模圖數(shù)據(jù)內(nèi)部隱含的信息也逐漸成為了研究的熱點(diǎn)。在信息爆炸的時(shí)代,傳統(tǒng)的基于關(guān)鍵字匹配的搜索引擎已經(jīng)難以滿足用戶希望迅速、準(zhǔn)確、簡(jiǎn)便地獲取信息的需求,為此知識(shí)圖譜通過建立基于語義的信息實(shí)體圖來滿足人們新的查詢需求。本文首先通過回顧學(xué)者、科研機(jī)構(gòu)及公司對(duì)知識(shí)圖譜的研究?jī)?nèi)容,對(duì)知識(shí)圖譜的發(fā)展和構(gòu)建方法作了全面的介紹,包括知識(shí)圖譜概念的起源、發(fā)展以及最終形成過程;構(gòu)建知識(shí)圖譜的數(shù)據(jù)來源;構(gòu)建過程中涉及的方法,包括本體和實(shí)體的抽取,圖譜的構(gòu)建、更新、維護(hù),以及面向知識(shí)圖譜的內(nèi)部結(jié)構(gòu)挖掘和外部擴(kuò)展應(yīng)用。最后,對(duì)知識(shí)圖譜的未來發(fā)展方向和面臨的挑戰(zhàn)作了展望。針對(duì)大規(guī)模圖數(shù)據(jù)挖掘面臨的計(jì)算復(fù)雜、數(shù)據(jù)稀疏的問題,本文在word2vec算法基礎(chǔ)上進(jìn)行改進(jìn)設(shè)計(jì)了一種基于深度學(xué)習(xí)的網(wǎng)絡(luò)表示學(xué)習(xí)算法,通過將圖結(jié)點(diǎn)表示為低維向量為圖數(shù)據(jù)挖掘工作中能夠使用成熟的機(jī)器學(xué)習(xí)算法和線性代數(shù)的理論和工具提供了可能。該算法針對(duì)圖結(jié)點(diǎn)的多標(biāo)簽分類任務(wù),利用部分標(biāo)簽信息指導(dǎo)在結(jié)點(diǎn)間游走的過程,然后使用邏輯回歸分類模型對(duì)結(jié)點(diǎn)的特征表示進(jìn)行多標(biāo)簽分類。實(shí)驗(yàn)結(jié)果顯示通過有指導(dǎo)地游走,標(biāo)簽分類準(zhǔn)確率有明顯提升。另外,本文利用網(wǎng)絡(luò)表示學(xué)習(xí)算法得到的圖結(jié)點(diǎn)的向量表示設(shè)計(jì)了一種生成邊特征表示的組合方法,同時(shí)通過構(gòu)建深度置信網(wǎng)絡(luò)的分類模型,實(shí)現(xiàn)了對(duì)復(fù)雜網(wǎng)絡(luò)的鏈路預(yù)測(cè)。
[Abstract]:With the extensive research and application of big data's thinking and deep learning, the graph structure is gradually used to represent the large-scale and complicated data in the real world. And deep mining the hidden information inside the large scale map data has gradually become the hot spot of research. In the era of information explosion, the traditional search engine based on keyword matching has been difficult to meet the needs of users who want to obtain information quickly, accurately and easily. Therefore, the knowledge map can meet the new query needs by building semantic information entity graph. Firstly, by reviewing the research contents of knowledge atlas by scholars, scientific research institutions and companies, this paper gives a comprehensive introduction to the development and construction methods of knowledge atlas, including the origin, development and final forming process of the concept of knowledge atlas; The methods involved in constructing knowledge map include ontology and entity extraction, graph construction, updating, maintenance, and knowledge map oriented internal structure mining and external extension application. Finally, the future development direction and challenges of knowledge map are prospected. Aiming at the problem of complex computation and sparse data in large-scale graph data mining, a network representation learning algorithm based on deep learning is proposed in this paper, which is improved on the basis of word2vec algorithm. By representing graph nodes as low-dimensional vectors, it is possible to use mature machine learning algorithms and linear algebra theories and tools in graph data mining. According to the multi-label classification task of graph nodes, the algorithm uses partial label information to guide the process of walking between nodes, and then uses the logical regression classification model to classify the feature representation of nodes. The experimental results show that the accuracy of label classification is significantly improved by guided walking. In addition, using the vector representation of graph nodes obtained by network representation learning algorithm, a combination method of generating edge feature representation is designed. At the same time, the link prediction of complex networks is realized by constructing a classification model of depth confidence networks.
【學(xué)位授予單位】：南京郵電大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP311.13;TP181

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 劉知遠(yuǎn);孫茂松;林衍凱;謝若冰;;知識(shí)表示學(xué)習(xí)研究進(jìn)展[J];計(jì)算機(jī)研究與發(fā)展;2016年02期

2 方濱興;賈焰;李愛平;殷麗華;;網(wǎng)絡(luò)空間大搜索研究范疇與發(fā)展趨勢(shì)[J];通信學(xué)報(bào);2015年12期

3 曹倩;趙一鳴;;知識(shí)圖譜的技術(shù)實(shí)現(xiàn)流程及相關(guān)應(yīng)用[J];情報(bào)理論與實(shí)踐;2015年12期

4 莊嚴(yán);李國(guó)良;馮建華;;知識(shí)庫實(shí)體對(duì)齊技術(shù)綜述[J];計(jì)算機(jī)研究與發(fā)展;2016年01期

5 陳維政;張巖;李曉明;;網(wǎng)絡(luò)表示學(xué)習(xí)[J];大數(shù)據(jù);2015年03期

6 王元卓;賈巖濤;劉大偉;靳小龍;程學(xué)旗;;基于開放網(wǎng)絡(luò)知識(shí)的信息檢索與數(shù)據(jù)挖掘[J];計(jì)算機(jī)研究與發(fā)展;2015年02期

7 王知津;王璇;馬婧;;論知識(shí)組織的十大原則[J];國(guó)家圖書館學(xué)刊;2012年04期

8 楊思洛;韓瑞珍;;知識(shí)圖譜研究現(xiàn)狀及趨勢(shì)的可視化分析[J];情報(bào)資料工作;2012年04期

9 呂琳媛;;復(fù)雜網(wǎng)絡(luò)鏈路預(yù)測(cè)[J];電子科技大學(xué)學(xué)報(bào);2010年05期

10 祝忠明;馬建霞;盧利農(nóng);李富強(qiáng);劉巍;吳登祿;;機(jī)構(gòu)知識(shí)庫開源軟件DSpace的擴(kuò)展開發(fā)與應(yīng)用[J];現(xiàn)代圖書情報(bào)技術(shù);2009年Z1期

相關(guān)碩士學(xué)位論文前5條

1 袁旭萍;基于深度學(xué)習(xí)的商業(yè)領(lǐng)域知識(shí)圖譜構(gòu)建[D];華東師范大學(xué);2015年

2 項(xiàng)靈輝;基于圖數(shù)據(jù)庫的海量RDF數(shù)據(jù)分布式存儲(chǔ)[D];武漢科技大學(xué);2013年

3 曹浩;基于機(jī)器學(xué)習(xí)的雙語詞匯抽取問題研究[D];南開大學(xué);2011年

4 關(guān)鍵;面向中文文本本體學(xué)習(xí)概念抽取的研究[D];吉林大學(xué);2010年

5 曾錦麒;語義WEB的知識(shí)表示語言及其應(yīng)用研究[D];中南大學(xué);2004年

，

本文編號(hào)：2235752

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2235752.html

上一篇：星上設(shè)備安裝姿態(tài)高精度自動(dòng)測(cè)量系統(tǒng)設(shè)計(jì)
下一篇：基于圖理論的圖像特征匹配算法研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于深度學(xué)習(xí)的大規(guī)模圖數(shù)據(jù)挖掘