電子郵件通信實體關系挖掘與分析研究

發(fā)布時間：2018-03-06 05:26

本文選題：社團劃分　切入點：實體勾畫　出處：《電子科技大學》2014年博士論文　論文類型：學位論文

【摘要】：為了適應網(wǎng)絡數(shù)據(jù)快速增長的實體關系挖掘需要,電子郵件網(wǎng)絡作為應用最廣泛的通信網(wǎng)絡之一,因其社會性明顯、應用人群巨大、數(shù)據(jù)中隱含著現(xiàn)實的關系體系,其社會網(wǎng)絡分析的研究日趨活躍。對電子郵件網(wǎng)絡數(shù)據(jù)的社會結(jié)構進行劃分呈現(xiàn)、未知鏈接的預測,是社會網(wǎng)絡分析在網(wǎng)絡數(shù)據(jù)實體關系挖掘中的重要內(nèi)容,同時在電子商務、社交推薦等商業(yè)應用,反恐、犯罪偵查等業(yè)務方面具有廣泛的應用前景。其中社團網(wǎng)絡劃分、鏈路預測則一直是研究的熱點方向。面對大數(shù)據(jù)量的電子郵件通信實體關系挖掘,社團劃分的效率、社團劃分的準確性和鏈路預測的召回率和準確率問題成為實際應用的困擾。本文從現(xiàn)有社會網(wǎng)絡分析的已知算法出發(fā),針對電子郵件網(wǎng)絡通信實體關系挖掘中的社團結(jié)構檢測算法的準確性問題、計算效率問題,以及鏈路預測算法召回率和準確率問題進行了深入研究。論文的主要貢獻如下:(1)提出了一個新的社團結(jié)構檢測算法的測度模型。該模型針對模塊度方法在劃分結(jié)果穩(wěn)定性方面存在的不足,基于信息中心度思想提出了一個新的測度模型,該模型通過對節(jié)點間關聯(lián)度和節(jié)點的度進行加權,不僅能夠準確識別聚類中心,而且為網(wǎng)絡中節(jié)點間相似度計算提供了依據(jù)。據(jù)此進一步提出了一種新的社團劃分算法(BSM算法),仿真實驗和真實網(wǎng)絡數(shù)據(jù)集上的實驗結(jié)果表明,與模塊度方法相比,該算法的穩(wěn)定性和準確性更高,由此也證實了測度模型的有效性。(2)提出了一個適用于大規(guī)模復雜網(wǎng)絡社團劃分的快速算法模型。該算法模型的研究工作分為兩步,首先針對魯汶快速算法首輪迭代效率低的問題,通過引入剪枝策略,提出了一種改進算法(FLA算法)。然后針對魯汶快速算法基于模塊度優(yōu)化思想,易于收斂到局部最優(yōu)解的缺點,通過對優(yōu)化模板函數(shù)進行改進,引入節(jié)點的度和邊的權重等相關信息,在FLA算法的基礎上,提出了一種新的CDDW算法。仿真實驗和真實網(wǎng)絡數(shù)據(jù)集上的實驗結(jié)果表明,新的算法模型不僅能夠大幅降低計算開銷,而且能夠提高整個網(wǎng)絡社團劃分結(jié)果的準確性。(3)提出了一種新型的鏈路預測集成學習算法模型。針對主流的鏈路預測算法普遍存在召回率和準確率較低的問題,提出了一種新穎的集成學習算法模型,將鏈路預測問題視為一個二元分類問題,利用Booting算法框架提供的誤差反饋機制,設計實現(xiàn)了一個新的鏈路預測算法模型:AdaPred模型。為了進一步提高算法的準確率和召回率,提出了一種新的鏈路預測算法,并將其集成到AdaPred模型中。通過在論文協(xié)作網(wǎng)絡和電子郵件網(wǎng)絡等真實數(shù)據(jù)的實證研究,證明了AdaPred算法的預測準確率和召回率明顯優(yōu)于其他算法。(4)研發(fā)了一個電子郵件通信網(wǎng)絡實體關系可視化分析系統(tǒng)�？梢暬夹g有利于社會網(wǎng)絡分析走向?qū)嶋H應用,將對該技術的普及產(chǎn)生深遠影響。本論文以郵件網(wǎng)絡中的實體關系挖掘為切入點,研發(fā)了一個面向應用的可視化分析平臺。該平臺所提供的數(shù)據(jù)分析能力與國際前沿水平看齊,具有良好的通用性和可擴展性。所研發(fā)的原型系統(tǒng)已通過第三方測試和國家863課題驗收,驗收考評結(jié)果為優(yōu)秀。綜上,本文對社會網(wǎng)絡分析技術走向?qū)嶋H應用時面臨的幾類重要挑戰(zhàn)性問題進行了針對性研究,并在此基礎上設計實現(xiàn)了一個可視化分析系統(tǒng)原型,該研究成果為社會網(wǎng)絡分析技術的推廣應用提供了一個高效可行的解決方案。本文所采用的分析技術基于網(wǎng)絡拓撲結(jié)構,而不依賴于更多的上下文信息,因此具有良好的可擴展性,能夠推廣到更廣泛的社會網(wǎng)絡數(shù)據(jù)分析應用場景。
[Abstract]:In order to increase the network data mining need to adapt to the entity relation network, email communication network as one of the most widely used, because of its obvious social application, huge population, data implies system reality, research and analysis of its social network is becoming more and more active. The social structure of email network data are divided into presentation, forecast unknown links, is the important content of social network analysis in the network data mining entity relationship, at the same time in electronic commerce, social recommendation and other commercial applications, counter terrorism, criminal investigation and other business and has wide application prospect. The community network division, link prediction has been the focus of research direction. In the face of a large amount of data e-mail communication entity relationship mining efficiency, community classification, community classification accuracy and link prediction precision and recall problems become real The application of problems. Starting from the analysis of the existing known algorithms of social networks, aiming at accuracy of community structure mining e-mail network communication entity relation detection algorithm in the calculation efficiency, and link prediction algorithm recall rate and accuracy rate were studied. The main contributions of this thesis are as follows: (1) put forward the measurement model a new community structure detection algorithm. This model is based on modularity method in the lack of stability of division results exist, the information center of the idea of a new measurement model based on the model of the correlation between nodes and nodes are weighted, not only can accurately identify the clustering center, and provides according to the similarity between the nodes in the network are calculated. Further proposes a new partitioning algorithm (BSM algorithm), simulation experiments and real data On the set of experimental results show that compared with the modularity method, the algorithm stability and higher accuracy, which also confirms the validity of the measurement model. (2) proposed a fast algorithm model for large-scale complex network community division. On the model of the algorithm are divided into two steps, first of all in Leuven the first round of iteration fast algorithm for the problem of low efficiency, by introducing the pruning strategy, proposed an improved algorithm (FLA algorithm). Then the Leuven fast algorithm based on modularity optimization, convergence to local optimal solution, based on the optimized template function is improved, and the weights of the edges and other related information into the node, based on the FLA algorithm, this paper proposes a new CDDW algorithm. The simulation results and the real network data sets. The experimental results show that the new algorithm model can not only greatly reduce the computational cost, and The accuracy and can enhance the network partition result. (3) proposed a new type of link prediction ensemble learning algorithm model. For link prediction algorithm mainstream widespread recall rate and low accuracy problem, this paper proposes a novel ensemble learning algorithm of the model, the link prediction problem as a a two element classification problem, error Booting algorithm using the framework provided by the feedback mechanism, the design and implementation of a new algorithm for link prediction models: AdaPred model. In order to further improve the accuracy and recall rate of the algorithm, we propose a new link prediction algorithm, and integrated into the AdaPred model. Through the empirical study on the real data collaboration network and e-mail network, AdaPred algorithm proves that the prediction accuracy rate and recall rate is better than other algorithms. (4) developed an email communication network The entity relationship analysis system. The visualization technology is conducive to social network analysis to practical application, will have a profound impact on the popularization of this technology. In this paper, the mail in the network entity relationship mining as the starting point, research and analysis platform of an application oriented visualization. The platform provides data analysis capabilities with the international advanced level in line with good universality and expansibility. The prototype system has been developed through the third party testing and the National 863 project acceptance, acceptance appraisal result is excellent. In conclusion, this paper researched the social network analysis technique into practical application faces several important challenges, and on this basis the design and Implementation of a visualization analysis system prototype, this research provides a feasible solution for the application of social network analysis. The analysis technology adopted in this paper is based on network topology without relying on more contextual information, so it has good scalability and can be extended to a wider application scenario of social network data analysis.

【學位授予單位】：電子科技大學
【學位級別】：博士
【學位授予年份】：2014
【分類號】：TP393.098

【相似文獻】

相關期刊論文前10條

1 趙淑萍;IP地址安全使用全攻略[J];華南金融電腦;2004年11期

2 楊鵬,趙博,王琨,周利華;利用Java技術實現(xiàn)SIP通信[J];計算機應用;2005年02期

3 陳業(yè)綱;李柳柏;徐則同;;利用JAINSIP構建SIP服務器[J];計算機時代;2006年11期

4 白巖;劉大有;;一種Agent通信中邏輯意外信息轉(zhuǎn)換方法[J];計算機研究與發(fā)展;2007年03期

5 白巖;劉大有;劉杰;;一種移動Agent通信中本體信息調(diào)整方法[J];吉林大學學報(工學版);2007年05期

6 王汝傳,王紹棣,孫知信,傅靜;混合密碼認證模型的研究[J];計算機學報;2002年11期

7 蒲志強;馮山;;基于移動IPv6的身份認證體系[J];綿陽師范學院學報;2007年11期

8 陳性元,李勇,潘正運,宋國文;選擇認可動態(tài)邏輯[J];通信學報;2002年06期

9 ;協(xié)議[J];電子科技文摘;2002年11期

10 路而紅;墨西哥新通信法規(guī)促進市場發(fā)展[J];通訊產(chǎn)品世界;1996年06期

相關會議論文前1條

1 江義杰;楊曉暉;;用GPS儀表實現(xiàn)電信通信實體的地理信息定位[A];2005年安徽通信論文集[C];2006年

相關博士學位論文前1條

1 吳祖峰;電子郵件通信實體關系挖掘與分析研究[D];電子科技大學;2014年

相關碩士學位論文前1條

1 樊怡;高校通信實體經(jīng)營模式的研究[D];蘭州大學;2007年

，

本文編號：1573521

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/guanlilunwen/ydhl/1573521.html

上一篇：內(nèi)容中心網(wǎng)絡服務器選擇和路由規(guī)劃的研究
下一篇：輕量級SNMP協(xié)議一致性測試技術研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

電子郵件通信實體關系挖掘與分析研究