天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 軟件論文 >

密度峰值聚類算法若干改進及地震分級應用研究

發(fā)布時間:2018-03-18 15:18

  本文選題:密度峰值聚類算法 切入點:Halo點識別 出處:《吉林財經(jīng)大學》2017年碩士論文 論文類型:學位論文


【摘要】:科技發(fā)展推動著人類社會從工業(yè)經(jīng)濟時代轉(zhuǎn)變?yōu)樾畔⒔?jīng)濟時代,信息已成為當今社會的重要生產(chǎn)資源,而如何有效處理TB級容量存在的大數(shù)據(jù)庫逐漸成為時下最值得關注的數(shù)據(jù)挖掘領域難題。聚類技術作為數(shù)據(jù)挖掘?qū)W習的重要工具,也成為了現(xiàn)今科學領域研究的熱點。密度峰值聚類算法(Density Peak Clustering,DPC)于2014年在Science雜志上提出,迄今為止已得到各領域廣泛認可。盡管如此,DPC算法仍然存在不足之處:(1)無法有效處理位于數(shù)據(jù)集低密度區(qū)域內(nèi)的數(shù)據(jù)點,錯誤地將異常點、中間節(jié)點歸類于簇類中;(2)人為參與選取簇類中心,降低了算法獲取真實簇類的客觀性和準確性;(3)無法有效處理復雜結構數(shù)據(jù),在處理復雜流型、差異化密度、差異規(guī)模數(shù)據(jù)等復雜數(shù)據(jù)時表現(xiàn)不佳。鑒于上述問題本文提出不同的改進方案:(1)針對密度峰值聚類算法,無法有效處理位于數(shù)據(jù)集低密度區(qū)域內(nèi)的數(shù)據(jù)點,錯誤地將異常點、中間節(jié)點歸類于簇類中等問題,提出基于密度峰值算法的Halo點識別方法(An Improved Recognition Method on Halo Node for Density Peak Clustering Algorithm,HaloDPC)。通過引入經(jīng)典DBSCAN算法的密度可達思想和SCAN算法的結構相似化模型,單獨處理數(shù)據(jù)集低密度區(qū)域數(shù)據(jù),挖掘出該數(shù)據(jù)區(qū)域內(nèi)的隱藏信息。(2)針對密度峰值聚類人為參與選取簇類中心,降低了算法獲取真實簇類的客觀性和準確性的問題,提出基于密度峰值聚類的片段聚類法(Density Fragment Clustering without Peaks,DFC)。該算法將原始DPC聚類算法局部密度序列通過降序規(guī)則和截斷距離分裂成Fragment片段,再以結構相似度為基礎進行聚類,從而達到自動獲取簇類中心的聚類過程。(3)針對密度密度峰值聚類難以處理復雜數(shù)據(jù)等問題,一種基于密度峰值的半監(jiān)督近鄰傳播聚類算法(Semi-supervised Affinity Propagation based on Density Peaks,SAP-DP)。傳統(tǒng)的近鄰傳播算法能實現(xiàn)對超球型數(shù)據(jù),緊湊型數(shù)據(jù)的快速聚類。但該算法中吸引度信息與歸屬度信息過于緊密的聯(lián)系,使得算法對流線型數(shù)據(jù)、復合型數(shù)據(jù)等復雜結構數(shù)據(jù)的處理過程過于單一化,導致該算法難以準確獲取正確的類數(shù),無法達到合理的效果。傳統(tǒng)的密度峰值算法能夠?qū)崿F(xiàn)對任意形狀的數(shù)據(jù)的聚類中心探索,因此本文將密度峰值聚類算法的優(yōu)勢引入近鄰傳播聚類分析中,充分利用密度峰值對復合型數(shù)據(jù)敏感的優(yōu)點。并將半監(jiān)督思想引入算法改進中,實現(xiàn)兩個算法的有效結合,為了更好地融合近鄰傳播算法與密度峰值算法的優(yōu)勢,本文基于半監(jiān)督思想建立成對約束條件,利用兩種約束信息的互相傳遞,更新聚類相似度矩陣,提高算法的運行效率和準確率。(4)拓展本文改進算法的應用領域,將改進算法應用于國家地震數(shù)據(jù)分級測試中,仿真實驗表明改進的算法能高效精確地測算地震震級,在實際應用領域具有極大的潛力,同時深度挖掘改進算法對于實際數(shù)據(jù)應用中的優(yōu)缺點,為進一步完善和提高算法準確率和實用性提供了依據(jù)。
[Abstract]:The development of science and technology to promote the society transition from industrial economy to information economy era, information has become an important resource in today's society, and how to effectively handle large databases TB level capacity has become nowadays the most concern of the field of data mining clustering problem. Data mining technology is an important tool for learning, has become a hot research area now in the field of science. The peak density clustering algorithm (Density Peak Clustering, DPC) in 2014 in the Journal Science, so far has been widely recognized in various fields. However, the DPC method still has shortcomings: (1) can not effectively deal with the data set is located in the low density region of the data points, mistakenly outliers, intermediate nodes classified in clusters; (2) the human selected cluster center, reduces the clustering algorithm to obtain the true objectivity and accuracy; (3) to The complex structure of data, in dealing with the complex flow pattern, the difference between the size of the data density, complex data poor performance. In view of the above problems this paper puts forward different solutions: (1) according to the density clustering algorithm can not effectively deal with the data set in the low density region of the data points, mistakenly outliers, intermediate according to the cluster node medium, Halo point recognition algorithm based on peak density (An Improved Recognition Method on Halo Node for Density Peak Clustering Algorithm, HaloDPC). By introducing the structure of classical DBSCAN algorithm and SCAN algorithm idea density similarity model, data set low density area data alone, mining the hidden information of the data within the region. (2) according to the density clustering cluster center selection of human involvement, reduced the algorithm clusters and obtain true objectivity The accuracy of the problem, proposed fragment clustering method based on density clustering (Density Fragment Clustering without Peaks, DFC). This algorithm will be the original DPC clustering algorithm of local density sequence by descending rule and the cutoff distance split into Fragment fragments, then the structure similarity based clustering, clustering process so as to achieve the automatic acquisition of cluster center. (3) the peak density clustering problem is difficult to deal with complex data, a semi supervised affinity propagation clustering algorithm based on density peak (Semi-supervised Affinity Propagation based on Density Peaks, SAP-DP). The traditional affinity propagation algorithm can achieve data on the super ball, compact and fast clustering data. But the algorithm in attracting the degree of information and membership information too closely, which makes the algorithm of convection type data, complex data structures data processing The process is too simple, the algorithm is difficult to accurately obtain the correct class number, cannot achieve reasonable results. The peak density of traditional algorithm can realize arbitrary shape clustering centers on data exploration, so this paper will introduce the advantages of density clustering algorithm of affinity propagation clustering, make full use of the advantages of the peak density of composite data sensitive. And introduces the idea of semi supervised algorithm, realize the effective combination of the two algorithms, in order to better integrate the affinity propagation algorithm and density peak algorithm has the advantage of the semi supervised thought establish pairwise constraints based on mutual transfer by using two kinds of constraint information, update the clustering similarity matrix, improve the efficiency of the algorithm and accuracy. (4) the expansion of the application of the improved algorithm, the improved algorithm is applied to the national seismic data classification test, simulation experiments show that The improved algorithm can estimate the magnitude of earthquake accurately and accurately, and has great potential in practical application. At the same time, we deeply explore the advantages and disadvantages of the improved algorithm for the actual data, and provide a basis for further improving and improving the accuracy and practicability of the algorithm.

【學位授予單位】:吉林財經(jīng)大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP311.13

【參考文獻】

相關期刊論文 前10條

1 張嘉琪;張紅云;;基于流形距離的密度峰值快速搜索聚類算法[J];電腦知識與技術;2017年02期

2 舒振宇;祁成武;辛士慶;胡超;韓祥蘭;劉利剛;;基于密度峰值的三維模型無監(jiān)督分類算法[J];計算機輔助設計與圖形學學報;2016年12期

3 戴嬌;張明新;鄭金龍;蔣禮青;尚趙偉;;基于密度峰值的快速聚類算法優(yōu)化[J];計算機工程與設計;2016年11期

4 王華秋;聶珍;;快速搜索密度峰值聚類在圖像檢索中的應用[J];計算機工程與設計;2016年11期

5 蔡旭芬;靳聰;胡飛;張勤;;一種面向高維數(shù)據(jù)的密度峰值聚類模型[J];中國傳媒大學學報(自然科學版);2016年05期

6 吳佳妮;陳永光;代大海;陳思偉;王雪松;;基于快速密度搜索聚類算法的極化HRRP分類方法[J];電子與信息學報;2016年10期

7 劉艷麗;張建朋;;基于密度峰值搜索的改進流形聚類算法[J];計算機工程與設計;2016年06期

8 李明麗;孫連英;邢邗;石曉達;;密度峰值算法在中文自動文摘中的應用研究[J];北京聯(lián)合大學學報(自然科學版);2016年02期

9 黃嵐;李玉;王貴參;王巖;;基于點距離和密度峰值聚類的社區(qū)發(fā)現(xiàn)方法[J];吉林大學學報(工學版);2016年06期

10 謝娟英;高紅超;謝維信;;K近鄰優(yōu)化的密度峰值快速搜索聚類算法[J];中國科學:信息科學;2016年02期

相關博士學位論文 前1條

1 張小峰;基于模糊聚類算法的醫(yī)學圖像分割技術研究[D];山東大學;2014年

相關碩士學位論文 前7條

1 趙莉莉;群智能優(yōu)化算法在聚類分析中的研究[D];江南大學;2016年

2 高小帆;基于直覺模糊集的醫(yī)學圖像聚類分割[D];中北大學;2016年

3 郭寶鋒;人工蜂群優(yōu)化模糊聚類研究及應用[D];山東大學;2016年

4 陳書會;人工魚群和k-means相結合的聚類算法研究與分布式實現(xiàn)[D];江蘇大學;2016年

5 徐曼舒;基于改進人工蜂群的模糊C均值聚類算法研究[D];安徽大學;2016年

6 樊萬姝;基于半監(jiān)督模糊聚類的醫(yī)學圖像分割系統(tǒng)設計[D];大連理工大學;2013年

7 任麗;基于時空聚類的車輛路徑分析與優(yōu)化[D];清華大學;2011年



本文編號:1630145

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1630145.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶aa461***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com