基于正態(tài)分布的密度峰聚類算法的研究
[Abstract]:Clustering algorithm is an important machine learning algorithm which divides data sets into several categories according to similarity characteristics. Clustering analysis is widely used in machine learning, pattern recognition, bioinformatics and image processing. In 2014, Alex Rodriguez et al proposed a new density-based density peak clustering (clustering by fast search and find of density peaks,DPC) algorithm on < Science >. The algorithm uses the density of data points and the distance between the data points and the higher density points to find the potential cluster centers. The density peak clustering algorithm is simple and clear, and the clustering results can be obtained in one step, and the clustering effect is better. But in the process of clustering, the algorithm needs to participate in the analysis of decision graph and select the potential cluster core, which reduces the efficiency of the algorithm. In order to achieve the purpose of automatic clustering, this paper presents a method of selecting potential cluster centers by using the multiplier Z of density and distance as a new judgement index and selecting cluster centers by probability and statistics according to the characteristics of each point in the decision graph. Because only the potential cluster centers have higher density and longer distance, their Z value is much larger than that of non-cluster centers. Assuming that the distribution of Z is a normal distribution, an upper bound can be determined by the method of probability and statistics. The point corresponding to the value above the upper bound will automatically be regarded as the cluster center point. The experimental results show that the probabilistic statistical method such as normal distribution can correctly identify the potential cluster center points, and the method is similar to the method of selecting the potential cluster center in the artificial analysis decision map. Compared with other excellent clustering algorithms, the density peak clustering algorithm based on normal distribution has better performance in dealing with different shape data sets, and can obtain better clustering results.
【學(xué)位授予單位】:浙江工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP311.13
【相似文獻】
相關(guān)期刊論文 前10條
1 陶勇劍;董德存;任鵬;;故障樹分析的二元決策圖方法[J];鐵路計算機應(yīng)用;2009年09期
2 朱隨江;劉宇;劉寶旭;姜政偉;;基于二叉決策圖的網(wǎng)絡(luò)可達性計算[J];計算機工程與應(yīng)用;2012年04期
3 邱建林;二叉決策圖在邏輯綜合中的應(yīng)用[J];微機發(fā)展;2002年01期
4 ;數(shù)理科學(xué)與基礎(chǔ)理論[J];電子科技文摘;2001年03期
5 何明;權(quán)冀川;鄭翔;賴海光;楊飛;;基于二元決策圖的網(wǎng)絡(luò)可靠性評估[J];控制與決策;2011年01期
6 李道豐;張增芳;;基于有序二叉決策圖的路徑規(guī)劃可行性研究[J];計算機工程與設(shè)計;2008年22期
7 紀明宇;王海濤;陳志遠;李艷梅;;基于決策圖的復(fù)雜系統(tǒng)模型對稱約減方法[J];計算機工程與設(shè)計;2013年10期
8 孫艷蕊,張祥德;利用二分決策圖計算網(wǎng)絡(luò)可靠度的一個有效算法[J];東北大學(xué)學(xué)報;1998年05期
9 王波,邱建林,管致錦;集成電路中布爾線路圖的優(yōu)化設(shè)計[J];南通工學(xué)院學(xué)報;2001年03期
10 姚金濤;劉財興;孔宇彥;;基于決策圖貝葉斯網(wǎng)絡(luò)的混沌優(yōu)化算法[J];系統(tǒng)仿真學(xué)報;2008年12期
相關(guān)會議論文 前1條
1 郭紅仙;王際芝;;廊坊市計算機輔助減災(zāi)決策圖文數(shù)據(jù)庫[A];第四屆全國結(jié)構(gòu)工程學(xué)術(shù)會議論文集(下)[C];1995年
相關(guān)博士學(xué)位論文 前2條
1 李淑敏;決策圖擴展方法及其在重要度計算中的應(yīng)用[D];西北工業(yè)大學(xué);2014年
2 賴永;帶蘊含文字的有序二元決策圖[D];吉林大學(xué);2013年
相關(guān)碩士學(xué)位論文 前4條
1 吳丹丹;基于決策圖的高速公路網(wǎng)連通可靠性研究[D];浙江師范大學(xué);2016年
2 鄭P;基于正態(tài)分布的密度峰聚類算法的研究[D];浙江工業(yè)大學(xué);2016年
3 王樂;基于可能性決策圖的可能性規(guī)劃[D];東北師范大學(xué);2011年
4 喬迪;光網(wǎng)絡(luò)可靠性評估模型和算法研究[D];北京郵電大學(xué);2014年
,本文編號:2236063
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2236063.html