基于粒子群優(yōu)化的聚類分析三個(gè)關(guān)鍵問題研究
本文選題:聚類分析 + 粒子群優(yōu)化。 參考:《南昌大學(xué)》2017年碩士論文
【摘要】:目前,粒子群優(yōu)化算法已廣泛應(yīng)用于模式識(shí)別、垃圾郵件檢測(cè)、數(shù)據(jù)聚類、機(jī)器人技術(shù)、推薦系統(tǒng)等很多領(lǐng)域。然而,在不同的應(yīng)用背景下,傳統(tǒng)的粒子群優(yōu)化算法在有效性驗(yàn)證、速度位置更新規(guī)則、收斂性能等方面仍存在急需深入解決的問題。因此,本文針對(duì)聚類有效性指標(biāo)、聚類算法以及復(fù)雜社團(tuán)檢測(cè)應(yīng)用場(chǎng)景三個(gè)關(guān)鍵問題,提出動(dòng)態(tài)終止聚類過程的聚類有效性指標(biāo),著重研究基于粒子群的聚類分析算法及復(fù)雜網(wǎng)絡(luò)社團(tuán)檢測(cè)算法。本文主要研究工作如下:1、根據(jù)本文提出的多種聚類度量,提出了一種動(dòng)態(tài)確定最佳聚類數(shù)的有效性評(píng)估方法,該方法采用本文提出的有效性指標(biāo)——距離平方和差值比RDSED。根據(jù)之前提出的距離平方和差值DSED來計(jì)算RDSED值,并動(dòng)態(tài)終止最佳聚類數(shù)搜索過程。人工數(shù)據(jù)集和真實(shí)數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明本章提出的RDSED指標(biāo)和方法,能夠有效地評(píng)估聚類劃分結(jié)果并確定最佳聚類數(shù)。2、研究提出了一種基于PSO和K均值的混合聚類算法KIPSO,與傳統(tǒng)粒子編碼方案不同,KIPSO算法使用一種簡(jiǎn)約粒子編碼方案,同時(shí)對(duì)數(shù)據(jù)進(jìn)行預(yù)處理,采用數(shù)據(jù)對(duì)象與類簇中心的平均距離作為適應(yīng)度函數(shù)。算法融合了PSO算法和K均值算法,具有PSO較強(qiáng)的全局尋優(yōu)能力,又有K均值的局部搜索能力。人工和真實(shí)數(shù)據(jù)集的實(shí)驗(yàn)結(jié)果表明,該方法更加精確并有更好的收斂性能。3、提出了一種基于進(jìn)化策略的離散粒子群復(fù)雜網(wǎng)絡(luò)社團(tuán)檢測(cè)算法,該算法重新定義了粒子的速度位置和更新方式等,并采用了避免陷入局部最優(yōu)的兩種進(jìn)化策略。GN基準(zhǔn)網(wǎng)絡(luò)數(shù)據(jù)集和真實(shí)網(wǎng)絡(luò)數(shù)據(jù)集上的實(shí)驗(yàn)證明該算法能夠有效發(fā)現(xiàn)網(wǎng)絡(luò)社團(tuán),具有穩(wěn)定的社團(tuán)劃分質(zhì)量和全局收斂性。本文研究貢獻(xiàn):從分離性度量和緊密性度量等方面衡量聚類有效性驗(yàn)證過程中各指標(biāo)相異性,并動(dòng)態(tài)終止驗(yàn)證過程;對(duì)傳統(tǒng)基于PSO的聚類算法進(jìn)行優(yōu)化,定義新型離散應(yīng)用場(chǎng)景下的基于PSO的復(fù)雜網(wǎng)絡(luò)社團(tuán)檢測(cè)算法。并通過多組實(shí)驗(yàn)驗(yàn)證了所提方法和算法是有效可行的。
[Abstract]:At present, particle swarm optimization (PSO) has been widely used in many fields, such as pattern recognition, spam detection, data clustering, robot technology, recommendation system and so on. However, in different application backgrounds, the traditional particle swarm optimization (PSO) still needs to be solved in the aspects of validity verification, velocity position updating rule, convergence performance and so on. Therefore, aiming at the three key problems of clustering validity index, clustering algorithm and application scene of complex community detection, this paper proposes the clustering validity index of dynamic termination clustering process. The cluster analysis algorithm based on particle swarm optimization and the community detection algorithm of complex network are studied. The main work of this paper is as follows: 1. According to the various clustering measures proposed in this paper, a new method for evaluating the effectiveness of dynamic determination of optimal clustering number is proposed. This method uses RDSED, an effective index proposed in this paper, which is called RDSED. The RDSED value is calculated according to the distance square sum difference DSED, and the optimal clustering number search process is dynamically terminated. The experimental results on artificial data set and real data set show that the RDSED index and method proposed in this chapter, A hybrid clustering algorithm based on PSO and K-means, KIPSO, is proposed, which is different from the traditional particle coding scheme and uses a reduced particle coding scheme. At the same time, the data is preprocessed and the average distance between the data object and the cluster center is used as the fitness function. The algorithm combines the PSO algorithm and the K-means algorithm, which has the strong global optimization ability of PSO and the local search ability of K-means. The experimental results of artificial and real data sets show that the proposed method is more accurate and has better convergence performance .3. an evolutionary strategy based community detection algorithm for discrete particle swarm complex networks is proposed. The algorithm redefines the velocity position and update mode of particles, and adopts two evolutionary strategies. GN benchmark network data set and real network data set to avoid falling into local optimum. The experiments show that the algorithm can effectively find network communities. It has stable community partition quality and global convergence. The contribution of this paper is to measure the different indexes in the validation process of clustering validity from the aspects of separation metric and compactness measure, and to dynamically terminate the verification process, and optimize the traditional clustering algorithm based on PSO. A new algorithm of community detection based on PSO for discrete applications is defined. The method and algorithm are proved to be effective and feasible through many experiments.
【學(xué)位授予單位】:南昌大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP18;TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 伍育紅;;聚類算法綜述[J];計(jì)算機(jī)科學(xué);2015年S1期
2 邱曉輝;陳羽中;;一種面向社會(huì)網(wǎng)絡(luò)社區(qū)發(fā)現(xiàn)的改進(jìn)粒子群優(yōu)化算法[J];小型微型計(jì)算機(jī)系統(tǒng);2014年06期
3 何清;李寧;羅文娟;史忠植;;大數(shù)據(jù)下的機(jī)器學(xué)習(xí)算法綜述[J];模式識(shí)別與人工智能;2014年04期
4 張長(zhǎng)水;;機(jī)器學(xué)習(xí)面臨的挑戰(zhàn)[J];中國(guó)科學(xué):信息科學(xué);2013年12期
5 龔尚福;陳婉璐;賈澎濤;;層次聚類社區(qū)發(fā)現(xiàn)算法的研究[J];計(jì)算機(jī)應(yīng)用研究;2013年11期
6 王李冬;魏寶剛;袁杰;;基于概率主題模型的文檔聚類[J];電子學(xué)報(bào);2012年11期
7 李國(guó)杰;程學(xué)旗;;大數(shù)據(jù)研究:未來科技及經(jīng)濟(jì)社會(huì)發(fā)展的重大戰(zhàn)略領(lǐng)域——大數(shù)據(jù)的研究現(xiàn)狀與科學(xué)思考[J];中國(guó)科學(xué)院院刊;2012年06期
8 王韶;周鑫;;應(yīng)用層次聚類法和蟻群算法的配電網(wǎng)無功優(yōu)化[J];電網(wǎng)技術(shù);2011年08期
9 郝洪星;朱玉全;陳耿;李米娜;;基于劃分和層次的混合動(dòng)態(tài)聚類算法[J];計(jì)算機(jī)應(yīng)用研究;2011年01期
10 蘇錦旗;薛惠鋒;詹海亮;;基于劃分的K-均值初始聚類中心優(yōu)化算法[J];微電子學(xué)與計(jì)算機(jī);2009年01期
,本文編號(hào):2040556
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2040556.html