基于雅克比矩陣的軟劃分聚類算法分析
本文選題:軟劃分聚類分析 切入點(diǎn):雅克比矩陣 出處:《北京交通大學(xué)》2017年博士論文
【摘要】:本文主要研究軟劃分聚類算法分析中兩個(gè)重要問題:參數(shù)選擇和收斂性質(zhì)分析。大部分軟劃分聚類算法中均存在參數(shù)選擇問題,參數(shù)的選擇直接影響聚類算法的速度和精度。討論軟劃分聚類算法的收斂性質(zhì),例如EM算法的自退火性質(zhì),可以幫助我們更好的理解這些聚類算法。除此之外,聚類算法的收斂速率可能會(huì)影響算法處理大數(shù)據(jù)的能力。本文提出一種基于雅克比矩陣的軟劃分聚類算法分析框架,在此框架下對軟劃分聚類算法參數(shù)的上下界、算法的收斂性質(zhì)以及收斂速率等問題進(jìn)行了討論。本文取得的主要研究成果如下:(1)本文提出了基于雅克比矩陣的軟劃分聚類算法分析框架。建立該軟劃分聚類算法分析框架的基本假設(shè)是:重合類是大部分軟劃分聚類算法的不動(dòng)點(diǎn),但為了避免聚類算法輸出重合類結(jié)果,重合類不能是軟劃分聚類算法的穩(wěn)定點(diǎn)。在這個(gè)基本假設(shè)下,我們將軟劃分聚類算法重寫為差分方程形式,通過分析軟劃分聚類算法差分方程的雅克比矩陣,從而對聚類算法的參數(shù)選擇和收斂性質(zhì)分析等等問題進(jìn)行討論。與其他軟劃分聚類算法分析方法相比,基于雅克比矩陣的軟劃分聚類算法分析方法可以用于分析一般具有隸屬度和聚類中心迭代更新過程的算法,而不要求聚類算法有明確的目標(biāo)函數(shù)。(2)本文在基于雅克比矩陣的軟劃分聚類算法分析框架下,從理論上分析了基于混合高斯模型的最大期望(EM)算法和確定性退火最大期望(DA-EM)算法的性質(zhì)。一方面,我們通過分析DA-EM算法差分方程的雅克比矩陣,提出了一種選擇DA-EM算法確定性退火參數(shù)理論下界的方法。另一方面我們在基于雅克比矩陣的軟劃分聚類算法分析框架下證明了 EM算法具有自退火性質(zhì),也就是說重合類不是EM算法的穩(wěn)定點(diǎn)。因?yàn)镈A-EM模型在確定性退火參數(shù)等于1時(shí)等于EM模型,因此我們將EM算法作為DA-EM算法的一個(gè)特殊形式,利用DA-EM的雅克比矩陣對EM算法進(jìn)行理論分析。(3)GK算法是在FCM的基礎(chǔ)上改進(jìn)的一種模糊聚類算法。與FCM算法等軟劃分聚類算法一樣,GK聚類算法的結(jié)果也會(huì)受到模糊指數(shù)m參數(shù)值的影響,然而文獻(xiàn)中缺乏對GK聚類算法的參數(shù)選擇問題的討論。我們在基于雅克比矩陣的軟劃分聚類算法分析框架下,建立GK聚類算法的穩(wěn)定點(diǎn)和樣本數(shù)據(jù)間的關(guān)系,從而給出選擇模糊指數(shù)m的理論根據(jù)。同時(shí),我們研究了模糊指數(shù)m的取值對聚類算法的收斂速率的影響。最后,我們通過實(shí)驗(yàn)證明了理論結(jié)果的正確性。(4)模糊指數(shù)m值會(huì)嚴(yán)重影響GK聚類算法的聚類結(jié)果。因此,本文我們提出了一種新的基于確定性退火機(jī)制的GK聚類算法,以減小參數(shù)選擇對聚類結(jié)果的影響。我們在GK聚類算法的目標(biāo)函數(shù)中加入隸屬度的香農(nóng)(信息)熵約束,并且用確定性退火機(jī)制調(diào)節(jié)退火參數(shù)。與此同時(shí),我們分析了確定性退火GK(DA-GK)聚類算法退火參數(shù)取值理論下界。除此之外,我們比較了 DA-GK聚類算法和其他聚類算法的聚類結(jié)果,并分析了 DA-GK聚類算法的計(jì)算復(fù)雜度。實(shí)驗(yàn)結(jié)果表明DA-GK算法具備良好的聚類性能。
[Abstract]:This paper mainly studies the soft clustering algorithm in the analysis of two important issues: the analysis of parameter selection and convergence properties. The parameter selection problem exists in most soft clustering algorithm, parameter selection directly affects the speed and accuracy of clustering algorithm. Discuss the convergence properties of soft clustering algorithms such as EM algorithm, self annealing properties, can help we have a better understanding of the clustering algorithm. In addition, the convergence rate of the algorithm clustering algorithm may affect the ability to handle large data. This paper proposes an analysis framework of soft clustering algorithm based on Jacobian matrix, under this framework on the parameters of soft clustering algorithm to partition the upper and lower bounds, convergence and convergence rate of the algorithm are discussed. The main achievements of this dissertation are as follows: (1) framework is proposed in this thesis. Analysis of soft clustering algorithm based on Jacobian matrix The establishment of the soft clustering algorithm framework is the basic assumptions of coincidence is most soft clustering algorithm of the fixed point, but in order to avoid clustering algorithm output results coincide, coincidence classes can not be a stable soft clustering algorithm. In this basic assumption, we will soft clustering algorithm for rewriting difference through the analysis of difference equations, Jacobian matrix equations of soft clustering algorithm, clustering algorithm to parameter selection and convergence analysis and so on are discussed. Compared with other soft clustering algorithm analysis method, soft clustering algorithm of Jacobian matrix analysis method can be used to analyze the general membership degree and cluster center with iterative update process based on the algorithm without clustering algorithm has a clear objective function. (2) based on the analysis frame of soft clustering algorithm based on Jacobian matrix The plane, from the theoretical analysis of the mixed Gauss model based on expectation maximization (EM) algorithm and the deterministic annealing expectation maximization (DA-EM) algorithm. On the one hand, we analyze the difference of Jacobian matrix equation DA-EM algorithm, put forward a method to select the DA-EM algorithm of deterministic annealing parameter theory. On the other hand the lower bound in our framework it is proved that the EM algorithm with self annealing properties of soft clustering algorithm based on Jacobian matrix, i.e. stable point is not EM algorithm. Because the DA-EM model in the deterministic annealing parameter is equal to 1 is equal to the EM model, so we use EM algorithm as a special form of the DA-EM algorithm, the theory of analysis of the EM algorithm using the Jacobian matrix of DA-EM. (3) the GK algorithm is an improved fuzzy clustering algorithm based on FCM. As with the FCM algorithm and soft clustering algorithm, GK clustering The results will be fuzzy algorithm influence index m parameter, discuss the parameters of GK clustering algorithm is however lacking in the literature selection problem. In our analysis based on soft clustering algorithm of Jacobian matrix framework, stable relationship establishment of GK clustering algorithm and the sample data, which gives the fuzzy selection index m according to the theory. At the same time, we studied the influence of convergence rate of the value of the fuzzy clustering algorithm of the M index. Finally, we prove the correctness of the theoretical results through experiments. (4) the fuzzy index m value will seriously affect the clustering results of GK clustering algorithm. Therefore, in this paper, we propose a new GK clustering the algorithm based on deterministic annealing mechanism, in order to reduce the influence of parameters selection on the clustering results. The membership degree function we target in the GK clustering algorithm in Shannon (information entropy) constraints, and deterministic annealing The lighter adjusting annealing parameters. At the same time, we analyzed the deterministic annealing GK (DA-GK) clustering algorithm of annealing parameters theoretical bounds. Besides, we compare the clustering results of DA-GK clustering algorithm and other clustering algorithms, and analyzes the calculation of the DA-GK complexity of clustering algorithms. The experimental results show that the DA-GK algorithm has good clustering performance.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2017
【分類號】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 蔡曉妍;戴冠中;楊黎斌;;基于譜聚類的復(fù)雜網(wǎng)絡(luò)社團(tuán)發(fā)現(xiàn)算法[J];計(jì)算機(jī)科學(xué);2009年09期
2 盧秋根;;模糊聚類算法的研究與實(shí)現(xiàn)[J];電腦知識(shí)與技術(shù);2008年27期
3 唐英干,趙立興,關(guān)新平;基于混合模型和DAEM算法的自適應(yīng)圖像分割[J];儀器儀表學(xué)報(bào);2005年06期
4 張敏,于劍;基于劃分的模糊聚類算法[J];軟件學(xué)報(bào);2004年06期
5 張祥德,唐青松;確定性退火技術(shù)及其在點(diǎn)匹配中的應(yīng)用[J];東北大學(xué)學(xué)報(bào);2003年11期
6 于劍,程乾生;關(guān)于FCM算法中的權(quán)重指數(shù)m的一點(diǎn)注記[J];電子學(xué)報(bào);2003年03期
7 高新波,裴繼紅,謝維信;模糊c-均值聚類算法中加權(quán)指數(shù)m的研究[J];電子學(xué)報(bào);2000年04期
8 楊廣文,李曉明,王義和,鄭緯民,王鼎興;確定性退火技術(shù)[J];計(jì)算機(jī)學(xué)報(bào);1998年08期
,本文編號:1697418
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1697418.html