基于快速地標采樣的大規(guī)模譜聚類算法

發(fā)布時間：2018-05-01 07:29

本文選題：地標點采樣 + 大數(shù)據�。� 參考：《電子與信息學報》2017年02期

【摘要】：為避免傳統(tǒng)譜聚類算法高復雜度的應用局限,基于地標表示的譜聚類算法利用地標點與數(shù)據集各點間的相似度矩陣,有效降低了譜嵌入的計算復雜度。在大數(shù)據集情況下,現(xiàn)有的隨機抽取地標點的方法會影響聚類結果的穩(wěn)定性,k均值中心點方法面臨收斂時間未知、反復讀取數(shù)據的問題。該文將近似奇異值分解應用于基于地標點的譜聚類,設計了一種快速地標點采樣算法。該算法利用由近似奇異向量矩陣行向量的長度計算的抽樣概率來進行抽樣,同隨機抽樣策略相比,保證了聚類結果的穩(wěn)定性和精度,同k均值中心點策略相比降低了算法復雜度。同時從理論上分析了抽樣結果對原始數(shù)據的信息保持性,并對算法的性能進行了實驗驗證。
[Abstract]:In order to avoid the high complexity of the traditional spectral clustering algorithm, the spectral clustering algorithm based on landmarks can effectively reduce the computational complexity of spectral embedding by using the similarity matrix between the ground punctuation points and the data sets. In the case of big data set, the existing random sampling ground punctuation methods will affect the stability of the clustering results and the k-means centroid method will face the problem of the unknown convergence time and the problem of repeatedly reading the data. In this paper, the approximate singular value decomposition is applied to the spectral clustering based on geopunctuation, and a fast punctuation sampling algorithm is designed. The algorithm uses the sampling probability calculated by the length of the approximate singular vector matrix row vector to carry out the sampling. Compared with the random sampling strategy, the stability and accuracy of the clustering results are guaranteed. Compared with the k-means center point strategy, the algorithm complexity is reduced. At the same time, the information retention of the sampling results to the original data is analyzed theoretically, and the performance of the algorithm is verified experimentally.
【作者單位】：解放軍信息工程大學;數(shù)學工程與先進計算國家重點實驗室;
【基金】：國家973計劃(2012CB315905) 國家自然科學基金(61502527,61379150)~~
【分類號】：TP311.13

【相似文獻】

相關期刊論文前10條

1 蔡曉妍;戴冠中;楊黎斌;;譜聚類算法綜述[J];計算機科學;2008年07期

2 汪中;劉貴全;陳恩紅;;基于模糊K-harmonic means的譜聚類算法[J];智能系統(tǒng)學報;2009年02期

3 孫昌思核;孔萬增;戴國駿;;一種自動確定類個數(shù)的譜聚類算法[J];杭州電子科技大學學報;2010年02期

4 蘭洋;;改進譜聚類算法在高等院校人才選拔中的應用[J];信陽師范學院學報(自然科學版);2010年04期

5 張力文;丁世飛;許新征;朱紅;徐麗;;一種基于成對約束的譜聚類算法[J];廣西師范大學學報(自然科學版);2010年04期

6 施培蓓;郭玉堂;胡玉娟;俞駿;;多尺度的譜聚類算法[J];計算機工程與應用;2011年08期

7 楊曉慧;王莉莉;李登峰;;一種新的層次譜聚類算法[J];上海理工大學學報;2014年01期

8 朱強生;何華燦;周延泉;;譜聚類算法對輸入數(shù)據順序的敏感性[J];計算機應用研究;2007年04期

9 金慧珍;趙遼英;;多層自動確定類別的譜聚類算法[J];計算機應用;2008年05期

10 孫大雷;孟凡榮;閆秋艷;;一種初始化不敏感的譜聚類算法[J];微電子學與計算機;2012年07期

相關博士學位論文前1條

1 呂紹高;統(tǒng)計學習中回歸與正則化譜聚類算法的研究[D];中國科學技術大學;2011年

相關碩士學位論文前10條

1 李純;快速譜聚類算法的研究與應用[D];哈爾濱工程大學;2012年

2 董彬;一種基于主動學習的半監(jiān)督譜聚類算法研究[D];中國礦業(yè)大學;2015年

3 劉萍萍;基于特征間隙檢測簇數(shù)的譜聚類算法研究[D];南京郵電大學;2015年

4 孫承祥;雙饋型風電機組的風電場建模研究[D];華北電力大學;2015年

5 崔慧嶺;一種面向大數(shù)據的文本聚類算法[D];湖北師范大學;2016年

6 徐大海;基于分布式的譜聚類算法在虛擬社區(qū)發(fā)現(xiàn)上的應用研究[D];暨南大學;2016年

7 王有華;基于歸一化壓縮距離的文本譜聚類算法研究[D];貴州大學;2016年

8 張濤;基于密度估計的譜聚類算法研究與應用[D];江南大學;2016年

9 包秀娟;聚類有效性指標結構分析及應用[D];天津大學;2014年

10 周燕琴;基于改進譜聚類算法在醫(yī)學圖像中的應用研究[D];廣西師范學院;2016年

，

本文編號：1828350

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1828350.html

上一篇：基于數(shù)字圖像的光學元件表面缺陷檢測
下一篇：基于高階馬爾可夫隨機場及非線性壓縮感知的相位恢復算法

論文發(fā)表

·知網|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于快速地標采樣的大規(guī)模譜聚類算法