天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于網(wǎng)格密度區(qū)分的多維聚類(lèi)挖掘算法設(shè)計(jì)

發(fā)布時(shí)間:2018-03-19 10:28

  本文選題:聚類(lèi)算法 切入點(diǎn):網(wǎng)格 出處:《西安財(cái)經(jīng)學(xué)院》2014年碩士論文 論文類(lèi)型:學(xué)位論文


【摘要】:聚類(lèi)分析為數(shù)據(jù)挖掘算法的重要組成部分,是數(shù)據(jù)挖掘中的一種分析活動(dòng)。聚類(lèi)算法是總體聚類(lèi)分析的核心,決定了全部聚類(lèi)分析結(jié)果的質(zhì)量。目前,如何在保證算法穩(wěn)定與有效的前提下,進(jìn)一步提高聚類(lèi)效率,,減少用戶(hù)成本和負(fù)擔(dān),成為當(dāng)前非常有意義的研究方向。 由于傳統(tǒng)的聚類(lèi)算法對(duì)計(jì)算機(jī)硬件資源要求比較高,海量數(shù)據(jù)聚類(lèi)運(yùn)算時(shí)間比較長(zhǎng),本文提出了一種新的基于網(wǎng)格和密度的聚類(lèi)算法。一般基于網(wǎng)格的聚類(lèi)具有節(jié)省時(shí)間成本、高效率的特點(diǎn),但它的聚類(lèi)質(zhì)量不是很好;密度的聚類(lèi)算法可以將任意具有相異外形的簇進(jìn)行聚類(lèi),但它在處理高維空間數(shù)據(jù)的時(shí)間復(fù)雜度高。由于這兩者的互補(bǔ)關(guān)系,基于網(wǎng)格密度結(jié)合的策略進(jìn)行樣本空間的區(qū)分,能夠極大的提高聚類(lèi)效率。本文聚類(lèi)算法的思想是:首先,創(chuàng)建網(wǎng)格,對(duì)數(shù)據(jù)空間進(jìn)行初始網(wǎng)格劃分。其次,樣本空間劃分,根據(jù)得到的網(wǎng)格密度閥值,將網(wǎng)格單元的數(shù)據(jù)劃分成高、低密度區(qū)兩部分;將高密度區(qū)所有網(wǎng)格按照密度大小進(jìn)行排列,找到密度最大的網(wǎng)格,利用其周?chē)罱兔芏染W(wǎng)格區(qū)尋找到第一個(gè)高密度簇;將第一個(gè)高密度簇的點(diǎn)去掉,將剩余高密度網(wǎng)格進(jìn)行排序,依次進(jìn)行,直到形成最終空間的劃分結(jié)果。最后,計(jì)算各子簇類(lèi)重心,將臨近簇重心空間合并,形成新簇重心,依次合并空間,直到等于給定簇類(lèi)數(shù),形成最終聚類(lèi)結(jié)果。 本文首先從理論方面對(duì)該算法進(jìn)行了描述,驗(yàn)證了該算法設(shè)計(jì)的合理性和科學(xué)性。最后通過(guò)Matlab隨機(jī)生成幾組數(shù)據(jù)進(jìn)行了實(shí)證分析,驗(yàn)證了本算法能夠在與經(jīng)典的K-means算法組間離差平方和相差不大的條件下,運(yùn)算時(shí)間上有了顯著的改善。
[Abstract]:Clustering analysis is an important part of data mining algorithm and an analysis activity in data mining. Clustering algorithm is the core of overall clustering analysis, which determines the quality of all the results of clustering analysis. How to further improve the clustering efficiency and reduce the cost and burden of users under the premise of ensuring the stability and effectiveness of the algorithm has become a very meaningful research direction. Because the traditional clustering algorithm requires high computer hardware resources, the clustering time of mass data is relatively long. In this paper, a new clustering algorithm based on grid and density is proposed. Generally, the clustering based on grid has the characteristics of saving time cost and high efficiency, but its clustering quality is not very good. The density clustering algorithm can cluster any cluster with different shapes, but it has a high time complexity in processing high-dimensional spatial data. Because of the complementary relationship between the two, the sample space is distinguished based on the combination of grid density. The idea of clustering algorithm in this paper is: firstly, to create grid, to divide the data space into the initial grid, secondly, to divide the sample space, according to the grid density threshold, The data of the grid cells are divided into high and low density areas, and all the grids in the high density region are arranged according to the density to find the most dense grid, and the first high density cluster is found by using the nearest low density grid area around the grid. The point of the first high density cluster is removed, the remaining high density grid is sorted, and then the final space is obtained. Finally, the center of gravity of each subcluster is calculated, and the adjacent center of gravity space is merged to form a new cluster center of gravity. The space is merged in turn until it is equal to a given number of clusters, and the final clustering result is obtained. Firstly, this paper describes the algorithm from the theoretical aspect, and verifies the rationality and scientificity of the algorithm design. Finally, several groups of data are generated randomly by Matlab for empirical analysis. It is verified that the algorithm can significantly improve the operation time under the condition that the sum of squared difference between the two groups is not different from that of the classical K-means algorithm.
【學(xué)位授予單位】:西安財(cái)經(jīng)學(xué)院
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:C81

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 韓家煒,孟小峰,王靜,李盛恩;Web挖掘研究[J];計(jì)算機(jī)研究與發(fā)展;2001年04期

2 岳士弘,王正友;二分網(wǎng)格聚類(lèi)方法及有效性[J];計(jì)算機(jī)研究與發(fā)展;2005年09期

3 胡亮;任維武;任斐;劉曉博;金剛;;基于改進(jìn)密度聚類(lèi)的異常檢測(cè)算法[J];吉林大學(xué)學(xué)報(bào)(理學(xué)版);2009年05期

4 胡文瑜,孫志揮,周曉云;基于最優(yōu)K相異性的密度聚類(lèi)算法研究[J];計(jì)算機(jī)工程與應(yīng)用;2005年22期

5 孟海東;宋飛燕;郝永寬;;基于密度與劃分方法的聚類(lèi)算法設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與應(yīng)用;2007年27期

6 李星毅;包從劍;施化吉;奚春海;;基于加權(quán)快速聚類(lèi)的異常數(shù)據(jù)挖掘算法[J];計(jì)算機(jī)工程與應(yīng)用;2007年35期

7 趙衛(wèi)中;馬慧芳;傅燕翔;史忠植;;基于云計(jì)算平臺(tái)Hadoop的并行k-means聚類(lèi)算法設(shè)計(jì)研究[J];計(jì)算機(jī)科學(xué);2011年10期

8 胡吉祥;許洪波;劉悅;程學(xué)旗;;重復(fù)串特征提取算法及其在文本聚類(lèi)中的應(yīng)用[J];計(jì)算機(jī)工程;2007年02期

9 張玉芳,毛嘉莉,熊忠陽(yáng);一種改進(jìn)的K-means算法[J];計(jì)算機(jī)應(yīng)用;2003年08期

10 鄭洪英;倪霖;肖迪;;大規(guī)模數(shù)據(jù)集聚類(lèi)中的數(shù)據(jù)分區(qū)及應(yīng)用研究[J];計(jì)算機(jī)應(yīng)用研究;2007年02期



本文編號(hào):1633868

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shekelunwen/shgj/1633868.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)a4162***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
小草少妇视频免费看视频| 久久国产亚洲精品成人| 成人区人妻精品一区二区三区| 激情偷拍一区二区三区视频| 91人妻人人澡人人人人精品| 亚洲精品一区二区三区免| 91欧美一区二区三区| 日本理论片午夜在线观看| 色婷婷中文字幕在线视频| 在线视频三区日本精品| 国产成人精品一区二三区在线观看| 午夜福利视频偷拍91| 精品日韩av一区二区三区| 高跟丝袜av在线一区二区三区| 极品少妇一区二区三区精品视频| 亚洲国产性生活高潮免费视频| 精品久久久一区二区三| 五月天丁香婷婷狠狠爱| 欧美国产在线观看精品| 亚洲欧美精品伊人久久| 99国产高清不卡视频| 国产精品午夜一区二区三区| 中文字幕一区二区免费| 亚洲综合精品天堂夜夜| 亚洲中文字幕在线观看黑人| 97人妻精品一区二区三区男同 | 亚洲少妇一区二区三区懂色| 欧美日韩国产另类一区二区 | 国内精品偷拍视频久久| 国产精品成人又粗又长又爽| 东京热加勒比一区二区三区| 国产一区二区三区av在线| 欧美成人免费夜夜黄啪啪| 国产成人av在线免播放观看av| 好吊一区二区三区在线看| 日韩国产精品激情一区| 亚洲精品国产主播一区| 日本午夜乱色视频在线观看| 国产精品美女午夜视频| 国产一区欧美一区日本道| 日本高清二区视频久二区|