天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 管理論文 > 營銷論文 >

改進(jìn)K-means聚類算法的研究

發(fā)布時(shí)間:2018-03-07 18:21

  本文選題:聚類分析 切入點(diǎn):K-means算法 出處:《安徽大學(xué)》2015年碩士論文 論文類型:學(xué)位論文


【摘要】:信息技術(shù)的快速提升以及Web技術(shù)的興起推動(dòng)著數(shù)據(jù)信息的獲取、存取向著自動(dòng)化、快速化以及智能化發(fā)展。面對海量的、無規(guī)律的數(shù)據(jù)資源,數(shù)據(jù)挖掘技術(shù)應(yīng)運(yùn)而生。在數(shù)據(jù)挖掘研究中,聚類分析技術(shù)是其中一個(gè)重要的研究分支。聚類分析技術(shù)是一種無監(jiān)督的、具有探索性的分類技術(shù),它是在沒有任何先驗(yàn)知識(shí)的前提下,將一個(gè)沒有類別標(biāo)識(shí)的數(shù)據(jù)集合進(jìn)行劃分,根據(jù)數(shù)據(jù)對象之間的相識(shí)度進(jìn)行劃分,結(jié)果是得到不同簇的集合。目前聚類分析技術(shù)應(yīng)用在眾多領(lǐng)域,如數(shù)據(jù)統(tǒng)計(jì)、電子商務(wù)、Web分析、生物醫(yī)藥、營銷分析等。K-means算法是一個(gè)經(jīng)典的聚類分析算法,算法基于劃分技術(shù),通過選取初始聚類中心將數(shù)據(jù)集進(jìn)行合理的分類,根據(jù)生成的聚類的平均值來合理地調(diào)整聚類的中心點(diǎn)。算法通過多次迭代,最終實(shí)現(xiàn)簇內(nèi)相似性最大,簇間相似性最小。K-means算法原理簡單、容易實(shí)現(xiàn),在對大規(guī)模數(shù)據(jù)集進(jìn)行處理時(shí)具有較好的延展性和時(shí)間復(fù)雜度。但是,它仍存在許多的缺點(diǎn),如:K-means算法對初始聚類中心的選擇很敏感,中心的不當(dāng)選擇會(huì)造成聚類分析結(jié)果的較大誤差;算法最終的分析結(jié)果往往是局部最優(yōu)結(jié)果,但對于全局不是最優(yōu)結(jié)果。此外,K-means算法需要事先給定初始聚類的個(gè)數(shù)k。本文以自適應(yīng)特征權(quán)重和遺傳算法為理論基礎(chǔ),解決了傳統(tǒng)K-means算法中的部分不足,避免聚類分析結(jié)果陷入局部最優(yōu),有效提高算法的準(zhǔn)確性和穩(wěn)定性。針對傳統(tǒng)K-means算法固定特征權(quán)重不靈活對初始聚類中心的選取有很大依賴性的缺點(diǎn),可以按照屬性重要程度越高,權(quán)值越大的原則對屬性的權(quán)值進(jìn)行調(diào)整,使人們可以清晰看出屬性的重要級別。在不指定K值的前提下,算法根據(jù)數(shù)據(jù)對象密度的大小,在高密度集合中選取若干代表性的對象作為初始聚類中心,通過對準(zhǔn)則函數(shù)的比較得出最優(yōu)的K,算法在迭代的過程中依據(jù)簇類內(nèi)盡可能相似、簇類間盡可能相異的準(zhǔn)則變化屬性的特征權(quán)重值。將遺傳算法與自適應(yīng)權(quán)重結(jié)合后運(yùn)用在K-means算法上,對其進(jìn)行改進(jìn),即在屬性權(quán)重的基礎(chǔ)上,用遺傳算法的全局搜索能力來獲得較優(yōu)的聚類中心,最后使用K-means算法進(jìn)行優(yōu)化。這種方法能很好地降低K-means算法對初始中心的依賴性,提高算法的聚類效果。將此算法在實(shí)驗(yàn)數(shù)據(jù)集上進(jìn)行試驗(yàn)后,并將其運(yùn)用在聚類算法的應(yīng)用領(lǐng)域之一的圖像分割上,比較其分割效果。實(shí)驗(yàn)采用標(biāo)準(zhǔn)數(shù)據(jù)集對兩個(gè)改進(jìn)的算法進(jìn)行驗(yàn)證,從準(zhǔn)確率、迭代次數(shù)和聚類中心幾個(gè)方面進(jìn)行分析,并與傳統(tǒng)K-means算法進(jìn)行比較,證實(shí)了改進(jìn)K-means聚類分析算法的高效性。
[Abstract]:The rapid improvement of information technology and the rise of Web technology promote the acquisition of data information, access to automation, rapid and intelligent development. In the research of data mining, clustering analysis is an important research branch. Clustering analysis is an unsupervised and exploratory classification technology. Without any prior knowledge, it divides a data set without class identification, and divides it according to the degree of acquaintance between data objects. The result is the collection of different clusters. At present, cluster analysis technology is applied in many fields, such as data statistics, e-commerce Web analysis, biomedicine, marketing analysis and so on. K-means algorithm is a classical clustering analysis algorithm, which is based on partitioning technology. By selecting the initial clustering center to classify the data set reasonably, the center point of the cluster can be adjusted reasonably according to the average value of the generated clustering. The algorithm achieves the maximum similarity in the cluster through multiple iterations. The algorithm of minimum similarity between clusters. K-means is simple in principle and easy to implement. It has good extensibility and time complexity in processing large data sets. However, it still has many shortcomings. Such as: K-means algorithm is very sensitive to the selection of initial clustering center, improper selection of center will result in a large error in the result of clustering analysis, the final analysis result of the algorithm is often the local optimal result. But the global is not the optimal result. In addition, the K-means algorithm needs to give the number of the initial clustering k. based on the adaptive feature weight and genetic algorithm, this paper solves some of the shortcomings of the traditional K-means algorithm. In order to avoid the clustering results falling into local optimum and effectively improve the accuracy and stability of the algorithm, the traditional K-means algorithm has the disadvantage that the fixed feature weights are inflexible and depend heavily on the selection of initial clustering centers. The weight of attribute can be adjusted according to the principle that the importance of attribute is higher and the weight of attribute is bigger, so that people can clearly see the importance level of attribute. Without specifying K value, the algorithm is based on the density of data object. Some representative objects are selected as the initial clustering center in the high density set. By comparing the criterion functions, the optimal Ks are obtained, and the algorithm is as similar as possible according to the cluster class in the iterative process. The genetic algorithm and adaptive weight are combined with K-means algorithm to improve the attribute weight, that is, on the basis of attribute weight. The global search ability of genetic algorithm is used to obtain the optimal clustering center, and the K-means algorithm is used to optimize the cluster center. This method can reduce the dependence of K-means algorithm on the initial center. After the experiment on the experimental data set, the algorithm is applied to the image segmentation, which is one of the application fields of the clustering algorithm. The experiment uses standard data set to verify the two improved algorithms, analyzes them from the aspects of accuracy, iteration times and clustering center, and compares them with the traditional K-means algorithm. The improved K-means clustering algorithm is proved to be efficient.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP311.13

【引證文獻(xiàn)】

相關(guān)期刊論文 前1條

1 胡濤;王濤;史永帥;鞠明遠(yuǎn);;一種改進(jìn)的K-means算法在智能用電數(shù)據(jù)分析上的應(yīng)用[J];信息技術(shù)與信息化;2016年09期

相關(guān)碩士學(xué)位論文 前1條

1 鄭偉娜;圖像分類中特征聚類算法研究[D];燕山大學(xué);2016年

,

本文編號:1580416

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/yingxiaoguanlilunwen/1580416.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶cb367***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com
日韩高清一区二区三区四区| 日韩精品视频一二三区| 日本理论片午夜在线观看| 91亚洲人人在字幕国产| 91欧美日韩一区人妻少妇| 中文字幕亚洲视频一区二区| 国产精品日韩欧美第一页| 99久久精品免费精品国产| 欧美字幕一区二区三区| 亚洲国产欧美精品久久| 99久久精品午夜一区二| 色哟哟精品一区二区三区| 亚洲欧洲在线一区二区三区| 国产不卡在线免费观看视频| 麻豆视频传媒入口在线看| 扒开腿狂躁女人爽出白浆av| 亚洲成人黄色一级大片| 亚洲一区二区亚洲日本| 青青操视频在线观看国产| 久久精品国产99精品亚洲| 人妻亚洲一区二区三区| 国产精品一区二区传媒蜜臀| 久久福利视频视频一区二区| 午夜精品一区二区三区国产| 日韩精品中文字幕在线视频| 亚洲另类欧美综合日韩精品| 国产成人一区二区三区久久| 精品人妻一区二区三区免费| 久草视频在线视频在线观看| 日本人妻的诱惑在线观看| 成人午夜视频精品一区| 黑鬼糟蹋少妇资源在线观看| 精品人妻一区二区三区免费| 国产一区国产二区在线视频| 国产性情片一区二区三区| 精品人妻少妇二区三区| 日本在线 一区 二区| 中文字幕日产乱码一区二区| 精品久久综合日本欧美| 在线观看国产午夜福利| 国产午夜福利在线观看精品|