基于膜系統(tǒng)的多關系聚類算法的研究與應用
發(fā)布時間:2018-10-12 10:56
【摘要】:膜系統(tǒng)是自然計算領域中的一個年輕的分支,受器官、組織、細胞及其他生物構造中化學元素處理方法的啟發(fā)而從中抽象出的分布式并行計算模型。由于具有并行性強、容錯性強和分布式等特性,膜系統(tǒng)在眾多領域得到了普遍的應用,并且已經解決了眾多的現(xiàn)實問題。傳統(tǒng)的聚類方法通常假設數(shù)據(jù)之間是相互獨立的,然而,現(xiàn)在大部分的應用數(shù)據(jù)存儲在關系數(shù)據(jù)庫以多關系的形式。傳統(tǒng)的聚類方法已不再能滿足現(xiàn)在應用數(shù)據(jù)的要求,本文針對多關系聚類存在聚類質量差和聚類效率低的問題,展開了深入的研究。本文以膜系統(tǒng)為基礎模型,首先提出了一種初始中心選取的方法對K-means聚類算法進行優(yōu)化改進,然后在此基礎上提出了兩種高效的多關系聚類算法,并將提出的算法應用于協(xié)同過濾推薦系統(tǒng):(1)基于初始聚類中心優(yōu)化的K-means算法(OIK-means算法)。該算法首先根據(jù)相似性計算每個對象的密度,然后通過計算對象與任意高密度對象的最小距離來篩選候選中心,接著通過平均密度來排除離群點,最后確定K初始中心點。OIK-means算法在人工數(shù)據(jù)集和UCI數(shù)據(jù)集上進行測驗,并與傳統(tǒng)的K-means算法在初始中心選取的準確性上進行了對比。(2)基于綜合相似性的多關系聚類算法(ISMC)。算法使用元組ID傳播的思想,為關系數(shù)據(jù)庫中的每個表設置一個權重,對傳統(tǒng)的相似性計算進行改進,按照一定的權重把對象的類內相似性和類外相似性整合成綜合相似性,基于綜合相似性對目標表中的對象進行OIK-means聚類。ISMC算法在UCI數(shù)據(jù)集Movie上進行了測驗,并與TPC、ReCOM、LinkClus算法進行了比較。(3)基于膜系統(tǒng)的遺傳K-means多關系聚類算法(GKM)。算法從膜系統(tǒng)與多關系聚類算法相結合的新角度出發(fā),設計了由三個細胞組成的進化-交流組織型P系統(tǒng),并在三個細胞中使用了三種不同的遺傳進化機制,這種混合遺傳機制能夠改善算法的收斂性和增強對象的多樣性,使多關系數(shù)據(jù)集能有一個準確的聚類。GKM算法在UCI數(shù)據(jù)集Movie上進行了測驗,并與ReCOM、LinkClus、ISMC算法進行了比較。(4)將基于膜系統(tǒng)的多關系聚類應用于協(xié)同過濾推薦系統(tǒng)中,提出了一個基于膜系統(tǒng)和多關系聚類的高效的協(xié)同過濾推薦方法(MCMCF)。該方法充分利用了膜系統(tǒng)的極大并行(Max)和分布式執(zhí)行的特點,綜合相似性計算方法使得數(shù)據(jù)稀疏性問題得到有效解決,多關系聚類也有效的縮減了近鄰的搜索規(guī)模,提高了算法的推薦質量和運行效率。
[Abstract]:Membrane system is a young branch in the field of natural computing. It is an abstract distributed parallel computing model inspired by the processing methods of chemical elements in organs, tissues, cells and other biological structures. Because of its strong parallelism, fault tolerance and distributed characteristics, membrane systems have been widely used in many fields, and many practical problems have been solved. Traditional clustering methods usually assume that the data are independent of each other. However, most of the application data are stored in the relational database in the form of multiple relationships. The traditional clustering method can no longer meet the requirements of the current application data. This paper focuses on the problems of poor clustering quality and low clustering efficiency in multi-relational clustering. In this paper, based on the membrane system model, an initial center selection method is proposed to optimize and improve the K-means clustering algorithm, and then two efficient multi-relational clustering algorithms are proposed. The proposed algorithm is applied to collaborative filtering recommendation system: (1) K-means algorithm based on initial clustering center optimization (OIK-means algorithm). The algorithm first calculates the density of each object according to the similarity, then selects the candidate center by calculating the minimum distance between the object and any high-density object, and then excludes outliers by the average density. Finally, the initial center point of K is determined. OIK-means algorithm is tested on artificial data set and UCI data set, and compared with the traditional K-means algorithm in the accuracy of initial center selection. (2) the multi-relation clustering algorithm (ISMC). Based on synthetic similarity is proposed. Using the idea of tuple ID propagation, the algorithm sets a weight for each table in relational database, improves the traditional similarity calculation, and integrates the intra-class similarity and out-of-class similarity of objects into comprehensive similarity according to certain weights. Based on the synthetic similarity, the objects in the target table are clustered by OIK-means. The ISMC algorithm is tested on the UCI dataset Movie, and compared with the TPC,ReCOM,LinkClus algorithm. (3) the genetic K-means multi-relation clustering algorithm (GKM). Based on the membrane system is proposed. From the view of the combination of membrane system and multi-relation clustering algorithm, an evolution-alternating tissue P system composed of three cells was designed, and three different genetic evolutionary mechanisms were used in the three cells. This hybrid genetic mechanism can improve the convergence of the algorithm and enhance the diversity of objects, so that there can be an accurate clustering of multi-relational datasets. The GKM algorithm is tested on the UCI dataset Movie. And compared with ReCOM,LinkClus,ISMC algorithm. (4) Multi-relational clustering based on membrane system is applied to collaborative filtering recommendation system, and an efficient collaborative filtering recommendation method (MCMCF). Based on membrane system and multi-relational clustering is proposed. This method makes full use of the characteristics of the maximal parallel (Max) and distributed execution of the membrane system. The synthetic similarity calculation method can effectively solve the problem of data sparsity, and the multi-relation clustering can effectively reduce the search scale of the nearest neighbor. The recommended quality and efficiency of the algorithm are improved.
【學位授予單位】:山東師范大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP311.13
本文編號:2265858
[Abstract]:Membrane system is a young branch in the field of natural computing. It is an abstract distributed parallel computing model inspired by the processing methods of chemical elements in organs, tissues, cells and other biological structures. Because of its strong parallelism, fault tolerance and distributed characteristics, membrane systems have been widely used in many fields, and many practical problems have been solved. Traditional clustering methods usually assume that the data are independent of each other. However, most of the application data are stored in the relational database in the form of multiple relationships. The traditional clustering method can no longer meet the requirements of the current application data. This paper focuses on the problems of poor clustering quality and low clustering efficiency in multi-relational clustering. In this paper, based on the membrane system model, an initial center selection method is proposed to optimize and improve the K-means clustering algorithm, and then two efficient multi-relational clustering algorithms are proposed. The proposed algorithm is applied to collaborative filtering recommendation system: (1) K-means algorithm based on initial clustering center optimization (OIK-means algorithm). The algorithm first calculates the density of each object according to the similarity, then selects the candidate center by calculating the minimum distance between the object and any high-density object, and then excludes outliers by the average density. Finally, the initial center point of K is determined. OIK-means algorithm is tested on artificial data set and UCI data set, and compared with the traditional K-means algorithm in the accuracy of initial center selection. (2) the multi-relation clustering algorithm (ISMC). Based on synthetic similarity is proposed. Using the idea of tuple ID propagation, the algorithm sets a weight for each table in relational database, improves the traditional similarity calculation, and integrates the intra-class similarity and out-of-class similarity of objects into comprehensive similarity according to certain weights. Based on the synthetic similarity, the objects in the target table are clustered by OIK-means. The ISMC algorithm is tested on the UCI dataset Movie, and compared with the TPC,ReCOM,LinkClus algorithm. (3) the genetic K-means multi-relation clustering algorithm (GKM). Based on the membrane system is proposed. From the view of the combination of membrane system and multi-relation clustering algorithm, an evolution-alternating tissue P system composed of three cells was designed, and three different genetic evolutionary mechanisms were used in the three cells. This hybrid genetic mechanism can improve the convergence of the algorithm and enhance the diversity of objects, so that there can be an accurate clustering of multi-relational datasets. The GKM algorithm is tested on the UCI dataset Movie. And compared with ReCOM,LinkClus,ISMC algorithm. (4) Multi-relational clustering based on membrane system is applied to collaborative filtering recommendation system, and an efficient collaborative filtering recommendation method (MCMCF). Based on membrane system and multi-relational clustering is proposed. This method makes full use of the characteristics of the maximal parallel (Max) and distributed execution of the membrane system. The synthetic similarity calculation method can effectively solve the problem of data sparsity, and the multi-relation clustering can effectively reduce the search scale of the nearest neighbor. The recommended quality and efficiency of the algorithm are improved.
【學位授予單位】:山東師范大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP311.13
【參考文獻】
相關期刊論文 前1條
1 鄧左祥;李春貴;;一種有效的多關系聚類算法[J];微電子學與計算機;2016年04期
相關博士學位論文 前2條
1 薛潔;兩類生物計算問題及其在數(shù)據(jù)挖掘中的應用研究[D];山東師范大學;2015年
2 高瀅;多關系聚類分析方法研究[D];吉林大學;2008年
相關碩士學位論文 前2條
1 黃小麗;基于PSO的膜聚類算法及其在圖像壓縮中的應用[D];西華大學;2015年
2 蔣洋;基于膜計算的聚類算法研究[D];西華大學;2014年
,本文編號:2265858
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2265858.html
最近更新
教材專著