基于組織型P系統(tǒng)的DNA-GA算法研究及其在聚類中的應(yīng)用
本文選題:P系統(tǒng) + DNA-GA; 參考:《山東師范大學》2017年碩士論文
【摘要】:DNA-GA算法本質(zhì)上是建立在DNA編碼上的遺傳算法,是將進化計算領(lǐng)域和DNA計算相結(jié)合的一種表現(xiàn)形式。DNA-GA算法所采用的DNA編碼方式與傳統(tǒng)的二進制編碼相比較起來更加靈活,并且還可以進行較多的遺傳操作,這就使得DNA-GA算法相對于遺傳算法來說,可以表達更多的遺傳信息。所以DNA-GA算法能夠在更大程度上克服GA算法所存在的某些局限問題,比如算法的早熟收斂、二進制海明懸崖問題等,因此DNA-GA近些年受到學者們的廣泛關(guān)注。當下設(shè)計出更有效的DNA-GA算法,為人類研究做出貢獻,具有很強的理論和現(xiàn)實意義。膜計算又稱P系統(tǒng),是從生物細胞、組織或器官的功能和結(jié)構(gòu)中抽象出來的具有分布式的并行計算模型。從計算效率角度來看,P系統(tǒng)能夠在線性時間內(nèi)求解NP難問題,因此能夠在計算智能方面為人們提供較多的方便。到目前為止,膜計算已被廣泛應(yīng)用于眾多領(lǐng)域,例如:計算機科學,生物學,語言學,近似優(yōu)化,計算機圖形學,經(jīng)濟學,密碼學等。膜計算的應(yīng)用研究相對于理論方面研究,目前尚處于初級階段,學者們期待P系統(tǒng)在應(yīng)用領(lǐng)域上會有突破性進展。聚類分析屬于無監(jiān)督學習的一種技術(shù),也就是說本身具有獨立的學習能力。聚類的整個過程可以描述為:將整個數(shù)據(jù)空間中的每個對象根據(jù)歐式距離分別劃分到不同的簇中,距離較近的對象會被劃分到相同的簇中,反之距離較遠的對象會被劃分到不同的簇中,最終使得同一類中的對象盡可能地相似而不同類中的對象盡可能地不同。隨著聚類分析的研究發(fā)展,其在模式分析、機器學習、數(shù)據(jù)挖掘、文檔檢索、圖像分割、模式識別等領(lǐng)域都有十分廣泛的應(yīng)用。本文就是在以上所述的理論前提下,以膜計算模型中的組織型P系統(tǒng)為基礎(chǔ),提出了基于組織型P系統(tǒng)的DNA-GA算法(TPDNA-GA)。主要涉及三部分的創(chuàng)新:一、對基本DNA-GA算法中涉及的遺傳操作進行部分修改,提出了基于新型重構(gòu)交叉算子的改進DNA-GA算法;二、將改進后的DNA-GA算法與組織型P系統(tǒng)相結(jié)合,結(jié)合的主要目的是利用組織型P系統(tǒng)的極大并行性和膜規(guī)則來提高DNA-GA的性能,其中包括了對適應(yīng)度函數(shù)的定義及膜規(guī)則的改進,從而尋找到等待處理的數(shù)據(jù)集的最佳聚類結(jié)果。并且本文利用三個標準測試函數(shù)對所提出新算法的性能進行了有效性驗證;三、將TPDNA-GA算法與K-means相結(jié)合進行了相關(guān)研究與對比分析,并利用標準測試集進行了算法性能分析;最后本文將該TPDNA-GA算法的聚類過程應(yīng)用在處理Web文檔中,提出了具體的文檔聚類應(yīng)用過程,并且利用Reuters-21578中的數(shù)據(jù)進行實驗,對聚類精確度進行驗證和比較,證明該算法能夠為人們在日常工作中查詢文檔提供方便。
[Abstract]:The DNA-GA algorithm is essentially a genetic algorithm based on DNA coding. It is a representation of evolutionary computing and DNA computation, which is more flexible than the traditional binary coding. And more genetic operations can be carried out, which makes the DNA-GA algorithm can express more genetic information than the genetic algorithm. Therefore, DNA-GA algorithm can overcome some limitations of GA algorithm to a greater extent, such as the premature convergence of the algorithm, binary Hemming Cliff problem and so on. Therefore, DNA-GA has been widely concerned by scholars in recent years. It is of great theoretical and practical significance to design a more effective DNA-GA algorithm to contribute to human research. Membrane computing, also called P system, is a distributed parallel computing model abstracted from the functions and structures of biological cells, tissues or organs. From the point of view of computational efficiency, the P / P system can solve NP-hard problems in linear time, so it can provide more convenience for people in computing intelligence. Up to now, membrane computing has been widely used in many fields, such as computer science, biology, linguistics, approximate optimization, computer graphics, economics, cryptography and so on. Compared with the theoretical research, the application of membrane computing is still in its infancy, and scholars expect that there will be a breakthrough in the application of P system. Clustering analysis is a kind of unsupervised learning technology, that is to say, it has independent learning ability. The whole process of clustering can be described as: each object in the whole data space is divided into different clusters according to the Euclidean distance, and the objects close to each other are divided into the same cluster. On the other hand, objects far away will be divided into different clusters, making objects in the same class as similar as possible and objects in different classes as different as possible. With the development of clustering analysis, it has been widely used in the fields of pattern analysis, machine learning, data mining, document retrieval, image segmentation, pattern recognition and so on. In this paper, on the basis of the tissue P system in the membrane computing model, the DNA-GA algorithm based on the tissue P system is proposed. It mainly involves the innovation of three parts: first, the genetic operation involved in the basic DNA-GA algorithm is partly modified, and an improved DNA-GA algorithm based on the new reconstruction crossover operator is proposed; second, the improved DNA-GA algorithm is combined with the organizational P system. The main purpose of the combination is to improve the performance of DNA-GA by using the maximal parallelism and membrane rules of the tissue P system, including the definition of fitness function and the improvement of membrane rules, so as to find the best clustering result of the data set waiting for processing. Three standard test functions are used to verify the performance of the proposed algorithm. Thirdly, the TPDNA-GA algorithm and K-means are studied and compared, and the performance of the algorithm is analyzed by using the standard test set. Finally, this paper applies the clustering process of the TPDNA-GA algorithm to the processing of Web documents, proposes a specific document clustering application process, and makes use of the data in Reuters-21578 to carry out experiments to verify and compare the clustering accuracy. It is proved that the algorithm can provide convenience for people to query documents in their daily work.
【學位授予單位】:山東師范大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:Q811.4;TP311.13
【參考文獻】
相關(guān)期刊論文 前10條
1 詹曉娟;姚登舉;朱懷球;;高通量DNA測序數(shù)據(jù)的生物信息學方法[J];大數(shù)據(jù);2016年02期
2 ,;張勛才;韓棟;王燕;崔光照;王子成;;一種基于DNA序列運算的信息隱藏方案[J];輕工學報;2016年01期
3 張菁芳;任家順;陳渝;黃春基;薛曉;向前;袁小山;;醫(yī)學應(yīng)急救援任務(wù)優(yōu)化調(diào)度策略研究[J];醫(yī)療衛(wèi)生裝備;2015年09期
4 鄭學東;;基于聚類小生境遺傳算法的DNA編碼優(yōu)化[J];計算機工程;2015年02期
5 寇光杰;馬云艷;岳峻;鄒海林;;膜計算在圖像處理中應(yīng)用的研究進展及展望[J];計算機科學;2014年S2期
6 廖孝勇;李尚鍵;孫棣華;何偉;余楚中;;一種基于膜計算的改進人工魚群算法[J];小型微型計算機系統(tǒng);2014年05期
7 宋春寧;劉少東;;基于SDNA-GA優(yōu)化的模糊神經(jīng)網(wǎng)絡(luò)控制[J];控制與決策;2014年04期
8 趙松;夏燕玲;何熊熊;;DNA遺傳算法的改進二維最大熵圖像分割[J];硅谷;2013年02期
9 周洪煜;曾濟貧;王照陽;趙乾;;基于混沌DNA遺傳算法與PSO組合優(yōu)化的RNN短期風電功率預測[J];電力系統(tǒng)保護與控制;2013年02期
10 王千;王成;馮振元;葉金鳳;;K-means聚類算法研究綜述[J];電子設(shè)計工程;2012年07期
相關(guān)博士學位論文 前3條
1 李積英;融合量子衍生及DNA計算速率的智能算法在圖像分割中的研究[D];蘭州交通大學;2014年
2 陳霄;DNA遺傳算法及應(yīng)用研究[D];浙江大學;2010年
3 黃亮;膜計算優(yōu)化方法研究[D];浙江大學;2007年
相關(guān)碩士學位論文 前10條
1 韓麗莎;類細胞P系統(tǒng)在劃分聚類中的研究[D];山東師范大學;2015年
2 呂慧珍;DNA遺傳算法及其在燃料電池中的應(yīng)用研究[D];浙江工業(yè)大學;2015年
3 高雪峰;膜計算在基因表達數(shù)據(jù)分析中的應(yīng)用[D];西華大學;2015年
4 任碩;基于膜計算的輸電線路路徑優(yōu)化問題的研究與應(yīng)用[D];山東師范大學;2015年
5 李薈嬈;K-means聚類方法的改進及其應(yīng)用[D];東北農(nóng)業(yè)大學;2014年
6 吳浩;DNA遺傳算法在表面貼裝生產(chǎn)線負荷均衡中的應(yīng)用研究[D];成都理工大學;2014年
7 孫杰;細胞型膜系統(tǒng)在聚類算法中的研究[D];山東師范大學;2014年
8 蔣洋;基于膜計算的聚類算法研究[D];西華大學;2014年
9 夏燕玲;DNA遺傳算法及其在流程工業(yè)中的應(yīng)用研究[D];浙江工業(yè)大學;2013年
10 黃小麗;細胞型膜系統(tǒng)設(shè)計方法研究[D];西南交通大學;2012年
,本文編號:1859200
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1859200.html