代價(jià)敏感屬性約簡(jiǎn)的混合蟻群優(yōu)化算法
本文關(guān)鍵詞:代價(jià)敏感屬性約簡(jiǎn)的混合蟻群優(yōu)化算法 出處:《西南石油大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 粗糙集 屬性約簡(jiǎn) 代價(jià)敏感學(xué)習(xí) 混合蟻群
【摘要】:隨著計(jì)算機(jī)應(yīng)用領(lǐng)域的不斷擴(kuò)張,所采集數(shù)據(jù)的維度呈現(xiàn)爆炸性增長(zhǎng)。屬性約簡(jiǎn)的本質(zhì)是在保持決策系統(tǒng)分類能力不變的前提下,降低數(shù)據(jù)維度,幫助人們高效地做出決策。代價(jià)敏感學(xué)習(xí)是數(shù)據(jù)挖掘中十大最具挑戰(zhàn)性問(wèn)題之一,其相關(guān)屬性約簡(jiǎn)問(wèn)題的目的是獲得最小測(cè)試代價(jià)、時(shí)間代價(jià)或誤分類代價(jià)等。針對(duì)代價(jià)敏感屬性約簡(jiǎn)問(wèn)題,很多仿生算法已被提出。蜂群算法速度快,但過(guò)早收斂導(dǎo)致局部最優(yōu)解,即解的代價(jià)不是最低。蟻群算法尋優(yōu)能力強(qiáng),但運(yùn)行效率較低。本文提出了一種混合蟻群優(yōu)化算法的一般性框架(hybrid ant colony optimization general framework,HA),它由兩種搜索策略混合而成,即部分搜索和完全搜索策略。在部分搜索策略中,每只先驅(qū)者螞蟻選擇出k個(gè)屬性。其中,k是一個(gè)經(jīng)驗(yàn)值,由一個(gè)初始約簡(jiǎn)的大小決定。部分搜索策略擯棄了頻繁計(jì)算正區(qū)域、刪除冗余屬性等操作,使得HA具有高效性。在完全搜索策略中,每只收割者螞蟻選擇出一個(gè)可行解。在兩種搜索策略中,人工螞蟻均通過(guò)更新路徑上的信息素來(lái)優(yōu)化路徑。對(duì)于最小測(cè)試代價(jià)和最小時(shí)間代價(jià)的屬性約簡(jiǎn)問(wèn)題,本文基于HA分別實(shí)現(xiàn)了相應(yīng)具體的算法。實(shí)驗(yàn)采用了 UCI數(shù)據(jù)庫(kù)中的四個(gè)真實(shí)數(shù)據(jù)集,每個(gè)數(shù)據(jù)集采取三種不同的代價(jià)分布。通過(guò)大量的參數(shù)調(diào)整以及與已有算法的對(duì)比,結(jié)果表明:1)參數(shù)配置對(duì)速度與解產(chǎn)生重要影響;2)當(dāng)解的質(zhì)量(比如找到最優(yōu)因子)相當(dāng)時(shí),本文算法運(yùn)行效率高于已有算法;3)當(dāng)運(yùn)行效率相近時(shí),本文算法解的質(zhì)量高于已有算法。
[Abstract]:With the continuous expansion of computer application field, the dimension of the collected data presents explosive growth. The essence of attribute reduction is to reduce the data dimension on the premise of keeping the ability of decision system classification unchanged. Cost-sensitive learning is one of the ten most challenging problems in data mining. For the cost sensitive attribute reduction problem, many bionic algorithms have been proposed. Bee colony algorithm is fast, but premature convergence leads to local optimal solution. The cost of the solution is not the lowest. However, the running efficiency is low. In this paper, a general framework of hybrid ant colony optimization algorithm is proposed. Hybrid ant colony optimization general framework. Hahe, which consists of two search strategies, partial search and complete search. In the partial search strategy, each pioneer ant selects k attributes, where k is an empirical value. Part of the search strategy abandons the frequent computation of positive areas and deletes redundant attributes, which makes HA efficient in the complete search strategy. Each reaper ant chooses a feasible solution. In both search strategies, the artificial ant optimizes the path by updating the information on the path. The attribute reduction problem of minimum test cost and minimum time cost is discussed. In this paper, the corresponding algorithm is implemented based on HA, and four real data sets in UCI database are used in the experiment. Each data set takes three different cost distributions. Through a large number of parameter adjustments and comparison with existing algorithms, the results show that the configuration of the parameters has an important effect on the speed and solution. 2) when the quality of the solution (such as finding the optimal factor) is equal, the efficiency of the algorithm is higher than that of the existing algorithm. 3) when the running efficiency is similar, the quality of the proposed algorithm is higher than that of the existing algorithm.
【學(xué)位授予單位】:西南石油大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 梁吉業(yè);錢(qián)宇華;李德玉;胡清華;;大數(shù)據(jù)挖掘的粒計(jì)算理論與方法[J];中國(guó)科學(xué):信息科學(xué);2015年11期
2 孫磊;胡學(xué)龍;張曉斌;李云;;生物醫(yī)學(xué)大數(shù)據(jù)處理的云計(jì)算解決方案[J];電子測(cè)量與儀器學(xué)報(bào);2014年11期
3 王勝;譚家政;劉勇;邱芹軍;;求解TSP問(wèn)題的改進(jìn)蟻群算法[J];武漢理工大學(xué)學(xué)報(bào)(信息與管理工程版);2013年03期
4 張華;;基于數(shù)據(jù)挖掘技術(shù)的電子商務(wù)旅游線路推薦系統(tǒng)[J];軟件;2013年03期
5 葛浩;李龍澍;楊傳健;;基于差別集的啟發(fā)式屬性約簡(jiǎn)算法[J];小型微型計(jì)算機(jī)系統(tǒng);2013年02期
6 王沛棟;唐功友;楊熙鑫;李揚(yáng);;一種求解旅行商問(wèn)題的改進(jìn)蟻群算法[J];中國(guó)海洋大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年01期
7 楊明;呂靜;;一種基于C-Tree的屬性約簡(jiǎn)增量式更新算法[J];控制與決策;2012年12期
8 劉大有;陳慧靈;齊紅;楊博;;時(shí)空數(shù)據(jù)挖掘研究進(jìn)展[J];計(jì)算機(jī)研究與發(fā)展;2013年02期
9 周華;張新;劉騰云;余發(fā)新;;高通量轉(zhuǎn)錄組測(cè)序的數(shù)據(jù)分析與基因發(fā)掘[J];江西科學(xué);2012年05期
10 陳玉明;苗奪謙;;基于冪圖的屬性約簡(jiǎn)搜索式算法[J];計(jì)算機(jī)學(xué)報(bào);2009年08期
相關(guān)碩士學(xué)位論文 前1條
1 徐子龍;代價(jià)敏感學(xué)習(xí)中屬性約簡(jiǎn)與決策樹(shù)分類若干關(guān)鍵問(wèn)題研究[D];閩南師范大學(xué);2014年
,本文編號(hào):1412327
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1412327.html