多標(biāo)簽分類算法研究及其應(yīng)用
發(fā)布時(shí)間:2018-03-01 16:14
本文關(guān)鍵詞: 多標(biāo)簽分類 標(biāo)簽相關(guān)性 集成算法 k-labelsets 情境推薦 出處:《山東大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:近年來(lái),我們進(jìn)入了數(shù)據(jù)爆炸時(shí)代,隨著數(shù)據(jù)的增長(zhǎng)以及數(shù)據(jù)存儲(chǔ)能力的增強(qiáng),使得我們可以獲得形式各異的數(shù)據(jù)源并將其存儲(chǔ)于信息庫(kù)中。通過(guò)對(duì)信息庫(kù)中存儲(chǔ)的數(shù)據(jù)進(jìn)行分析挖掘,可以有效地抽取出富含價(jià)值的信息,有助于商業(yè)、科研等活動(dòng)的決策。而分類技術(shù)作為其中一種數(shù)據(jù)分析挖掘的形式,它可以抽取能夠描述重要數(shù)據(jù)集合的模型,用于預(yù)測(cè)數(shù)據(jù)對(duì)象的離散類別。而根據(jù)分類預(yù)測(cè)后的樣本類別標(biāo)簽個(gè)數(shù)不同,分類問(wèn)題又可分為單標(biāo)簽分類和多標(biāo)簽分類。在傳統(tǒng)的監(jiān)督學(xué)習(xí)任務(wù)中我們所面臨的問(wèn)題大部分是單標(biāo)簽分類問(wèn)題,然而,在很多分類任務(wù)中每個(gè)樣本需要與多個(gè)類別標(biāo)簽相關(guān)聯(lián),如在文本分類中(與多種類型相關(guān)聯(lián)的書(shū))和醫(yī)學(xué)診斷中(例如,多發(fā)病人的疾病診斷)等。而這些問(wèn)題是單標(biāo)簽分類技術(shù)無(wú)法解決的,因此,近些年來(lái),多標(biāo)簽分類的研究得到了國(guó)內(nèi)外學(xué)者的廣泛關(guān)注。目前解決多標(biāo)簽分類問(wèn)題的算法并沒(méi)有達(dá)到令人滿意的效果,研究者們也試圖通過(guò)考慮標(biāo)簽相關(guān)性以及通過(guò)分類器集成等方法來(lái)提高分類性能。通過(guò)對(duì)現(xiàn)有的多標(biāo)簽分類算法的研究分析,其中,RAkEL多標(biāo)簽分類算法是一種使用分類器集成技術(shù)的較為有效的多標(biāo)簽分類算法,然而由于該算法在子分類器構(gòu)造過(guò)程中標(biāo)簽組合具有隨機(jī)性以及沒(méi)有充分利用標(biāo)簽的相關(guān)性信息等因素,其分類效果仍有提升的空間。本文通過(guò)將標(biāo)簽相關(guān)性與分類器集成技術(shù)應(yīng)用于統(tǒng)一的框架,提出了一種改進(jìn)自RAkEL算法的新的多標(biāo)簽分類算法。本文提出的方法在實(shí)驗(yàn)中與RAkEL多標(biāo)簽分類算法相比較在多個(gè)評(píng)測(cè)指標(biāo)上得到性能提升,與其他多標(biāo)簽分類算法相比也具有競(jìng)爭(zhēng)性的優(yōu)勢(shì)。另外,本文也探索了多標(biāo)簽分類算法在推薦系統(tǒng)領(lǐng)域的應(yīng)用。在推薦系統(tǒng)領(lǐng)域,上下文感知推薦系統(tǒng)利用上下文情境信息進(jìn)一步提高了推薦的精確度和用戶滿意度,但上下文感知推薦系統(tǒng)研究的問(wèn)題仍然是如何將項(xiàng)目集合推薦給目標(biāo)用戶。在本文中,我們將研究現(xiàn)實(shí)生活中另外一種推薦場(chǎng)景:當(dāng)用戶選定某個(gè)項(xiàng)目時(shí),我們?yōu)槠渫扑]最合適的應(yīng)用情境,即上下文,例如,某用戶已經(jīng)決定去看某部電影,這時(shí)他需要的建議是在哪里(家里還是劇院)、和誰(shuí)(家人還是朋友)一起觀看會(huì)獲得更好的觀影體驗(yàn)。情境推薦不僅可以為用戶消費(fèi)某個(gè)項(xiàng)目推薦最合適的情境以提高消費(fèi)體驗(yàn),也可以協(xié)助用戶做項(xiàng)目選擇決策。我們將此類推薦問(wèn)題轉(zhuǎn)化為多標(biāo)簽分類問(wèn)題進(jìn)行求解,首先我們驗(yàn)證了轉(zhuǎn)化為多標(biāo)簽分類問(wèn)題進(jìn)行求解的有效性,然后通過(guò)改進(jìn)多標(biāo)簽分類算法,得到適用于情境推薦問(wèn)題的方法,并在兩個(gè)領(lǐng)域的數(shù)據(jù)集上進(jìn)行了實(shí)驗(yàn)。實(shí)驗(yàn)結(jié)果表明,本文算法可以給出個(gè)性化建議,并在多個(gè)指標(biāo)上好于原算法。
[Abstract]:In recent years, we have entered the era of data explosion, with the growth of data and the enhancement of data storage capacity, It allows us to access a variety of data sources and store them in a repository. By analyzing and mining the data stored in the repository, we can effectively extract valuable information and help business. As one of the forms of data analysis and mining, classification technology can extract models that can describe important data sets. Used to predict discrete classes of data objects. The number of sample class labels predicted according to classification is different, The classification problem can be divided into single label classification and multi label classification. In the traditional supervised learning task, most of the problems we face are single label classification problems, however, Each sample needs to be associated with multiple category labels in many classification tasks, such as text categorization (books associated with multiple types) and medical diagnostics (for example, These problems cannot be solved by single label classification technology, so in recent years, The research of multi-label classification has received extensive attention from scholars at home and abroad. At present, the algorithm to solve the problem of multi-label classification has not achieved satisfactory results. Researchers also try to improve classification performance by considering label correlation and classifier integration. Among them, Rakel multi-label classification algorithm is a more effective multi-label classification algorithm using classifier integration technology. However, due to the randomness of tag combination in the construction of subclassifier, the correlation information of label is not fully utilized. There is still room for improvement in the classification effect. In this paper, the label correlation and classifier integration technology is applied to the unified framework. In this paper, a new multi-label classification algorithm is proposed to improve the self-label classification algorithm. Compared with the RAkEL multi-label classification algorithm, the method proposed in this paper improves the performance of the multi-label classification algorithm in comparison with that of the RAkEL multi-label classification algorithm. In addition, this paper also explores the application of multi-label classification algorithm in the field of recommendation system. Context-aware recommendation system makes use of context context information to further improve the accuracy of recommendation and user satisfaction, but the problem of context-aware recommendation system is still how to recommend the item set to the target user. We're going to look at another recommendation scenario in real life: when a user selects a project, we recommend the most appropriate application scenario for them, that is, context, for example, a user has decided to see a movie. The advice he needs at this point is where (home or theater), who (family or friend) will get a better viewing experience. Situational recommendations can not only recommend the most appropriate scenario for the user to consume a particular project. To improve the consumer experience, It can also help users to make project selection decision. We transform this kind of recommendation problem into multi-label classification problem and solve it. First, we verify the validity of solving multi-label classification problem. Then, by improving the multi-label classification algorithm, the method suitable for the situation recommendation problem is obtained, and the experiments are carried out on the data sets in two fields. The experimental results show that the proposed algorithm can give personalized advice. And in many indicators better than the original algorithm.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP181;TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 景寧,劉雨,彭甫陽(yáng);一種實(shí)用外分類算法—快速分類-折半插入算法的研究及實(shí)現(xiàn)[J];小型微型計(jì)算機(jī)系統(tǒng);1988年09期
2 鄭智捷;幻序合并分類算法[J];計(jì)算機(jī)學(xué)報(bào);1984年05期
3 劉t,
本文編號(hào):1552640
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1552640.html
最近更新
教材專著