天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

面向數(shù)據(jù)挖掘的關(guān)系型領(lǐng)域知識(shí)融合方法研究

發(fā)布時(shí)間:2018-12-10 12:21
【摘要】:現(xiàn)有數(shù)據(jù)挖掘技術(shù)所面向的數(shù)據(jù)大多是在原始層次上的,相應(yīng)的挖掘方法是無(wú)領(lǐng)域知識(shí)融合,或者是依賴于用戶參與的人工方式融合領(lǐng)域知識(shí)來(lái)實(shí)現(xiàn)知識(shí)發(fā)現(xiàn)的過(guò)程。然而,實(shí)際應(yīng)用領(lǐng)域的數(shù)據(jù)存在層次上的差異,有些數(shù)據(jù)是原始級(jí)的,還有些數(shù)據(jù)與其他一些數(shù)據(jù)密切相關(guān),并且采用這些相關(guān)數(shù)據(jù)的適當(dāng)?shù)慕M合或泛化粒度可能更好地揭示其內(nèi)在的規(guī)律。因此,充分利用與原始數(shù)據(jù)相關(guān)的領(lǐng)域知識(shí)指導(dǎo)數(shù)據(jù)挖掘的工作,能“從極不相同的粒度上觀察和分析同一問(wèn)題”,達(dá)到在合理的數(shù)據(jù)層次上獲取知識(shí),在不同的數(shù)據(jù)層次上靈活轉(zhuǎn)換,做到往返自如,毫無(wú)困難,這成為重要的研究課題。鑒于實(shí)際應(yīng)用領(lǐng)域中,大量的數(shù)據(jù)存在著以屬性擴(kuò)展或延伸為代表形式的領(lǐng)域知識(shí),而此類領(lǐng)域知識(shí)大多采用關(guān)系表的形式出現(xiàn)。因此,本文重點(diǎn)研究關(guān)系型領(lǐng)域知識(shí)的表示及其與數(shù)據(jù)挖掘研究工作融合的方法,從而自動(dòng)有效的開(kāi)展知識(shí)發(fā)現(xiàn)工作。本文主要研究工作如下:(1)提出基于關(guān)系模型領(lǐng)域知識(shí)的結(jié)構(gòu)化表示模型DKMRM (Domain Knowledge of Multi-Relations Model,DKMRM)。模型中采用關(guān)系模型對(duì)數(shù)據(jù)表中的相關(guān)屬性的領(lǐng)域知識(shí)進(jìn)行映射或投影,從而構(gòu)成領(lǐng)域知識(shí)的上下文關(guān)系表,進(jìn)而形成了復(fù)雜的多關(guān)系表示模型。在面向關(guān)系型數(shù)據(jù)庫(kù)系統(tǒng)進(jìn)行挖掘時(shí),利用這種模型和必要的變換策略,可以將某些原始數(shù)據(jù)泛化或例化到合理的層次,以獲得更符合用戶個(gè)性化需求的知識(shí)形式。(2)基于DKMRM的數(shù)據(jù)挖掘研究工作。提出面向數(shù)據(jù)挖掘的關(guān)系型領(lǐng)域知識(shí)融合方法。以分類問(wèn)題為實(shí)際案例,建立融合關(guān)系型領(lǐng)域知識(shí)的分類挖掘方法框架。針對(duì)傳統(tǒng)挖掘方法存在的局限性,本方法框架有效解決傳遞源、傳遞路徑、終止策略、傳遞的偏差統(tǒng)計(jì)等關(guān)鍵問(wèn)題。(3)提出基于屬性選擇的多關(guān)系分類挖掘算法CC-DKMR ( Classification of Characters based on Domain Knowledge of Multi-Relations,CC-DKMR)和基于關(guān)系表選擇的多關(guān)系分類挖掘算法 CS-DKMR (Classification of Sheets based on Domain Knowledge of Multi-Relations,CS-DKMR),以尋求在不同的數(shù)據(jù)粒度層次上挖掘模式和靈活的轉(zhuǎn)換機(jī)制,從領(lǐng)域知識(shí)中獲取更有價(jià)值的知識(shí)。實(shí)驗(yàn)表明此方法是有效的。(4)提出在數(shù)據(jù)挖掘的評(píng)測(cè)階段融合領(lǐng)域知識(shí)的挖掘算法的評(píng)測(cè)方法,解決數(shù)據(jù)挖掘的算法(程序)存在的“oracle”現(xiàn)象,傳統(tǒng)的評(píng)測(cè)方法難以具有適應(yīng)性的問(wèn)題;谕懽儨y(cè)試技術(shù),該方法有效利用領(lǐng)域知識(shí),并針對(duì)分類、關(guān)聯(lián)、聚類挖掘算法的具體案例開(kāi)展研究分析,構(gòu)造了針對(duì)具體算法的蛻變關(guān)系。實(shí)驗(yàn)結(jié)果表明,此方法能有效達(dá)到評(píng)測(cè)目的,并具有適用其它領(lǐng)域的推廣可行性。
[Abstract]:Most of the existing data mining technologies are based on the original level. The corresponding mining methods are domainless knowledge fusion or the process of realizing knowledge discovery by integrating domain knowledge with the user's participation. However, there are hierarchical differences in data in practical application areas, some of which are raw, others that are closely related to others, And the proper combination or generalization granularity of these related data may better reveal its inherent law. Therefore, to make full use of domain knowledge related to raw data to guide the work of data mining, we can "observe and analyze the same problem from very different granularity", so as to obtain knowledge at a reasonable data level. Flexible conversion at different data levels, free commutation, no difficulty, this has become an important research topic. In view of the fact that a large number of data exist in the field of practical application, there is domain knowledge in the form of attribute extension or extension, and most of such domain knowledge appears in the form of relational tables. Therefore, this paper focuses on the representation of relational domain knowledge and its fusion with data mining research, so that knowledge discovery can be carried out automatically and effectively. The main work of this paper is as follows: (1) A structured representation model based on relational model domain knowledge (DKMRM (Domain Knowledge of Multi-Relations Model,DKMRM) is proposed. In the model, the relational model is used to map or project the domain knowledge of the related attributes in the data table, so as to form the contextual table of domain knowledge, and then form a complex multi-relational representation model. When mining for relational database system, some raw data can be generalized or exemplified to a reasonable level by using this model and necessary transformation strategy. (2) the research work of data mining based on DKMRM. A relational domain knowledge fusion method for data mining is proposed. Taking the classification problem as a practical case, the framework of classification mining method for integrating relational domain knowledge is established. In view of the limitations of traditional mining methods, the framework of this method effectively solves the problem of transfer source, transfer path and termination strategy. (3) A multi-relational classification mining algorithm CC-DKMR (Classification of Characters based on Domain Knowledge of Multi-Relations, based on attribute selection is proposed. CC-DKMR) and CS-DKMR (Classification of Sheets based on Domain Knowledge of Multi-Relations,CS-DKMR), a multi-relational classification mining algorithm based on relational table selection, to seek for mining patterns and flexible transformation mechanisms at different data granularity levels. Acquire more valuable knowledge from domain knowledge. Experimental results show that this method is effective. (4) A method for evaluating the fusion of domain knowledge in the evaluation stage of data mining is proposed to solve the "oracle" phenomenon in the algorithm (program) of data mining. It is difficult for traditional evaluation methods to be adaptive. Based on the metamorphosis testing technology, the method effectively utilizes domain knowledge, and carries out research and analysis on the specific cases of classification, association and clustering mining algorithm, and constructs the metamorphosis relation for the specific algorithm. The experimental results show that this method can effectively achieve the purpose of evaluation and is applicable to other fields.
【學(xué)位授予單位】:合肥工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 謝亮;張晶;胡學(xué)鋼;;主從關(guān)系數(shù)據(jù)庫(kù)中關(guān)聯(lián)規(guī)則挖掘算法研究[J];合肥工業(yè)大學(xué)學(xué)報(bào)(自然科學(xué)版);2009年05期

2 董國(guó)偉;徐寶文;陳林;聶長(zhǎng)海;王璐璐;;蛻變測(cè)試技術(shù)綜述[J];計(jì)算機(jī)科學(xué)與探索;2009年02期

3 彭珍;楊炳儒;李冬艷;侯偉;寧頂利;;多關(guān)系數(shù)據(jù)分類方法綜述[J];計(jì)算機(jī)工程與應(yīng)用;2008年34期

4 何軍;劉紅巖;杜小勇;;挖掘多關(guān)系關(guān)聯(lián)規(guī)則[J];軟件學(xué)報(bào);2007年11期

5 徐光美;楊炳儒;張偉;寧淑榮;;多關(guān)系數(shù)據(jù)挖掘方法研究[J];計(jì)算機(jī)應(yīng)用研究;2006年09期

6 李道國(guó);苗奪謙;杜偉林;;粒度計(jì)算在人工神經(jīng)網(wǎng)絡(luò)中的應(yīng)用[J];同濟(jì)大學(xué)學(xué)報(bào)(自然科學(xué)版);2006年07期

7 ;A Granular Computing Model Based on Tolerance relation[J];The Journal of China Universities of Posts and Telecommunications;2005年03期

8 朱靖波,陳文亮;基于領(lǐng)域知識(shí)的文本分類[J];東北大學(xué)學(xué)報(bào);2005年08期

9 吳鵬,施小純,唐江峻,林惠民,陳宗岳;關(guān)于蛻變測(cè)試和特殊用例測(cè)試的實(shí)例研究(英文)[J];軟件學(xué)報(bào);2005年07期

10 李道國(guó),苗奪謙,張紅云;粒度計(jì)算的理論、模型與方法[J];復(fù)旦學(xué)報(bào)(自然科學(xué)版);2004年05期



本文編號(hào):2370556

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2370556.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶0e28b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com