稀疏數(shù)據(jù)的貝葉斯網(wǎng)絡結(jié)構學習
本文選題:稀疏數(shù)據(jù) 切入點:貝葉斯網(wǎng)絡 出處:《山東師范大學》2017年碩士論文
【摘要】:圖模型被廣泛用于表示和分析隨機變量之間的因果關系以及條件獨立性.圖模型中主要包括有向無環(huán)圖,無向圖和鏈圖.其中有向無環(huán)圖,也被稱作貝葉斯網(wǎng)絡,圖中的邊都是有向邊,并且不能構成有向環(huán).貝葉斯網(wǎng)絡用來描述隨機變量的因果關系.本文主要提出對貝葉斯網(wǎng)絡結(jié)構進行學習的算法.貝葉斯網(wǎng)絡結(jié)構學習算法主要有三類:1基于獨立性檢驗的約束算法;2基于評分搜索的算法;3將獨立性檢驗和評分搜索綜合利用起來的算法.2008年,Xie和Geng針對貝葉斯網(wǎng)絡結(jié)構學習提出貝葉斯網(wǎng)絡結(jié)構學習的遞歸分解算法.這個算法是將大規(guī)模的貝葉斯網(wǎng)絡結(jié)構學習問題遞歸地分割為較小規(guī)模的結(jié)構學習的問題.這個算法主要應用無向獨立圖的構建,這就導致了它有兩個困難:一是在數(shù)據(jù)稀疏,變量較多的情況下無向獨立圖的構建不夠準確;二是當變量較多時,無向圖的構建也是比較復雜的.2013年Cai等人提出了可測因果分割算法(Scalable cAusation Discovery Al-gorithm, SADA) .這個算法將變量集 V 分割成三個集合 (V1,C,V2),其 中只要保證在給定C的情況下,V1與V2之間沒有邊直接相連即可,并不需要它們條件獨立,所以能夠解決數(shù)據(jù)稀疏的困難.但是,Cai的算法在合并的過程中有可能出現(xiàn)假邊,針對這個問題,本論文提出一種再學習的檢查學習算法,這種算法的提出,結(jié)合了Cai和Xie算法中的優(yōu)勢,解決了稀疏數(shù)據(jù)貝葉斯網(wǎng)絡結(jié)構學習的問題.本文提出的算法,首先將一個貝葉斯網(wǎng)絡的變量集合不斷地調(diào)用可測因果分割算法進行分割;然后在每一組因果分割上,先進行局部結(jié)構學習再進行合并得到可能有假邊存在整體結(jié)構;最后尋找因果分割集及其鄰居集合,在這之上調(diào)用再學習檢查算法,進行修正學習以得到正確貝葉斯網(wǎng)絡的骨架圖,最后利用Meek準則確定出等價類來.
[Abstract]:Graph models are widely used to express and analyze causality and conditional independence between random variables.The graph model mainly includes directed acyclic graph, undirected graph and chain graph.The directed acyclic graph is also called Bayesian network. The edges of the graph are directed edges and cannot form directed rings.Bayesian networks are used to describe the causality of random variables.In this paper, a learning algorithm for Bayesian network structure is proposed.There are three kinds of Bayesian network structure learning algorithms: one is a constraint algorithm based on independence test; the other is an algorithm based on score search. In 2008, Xie and Geng aimed at BayesianA recursive decomposition algorithm for learning Bayesian network structure is proposed.This algorithm recursively divides the large-scale Bayesian network structure learning problem into smaller scale structural learning problems.This algorithm mainly applies the construction of undirected independent graph, which leads to two difficulties: one is that the construction of undirected independent graph is not accurate enough when the data is sparse, and the other is that when there are more variables, the construction of undirected independent graph is not accurate enough.The construction of undirected graph is also complicated. In 2013, Cai et al proposed scalable cAusation Discovery algorithm (SADAA).In this algorithm, the variable set V is divided into three sets: v _ 1 / C _ 1 / V _ 2, which ensures that there is no direct connection between V _ 1 and V _ 2 under a given C condition, and they do not need to be conditional independent, so the difficulty of data sparsity can be solved.However, false edges may appear in the merging process of Cai algorithm. In view of this problem, this paper proposes a relearning check learning algorithm, which combines the advantages of Cai and Xie algorithms.The problem of sparse data Bayesian network structure learning is solved.The algorithm proposed in this paper firstly segments a set of Bayesian network variables by calling the testable causality segmentation algorithm continuously, and then on each set of causal segmentation,The local structure learning is carried out first, then the global structure with false edges is obtained. Finally, the causal partition set and its neighbor set are found, and then the learning and re-checking algorithm is called.The correct skeleton diagram of Bayesian network is obtained by modified learning, and the equivalent class is determined by using Meek criterion.
【學位授予單位】:山東師范大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:F224
【相似文獻】
相關期刊論文 前1條
1 祝琴;戴愛明;;高維稀疏數(shù)據(jù)對象—屬性的非關聯(lián)子空間分析[J];中國管理信息化;2011年09期
相關會議論文 前2條
1 李博多;李建中;高宏;彭麗萍;;一種支持大規(guī)模稀疏數(shù)據(jù)表上相似性查詢的索引設計[A];第二十五屆中國數(shù)據(jù)庫學術會議論文集(一)[C];2008年
2 陳郁馨;程序;趙鵬;孟必平;李紅燕;王騰蛟;;云環(huán)境中一種面向海量稀疏數(shù)據(jù)存儲的缺失值處理方法[A];第29屆中國數(shù)據(jù)庫學術會議論文集(B輯)(NDBC2012)[C];2012年
相關博士學位論文 前2條
1 曾玉華;稀疏數(shù)據(jù)恢復的結(jié)構優(yōu)化模型及其算法研究[D];湖南大學;2016年
2 袁宇;稀疏數(shù)據(jù)條件下河流入海污染物通量的估算[D];東北大學;2009年
相關碩士學位論文 前10條
1 楊玉;稀疏數(shù)據(jù)的貝葉斯網(wǎng)絡結(jié)構學習[D];山東師范大學;2017年
2 陳婷婷;面向稀疏數(shù)據(jù)的微分隱私數(shù)據(jù)發(fā)布[D];福州大學;2014年
3 劉帥;高維稀疏數(shù)據(jù)的相關性度量方法研究[D];首都經(jīng)濟貿(mào)易大學;2014年
4 尹松;高屬性維稀疏數(shù)據(jù)動態(tài)抽象聚類方法研究[D];廣西大學;2005年
5 張珍;貝葉斯Meta-分析[D];山東師范大學;2017年
6 任健;基于壓縮感知的非均勻空間立體陣SAR三維層析成像[D];哈爾濱工業(yè)大學;2014年
7 李曉萍;有約束條件優(yōu)化問題的MM算法[D];蘭州大學;2017年
8 田正東;基于子空間分析的DOA估計算法研究[D];南京郵電大學;2017年
9 趙程檐;花授粉算法的研究及應用[D];廣西民族大學;2017年
10 葉曉平;高階多模型狀態(tài)估計算法及應用[D];哈爾濱工業(yè)大學;2017年
,本文編號:1728547
本文鏈接:http://sikaile.net/jingjifazhanlunwen/1728547.html