基于高斯混合模型的變分自動(dòng)編碼器
發(fā)布時(shí)間:2018-11-28 18:12
【摘要】:無(wú)監(jiān)督學(xué)習(xí)作為一種能從無(wú)標(biāo)簽數(shù)據(jù)中學(xué)習(xí)真實(shí)世界的方法,它能把人類從數(shù)據(jù)的標(biāo)簽化中解放出來(lái)。費(fèi)曼說(shuō)過(guò):做不出來(lái)就沒(méi)有真正明白,評(píng)價(jià)無(wú)監(jiān)督學(xué)習(xí)好壞的方式有很多,其中生成任務(wù)就是最直接的一個(gè)。只有當(dāng)我們能生成/創(chuàng)造我們的真實(shí)世界,才能說(shuō)明我們是完完全全理解了它。因此,生成模型成為了近年來(lái)無(wú)監(jiān)督學(xué)習(xí)算法中最流行的算法之一。本文將介紹一種復(fù)雜分布無(wú)監(jiān)督學(xué)習(xí)中最流行的生成模型之一,即變分自動(dòng)編碼器,一種能夠自動(dòng)生成數(shù)據(jù)的模型,它是將高維復(fù)雜圖像分布降低成低維簡(jiǎn)單分布,從而再?gòu)牡途S簡(jiǎn)單分布中自動(dòng)生成原始圖像。目前的變分自動(dòng)編碼器中隱變量z的后驗(yàn)分布大多滿足單個(gè)簡(jiǎn)單分布,比如高斯分布,這就造成低維表示太過(guò)簡(jiǎn)單。然而真實(shí)世界中有許多非高斯形態(tài)的分布,特別地,對(duì)于一些高度扭曲的多峰分布,一個(gè)高斯近似往往是不足夠的。而數(shù)據(jù)集的隱空間也可能是任意復(fù)雜的分布;诖,我們主要做了以下幾點(diǎn)貢獻(xiàn)。首先,為了提高后驗(yàn)分布的靈活性,我們把近似后驗(yàn)分布改為高斯混合模型,高斯混合模型的加入大大提高了變分自動(dòng)編碼器在數(shù)據(jù)集上的邊緣似然。其次,為了進(jìn)一步提高后驗(yàn)分布的靈活性,我們?cè)谧兎肿跃幋a器中引入了Normalizing Flows,并將Normalizing Flows與高斯混合模型結(jié)合。Normalizing Flows可以用來(lái)指定任意復(fù)雜的、靈活的、可縮放的近似后驗(yàn)分布,即一個(gè)簡(jiǎn)單的初始化密度函數(shù)通過(guò)運(yùn)用一系列的可逆變換被轉(zhuǎn)移成一個(gè)渴望獲得的復(fù)雜分布。最后,我們重新推導(dǎo)了高斯混合模型下變分自動(dòng)編碼器的變分下界,并獲得了其對(duì)應(yīng)的優(yōu)化算法。由于Normalizing Flows的加入,高斯混合模型中的每一個(gè)單高斯都可以近似全協(xié)方差矩陣,即高斯混合模型的所有協(xié)方差矩陣都是非對(duì)角的,因此,基于高斯混合模型的變分自動(dòng)編碼器又被稱為非對(duì)角高斯混合變分自動(dòng)編碼器(non-diagonal Gaussian mixture variational auto-encoders,NDGMVAE)。NDGMVAE使得隱變量z能夠更真實(shí)地匹配隱變量空間。進(jìn)一步,為了提高變分自動(dòng)編碼器的圖像生成清晰度,我們改進(jìn)了變分自動(dòng)編碼器中編碼器和解碼器的結(jié)構(gòu),使用了最新的卷積神經(jīng)網(wǎng)絡(luò)(CNN)和具有門控機(jī)制(gating mechanism)的神經(jīng)網(wǎng)絡(luò),我們還對(duì)不同結(jié)構(gòu)的變分自動(dòng)編碼器的變分下界進(jìn)行了比較。為了證明新引入的后驗(yàn)分布更加的靈活,能夠更真實(shí)地匹配隱變量空間,我們基于MNIST數(shù)據(jù)集、OMNIGLOT數(shù)據(jù)集和Histopathology數(shù)據(jù)集進(jìn)行了實(shí)驗(yàn),著重比較了各個(gè)數(shù)據(jù)集下的log似然的變分下界,并且在MNIST、OMNIGLOT和Freyfaces數(shù)據(jù)集上進(jìn)行了可視化,比較了MNIST對(duì)應(yīng)的隱變量分布。不僅如此,我們還基于不同高斯混合個(gè)數(shù)、不同高斯混合系數(shù)和Normalizing Flows的長(zhǎng)度做了相應(yīng)的實(shí)驗(yàn)?傊,新改進(jìn)地基于高斯混合模型的變分自動(dòng)編碼器在性能和變分推理的各種應(yīng)用上都有一個(gè)明顯的提高,并且在理論上也具有優(yōu)勢(shì)。
[Abstract]:No supervised learning, as a method of learning real-world from non-label data, can free mankind from the labeling of data. It's not true to do it, and there are a lot of ways to evaluate the quality of unsupervised learning, which is one of the most direct ones. Only when we can generate/ create our real world can we tell us that we are completely understood. Therefore, the generation model is one of the most popular algorithms in the non-supervised learning algorithm in recent years. This paper will introduce one of the most popular generation models in the unsupervised learning of complex distribution, i.e., the variable sub-automatic encoder, a model capable of automatically generating data, which is to reduce the high-dimensional complex image distribution to a low-dimensional simple distribution, so as to automatically generate the original image from the low-dimensional simple distribution. The post-test distribution of the hidden variable z in the present variable-frequency automatic encoder mostly satisfies a single simple distribution, such as a Gaussian distribution, which makes the low-dimensional representation too simple. However, there are many non-Gaussian distributions in the real world, and in particular, for some highly distorted multi-peak distributions, a Gaussian approximation is often not sufficient. while the hidden space of the data set may be any complex distribution. Based on this, we have made the following contributions. First, in order to improve the flexibility of the post-test distribution, we change the approximate post-check distribution to the Gaussian mixture model, and the addition of the Gaussian mixture model greatly improves the edge of the variable sub-automatic encoder on the data set. Secondly, in order to further improve the flexibility of the post-test distribution, we introduced the Normalizing Flops in the variable sub-encoder and combined the Normalizing Flops with the Gaussian mixture model. Normalizing Flows can be used to specify any complex, flexible, scalable, approximately post-test distribution, that is, a simple initialization density function is transferred into a complex distribution that is desired by the use of a series of reversible transformations. In the end, we re-derive the lower bound of the variable sub-automatic coder under the Gaussian mixture model, and obtain the corresponding optimization algorithm. Due to the addition of the Normalizing Flops, each single Gaussian in the Gaussian mixture model can approximate the full covariance matrix, that is, all the covariance matrices of the Gaussian mixture model are non-diagonal, and therefore, A variable sub-automatic encoder based on a Gaussian mixture model is also referred to as a non-diagonal Gaussian mixture variable auto-encoding device (NDGMVAE). The NDGMVAE enables the hidden variable z to more truly match the hidden variable space. Further, in order to improve the definition of the image generation of the variable sub-automatic encoder, we improve the structure of the encoder and decoder in the variable sub-automatic encoder, using the latest convolutional neural network (CNN) and a neural network with gating mechanism, We also compare the lower and lower bounds of the variable sub-automatic encoder with different structures. In order to prove that the newly introduced post-test distribution is more flexible, the hidden variable space can be more truly matched. We have carried on the experiment based on the MNIST data set, the OMNIGLOT data set and the Histology data set, and the log-like variable lower bound under each data set is emphatically compared, and at MNIST, The hidden variable distribution corresponding to MNIST is compared on the OMNIGLOT and the Freysurfaces data set. In addition, we have done a corresponding experiment based on the number of different Gaussian mixture, the different Gaussian mixture coefficient and the length of the Normalizing Flow. In conclusion, the new modified auto-encoder based on the Gaussian mixture model has an obvious improvement in various applications of performance and variable-division reasoning, and has an advantage in theory.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
[Abstract]:No supervised learning, as a method of learning real-world from non-label data, can free mankind from the labeling of data. It's not true to do it, and there are a lot of ways to evaluate the quality of unsupervised learning, which is one of the most direct ones. Only when we can generate/ create our real world can we tell us that we are completely understood. Therefore, the generation model is one of the most popular algorithms in the non-supervised learning algorithm in recent years. This paper will introduce one of the most popular generation models in the unsupervised learning of complex distribution, i.e., the variable sub-automatic encoder, a model capable of automatically generating data, which is to reduce the high-dimensional complex image distribution to a low-dimensional simple distribution, so as to automatically generate the original image from the low-dimensional simple distribution. The post-test distribution of the hidden variable z in the present variable-frequency automatic encoder mostly satisfies a single simple distribution, such as a Gaussian distribution, which makes the low-dimensional representation too simple. However, there are many non-Gaussian distributions in the real world, and in particular, for some highly distorted multi-peak distributions, a Gaussian approximation is often not sufficient. while the hidden space of the data set may be any complex distribution. Based on this, we have made the following contributions. First, in order to improve the flexibility of the post-test distribution, we change the approximate post-check distribution to the Gaussian mixture model, and the addition of the Gaussian mixture model greatly improves the edge of the variable sub-automatic encoder on the data set. Secondly, in order to further improve the flexibility of the post-test distribution, we introduced the Normalizing Flops in the variable sub-encoder and combined the Normalizing Flops with the Gaussian mixture model. Normalizing Flows can be used to specify any complex, flexible, scalable, approximately post-test distribution, that is, a simple initialization density function is transferred into a complex distribution that is desired by the use of a series of reversible transformations. In the end, we re-derive the lower bound of the variable sub-automatic coder under the Gaussian mixture model, and obtain the corresponding optimization algorithm. Due to the addition of the Normalizing Flops, each single Gaussian in the Gaussian mixture model can approximate the full covariance matrix, that is, all the covariance matrices of the Gaussian mixture model are non-diagonal, and therefore, A variable sub-automatic encoder based on a Gaussian mixture model is also referred to as a non-diagonal Gaussian mixture variable auto-encoding device (NDGMVAE). The NDGMVAE enables the hidden variable z to more truly match the hidden variable space. Further, in order to improve the definition of the image generation of the variable sub-automatic encoder, we improve the structure of the encoder and decoder in the variable sub-automatic encoder, using the latest convolutional neural network (CNN) and a neural network with gating mechanism, We also compare the lower and lower bounds of the variable sub-automatic encoder with different structures. In order to prove that the newly introduced post-test distribution is more flexible, the hidden variable space can be more truly matched. We have carried on the experiment based on the MNIST data set, the OMNIGLOT data set and the Histology data set, and the log-like variable lower bound under each data set is emphatically compared, and at MNIST, The hidden variable distribution corresponding to MNIST is compared on the OMNIGLOT and the Freysurfaces data set. In addition, we have done a corresponding experiment based on the number of different Gaussian mixture, the different Gaussian mixture coefficient and the length of the Normalizing Flow. In conclusion, the new modified auto-encoder based on the Gaussian mixture model has an obvious improvement in various applications of performance and variable-division reasoning, and has an advantage in theory.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 張開(kāi)旭;周昌樂(lè);;基于自動(dòng)編碼器的中文詞匯特征無(wú)監(jiān)督學(xué)習(xí)[J];中文信息學(xué)報(bào);2013年05期
2 來(lái)斯惟;徐立恒;陳玉博;劉康;趙軍;;基于表示學(xué)習(xí)的中文分詞算法探索[J];中文信息學(xué)報(bào);2013年05期
3 李海峰;李純果;;深度學(xué)習(xí)結(jié)構(gòu)和算法比較分析[J];河北大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年05期
4 王瑞琴;孔繁勝;;無(wú)監(jiān)督詞義消歧研究[J];軟件學(xué)報(bào);2009年08期
5 李剛,童,
本文編號(hào):2363828
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2363828.html
最近更新
教材專著