基于并行統(tǒng)計(jì)計(jì)算的金融數(shù)據(jù)分析
發(fā)布時(shí)間:2018-02-11 03:49
本文關(guān)鍵詞: 統(tǒng)計(jì)計(jì)算 回歸 非參數(shù)推斷 隨機(jī)過(guò)程 出處:《山東大學(xué)》2012年博士論文 論文類(lèi)型:學(xué)位論文
【摘要】:現(xiàn)代計(jì)算機(jī)系統(tǒng)更加強(qiáng)大,使許多統(tǒng)計(jì)計(jì)算可以在瞬間完成。然而,一些重要的情況計(jì)算時(shí)間仍然需要用天來(lái)算,尤其是大樣本海量數(shù)據(jù)或較大復(fù)雜抽樣數(shù)據(jù)的統(tǒng)計(jì)推斷。故一般的處理方法是用速度較快,但不太準(zhǔn)確的方法,或完全跳過(guò)潛在的重要計(jì)算。因此,并行統(tǒng)計(jì)計(jì)算的發(fā)展是非常重要的。 在這篇論文中,我們研究了工資數(shù)據(jù),破產(chǎn)數(shù)據(jù)的加速機(jī)會(huì),和養(yǎng)老基金數(shù)據(jù)的統(tǒng)計(jì)方法。我們發(fā)現(xiàn)了,并行統(tǒng)計(jì)計(jì)算處理大型統(tǒng)計(jì)推斷問(wèn)題良好的速度性能。本論文由五個(gè)章節(jié),其主要內(nèi)容描述如下: 第一章并行統(tǒng)計(jì)計(jì)算是一個(gè)非常有趣的問(wèn)題:在統(tǒng)計(jì)中,有很多統(tǒng)計(jì)計(jì)算是密集并行,因此并行和統(tǒng)計(jì)計(jì)算之間交叉的研究非常重要。本章重點(diǎn)關(guān)注的是回歸問(wèn)題,非參數(shù)推斷,隨機(jī)過(guò)程。特別是,我們綜述的方法有并行多分裂法,線(xiàn)性回歸最小二乘的并行統(tǒng)計(jì)解法和非線(xiàn)性回歸并行統(tǒng)計(jì)算法,并行自助在非參數(shù)推斷的理論結(jié)構(gòu):馬氏鏈的并行統(tǒng)計(jì)解法,并行馬氏鏈蒙特卡洛。非常重要的是,我們對(duì)并行GPU處理非圖形的應(yīng)用給出了綜述。我們的結(jié)論是,并行統(tǒng)計(jì)算法的進(jìn)一步研究是必須的。并對(duì)一些重要且懸而未決的問(wèn)題給予了描述。 第二章對(duì)于執(zhí)行多元線(xiàn)性模型,子集選擇和運(yùn)行時(shí)間是很重要的問(wèn)題。為了解決這些問(wèn)題,我們引入一個(gè)新的并行估計(jì)。首先給出這一方法和廣義最小二乘估計(jì)的等價(jià)條件,并考慮了投影和特征值的秩。然后,當(dāng)存在一個(gè)穩(wěn)定解時(shí),我們給出它的誤差。此外,我們所提出的方法,被用于破產(chǎn)數(shù)據(jù),獲得了一個(gè)數(shù)據(jù)集的估計(jì)方程,并報(bào)告了兩個(gè)數(shù)據(jù)模擬的執(zhí)行時(shí)間。 第三章探討解大樣本方程的乘性和阻尼加性施瓦茨法的收斂理論。對(duì)于大樣本的廣義線(xiàn)性模型和廣義加性模型,我們建議施瓦茨法解擬似然和懲罰擬似然。施瓦茨法用于一個(gè)子模型的序列,其中每個(gè)子模型對(duì)應(yīng)兩步估計(jì)參數(shù)中元素的一個(gè)子集,組合的子模型一起產(chǎn)生整個(gè)模型的解。這項(xiàng)技術(shù)可被用于模型比較,其中子模型的擬合值被用來(lái)作為一個(gè)更大模型的初始值。 第四章并行自助是一個(gè)非常有用,時(shí)間性能突出的統(tǒng)計(jì)方法。然而,該法的理論研究還沒(méi)有出現(xiàn)。在本章,介紹一個(gè)關(guān)于該法的工作相關(guān)矩陣,稱(chēng)為并行自助矩陣。我們考慮該重抽樣的一些性質(zhì),以及光滑函數(shù)模型的相關(guān)最優(yōu)子樣本長(zhǎng)度。我們出現(xiàn)了并行自助估計(jì)的時(shí)間性能研究;對(duì)于金融時(shí)間序列數(shù)據(jù),給出了子樣本長(zhǎng)度選擇的一些性能研究結(jié)果。 第五章研究馬氏鏈擬平穩(wěn)分布的計(jì)算方法。這里的矩陣為擬隨機(jī)陣,即,每行的和小于或等于1。我們發(fā)展施瓦茨法解該分布。特別是,得到了加性和乘性施瓦茨以及兩水平的半收斂性。為了解釋建議的方法,我們給出了馬氏鏈擬平穩(wěn)分布的兩個(gè)例子。
[Abstract]:Modern computer systems are more powerful, so that many statistical calculations can be completed in an instant. However, in some important cases, computing time still needs to be calculated in days. In particular, the statistical inference of large sample mass data or large and complex sample data. Therefore, the general processing method is to use a faster, but less accurate method, or skip the potentially important calculation completely. The development of parallel statistical computing is very important. In this paper, we looked at wage data, accelerated opportunities for bankruptcy data, and statistical methods for pension fund data. Parallel statistical computation has good speed performance in dealing with large scale statistical inference problems. This paper consists of five chapters, the main contents of which are described as follows:. Chapter 1 parallel statistical computing is a very interesting problem: in statistics, there are many statistical computations that are dense and parallel, so the study of the intersection between parallel and statistical computing is very important. Nonparametric inference, stochastic processes. In particular, the methods we review include parallel multisplitting, linear regression least squares parallel statistical solution and nonlinear regression parallel statistical algorithm. The theoretical structure of parallel self-help nonparametric inference: the parallel statistical solution of Markov chain, the parallel Markov chain Monte Carlo. Very important, we give an overview of the application of parallel GPU processing non-graph. Further research on parallel statistical algorithms is necessary, and some important and unsolved problems are described. In chapter 2, subset selection and running time are very important problems for multivariate linear models. In order to solve these problems, we introduce a new parallel estimator. First, we give the equivalent conditions of this method and generalized least square estimation. And we consider the rank of projection and eigenvalue. Then, when there is a stable solution, we give the error of it. In addition, our method is applied to the ruin data, and the estimation equation of a data set is obtained. The execution time of two data simulations is reported. In chapter 3, we discuss the convergence theory of the multiplicative and damped additive Schwartz method for solving large sample equations. For the generalized linear model and generalized additive model of large sample, We suggest that Schwartz's method be used to solve quasi-likelihood and punish quasi-likelihood. Schwartz's method is used for the sequence of a submodel, where each submodel corresponds to a subset of elements in a two-step estimation parameter. This technique can be used for model comparison where the fitting value of the submodel is used as the initial value of a larger model. Chapter 4th parallel self-help is a very useful statistical method with outstanding time performance. However, the theoretical study of this method has not yet appeared. In this chapter, a work correlation matrix is introduced. We consider some properties of this resampling and the relevant optimal subsample length of smooth function model. We have studied the time performance of parallel self-help estimation. Some research results on the selection of subsample length are given. In chapter 5th, we study the calculation method of quasi-stationary distribution of Markov chains. The matrix here is quasi random matrix, that is, the sum of each row is less than or equal to 1.We develop the Schwartz method to solve the distribution. We obtain additive and multiplicative Schwartz and semi-convergence of two levels. In order to explain the proposed method, we give two examples of quasi-stationary distribution of Markov chains.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2012
【分類(lèi)號(hào)】:F224;F830
【共引文獻(xiàn)】
相關(guān)博士學(xué)位論文 前1條
1 周春英;超數(shù)據(jù)集成挖掘方法與技術(shù)研究[D];浙江大學(xué);2012年
,本文編號(hào):1502140
本文鏈接:http://sikaile.net/guanlilunwen/huobilw/1502140.html
最近更新
教材專(zhuān)著