Hadoop環(huán)境下基于Lick線指數(shù)的恒星光譜分類與參數(shù)測(cè)量
發(fā)布時(shí)間:2018-09-18 15:50
【摘要】:天體光譜中蘊(yùn)含著非常豐富的天體物理信息,通過(guò)對(duì)光譜的分析,可以得到天體的物理信息、化學(xué)成分以及天體的大氣參數(shù)等。隨著LAMOST、SDSS等大規(guī)模巡天望遠(yuǎn)鏡的實(shí)施,將會(huì)產(chǎn)生海量的光譜數(shù)據(jù),尤其是LAMOST正式運(yùn)行后,每個(gè)觀測(cè)夜產(chǎn)生大約2~4萬(wàn)條光譜數(shù)據(jù)。如此海量的光譜數(shù)據(jù)對(duì)光譜的快速有效的處理提出了更高的要求。本課題就是在此背景下提出來(lái)的,目標(biāo)是研究海量恒星光譜的自動(dòng)處理技術(shù)。對(duì)恒星光譜數(shù)據(jù)的自動(dòng)處理技術(shù)可以分為兩類:一類是恒星光譜的自動(dòng)分類技術(shù),另一類是恒星大氣物理參數(shù)的自動(dòng)測(cè)量技術(shù)。根據(jù)光譜的譜線與連續(xù)譜的相對(duì)強(qiáng)度以及光譜的其他特征,將恒星分為O型、B型、A型、F型、G型、K型、M型七大類。恒星連續(xù)譜的分布以及譜線的輪廓是由恒星大氣內(nèi)的物理因素決定的,恒星大氣最基本的物理參數(shù)包括表面有效溫度(Teff)、表面重力加速度(log g)、化學(xué)豐度(Fe/H)。目前利用光譜的波長(zhǎng)和流量信息對(duì)光譜進(jìn)行分類以及參數(shù)測(cè)量的方法比較多,但是光譜數(shù)據(jù)的維數(shù)很高,往往需要經(jīng)過(guò)歸一化、降維等一系列的預(yù)處理,運(yùn)算量非常大。本文研究了基于Lick線指數(shù)進(jìn)行光譜分類及大氣參數(shù)測(cè)量的方法,針對(duì)海量光譜的情況,基于Hadoop平臺(tái)實(shí)現(xiàn)了Lick線指數(shù)的計(jì)算,以及利用貝葉斯決策進(jìn)行光譜分類的方法。利用Hadoop HDFS高吞吐率和高容錯(cuò)性的特點(diǎn),結(jié)合Hadoop MapReduce編程模型的并行優(yōu)勢(shì),提高了對(duì)大規(guī)模光譜數(shù)據(jù)的分析和處理效率。本文的創(chuàng)新點(diǎn)為:1.以Lick線指數(shù)作為特征,基于貝葉斯算法實(shí)現(xiàn)恒星光譜分類,基于核偏最小二乘回歸方法實(shí)現(xiàn)恒星大氣參數(shù)的測(cè)量;2.基于Hadoop MapReduce分布式計(jì)算框架實(shí)現(xiàn)Lick線指數(shù)的并行計(jì)算以及貝葉斯分類過(guò)程的并行化。
[Abstract]:The spectrum of celestial bodies contains a lot of astrophysical information. Through the analysis of the spectra, the physical information, chemical composition and atmospheric parameters of celestial bodies can be obtained. With the implementation of LAMOST,SDSS and other large-scale survey telescopes, massive spectral data will be generated, especially after the LAMOST is officially put into operation, and about 24,000 spectral data will be generated in each observation night. Such a large amount of spectral data put forward a higher demand for fast and effective processing of spectrum. The aim of this paper is to study the automatic processing technology of massive star spectrum. The automatic processing of stellar spectral data can be divided into two categories: one is the automatic classification of stellar spectrum and the other is the automatic measurement of stellar atmospheric physical parameters. According to the relative intensity of spectral line and continuous spectrum and other characteristics of spectrum, stars are classified into seven types: O type, B type, A type, F type, G type, K type and M type. The distribution of the stellar continuous spectrum and the profile of the spectral lines are determined by the physical factors in the stellar atmosphere. The most basic physical parameters of the stellar atmosphere include the surface effective temperature (Teff), surface gravity acceleration (log g), chemical abundance (Fe/H). At present, there are many methods to classify and measure spectrum by wavelength and flow information, but the dimension of spectral data is very high, which often needs a series of preprocessing, such as normalization, dimensionality reduction and so on. In this paper, the methods of spectral classification and atmospheric parameter measurement based on Lick line index are studied. The calculation of Lick line index based on Hadoop platform and the method of spectrum classification based on Bayesian decision are realized based on the massive spectrum. Based on the characteristics of high throughput and fault tolerance of Hadoop HDFS and the parallel advantages of Hadoop MapReduce programming model, the efficiency of analyzing and processing large-scale spectral data is improved. The innovation of this paper is: 1. The spectral classification of stars is realized based on Bayesian algorithm with Lick line exponent as the feature, and the atmospheric parameters of stars are measured by kernel partial least square regression method. The parallel computation of Lick line exponent and the parallelization of Bayesian classification process are realized based on Hadoop MapReduce distributed computing framework.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:P144
本文編號(hào):2248386
[Abstract]:The spectrum of celestial bodies contains a lot of astrophysical information. Through the analysis of the spectra, the physical information, chemical composition and atmospheric parameters of celestial bodies can be obtained. With the implementation of LAMOST,SDSS and other large-scale survey telescopes, massive spectral data will be generated, especially after the LAMOST is officially put into operation, and about 24,000 spectral data will be generated in each observation night. Such a large amount of spectral data put forward a higher demand for fast and effective processing of spectrum. The aim of this paper is to study the automatic processing technology of massive star spectrum. The automatic processing of stellar spectral data can be divided into two categories: one is the automatic classification of stellar spectrum and the other is the automatic measurement of stellar atmospheric physical parameters. According to the relative intensity of spectral line and continuous spectrum and other characteristics of spectrum, stars are classified into seven types: O type, B type, A type, F type, G type, K type and M type. The distribution of the stellar continuous spectrum and the profile of the spectral lines are determined by the physical factors in the stellar atmosphere. The most basic physical parameters of the stellar atmosphere include the surface effective temperature (Teff), surface gravity acceleration (log g), chemical abundance (Fe/H). At present, there are many methods to classify and measure spectrum by wavelength and flow information, but the dimension of spectral data is very high, which often needs a series of preprocessing, such as normalization, dimensionality reduction and so on. In this paper, the methods of spectral classification and atmospheric parameter measurement based on Lick line index are studied. The calculation of Lick line index based on Hadoop platform and the method of spectrum classification based on Bayesian decision are realized based on the massive spectrum. Based on the characteristics of high throughput and fault tolerance of Hadoop HDFS and the parallel advantages of Hadoop MapReduce programming model, the efficiency of analyzing and processing large-scale spectral data is improved. The innovation of this paper is: 1. The spectral classification of stars is realized based on Bayesian algorithm with Lick line exponent as the feature, and the atmospheric parameters of stars are measured by kernel partial least square regression method. The parallel computation of Lick line exponent and the parallelization of Bayesian classification process are realized based on Hadoop MapReduce distributed computing framework.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:P144
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 李鄉(xiāng)儒;劉中田;胡占義;吳福朝;趙永恒;;巡天光譜分類前的預(yù)處理——流量標(biāo)準(zhǔn)化[J];光譜學(xué)與光譜分析;2007年07期
2 張健楠;吳福朝;羅阿理;;核回歸方法在恒星光譜物理參量自動(dòng)估計(jì)中的應(yīng)用[J];光譜學(xué)與光譜分析;2009年04期
3 劉杰;潘景昌;韋鵬;劉猛;羅阿理;;基于光譜相似度的恒星大氣參數(shù)自動(dòng)測(cè)量方法[J];光譜學(xué)與光譜分析;2012年12期
,本文編號(hào):2248386
本文鏈接:http://sikaile.net/kejilunwen/tianwen/2248386.html
最近更新
教材專著