基于數(shù)據(jù)挖掘技術(shù)的激變變星的特征提取
發(fā)布時(shí)間:2018-05-18 16:08
本文選題:數(shù)據(jù)挖掘 + 分類。 參考:《山東大學(xué)》2011年碩士論文
【摘要】:天體光譜中蘊(yùn)含了豐富的物理信息,隨著LAMOST望遠(yuǎn)鏡投入使用,每個(gè)觀測(cè)夜將獲得上萬條光譜。傳統(tǒng)分析光譜的方法效率低、速度慢,無法滿足對(duì)日益增長的數(shù)據(jù)的處理。數(shù)據(jù)挖掘作為信息發(fā)展到一定階段的產(chǎn)物,從大量的、有噪聲的的數(shù)據(jù)中提取出隱含在其中的有用信息,可以實(shí)現(xiàn)相關(guān)性預(yù)測(cè)、分類、聚類、孤立點(diǎn)發(fā)現(xiàn)、時(shí)間序列分析等許多功能,尤其是高維數(shù)據(jù)的處理非常有效。 LAMOST的海量天體光譜,不但對(duì)大樣本天文學(xué)的研究有重要作用,還將產(chǎn)生許多副產(chǎn)品。數(shù)量如此巨大的光譜中隱藏著相對(duì)較多的稀有天體、時(shí)變天體和未知天體。激變變星是一種稀少天體,是研究吸積盤的“最佳天體物理實(shí)驗(yàn)室”,對(duì)其光學(xué)觀測(cè),特別是光譜觀測(cè)對(duì)研究激變變星的物理特性和運(yùn)動(dòng)規(guī)律,對(duì)研究吸積盤的理論以及恒星演化等都具有重要意義。 本文主要工作是通過研究激變變星的光譜,根據(jù)不同波段的特點(diǎn),如巴爾默線系的發(fā)射與吸收、駝峰現(xiàn)象、雙峰現(xiàn)象等,利用數(shù)據(jù)挖掘技術(shù)提取出已知激變變星的光譜特征,用于篩選激變變星的候選體。各類激變變星的光譜雖有一些共同特征,但是不同類型甚至同一類型的不同天體也有它的特殊性,加上某些爆發(fā)階段的光譜與某些非激變變星光譜并無明顯差別,因此,本文主要進(jìn)行了以下工作: (1)研究了激變變星的主要特點(diǎn),特別是其光譜特性,并且采用了PCA方法構(gòu)造光譜的主分量,對(duì)光譜特征進(jìn)行提取。采用主分量為軸,直接把樣本點(diǎn)在主分量坐標(biāo)軸上進(jìn)行投影,可以得到二維平面上的樣本特征點(diǎn),大大降低了光譜數(shù)據(jù)的維數(shù)。 (2)分別研究了支持向量機(jī)、人工神經(jīng)網(wǎng)絡(luò)、K均值、K近鄰等常用數(shù)據(jù)挖掘方法在分類和聚類上的應(yīng)用。研究了一種新的方法:蟻群算法,分別就其在分類和聚類方而的模型進(jìn)行研究 (3)在MATLAB不境下,根據(jù)數(shù)據(jù)挖掘的一般步驟,采用支持向量機(jī)、人工神經(jīng)網(wǎng)絡(luò)、K均值、K近鄰以及隨機(jī)森林,分別對(duì)同一數(shù)據(jù)集進(jìn)行激變變星挖掘?qū)嶒?yàn)。對(duì)各種方法得到的結(jié)果進(jìn)行時(shí)間、類CVs個(gè)數(shù)等綜合分析和比較。比較不同方法篩選出激變變星的候選體,并分析其原因。
[Abstract]:There are abundant physical information in the spectrum of the celestial body. With the use of the LAMOST telescope, tens of thousands of spectra will be obtained each night. The traditional method of analyzing the spectrum is inefficient and slow, and can not meet the processing of growing data. Data mining is the product of information development to a certain stage, from a large number of noise. The useful information contained in it is extracted, which can realize correlation prediction, classification, clustering, isolated point discovery, time series analysis and many other functions, especially high dimensional data processing.
LAMOST's mass spectra of celestial bodies not only play an important role in the study of large sample astronomy, but also produce a number of by-products. In such a large spectrum, a relatively large number of rare objects are hidden, time-varying celestial bodies and unknown celestial bodies. The shock variable star is a rare celestial body and is the "best astrophysical laboratory" for the study of accretion discs. The optical observation, especially the spectral observation, is of great significance to the study of the physical characteristics and motion laws of the variable variable stars. It is of great significance to the study of the theory of the accretion disk and the evolution of the stars.
The main work of this paper is to study the spectrum of variable stars, according to the characteristics of different bands, such as the emission and absorption of the Ballmer line, the hump phenomenon, the Shuangfeng phenomenon and so on. The spectral characteristics of the known variable stars are extracted by data mining technology to select the candidate of the variable stars. Characteristics, but different types and even the same type of different celestial bodies have its particularity, and there is no obvious difference between the spectrum of some outbreak stages and some non - shock variation, so the following work is carried out in this paper.
(1) the main characteristics of the variable star are studied, especially its spectral characteristics, and the PCA method is used to construct the main component of the spectrum, and the spectral features are extracted. The sample point is directly projected on the principal component axis by using the principal component as the axis, and the sample feature points on the Er Weiping surface can be obtained, which greatly reduces the dimension of the spectral data. Number.
(2) the application of common data mining methods such as support vector machine, artificial neural network, K mean, K nearest neighbor and other common data mining methods in classification and clustering are studied. A new method, ant colony algorithm, is studied, which is studied on the model of classification and clustering respectively.
(3) under MATLAB, according to the general steps of data mining, using support vector machine, artificial neural network, K mean, K nearest neighbor and random forest, we carry on the experiment of variable star mining for the same data set, and analyze and compare the time and the number of CVs of the results obtained by various methods. The candidate of the variable star and its reasons.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2011
【分類號(hào)】:TP311.13;P145.4
【引證文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前2條
1 張瑞敏;并行環(huán)境下恒星理論光譜模板庫的構(gòu)建[D];山東大學(xué);2012年
2 劉杰;基于模板匹配的恒星大氣物理參數(shù)自動(dòng)測(cè)量的研究[D];山東大學(xué);2012年
,本文編號(hào):1906503
本文鏈接:http://sikaile.net/kejilunwen/tianwen/1906503.html
最近更新
教材專著