當(dāng)前位置：主頁 > 科技論文 > 農(nóng)業(yè)技術(shù)論文 >

基于隨機森林算法的土壤圖斑分解

發(fā)布時間：2018-07-26 18:26

【摘要】：多邊形的制圖方式和利用長期的野外調(diào)查以及航空圖片解譯過程決定了傳統(tǒng)土壤圖不管是在調(diào)查方法上還是在制圖方式上,其效率都相對低下,費時費力,而且傳統(tǒng)土壤圖的精度也很難再滿足現(xiàn)代科學(xué)發(fā)展的日新月異。在新時代的發(fā)展下,傳統(tǒng)土壤圖主要面臨以下幾個問題。首先,制圖比例尺的大小往往決定了最小圖斑的大小,比例尺越大,土壤圖中可表達的最小圖斑就越小,因而傳統(tǒng)土壤圖在成圖過程中會因為比例尺的限制而忽略掉一些大圖斑中的小圖斑,產(chǎn)生了土壤圖空間和屬性上的簡化;其次,手工的多邊形的表達方式也忽略掉了土壤的空間漸變特征,多邊形邊界處的突變導(dǎo)致了原本是連續(xù)變化的土壤空間和屬性的突變,最后基于專家經(jīng)驗和手工的制圖方式非常耗時耗力且容易產(chǎn)生人為誤差。然而包含了大量專家知識的傳統(tǒng)土壤圖是歷史留下的寶貴資料,對于當(dāng)下的相關(guān)研究仍具有重要的參考價值。本文將湖北省黃岡市紅安縣華家河鎮(zhèn)灄水河流域作為研究區(qū)域,結(jié)合全國第二次土壤普查獲得的傳統(tǒng)土壤圖,利用已有的地形數(shù)據(jù)和多光譜數(shù)據(jù),在GIS平臺和R語言環(huán)境下采用隨機森林模型挖掘土壤-環(huán)境關(guān)系知識,并利用該模型對原有的土壤圖斑進行空間分解,得到了空間信息更加詳細的土壤分布圖。具體的研究步驟分為以下幾步:1)提取與研究區(qū)域成土環(huán)境相關(guān)的景觀因子數(shù)據(jù)。在此選擇的初始環(huán)境變量數(shù)據(jù)包括土壤母質(zhì)數(shù)據(jù)、地形數(shù)據(jù)和多光譜數(shù)據(jù),利用從高程數(shù)據(jù)中提取的坡度、坡向、地形濕度指數(shù)、沿等高線曲率、沿平年曲率和水平曲率,從多光譜數(shù)據(jù)中提取的歸一化植被指數(shù)、歸一化水體指數(shù)、第一主成分、偏斜、信息熵、方差、平均值,加上母質(zhì)構(gòu)成研究所用的因變量。2)設(shè)計采樣點。采取圖斑面積加權(quán)的采樣方式設(shè)計采樣點,保證每個圖斑至少有10個樣點,最終確定了6686個樣點。利用樣點提取研究所用的環(huán)境因子數(shù)據(jù)并將樣點數(shù)據(jù)按母質(zhì)進行分類。3)環(huán)境因子的篩選。為確保制圖精度和效率,需要剔除掉一部分對模型貢獻率低的因子,本研究利用R語言提供的變量重要性測度importance()函數(shù)進行因子篩選。4)模型參數(shù)的確定。隨機森林模型的兩個非常重要的參數(shù)mtry和ntree可以分別通過隨機森林模型袋外誤差和模型穩(wěn)定性的計算進行判斷。5)模型的應(yīng)用。利用R語言中的Random Forest包對數(shù)據(jù)進行建模,得到四種母質(zhì)單元下的四組模型,利用這四組模型對研究區(qū)域內(nèi)每個柵格位置的環(huán)境因子信息進行投票判斷,通過投票最終得到各個位置的土壤類型,進而可以得到所研究區(qū)域的土壤圖。研究表明:相比于傳統(tǒng)的土壤圖,圖斑分解后的整個土壤圖在圖斑的數(shù)量上明顯增多,空間分布更加詳細,展現(xiàn)了更多的細節(jié)信。本研究中利用RF模型在分類問題上實現(xiàn)了較好的表達,說明利用RF模型來獲取土壤-環(huán)境關(guān)系的知識是真實可信的,可以為精細數(shù)字土壤制圖提供一種高效的方法。另外,隨機森林算法提供的變量重要性測度函數(shù)可以對變量重要性進行排序,刪除對模型貢獻率小的因子,不僅保證了分類精度,還大大提高了運算效率,為今后大面積進行土壤圖斑分解提供了可靠的方法和依據(jù)。
[Abstract]:The drawing method of polygon and the process of using long field survey and aerial picture interpretation determine that the efficiency of the traditional soil map is relatively low and time-consuming, and the precision of the traditional soil map is difficult to meet the development of modern science. The traditional soil map is mainly faced with the following problems. First, the size of the mapping scale often determines the size of the smallest plot. The larger the scale, the smaller the smallest map that can be expressed in the soil map, so the traditional soil map will ignore the small spots in the large plot because of the scale limitation during the drawing process. The space and attribute of the soil map are simplified. Secondly, the expression of the hand polygon also neglects the characteristics of the soil spatial gradient. The mutation of the polygon boundary leads to the mutation of the soil space and properties that have been changed continuously. Finally, based on the expert experience and manual drawing, it is very time-consuming and easy to produce people. However, the traditional soil map, which contains a large number of expert knowledge, is the valuable information left by the history, and still has important reference value for the present research. This paper takes the water river basin of huayuhe Town, Hong'an County, Huanggang City, Hubei Province as the research area, and combines the traditional soil map obtained by the National Second Soil Census. Some terrain data and multi spectral data are used in GIS platform and R language environment to excavate soil environmental knowledge, and use this model to decompose the original soil map in space, and get more detailed spatial distribution map of spatial information. The specific research steps are divided into following steps: 1) extraction and research area The initial environmental variables in this selection include soil parent material data, topographic data and multispectral data, using gradient, slope, terrain humidity index, curvature along the contour, horizontal curvature and horizontal curvature to extract normalized vegetation from multi spectral data. Index, normalized water index, the first principal component, deviation, information entropy, variance, mean value, and the dependent variable.2 used in the research of the parent material. The sampling point is designed with the weighted sampling pattern of the patch area to ensure that each spot has at least 10 samples, and the 6686 samples are finally determined. Boundary factor data and classification of sample data according to the parent material.3) environmental factors screening. In order to ensure mapping precision and efficiency, we need to eliminate a part of the factors that have low contribution to the model. This study uses the variable importance measure importance () function provided by the R language to determine the parameters of the.4 model. Two very important parameters, mtry and nTree, can be used to judge the.5) model through the calculation of the external error of the random forest model and the calculation of the model stability respectively. Using the Random Forest packet in the R language, the data are modeled and four groups of models under the four matrix units are obtained, and the four groups of models are used to study each grid position in the area. The environment factor information is voted to determine the soil type in each location by voting, and then the soil map of the area is obtained. The study shows that the whole soil map after the decomposition of the map is significantly increased in the number of spots compared with the traditional soil map, and the spatial distribution is more detailed, showing more details. In this study, we use the RF model to achieve a better expression on the classification problem. It shows that the knowledge of using the RF model to obtain the soil environmental relationship is true and credible. It can provide a efficient method for the fine digital soil mapping. In addition, the variable importance measure function provided by the random forest algorithm can be important to the variables. In order to delete the factor of small contribution to the model, it not only ensures the accuracy of the classification, but also greatly improves the efficiency of the calculation. It provides a reliable method and basis for the soil map decomposition in large area in the future.
【學(xué)位授予單位】：華中農(nóng)業(yè)大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：S159.9

【參考文獻】

相關(guān)期刊論文前10條

1 劉雪琦;朱阿興;楊琳;繆亞敏;曾燦英;;土壤圖更新中基于土壤類型面積分級的訓(xùn)練樣點選擇方法[J];土壤學(xué)報;2017年01期

2 王茵茵;齊雁冰;陳洋;解飛;;基于多分辨率遙感數(shù)據(jù)與隨機森林算法的土壤有機質(zhì)預(yù)測研究[J];土壤學(xué)報;2016年02期

3 黃魏;羅云;汪善勤;陳家贏;韓宗偉;祁大成;;基于傳統(tǒng)土壤圖的土壤—環(huán)境關(guān)系獲取及推理制圖研究[J];土壤學(xué)報;2016年01期

4 郭澎濤;李茂芬;羅微;林清火;唐群鋒;劉志崴;;基于多源環(huán)境變量和隨機森林的橡膠園土壤全氮含量預(yù)測[J];農(nóng)業(yè)工程學(xué)報;2015年05期

5 趙北庚;;基于R語言randomForest包的隨機森林建模研究[J];計算機光盤軟件與應(yīng)用;2015年02期

6 韓宗偉;黃魏;羅云;張春弟;祁大成;;基于路網(wǎng)的土壤采樣布局優(yōu)化——模擬退火神經(jīng)網(wǎng)絡(luò)算法[J];應(yīng)用生態(tài)學(xué)報;2015年03期

7 楊琳;朱阿興;張淑杰;安藝明;;土壤制圖中多等級代表性采樣與分層隨機采樣的對比研究[J];土壤學(xué)報;2015年01期

8 寧亮亮;張曉麗;;基于紋理信息的Landsat-8影像植被分類初探[J];中南林業(yè)科技大學(xué)學(xué)報;2014年09期

9 韓宗偉;黃魏;張春弟;羅云;;基于土壤養(yǎng)分-景觀關(guān)系的土壤采樣布局合理性研究[J];華中農(nóng)業(yè)大學(xué)學(xué)報;2014年01期

10 張淑杰;朱阿興;劉京;楊琳;;基于樣點的數(shù)字土壤屬性制圖方法及樣點設(shè)計綜述[J];土壤;2012年06期

相關(guān)碩士學(xué)位論文前2條

1 周銀;基于決策樹方法的縣級土壤數(shù)字制圖研究[D];浙江大學(xué);2011年

2 李杭燕;時間序列NDVI數(shù)據(jù)集重建方法研究[D];蘭州大學(xué);2010年

，

本文編號：2146909

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/nykj/2146909.html

上一篇：海南省商品有機肥的組成與養(yǎng)分狀況研究
下一篇：甘肅中部地區(qū)水砂田玉米土壤養(yǎng)分豐缺指標研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于隨機森林算法的土壤圖斑分解