基于聚類和加權K近鄰的煙葉分級研究
本文選題:光譜分析技術 切入點:煙葉分級 出處:《鄭州大學》2017年碩士論文 論文類型:學位論文
【摘要】:煙葉收購階段,正確客觀的劃分煙葉等級既可提高煙農(nóng)的種植積極性,又可保障卷煙企業(yè)的經(jīng)濟利益,F(xiàn)階段的人工分級存在主觀性強、人力和物力耗費大等缺點,針對同一片煙葉,不同的專家有可能將它劃分到不同的等級。因此,客觀、快速、高準確率的智能分級是迫切需要的。目前,煙葉智能分級的研究集中在基于煙葉圖像和紅外光譜進行分級兩個方面。由于煙葉的光譜特征可以更好地反映含油量、色度、身份及成熟度等與煙葉等級密切相關的因素,所以本文基于光譜對煙葉分級進行了研究。煙葉智能分級系統(tǒng)的識別率和整體速度與所選擇的分級模型和樣本特征光譜的采集量存在很大的關系,為實現(xiàn)一個具有高識別率的實時煙葉智能分級系統(tǒng)本文主要進行了以下工作:1.煙葉光譜的采集、預處理和孤立樣本的檢測。利用型號為UV-3600的光譜儀采集642(13個等級)片煙葉的反射光譜;為降低基線漂移所帶來的噪聲和特征值間的差異對分級的影響,對光譜進行了歸一化處理;由于可能存在錯分類別的樣本(孤立樣本),所以需要對構建分級模型的樣本訓練集進行選擇。本文分別利用夾角余弦距離、歐氏距離和相關系數(shù)法并通過統(tǒng)計分析選擇合適的閾值檢測各個等級中的孤立樣本和確定用于建立分級模型的樣本訓練集。2.分級模型的構建以及K近鄰算法的改進。分別構建支持向量機(SVM)、極限學習機(ELM)、K近鄰(KNN)和加權K近鄰等煙葉分級模型,將分級模型的識別率作為適應度函數(shù),全光譜下ELM和SVM的測試集最優(yōu)正確率分別為85.75%和91.02%。加權K近鄰方法為:一種方法是每個等級中訓練集的權重相同,為該等級樣本個數(shù)的倒數(shù)。另一種方法是先找出K個近鄰,加上與距離呈負相關的權重,通過計算每個等級的權重之和為煙葉進行定級,兩種方法相結合的識別率可達90.77%。加權K近鄰的分類效果優(yōu)于傳統(tǒng)K近鄰,計算復雜度低于SVM和ELM,本文選用加權K近鄰作為煙葉等級判斷的分類器。3.基于聚類思想的特征初步篩選。同時考慮相同特征的類內離散度和類間離散度,構造判別特征好與壞的鑒別函數(shù)D,依據(jù)D值刪除拐點右側特征,在第6個拐點下取得最優(yōu)分級效果,余下326個特征,測試集正確率由90.77%增加至94.59%,既提高了識別率又降低了特征的個數(shù)。4.深層特征的篩選。采用粒子群(BPSO)、遺傳算法(GA)、相關系數(shù)分析(CC)進一步進行特征的篩選。BPSO取得較好的效果,特征數(shù)目由原來的451個減少到143個,這樣采集光譜所耗費的時間可節(jié)省68.3%;識別率由原來的90.77%提高到93.69%,提高了2.92個百分點。
[Abstract]:In the stage of tobacco leaf purchase, the correct and objective classification of tobacco leaf grade can not only improve the planting enthusiasm of tobacco farmers, but also protect the economic interests of cigarette enterprises. For the same leaf, it is possible for different experts to divide it into different grades. Therefore, an objective, fast and accurate intelligent grading is urgently needed. The research on intelligent classification of tobacco leaves is focused on two aspects: tobacco image and infrared spectrum. Because the spectral characteristics of tobacco leaf can better reflect the oil content, chroma, identity and maturity and other factors closely related to tobacco grade. The recognition rate and the overall speed of the intelligent tobacco classification system have a great relationship with the selected classification model and the collection amount of the sample characteristic spectrum. In order to realize a real-time intelligent classification system of tobacco leaves with high recognition rate, the following work was carried out in this paper: 1. The collection of tobacco leaf spectrum, Pretreatment and detection of isolated samples. The reflectance spectra of 642 (13 grades) tobacco leaves were collected by using a UV-3600 spectrometer; the effects of noise and differences in eigenvalues to reduce the baseline drift on the classification, In this paper, the spectrum is normalized, and the sample training set for constructing the hierarchical model needs to be selected because of the possible existence of the wrong subclass samples (isolated samples). In this paper, the angle cosine distance is used respectively. Euclidean distance and correlation coefficient method and statistical analysis to select appropriate threshold to detect isolated samples in each level and to determine the training set of samples used to establish hierarchical model .2. the construction of hierarchical model and the improvement of K-nearest neighbor algorithm. The tobacco leaf classification models, such as support vector machine (SVM), extreme learning machine (LLM) and KNN (weighted K nearest neighbor), were constructed, respectively. Using the recognition rate of hierarchical model as fitness function, the optimal accuracy of test set for ELM and SVM is 85.75% and 91.02 respectively in full spectrum. The weighted K-nearest neighbor method is that the weight of training set in each level is the same. The other method is to find out K nearest neighbors, plus the weight negatively related to distance, by calculating the sum of the weights of each grade to grade tobacco leaves. The recognition rate of the two methods combined can reach 90.77. The classification effect of weighted K-nearest neighbor is better than that of traditional K-nearest neighbor. The computational complexity is lower than that of SVM and Elm. In this paper, weighted K nearest neighbor is chosen as classifier of tobacco leaf grade judgment. The feature is preliminarily screened based on clustering idea, and the intra-class dispersion and inter-class dispersion of the same feature are considered at the same time. A discriminant function D of good and bad features is constructed. According to D value, the right feature of inflection point is deleted, and the optimal classification effect is obtained at the sixth inflection point. The remaining 326 features are obtained. The correct rate of test set is increased from 90.77% to 94.59, which not only improves the recognition rate but also reduces the number of features .4.The selection of deep features is carried out by using particle swarm optimization (BPSO), genetic algorithm (GA) and correlation coefficient analysis (#en0#), and the better results are obtained in the further screening of features by means of particle swarm optimization (PSO), genetic algorithm (GA) and correlation coefficient analysis (#en0#). The number of features is reduced from 451 to 143, which saves 68.3 percent of the time spent in collecting the spectrum, and increases the recognition rate from 90.77% to 93.69 percent, an increase of 2.92 percentage points.
【學位授予單位】:鄭州大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TS442
【相似文獻】
相關期刊論文 前6條
1 晉子銘;;怎樣做好煙葉分級檢驗工作[J];煙草科技通訊;1980年04期
2 金釗;金劍;;煙葉分級投料管理系統(tǒng)開發(fā)[J];中國煙草學報;2010年05期
3 朱尊權;;煙葉分級和煙草生產(chǎn)技術的改革(一)——在1990年2月13日中國煙草總公司于廣州召開的煙葉分級研討會上的講話(根據(jù)錄音整理)[J];煙草科技;1990年03期
4 朱尊權;;煙葉分級和煙草生產(chǎn)技術的改革(二)——在1990年2月13日中國煙草總公司于廣州召開的煙葉分級研討會上的講話[J];煙草科技;1990年04期
5 于華堂;;煙葉分級基本知識[J];煙草科技;1987年04期
6 陳風雷;孫紅權;鄭少清;郭亮;穆東升;;初烤煙葉專業(yè)化分級效率研究[J];山地農(nóng)業(yè)生物學報;2012年03期
相關會議論文 前1條
1 閆新甫;羅安娜;;美國煙葉分級標準體系中類、型和組的劃分[A];中國煙草學會2009年年會論文集[C];2009年
相關重要報紙文章 前10條
1 特約記者 趙家榮;云南煙葉分級人才輩出[N];中華合作時報;2006年
2 陳曉波;提高煙葉分級綜合能力[N];經(jīng)理日報;2010年
3 雷樸昭;紅塔進行煙葉分級中高級工職業(yè)技能鑒定[N];經(jīng)理日報;2007年
4 本報記者 喬夫;煙草業(yè)如何突破培養(yǎng)“藍領”的瓶頸[N];中華合作時報;2005年
5 陳登科邋張,
本文編號:1599902
本文鏈接:http://sikaile.net/shoufeilunwen/boshibiyelunwen/1599902.html