基于聚類和加權(quán)K近鄰的煙葉分級研究
本文選題:光譜分析技術(shù) 切入點:煙葉分級 出處:《鄭州大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:煙葉收購階段,正確客觀的劃分煙葉等級既可提高煙農(nóng)的種植積極性,又可保障卷煙企業(yè)的經(jīng)濟利益,F(xiàn)階段的人工分級存在主觀性強、人力和物力耗費大等缺點,針對同一片煙葉,不同的專家有可能將它劃分到不同的等級。因此,客觀、快速、高準(zhǔn)確率的智能分級是迫切需要的。目前,煙葉智能分級的研究集中在基于煙葉圖像和紅外光譜進(jìn)行分級兩個方面。由于煙葉的光譜特征可以更好地反映含油量、色度、身份及成熟度等與煙葉等級密切相關(guān)的因素,所以本文基于光譜對煙葉分級進(jìn)行了研究。煙葉智能分級系統(tǒng)的識別率和整體速度與所選擇的分級模型和樣本特征光譜的采集量存在很大的關(guān)系,為實現(xiàn)一個具有高識別率的實時煙葉智能分級系統(tǒng)本文主要進(jìn)行了以下工作:1.煙葉光譜的采集、預(yù)處理和孤立樣本的檢測。利用型號為UV-3600的光譜儀采集642(13個等級)片煙葉的反射光譜;為降低基線漂移所帶來的噪聲和特征值間的差異對分級的影響,對光譜進(jìn)行了歸一化處理;由于可能存在錯分類別的樣本(孤立樣本),所以需要對構(gòu)建分級模型的樣本訓(xùn)練集進(jìn)行選擇。本文分別利用夾角余弦距離、歐氏距離和相關(guān)系數(shù)法并通過統(tǒng)計分析選擇合適的閾值檢測各個等級中的孤立樣本和確定用于建立分級模型的樣本訓(xùn)練集。2.分級模型的構(gòu)建以及K近鄰算法的改進(jìn)。分別構(gòu)建支持向量機(SVM)、極限學(xué)習(xí)機(ELM)、K近鄰(KNN)和加權(quán)K近鄰等煙葉分級模型,將分級模型的識別率作為適應(yīng)度函數(shù),全光譜下ELM和SVM的測試集最優(yōu)正確率分別為85.75%和91.02%。加權(quán)K近鄰方法為:一種方法是每個等級中訓(xùn)練集的權(quán)重相同,為該等級樣本個數(shù)的倒數(shù)。另一種方法是先找出K個近鄰,加上與距離呈負(fù)相關(guān)的權(quán)重,通過計算每個等級的權(quán)重之和為煙葉進(jìn)行定級,兩種方法相結(jié)合的識別率可達(dá)90.77%。加權(quán)K近鄰的分類效果優(yōu)于傳統(tǒng)K近鄰,計算復(fù)雜度低于SVM和ELM,本文選用加權(quán)K近鄰作為煙葉等級判斷的分類器。3.基于聚類思想的特征初步篩選。同時考慮相同特征的類內(nèi)離散度和類間離散度,構(gòu)造判別特征好與壞的鑒別函數(shù)D,依據(jù)D值刪除拐點右側(cè)特征,在第6個拐點下取得最優(yōu)分級效果,余下326個特征,測試集正確率由90.77%增加至94.59%,既提高了識別率又降低了特征的個數(shù)。4.深層特征的篩選。采用粒子群(BPSO)、遺傳算法(GA)、相關(guān)系數(shù)分析(CC)進(jìn)一步進(jìn)行特征的篩選。BPSO取得較好的效果,特征數(shù)目由原來的451個減少到143個,這樣采集光譜所耗費的時間可節(jié)省68.3%;識別率由原來的90.77%提高到93.69%,提高了2.92個百分點。
[Abstract]:In the stage of tobacco leaf purchase, the correct and objective classification of tobacco leaf grade can not only improve the planting enthusiasm of tobacco farmers, but also protect the economic interests of cigarette enterprises. For the same leaf, it is possible for different experts to divide it into different grades. Therefore, an objective, fast and accurate intelligent grading is urgently needed. The research on intelligent classification of tobacco leaves is focused on two aspects: tobacco image and infrared spectrum. Because the spectral characteristics of tobacco leaf can better reflect the oil content, chroma, identity and maturity and other factors closely related to tobacco grade. The recognition rate and the overall speed of the intelligent tobacco classification system have a great relationship with the selected classification model and the collection amount of the sample characteristic spectrum. In order to realize a real-time intelligent classification system of tobacco leaves with high recognition rate, the following work was carried out in this paper: 1. The collection of tobacco leaf spectrum, Pretreatment and detection of isolated samples. The reflectance spectra of 642 (13 grades) tobacco leaves were collected by using a UV-3600 spectrometer; the effects of noise and differences in eigenvalues to reduce the baseline drift on the classification, In this paper, the spectrum is normalized, and the sample training set for constructing the hierarchical model needs to be selected because of the possible existence of the wrong subclass samples (isolated samples). In this paper, the angle cosine distance is used respectively. Euclidean distance and correlation coefficient method and statistical analysis to select appropriate threshold to detect isolated samples in each level and to determine the training set of samples used to establish hierarchical model .2. the construction of hierarchical model and the improvement of K-nearest neighbor algorithm. The tobacco leaf classification models, such as support vector machine (SVM), extreme learning machine (LLM) and KNN (weighted K nearest neighbor), were constructed, respectively. Using the recognition rate of hierarchical model as fitness function, the optimal accuracy of test set for ELM and SVM is 85.75% and 91.02 respectively in full spectrum. The weighted K-nearest neighbor method is that the weight of training set in each level is the same. The other method is to find out K nearest neighbors, plus the weight negatively related to distance, by calculating the sum of the weights of each grade to grade tobacco leaves. The recognition rate of the two methods combined can reach 90.77. The classification effect of weighted K-nearest neighbor is better than that of traditional K-nearest neighbor. The computational complexity is lower than that of SVM and Elm. In this paper, weighted K nearest neighbor is chosen as classifier of tobacco leaf grade judgment. The feature is preliminarily screened based on clustering idea, and the intra-class dispersion and inter-class dispersion of the same feature are considered at the same time. A discriminant function D of good and bad features is constructed. According to D value, the right feature of inflection point is deleted, and the optimal classification effect is obtained at the sixth inflection point. The remaining 326 features are obtained. The correct rate of test set is increased from 90.77% to 94.59, which not only improves the recognition rate but also reduces the number of features .4.The selection of deep features is carried out by using particle swarm optimization (BPSO), genetic algorithm (GA) and correlation coefficient analysis (#en0#), and the better results are obtained in the further screening of features by means of particle swarm optimization (PSO), genetic algorithm (GA) and correlation coefficient analysis (#en0#). The number of features is reduced from 451 to 143, which saves 68.3 percent of the time spent in collecting the spectrum, and increases the recognition rate from 90.77% to 93.69 percent, an increase of 2.92 percentage points.
【學(xué)位授予單位】:鄭州大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TS442
【相似文獻(xiàn)】
相關(guān)期刊論文 前6條
1 晉子銘;;怎樣做好煙葉分級檢驗工作[J];煙草科技通訊;1980年04期
2 金釗;金劍;;煙葉分級投料管理系統(tǒng)開發(fā)[J];中國煙草學(xué)報;2010年05期
3 朱尊權(quán);;煙葉分級和煙草生產(chǎn)技術(shù)的改革(一)——在1990年2月13日中國煙草總公司于廣州召開的煙葉分級研討會上的講話(根據(jù)錄音整理)[J];煙草科技;1990年03期
4 朱尊權(quán);;煙葉分級和煙草生產(chǎn)技術(shù)的改革(二)——在1990年2月13日中國煙草總公司于廣州召開的煙葉分級研討會上的講話[J];煙草科技;1990年04期
5 于華堂;;煙葉分級基本知識[J];煙草科技;1987年04期
6 陳風(fēng)雷;孫紅權(quán);鄭少清;郭亮;穆東升;;初烤煙葉專業(yè)化分級效率研究[J];山地農(nóng)業(yè)生物學(xué)報;2012年03期
相關(guān)會議論文 前1條
1 閆新甫;羅安娜;;美國煙葉分級標(biāo)準(zhǔn)體系中類、型和組的劃分[A];中國煙草學(xué)會2009年年會論文集[C];2009年
相關(guān)重要報紙文章 前10條
1 特約記者 趙家榮;云南煙葉分級人才輩出[N];中華合作時報;2006年
2 陳曉波;提高煙葉分級綜合能力[N];經(jīng)理日報;2010年
3 雷樸昭;紅塔進(jìn)行煙葉分級中高級工職業(yè)技能鑒定[N];經(jīng)理日報;2007年
4 本報記者 喬夫;煙草業(yè)如何突破培養(yǎng)“藍(lán)領(lǐng)”的瓶頸[N];中華合作時報;2005年
5 陳登科邋張,
本文編號:1599902
本文鏈接:http://sikaile.net/shoufeilunwen/boshibiyelunwen/1599902.html