基于機(jī)器學(xué)習(xí)方法的地理標(biāo)志大米產(chǎn)地確證技術(shù)研究
[Abstract]:This paper discusses the feasibility of applying machine learning method to the identification technology of geographical indication rice producing area, and establishes the model of producing area confirmation in adjacent regions, which can provide the theoretical basis for the construction of geographical indication rice protection system. In this study, 166 rice samples from Meiehekou City, Jilin Province, and their adjacent regions were collected. The contents of 10 mineral elements (Cu ~ (2 +) Zn ~ (Zn) Fe ~ (2 +) K _ (+) K _ (+) (Na) mg _ (Pb) (CD) in rice samples were determined by atomic spectrophotometer (AAS). The data obtained by instrument analysis are stratified by sampling package, and the training set and test set are divided according to 7:3 scale. The models of Random forest (RF) and support vector machine (Support Vector machine) were established and compared with the multivariate statistical discriminant model established by the linear discriminant analysis (Linear Discriminant) method. The main conclusions are as follows: (1) the two machine learning methods, random forest and support vector machine, can be applied to the identification of rice geographical indication. The prediction accuracy of RF model is 96 / 94 respectively. (2) the RF model selects 8 elements from 10 elements as feature subsets and optimizes the original parameter mtry=3 ntree=500 to mtry=1 ntree600. After optimization, the RF model can be constructed by only 8 elements. The accuracy of external test set is improved from 94% to 96%, and the generalization ability is improved. (3) support vector machine is based on four kernel functions (linear kernel, Gao Si kernel, polynomial kernel Sigmoid kernel). After parameter optimization, the accuracy of the model based on four kernel functions is improved, among which the linear kernel function model has the highest accuracy, the support vector and the optimized parameters are the least, so the feature selection is further based on the linear kernel function. The accuracy of the external test set of the optimized model is improved from 91.67% to 94. (4) the accuracy of the external test set of the LDA model is 922 after the feature selection and optimization. Compared with the model established by the machine learning method (RFS-SVM), it has no presupposition constraints on the initial data, has stronger generalization ability, and is more accurate in predicting unknown data than the LDA model. (5) the accuracy and generalization ability of the three methods are better than that of the LDA model. The comparison of the degree of overfitting and the cost of constructing the model shows that the RF model is optimal, with high prediction accuracy, strong generalization ability, low degree of overfitting and low cost of model construction.
【學(xué)位授予單位】:吉林農(nóng)業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:S511;TP181
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 錢麗麗;冷候喜;宋雪健;鹿保鑫;史蕊;張東杰;;基于PLS-DA判別法對(duì)黑龍江大米產(chǎn)地溯源的研究[J];食品工業(yè);2017年01期
2 郝莉花;張平;;近紅外光譜技術(shù)在食品產(chǎn)地溯源中的應(yīng)用研究進(jìn)展[J];農(nóng)產(chǎn)品加工;2016年24期
3 錢麗麗;呂海峰;鹿保鑫;左鋒;張東杰;;地理標(biāo)志大米的仿生電子鼻分類識(shí)別[J];中國(guó)糧油學(xué)報(bào);2016年08期
4 李仲;劉明地;吉守祥;;基于紅外光譜和隨機(jī)森林的枸杞產(chǎn)地鑒別[J];計(jì)算機(jī)與應(yīng)用化學(xué);2016年07期
5 張力;艾海新;張吉寬;胡桓;劉宏生;馬樹(shù)才;;基于隨機(jī)森林和特征選擇方法的蛋白質(zhì)熱穩(wěn)定性影響因素預(yù)測(cè)[J];現(xiàn)代食品科技;2016年07期
6 楊飚;尚秀偉;;加權(quán)隨機(jī)森林算法研究[J];微型機(jī)與應(yīng)用;2016年03期
7 錢麗麗;張愛(ài)武;呂海峰;宋春蕾;張東杰;;大米理化指標(biāo)指紋在產(chǎn)地溯源的探究[J];中國(guó)糧油學(xué)報(bào);2016年01期
8 孫娟;張暉;王立;錢海峰;齊希光;;基于拉曼光譜的大米快速分類判別方法[J];食品與機(jī)械;2016年01期
9 陶夢(mèng)琳;顧文濤;汪子青;侯珂惠;崔書(shū)盛;唐道超;秦娜;張大永;萬(wàn)軍;;基于支持向量機(jī)的黃連飲片產(chǎn)地識(shí)別研究[J];中草藥;2015年21期
10 徐大江;馬占峰;高文佳;羅海峰;;基于電感耦合等離子體質(zhì)譜法測(cè)定蜂蜜同位素進(jìn)行產(chǎn)地溯源[J];食品安全質(zhì)量檢測(cè)學(xué)報(bào);2015年10期
相關(guān)重要報(bào)紙文章 前2條
1 王彥;;保護(hù)“五常大米” 依法解決地理標(biāo)志產(chǎn)品亂象[N];黑龍江日?qǐng)?bào);2015年
2 趙赫男;;“吉林大米”如何成為“白金名片”[N];吉林日?qǐng)?bào);2015年
相關(guān)博士學(xué)位論文 前1條
1 夏立婭;大米產(chǎn)地特征因子及溯源方法研究[D];河北大學(xué);2013年
相關(guān)碩士學(xué)位論文 前9條
1 朱思宇;大米產(chǎn)地的模式識(shí)別研究[D];黑龍江科技大學(xué);2016年
2 劉笑笑;基于RF-RFE算法的森林生物量遙感特征選擇方法研究[D];山東農(nóng)業(yè)大學(xué);2016年
3 丁然;基于隨機(jī)森林大豆籽粒外觀品質(zhì)識(shí)別系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];東北農(nóng)業(yè)大學(xué);2015年
4 言思敏;地理標(biāo)志產(chǎn)品武夷巖茶的產(chǎn)地識(shí)別技術(shù)研究[D];中國(guó)計(jì)量學(xué)院;2015年
5 王蓉;基于稀疏主成分和SVM的白酒類別的定性研究[D];武漢輕工大學(xué);2014年
6 鐘敏;用碳氮穩(wěn)定同位素對(duì)大米產(chǎn)地溯源的研究[D];大連海事大學(xué);2013年
7 任洪玲;大米原產(chǎn)地品質(zhì)分析與鑒別[D];河南工業(yè)大學(xué);2012年
8 王全才;隨機(jī)森林特征選擇[D];大連理工大學(xué);2011年
9 惠娜;近紅外光譜分析技術(shù)在黨參及復(fù)方丹參片質(zhì)量控制中的應(yīng)用[D];蘭州大學(xué);2011年
,本文編號(hào):2195536
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2195536.html