基于多分類器集成的工業(yè)品缺陷分析方法研究
發(fā)布時(shí)間:2018-01-14 20:43
本文關(guān)鍵詞:基于多分類器集成的工業(yè)品缺陷分析方法研究 出處:《浙江大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 工業(yè)數(shù)據(jù) 代價(jià)敏感 多分類器 類別不平衡 集成方法
【摘要】:制造工業(yè)產(chǎn)品缺陷的分析是改進(jìn)企業(yè)產(chǎn)品制造過程的重要途徑之一,對(duì)于產(chǎn)品質(zhì)量以及營銷收益有著重要的研究意義和應(yīng)用價(jià)值。伴隨計(jì)算機(jī)技術(shù)的快速發(fā)展、自動(dòng)化系統(tǒng)的全面部署,產(chǎn)品制造過程中信息采集和存儲(chǔ)的難度大大降低。具有潛在信息和價(jià)值的數(shù)據(jù)在不斷地積累。同時(shí),機(jī)器學(xué)習(xí)、數(shù)據(jù)挖掘等方法在各行各業(yè)取得了飛速的發(fā)展和應(yīng)用。然而制造業(yè)由于工業(yè)性質(zhì)對(duì)這些數(shù)據(jù)的利用水平遠(yuǎn)不如其它行業(yè),并沒有真正地發(fā)揮這些數(shù)據(jù)應(yīng)有的價(jià)值。為此,本文針對(duì)制造工業(yè)品數(shù)據(jù)的主要特點(diǎn),總結(jié)了一般的針對(duì)工業(yè)產(chǎn)品缺陷分析問題的處理流程,對(duì)數(shù)據(jù)進(jìn)行處理以及統(tǒng)計(jì)分析,并將分析產(chǎn)品各項(xiàng)質(zhì)量檢測結(jié)果與產(chǎn)品的缺陷數(shù)據(jù)之間的關(guān)系問題,轉(zhuǎn)化成通過統(tǒng)計(jì)學(xué)習(xí)方法建立產(chǎn)品質(zhì)量與缺陷的分類模型。然而缺陷數(shù)據(jù)同時(shí)出現(xiàn)多個(gè)缺陷類別以及類別樣本數(shù)目不平衡的問題,這對(duì)分類算法模型的構(gòu)建而言是一大阻礙。本文針對(duì)需要同時(shí)掃清該兩者障礙提出了結(jié)合代價(jià)敏感與集成方法的多分類器模型,通過樣本重賦權(quán)重再縮放的方法結(jié)合分類代價(jià)敏感,再集成多個(gè)決策樹構(gòu)建多分類模型。實(shí)驗(yàn)結(jié)果表明該模型可以有效地處理不平衡類別的多分類問題,同時(shí)可以平衡分類代價(jià)和預(yù)測的準(zhǔn)確率。此外對(duì)決策樹的集成擬合可以得出相關(guān)屬性的重要性度量,可以作為追溯缺陷主要影響因素的一個(gè)依據(jù)。
[Abstract]:Analysis of manufacturing industrial product defect is one of the most important ways to improve the enterprise production process, and has important research significance and application value for the quality of the products and marketing revenue. With the rapid development of computer technology, the full deployment of automation system, information collection and storage products in the manufacturing process has the potential to greatly reduce the difficulty of information and value. The data is accumulated ceaselessly. Machine learning method, at the same time, data mining has achieved rapid development and application in all walks of life. However, due to the nature of these manufacturing industry data use level is far behind that of other industries, and these data did not really play its due value. Therefore, this thesis mainly manufacture of industrial products the data, summed up the general on industrial product defect analysis processing procedures, data processing and statistical analysis, The analysis and the relationship between the detection results of defect data quality products, into learning classification model based on product quality and defects by statistic method. However, the defect data appear at the same time a number of samples and the type of defect category imbalance problem, which is a major impediment to the construction of classification model according to the need to clear away the obstacles. The two also proposed multi classifier model combining cost sensitive and integration method, through the method of sample weight to weight the combination of cost sensitive classification and zoom, and integration of multiple decision tree to construct multi classification model. The experimental results show that the model can effectively deal with unbalanced classes of multi classification problems. At the same time can balance the cost of classification accuracy and prediction. In addition to measure the importance of integrated fitting decision tree can draw relevant attributes, can As a basis for the main influencing factors of retroactive defects.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 周志華,陳世福;神經(jīng)網(wǎng)絡(luò)集成[J];計(jì)算機(jī)學(xué)報(bào);2002年01期
,本文編號(hào):1425246
本文鏈接:http://sikaile.net/guanlilunwen/yingxiaoguanlilunwen/1425246.html
最近更新
教材專著