基于數(shù)據(jù)挖掘的通信客戶流失預(yù)警模型研究
本文關(guān)鍵詞: 客戶流失 數(shù)據(jù)挖掘 類別均衡化 特征選擇 組合預(yù)警模型 出處:《華中師范大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:作為客戶關(guān)系管理中很重要的一部分,客戶流失管理正越來越受到企業(yè)的關(guān)注和重視。客戶流失預(yù)警作為一種有效的客戶流失管理方法,通過構(gòu)建預(yù)警模型,對潛在流失客戶進(jìn)行預(yù)測分析,及時預(yù)警并采取相應(yīng)挽留措施,可以有效減少不必要的客戶流失,一定程度上減少企業(yè)損失。通信運(yùn)營企業(yè)有數(shù)量龐大的客戶群,因此擁有豐富的客戶數(shù)據(jù),同時對客戶流失預(yù)警管理有強(qiáng)烈的需求。在這樣的背景下,本文提出了基于數(shù)據(jù)挖掘的通信客戶流失預(yù)警模型研究,結(jié)合數(shù)據(jù)挖掘從海量數(shù)據(jù)中提取有效信息的能力,通過構(gòu)建模型對通信客戶的潛在流失行為進(jìn)行預(yù)警研究。本文在研讀了國內(nèi)外學(xué)者的研究成果之后,對近年來預(yù)警模型的構(gòu)建和數(shù)據(jù)挖掘算法在模型構(gòu)建中的應(yīng)用進(jìn)行了綜述和總結(jié)。并對客戶流失概念、數(shù)據(jù)挖掘相關(guān)理論和預(yù)警模型構(gòu)建相關(guān)技術(shù)進(jìn)行了介紹,奠定本文的理論研究基礎(chǔ)。在模型數(shù)據(jù)準(zhǔn)備方面,本文以某市通信運(yùn)營企業(yè)客戶數(shù)據(jù)為實(shí)證研究對象,主要從無用特征刪除、缺失值填充、數(shù)據(jù)離散化、非均衡數(shù)據(jù)均衡化四個方面進(jìn)行方法探討和實(shí)際操作處理,確保了模型構(gòu)建的較高的數(shù)據(jù)質(zhì)量。在關(guān)鍵特征選擇方面,針對通信客戶數(shù)據(jù)的特征維度高的特點(diǎn),對比分析了卡方檢驗(yàn)、主成分分析以及Fisher比率三種常用的關(guān)鍵特征選擇方法的效果。對比實(shí)驗(yàn)結(jié)果發(fā)現(xiàn),基于不同算法的流失預(yù)警模型在采用不同的關(guān)鍵特征選擇方法時會得到不同的預(yù)測效果,相比較而言,Fisher比率篩選更優(yōu)化特征子集的能力比卡方檢驗(yàn)和主成分分析更強(qiáng),對于基于不同算法的流失預(yù)警模型都能得到更好的預(yù)測效果。在預(yù)警模型構(gòu)建方面,本文提出構(gòu)建通信客戶流失組合預(yù)警模型。相較于一般的組合預(yù)警模型,本文加入了基于Fisher比率的特征選擇步驟,根據(jù)各單項(xiàng)預(yù)警模型的最佳特征子集優(yōu)化訓(xùn)練集。選用C5.0決策樹、BP神經(jīng)網(wǎng)絡(luò)、支持向量機(jī)(SVM)三種數(shù)據(jù)挖掘算法構(gòu)建基本通信客戶流失預(yù)警模型,利用拉格朗日函數(shù)求解得到使組合預(yù)警與各單項(xiàng)預(yù)警偏差最小的最佳組合流失預(yù)警模型權(quán)重,根據(jù)權(quán)重線性組合三個基本預(yù)警模型的預(yù)測結(jié)果來構(gòu)建組合流失預(yù)警模型,在此基礎(chǔ)上得到通信客戶流失組合預(yù)警模型的預(yù)測結(jié)果。實(shí)證結(jié)果表明,組合流失預(yù)警模型比各單項(xiàng)基本流失預(yù)警模型預(yù)測效果更好,可以一定程度上減少通信運(yùn)營企業(yè)的收入損失。
[Abstract]:As an important part of customer relationship management, customer churn management is being paid more and more attention by enterprises. Forecasting and analysis of potential customers, timely warning and corresponding retention measures can effectively reduce unnecessary customer turnover and reduce enterprise losses to a certain extent. Communication operators have a large number of customers. Therefore, there is abundant customer data, and there is a strong demand for customer churn warning management. Under this background, this paper puts forward the research of communication customer churn warning model based on data mining. Combined with the ability of data mining to extract effective information from massive data, the potential loss behavior of communication customers is studied by constructing a model. This paper summarizes the construction of early warning model and the application of data mining algorithm in model construction in recent years, and introduces the concept of customer churn, the theory of data mining and the related technology of early warning model construction. In the aspect of model data preparation, this paper takes the customer data of a city communication operation enterprise as the empirical research object, mainly removes the useless feature, fills the missing value, and discretizes the data. Four aspects of unbalanced data equalization are discussed and processed in practice, which ensures the high data quality of model construction. In the aspect of key feature selection, aiming at the characteristics of high feature dimension of communication customer data, The effects of three common key feature selection methods, chi-square test, principal component analysis and Fisher ratio, are compared and analyzed. The loss early warning model based on different algorithms will get different prediction results when adopting different key feature selection methods. Compared with Fisher ratio, the ability of selecting more optimized feature subsets is stronger than chi-square test and principal component analysis. For the loss warning model based on different algorithms can get better prediction results. In the early warning model construction, this paper proposes a communication customer churn combination warning model. Compared with the general combination warning model, In this paper, the step of feature selection based on Fisher ratio is added, and the training set is optimized according to the best feature subset of each single early-warning model. The C5.0 decision tree and BP neural network are selected. Three kinds of data mining algorithms of support vector machine (SVM) are used to construct the basic communication customer churn warning model, and the weight of the optimal combination loss warning model is obtained by using Lagrange function to minimize the deviation between the combination early warning and the single item early warning. According to the forecasting results of three basic early-warning models of weighted linear combination, the combined loss early-warning model is constructed, and on this basis, the forecasting results of communication customer churn combination early-warning model are obtained. The empirical results show that, The combined loss early warning model is more effective than the single basic loss warning model and can reduce the loss of revenue to some extent.
【學(xué)位授予單位】:華中師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP311.13;F626
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 鄭宇晨;呂王勇;;基于logistic模型的證券公司客戶流失預(yù)警分析[J];鄭州航空工業(yè)管理學(xué)院學(xué)報(bào);2016年05期
2 胡世前;姜倩雯;凌冰;尹偉東;;基于改進(jìn)支持向量機(jī)的空氣質(zhì)量監(jiān)測預(yù)警模型[J];江蘇大學(xué)學(xué)報(bào)(自然科學(xué)版);2016年04期
3 劉佼;袁紅平;;基于人工神經(jīng)網(wǎng)絡(luò)的房地產(chǎn)市場預(yù)警模型研究——以成都市為例[J];工程管理學(xué)報(bào);2016年02期
4 張慧;徐勇;;數(shù)據(jù)挖掘中SVM模型與貝葉斯模型的比較分析——基于電信客戶的流失分析[J];平頂山學(xué)院學(xué)報(bào);2016年02期
5 方匡南;范新妍;馬雙鴿;;基于網(wǎng)絡(luò)結(jié)構(gòu)Logistic模型的企業(yè)信用風(fēng)險(xiǎn)預(yù)警[J];統(tǒng)計(jì)研究;2016年04期
6 洪麗平;覃錫忠;賈振紅;馬軍;;基于后驗(yàn)概率支持向量機(jī)在客戶流失中的預(yù)測[J];計(jì)算機(jī)工程與設(shè)計(jì);2016年02期
7 周金治;唐肖芳;;基于相關(guān)系數(shù)分析的腦電信號特征選擇[J];生物醫(yī)學(xué)工程學(xué)雜志;2015年04期
8 鮑新中;傅宏宇;;基于變精度加權(quán)平均粗糙度決策樹的財(cái)務(wù)預(yù)警研究[J];運(yùn)籌與管理;2015年03期
9 付杰;方芳;嚴(yán)克文;;基于Logistic回歸的通信業(yè)客戶流失預(yù)測與挽留研究[J];鄂州大學(xué)學(xué)報(bào);2015年06期
10 賀本嵐;;支持向量機(jī)模型在銀行客戶流失預(yù)測中的應(yīng)用研究[J];金融論壇;2014年09期
相關(guān)碩士學(xué)位論文 前4條
1 危虎;基于數(shù)據(jù)挖掘的模具業(yè)客戶流失分析[D];廣東工業(yè)大學(xué);2014年
2 王志君;基于神經(jīng)網(wǎng)絡(luò)的客戶流失預(yù)警研究[D];吉林大學(xué);2013年
3 洪金嵩;基于logistic回歸的上市公司財(cái)務(wù)困境預(yù)警模型實(shí)證研究[D];吉林大學(xué);2010年
4 魏民;基于Logistic回歸法的銀行風(fēng)險(xiǎn)預(yù)警模型構(gòu)建[D];長沙理工大學(xué);2010年
,本文編號:1554664
本文鏈接:http://sikaile.net/jingjilunwen/xxjj/1554664.html