基于金融類客戶畫像的二分K均值算法分析研究與應用
發(fā)布時間:2018-06-08 00:12
本文選題:數(shù)據(jù)倉庫 + 客戶畫像。 參考:《中國科學院大學(工程管理與信息技術學院)》2016年碩士論文
【摘要】:隨著近幾年互聯(lián)網(wǎng)的迅猛發(fā)展,大量企業(yè)進入到電子商務領域,借助電商平臺來進行產(chǎn)品的營銷和推廣。信息技術借助互聯(lián)網(wǎng)快速發(fā)展,互聯(lián)網(wǎng)金融模式逐漸興起。大數(shù)據(jù)時代的到來對于給金融機構既是挑戰(zhàn),也是機遇;ヂ(lián)網(wǎng)金融不是簡單字面上的通過互聯(lián)網(wǎng)來提供金融服務,這只是表面上的形式而已,背后還需要大量數(shù)據(jù)的積累和強大的數(shù)據(jù)處理能力,也是互聯(lián)網(wǎng)金融的兩個關鍵基礎因素:大數(shù)據(jù)和云計算;ヂ(lián)網(wǎng)金融依托于大數(shù)據(jù)和云計算為客戶提供一系列的互聯(lián)網(wǎng)金融服務。而本文研究的基于互聯(lián)網(wǎng)金融屬性的券商電商平臺是結合了產(chǎn)品銷售、咨詢服務、投資顧問簽約、證券交易以及依托于大數(shù)據(jù)和云計算的綜合型平臺。目前而言,還沒有具體針對券商電商客戶精確化分類的金融平臺,客戶畫像還只是用于簡單的描述用戶信息,本文將根據(jù)用戶的基礎信息、資產(chǎn)信息、交易記錄、平臺活動軌跡等行為數(shù)據(jù)通過云計算來進行數(shù)據(jù)建模,在客戶畫像的基礎上對用戶進行聚類分析建立數(shù)據(jù)分類模型,將客戶進行分層,然后針對各層次的客戶進行制定個性化營銷方案,從而更有針對性的進行產(chǎn)品的營銷和推廣?蛻舴謱臃诸愅ǔJ褂镁垲愃惴▉韺崿F(xiàn),而K-means算法是最為常用的數(shù)據(jù)挖掘算法之一,通過對K-means算法的深入分析,作者發(fā)現(xiàn)選擇適當?shù)某跏假|(zhì)心是K-means算法執(zhí)行過程的關鍵,一般情況下會采用隨機選取質(zhì)心來解決人為干預的因素,但是這樣會導致不同的運行產(chǎn)生不同的總誤差平方和(Sum of the Squared Error,簡稱SSE),最終影響結果的準確性和穩(wěn)定性。為了克服隨機選取質(zhì)心的缺陷,美國學者Pang-Ning Tan提出了二分K-means算法,這種算法的基本思想是將所有點的集合分裂成兩個簇,從這兩個簇中根據(jù)條件篩出選取一個繼續(xù)分裂,如此下去產(chǎn)生K個簇。根據(jù)實際實驗結果得出結論二分K-means算法受質(zhì)心影響較小,且效率和準確性比K-means算法要高很多。本文則主要根據(jù)二分K-means算法進行分析研究和應用,通過此算法將券商客戶分類以后,通過不同層次的客戶匹配不同風險等級的產(chǎn)品,從而在策略上達到區(qū)分客戶精準營銷的目的。本文完成的主要工作包括:(1)建立統(tǒng)一的數(shù)據(jù)中心,將客戶的各項數(shù)據(jù)進行統(tǒng)一抽取、分類,并通過系列方法來篩選整合數(shù)據(jù),使客戶數(shù)據(jù)達到實驗要求;(2)建立客戶畫像系統(tǒng),建立統(tǒng)一的客戶畫像指標體系,通過系列指標來篩選客戶作為客戶聚類分析的基礎;(3)通過優(yōu)化的聚類分析方法對客戶數(shù)據(jù)進行分類,將客戶分層,制定個性化營銷方案,提高客戶轉化率;趯τ谀壳盎ヂ(lián)網(wǎng)金融電商平臺對客戶研究重要性的認知,本研究在系統(tǒng)綜述經(jīng)典文獻研究的基礎上,通過云計算平臺將客戶的大數(shù)據(jù)信息通過數(shù)據(jù)建模,在客戶畫像的基礎上將用戶進行分類算法分類,精確定位用戶,并通過實際的個性化營銷和推廣來驗證和修正數(shù)據(jù)模型,提高券商客戶轉化率,并達到了預期的效果。
[Abstract]:With the rapid development of the Internet in recent years, a large number of enterprises have entered the field of electronic commerce, with the help of the e-commerce platform to carry out the marketing and promotion of products. Information technology has developed rapidly with the help of the Internet, and the Internet financial model is rising gradually. The advent of the era of big data is not only a challenge but also an opportunity for the financial machinery. Simply literally, providing financial services through the Internet, which is just a surface form, requires a lot of data accumulation and powerful data processing capabilities. It is also the two key basic factor for Internet Finance: large data and cloud computing. Internet Finance provides a series of customers with large data and cloud computing. In this paper, the e-commerce platform based on the Internet financial attributes is a combination of product sales, consulting services, investment consulting, securities trading and integrated platform based on large data and cloud computing. At present, there are no specific financial platforms for the precise classification of securities business customers. Customer portrait is also used to simply describe user information. This article will model the data according to the user's basic information, asset information, transaction record, platform activity track and other behavioral data through cloud computing. Personalized marketing programs are made to customers at all levels, which are more targeted to the marketing and promotion of products. Customer stratification classification is usually implemented using clustering algorithms. The K-means algorithm is one of the most commonly used data mining algorithms. By deep analysis of the K-means algorithm, the author finds that the appropriate initial centroid is selected. It is the key to the execution of the K-means algorithm. In general, a random selection of centroids will be used to solve the factors of human intervention. However, this will lead to different running of the total error square sum (Sum of the Squared Error, for short, SSE), and ultimately affect the accuracy and stability of the result. In order to overcome the defect of random selection of the centroid The American scholar Pang-Ning Tan proposed a two point K-means algorithm. The basic idea of this algorithm is to divide the set of all points into two clusters, and select one to continue splitting from the two clusters according to the conditions, and then produce K clusters. According to the actual experimental results, the conclusion is that the effect of the centroid is smaller and the efficiency is less effective. And the accuracy is much higher than the K-means algorithm. This paper is mainly based on the analysis and application of the two point K-means algorithm. After classifying the broker customers, this algorithm can match the products of different risk levels through different levels of customers, so as to achieve the purpose of distinguishing the customers' accurate marketing in the strategy. The main work done in this paper is the main work of this paper. Including: (1) to establish a unified data center to unify and classify the customer's data, and to select the integrated data through a series of methods to make the customer data meet the requirements of the experiment; (2) establish a customer portrait system, establish a unified customer portrait index system, and screen customers as customer clustering analysis through a series of indicators. 3. (3) classifying customer data by optimizing clustering analysis method, delamination of customers, formulate personalized marketing schemes and improve customer conversion rate. Based on the understanding of the importance of Internet financial e-commerce platform to customer research, this research is based on the system overview of classic literature and through the cloud computing platform The customer's large data information is modeled by the data, the classification algorithm is classified on the basis of customer portrait, the user is accurately positioned, and the data model is verified and modified through the actual personalized marketing and promotion to improve the conversion rate of the customers and achieve the expected effect.
【學位授予單位】:中國科學院大學(工程管理與信息技術學院)
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP311.13
【參考文獻】
相關期刊論文 前10條
1 路闊;鐘伯成;;基于LMBP神經(jīng)網(wǎng)絡的建筑能耗預測[J];計算機技術與發(fā)展;2015年06期
2 李鑫;徐唯q,
本文編號:1993457
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/1993457.html
最近更新
教材專著