基于訂票行為的航空旅客劃分方法研究
發(fā)布時間:2018-06-02 06:48
本文選題:客戶細(xì)分 + 航空旅客; 參考:《江蘇科技大學(xué)》2015年碩士論文
【摘要】:近年來,隨著國內(nèi)經(jīng)濟(jì)的高速發(fā)展,民航旅客的數(shù)量急劇增加,國內(nèi)民航進(jìn)入快速發(fā)展模式。各航空公司為了應(yīng)對民航市場的激烈競爭,分析民航不同旅客群體的出行偏好,進(jìn)而制定相應(yīng)的競爭策略成為航空公司的迫切需求。為此,本文以航空旅客購票時記錄的客戶信息作為數(shù)據(jù)來源,采用聚類分析的方式,在對客戶群體進(jìn)行有效劃分的基礎(chǔ)上,分析航空旅客的出行偏好。與傳統(tǒng)聚類算法分析的數(shù)值類型的數(shù)據(jù)不同,本文以記錄航空客戶訂票行為的數(shù)據(jù)作為分析對象,其特殊性在于:首先,源數(shù)據(jù)為包含數(shù)值屬性和分類屬性的混合類型數(shù)據(jù);其次,數(shù)據(jù)量龐大且分布存儲于各航空公司。為此,本文通過改進(jìn)現(xiàn)有聚類算法的方式使其適合于單一航空公司混合類型數(shù)據(jù)的聚類分析,從局部的角度分析單一航空公司的旅客出行偏好;進(jìn)而設(shè)計(jì)分布式聚類算法,以同時利用不同航空的旅客信息,從全局的角度來分析民航旅客的出行偏好。因此,本文的研究工作主要包括以下兩個方面:(1)本文以旅客訂票過程中記錄的相關(guān)信息為基礎(chǔ),將旅客群體劃分歸結(jié)為混合類型數(shù)據(jù)的聚類問題,采用k-prototypes算法來實(shí)現(xiàn)航空旅客群體的有效劃分。同時,針對描述旅客購票信息的部分?jǐn)?shù)據(jù)屬性為離散值且類別眾多、語義模糊的不足,借助于民航領(lǐng)域知識對屬性數(shù)據(jù)進(jìn)行轉(zhuǎn)換表示,簡化了屬性數(shù)據(jù)的類別信息,顯示表示屬性數(shù)據(jù)中的隱含知識;同時通過構(gòu)建旅客價值的定量計(jì)算模型,有效刻畫旅客價值,從而在對航空旅客進(jìn)行有效劃分的基礎(chǔ)上分析航空旅客的出行偏好。(2)為了有效處理大規(guī)模分布式混合數(shù)據(jù)集,本文通過擴(kuò)展k-prototypes算法,以并行方式運(yùn)行k-prototypes算法,結(jié)合領(lǐng)域知識,提出了面向領(lǐng)域的并行k-prototypes算法(Domain based Parallel K-prototypes,DPKP),使得各自航空公司的旅客劃分和數(shù)據(jù)分析在各自站點(diǎn)完成,在提高算法運(yùn)行效率的同時保護(hù)了航空公司的商業(yè)隱私。實(shí)驗(yàn)結(jié)果表明,本文提出的聚類算法適合對航空旅客數(shù)據(jù)的劃分,不僅使得聚類結(jié)果的準(zhǔn)確性有所提高,而且聚類的時間效率也有提升。最后本文利用國內(nèi)航空公司提供的旅客數(shù)據(jù)集,結(jié)合本文提出的聚類算法,構(gòu)建航空旅客細(xì)分模型,對旅客進(jìn)行細(xì)分,同時根據(jù)細(xì)分結(jié)果分析不同旅客群體的出行需求,制定相應(yīng)的營銷策略,從而為航空公司提供了很好的戰(zhàn)略建議。
[Abstract]:In recent years, with the rapid development of domestic economy, the number of civil aviation passengers has increased sharply, and domestic civil aviation has entered a rapid development mode. In order to cope with the fierce competition in the civil aviation market, the airlines need to analyze the travel preferences of different passenger groups of civil aviation, and then formulate the corresponding competition strategy. Therefore, this paper takes the customer information recorded by airline passengers as the data source, adopts the method of cluster analysis, and analyzes the travel preference of airline passengers on the basis of effectively dividing the customer groups. Different from the data of numerical type analyzed by traditional clustering algorithm, this paper takes the data of booking behavior of aviation customer as the analysis object. The particularity of the data is: firstly, the source data is mixed type data including numerical attribute and classified attribute; Secondly, the amount of data is huge and distributed among airlines. Therefore, this paper improves the existing clustering algorithm to make it suitable for the clustering analysis of mixed type data of single airline, analyzes the passenger travel preference of single airline from a local point of view, and then designs a distributed clustering algorithm. The travel preference of civil aviation passengers is analyzed from a global point of view by using the passenger information of different airlines at the same time. Therefore, the research work of this paper mainly includes the following two aspects: 1) based on the relevant information recorded in the passenger booking process, this paper divides the passenger group into the clustering problem of mixed type data. K-prototypes algorithm is used to realize the effective division of air passenger group. At the same time, in view of the deficiency that some data attributes describing passenger ticket purchase information are discrete values and many categories, and the semantics are fuzzy, the attribute data is transformed and expressed by means of civil aviation knowledge, which simplifies the category information of attribute data. Display the implied knowledge in the attribute data and construct the quantitative calculation model of passenger value to depict the passenger value effectively. In order to deal with large-scale distributed mixed data sets effectively, this paper extends k-prototypes algorithm, runs k-prototypes algorithm in parallel mode, and combines domain knowledge. A domain-oriented parallel k-prototypes algorithm named Domain based Parallel K-prototypes is proposed, which makes passenger partition and data analysis of their respective airlines complete at their respective stations, which improves the efficiency of the algorithm and protects the commercial privacy of airlines. The experimental results show that the proposed clustering algorithm is suitable for the classification of air passenger data, which not only improves the accuracy of the clustering results, but also improves the time efficiency of the clustering. Finally, this paper uses the passenger data set provided by domestic airlines, combined with the clustering algorithm proposed in this paper, to build an air passenger subdivision model to subdivide passengers, and analyze the travel needs of different passenger groups according to the subdivision results. Develop the corresponding marketing strategy, thus providing a good strategic advice for the airline.
【學(xué)位授予單位】:江蘇科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 曹國;;基于K-means和PCA的商業(yè)銀行客戶價值細(xì)分模型研究[J];財會通訊;2010年27期
2 於躍成;王建東;鄭關(guān)勝;陳斌;;基于約束信息的并行k-means算法[J];東南大學(xué)學(xué)報(自然科學(xué)版);2011年03期
3 毛典輝;;基于MapReduce的Canopy-Kmeans改進(jìn)算法[J];計(jì)算機(jī)工程與應(yīng)用;2012年27期
相關(guān)博士學(xué)位論文 前1條
1 朱恒民;領(lǐng)域知識制導(dǎo)的數(shù)據(jù)挖掘技術(shù)及其在中藥提取中的應(yīng)用[D];南京航空航天大學(xué);2006年
相關(guān)碩士學(xué)位論文 前1條
1 何青松;基于隱私保護(hù)的分布式聚類算法的研究[D];復(fù)旦大學(xué);2010年
,本文編號:1967816
本文鏈接:http://sikaile.net/guanlilunwen/yingxiaoguanlilunwen/1967816.html
最近更新
教材專著