天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于機器學(xué)習(xí)和GAM模型方法對北京二手房的交互研究

發(fā)布時間:2019-03-25 13:30
【摘要】:近年來我國經(jīng)濟迅猛發(fā)展,人民生活水平質(zhì)量不斷提高,也同時激發(fā)了人民的投資需求,房產(chǎn)成了重要的投資目標(biāo),進而推動了房產(chǎn)價格的上升。尤其是2008年經(jīng)濟危機以后,北京的房價一路飆升,高到天價,有的房子能高到每平米十幾萬,北京住房壓力巨大。截止2016年5月,北京二手房占市場成交比例已高達80%,同時北京二手房價也在短短幾年之內(nèi)翻了幾倍。為了尋找到適合研究北京二手房房價差異較好的模型以及觀察影響房價的因素是如何造成二手房房價差異的,本文利用2016年5月北京六個城區(qū)16210套二手房數(shù)據(jù),首先利用K-均值(K-means)聚類對房屋類型進行了分析,然后構(gòu)建普通最小二乘線性回歸模型(Ordinary Least Squares,OLS),對數(shù)OLS模型,K近鄰(K-nearest neighbor,KNN)回歸方法,對數(shù)KNN回歸,非線性廣義相加模型(Generalized Additive Models,GAM),對數(shù)GAM模型這六種方法對采集到的預(yù)測變量之間有無交互項兩種情況進行了研究,進而用穩(wěn)定性方法尋找最優(yōu)模型,最后又用OLS模型、對數(shù)OLS模型、GAM模型以及對數(shù)GAM模型這四個方法建模進行分析。結(jié)果發(fā)現(xiàn),所采集的房屋有四種類型,分別為地段型,郊區(qū)型,大眾型和大戶型。在模型的泛化能力方面,對數(shù)KNN回歸在無交互研究下是最優(yōu)的,對數(shù)GAM模型在有交互研究下是最優(yōu)的,且對數(shù)GAM模型是十二種模型中最優(yōu)的;在模型解釋方面,GAM模型無論是有無交互項還是是否對房價做了對數(shù)變換,都揭示連續(xù)型預(yù)測變量和房價之間的復(fù)雜非線性關(guān)系;在模型擬合優(yōu)度方面,有交互對數(shù)GAM模型的擬合優(yōu)度最高,效果最好;交互模型預(yù)測效果優(yōu)于非交互模型,多個預(yù)測變量之間存在交互效應(yīng),研究預(yù)測變量交互效應(yīng)可以提供很多有用信息,比如:利用有交互的線性模型可以得到在海淀區(qū)地鐵對房價的影響比在西城區(qū)地鐵對房價的影響大,說明海淀區(qū)地鐵房提升二手房價格的速度比西城區(qū)地鐵房房價更快。得出的結(jié)論是,非參交互模型更加適合對二手房的研究,連續(xù)型變量對房價的影響是非線性變化的,并且多個變量之間存在交互效應(yīng)。本文研究的是來自橫截面上,同一時間的房價差異,建立更好研究模型的目的讓購房者在做決策的時候,擁有一個客觀的參照。因為從大量北京二手房樣本中得到的房屋價格比簡單比較三兩家房價得到的房價參照會更加客觀可靠,從而做出的決斷也會更理性。
[Abstract]:In recent years, China's economy has developed rapidly and the quality of people's living standards has been continuously improved. At the same time, it has also stimulated the people's investment demand, and real estate has become an important investment target, thus promoting the rise of property prices. Especially after the 2008 economic crisis, Beijing's housing prices have skyrocketed, some houses can be as high as more than 100,000 square meters per square meter, Beijing housing pressure is huge. As of May 2016, second-hand housing in Beijing accounted for as much as 80 percent of market transactions, while second-hand housing prices in Beijing had more than doubled in just a few years. In order to find a suitable model to study the difference of second-hand housing price in Beijing and to observe how the factors that influence the price of second-hand house cause the difference of second-hand house price, this paper uses the data of 16210 second-hand houses in six urban areas of Beijing in May 2016. In this paper, we first use K-means (K-means) clustering to analyze the types of houses, then construct the ordinary least squares linear regression model (Ordinary Least Squares,OLS, logarithmic OLS model, K nearest neighbor (K-nearest neighbor,KNN) regression method, and then construct the general least square linear regression model (Ordinary Least Squares,OLS, logarithmic OLS model and K nearest neighbor regression method. Logarithmic KNN regression, nonlinear generalized additive model (Generalized Additive Models,GAM), logarithmic GAM model are used to study the interaction between the predicted variables, and then the stability method is used to find the optimal model. Finally, the OLS model, the logarithmic OLS model, the GAM model and the logarithmic GAM model are used to model the model. The results show that there are four types of houses collected, namely, ground type, subdistrict type, populace type and large household type. In terms of generalization ability of the model, logarithmic KNN regression is optimal under no interactive study, logarithmic GAM model is optimal under interactive research, and logarithmic GAM model is optimal among the twelve models. In the interpretation of the model, the GAM model reveals the complex nonlinear relationship between the continuous forecasting variables and the house price, whether there is an interaction term or whether the logarithmic transformation of the house price has been carried out. In the aspect of model goodness-of-fit, the cross-logarithmic GAM model has the highest goodness-of-fit and the best effect. The prediction effect of interactive model is better than that of non-interactive model, and there are interaction effects among several prediction variables. It can provide a lot of useful information to study the interaction effect of prediction variables. For example, by using the interactive linear model, we can get that the influence of subway on house price in Haidian district is greater than that in Xicheng district, which shows that the price of second-hand house in Haidian district is higher than that in Xicheng district, which indicates that the price of second-hand house in Haidian district is higher than that in Xicheng district. The conclusion is that the non-parametric interaction model is more suitable for the study of second-hand housing, the influence of continuous variables on house prices is nonlinear, and there are interaction effects among several variables. This paper studies the difference of house prices at the same time from the cross-section. The purpose of establishing a better research model is to give buyers an objective reference in making decisions. Because the prices obtained from a large sample of second-hand houses in Beijing are more objective and reliable than those obtained from a simple comparison of the prices of three or two houses, the decisions made will also be more rational.
【學(xué)位授予單位】:太原理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:F299.23

【參考文獻】

相關(guān)期刊論文 前8條

1 蔡正U,

本文編號:2447018


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjifazhanlunwen/2447018.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶1127f***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com