基于微博的位置推測技術(shù)研究
[Abstract]:Weibo has become a platform for people to share and disseminate information quickly, which is characterized by the ability of the whole people to publish and share information anytime, anywhere on Weibo. In order to realize location-based service, how to speculate the location of users from decentralized and diversified information has become a difficult and hot issue in the Weibo era. Combined with the existing location conjecture technology at home and abroad, and on the premise of known location knowledge, this paper studies the location conjecture technology based on Weibo in order to improve the accuracy of location conjecture under different geographical granularity and solve the problem of sparsity of location information. Firstly, in order to realize location conjecture at the urban and street levels, a Weibo location conjecture method based on language model is proposed. Making full use of the characteristics of geographical information in urban and street granularity in Weibo, an improved local vocabulary extraction algorithm is used to construct Weibo location estimation method based on language model. The experimental results show that the proposed method can realize the urban level location conjecture under the unilanguage model and the binary language model, and the f-measure is 0.32 and 0.34, respectively. At the same time, the location conjecture under urban and street granularity can be realized, and the recall rate is 24.9% and 16.36%, respectively. at the same time, the experimental results also show that the accuracy and recall rate of the existing Weibo location speculation technology still need to be improved, especially the problem of location information sparsity needs to be solved. Secondly, in order to solve the problem that the accuracy of location estimation is not high under the condition of sparsity of Weibo location information, a method of user location estimation based on Weibo content is proposed. Firstly, the local vocabulary related to geography is extracted from the Weibo content of the user, and the weight of the local vocabulary in different regions is calculated, and then the location of the user is deduced by the matching degree between the Weibo content and the local vocabulary after word segmentation. The experimental results show that the accuracy of the location estimation method based on Weibo content at the provincial level and the urban level is 68.49% and 66.52%, respectively, which is superior to the existing location estimation methods based on benchmark algorithm, toponymic dictionary and TEDAS. Finally, in order to further improve the accuracy of location estimation, a user location estimation method based on Weibo content and mutual friends is proposed. In this method, the accuracy of location estimation is improved by combining the two methods based on Weibo content location estimation and mutual powder friend location estimation. The experimental results show that the accuracy of location estimation in this method is better than that based on Weibo content, mutual friends, benchmark algorithm, toponymic dictionary and TEDAS, and the accuracy of location estimation at provincial level and city level is 81.39% and 78.85% respectively when Weibo position information is sparse.
【學(xué)位授予單位】:杭州電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.092;TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 孫茂松,鄒嘉彥;漢語自動(dòng)分詞研究評述[J];當(dāng)代語言學(xué);2001年01期
2 王達(dá);崔蕊;;數(shù)據(jù)平滑技術(shù)綜述[J];電腦知識(shí)與技術(shù);2009年17期
3 鄭偉發(fā);;一種基于上下文的隱馬爾可夫模型的漢語句法分析模型的實(shí)現(xiàn)[J];福建電腦;2009年07期
4 張敏;王春紅;;基于統(tǒng)計(jì)方法的Web新詞分詞方法研究[J];計(jì)算機(jī)工程與科學(xué);2010年05期
5 黃昌寧;趙海;;中文分詞十年回顧[J];中文信息學(xué)報(bào);2007年03期
6 何黎;何躍;霍葉青;;微博用戶特征分析和核心用戶挖掘[J];情報(bào)理論與實(shí)踐;2011年11期
7 楊小朋;何躍;;騰訊微博用戶的特征分析[J];情報(bào)雜志;2012年03期
8 劉博;鄭家恒;張虎;;規(guī)則與統(tǒng)計(jì)相結(jié)合的分詞一致性檢驗(yàn)[J];計(jì)算機(jī)工程與設(shè)計(jì);2008年07期
9 王曉光;;微博客用戶行為特征與關(guān)系特征實(shí)證分析——以“新浪微博”為例[J];圖書情報(bào)工作;2010年14期
10 孫嵐;羅釗;吳英杰;王一蕾;;面向路網(wǎng)限制的位置隱私保護(hù)算法[J];山東大學(xué)學(xué)報(bào)(工學(xué)版);2012年05期
本文編號(hào):2505002
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2505002.html