基于社交大數(shù)據(jù)的用戶信用畫像方法研究
[Abstract]:In recent years, mobile Internet and social media have sprung up, gradually replacing traditional blogs, BBS forums, as the main platform for people to socialize, learn, and entertain. At the same time, with the wide acceptance of the whole society, especially the mobile Internet, the number of connected users and the user generated data (UGC). Explosive growth. Compared with previous Internet media technologies (such as mail, forums, blogs), social media records are more rich in data, timely and more timeliness. In particular, micro-blog type social media data has become a public platform for information release, interuser interaction, and event discovery diffusion. In order to make full use of this data source, the academic community has carried out a wide range of social network theory, user behavior patterns, public event development rules, and rumor discovery detection methods, in order to make full use of this data source. In general, the valuable information contained in the large data of social media requires new data processing and analytical methods to solve them. However, new challenges and problems have been formed by the short length, poor quality, rapid change and weak correlation of the social media data, which makes the traditional data mining methods incapable. In response to the challenges of "sequence", "behavioural" and "multi source" in social media data, the goal of a user credit portrait based on social data is better implemented. This paper carries out an efficient sequence mining algorithm for micro-blog type data, based on the user's credit portrait of the implicit behavior model of micro-blog users, based on feature design and integrated learning fusion. Research on three aspects of user credit portrait of multi source information. In addition, in the research process of user credit image algorithm under micro-blog data, this paper makes a summary and prospect for the user portrait algorithm on social large data. In particular, the main research content, innovation and academic contributions of this paper include the following three aspects: 1) Micro-blog data is presented to the user in the form of time line (Timeline), which is essentially an event type sequence data. Event sequence data mining, in addition to considering the frequency of the project (item), also starts to consider the utility of the project (utility), and then implements efficient fragment mining. This paper proposes a multiple optimization strategy for the existing high utility episode mining algorithm, which makes the algorithm running speed and memory efficiency improved on a large scale. More important, the word sequence prefix tree mining framework introduced in this paper has a tighter pruning threshold value estimation, which makes the event sequence efficient use of events. Fragment mining algorithm becomes fast and practical (third chapter).2) every micro-blog in micro-blog data contains text content and context information related to user behavior. Text and behavior two data sources can provide data support for user's credit model at the same time, but simple feature extraction is then used. In order to realize the user's credit picture based on the behavior pattern, this paper, through the modeling method of the probability graph model, combines the observable user text with a variety of behavior features to obtain the user's implicit behavior pattern which provides input for the prediction of the letter. The probabilistic theme model, LUBD-CM, is designed to assume that a micro-blog is generated by the same topic and that both the behavior data and the text data on micro-blog are constrained by the assigned topic. The experimental results show that the LUBD-CM model is a simplified variant of the LUBD-CM, the traditional LDA, and the simple Bayes algorithm, for the prediction performance of the user credit label. Promotion (fourth chapter).3) user data on social platform, except user generated content, including user personal information, social network relationship. Different sources of user social data contain different types of information related to user credit. However, the "immediacy" of micro-blog social data causes the data quality to be generally very low and difficult to be used as a standard. Quasi classifier such as SVM, the input of the decision tree and the higher user tag prediction performance. In order to fuse the effective information of the credit picture in the multi-source heterogeneous social data, this paper, starting with the personal credit related domain knowledge, analyzes a wide variety of possible feature design schemes to select better social features and use the double layer integration. Learning framework, fully mining the effective information hidden in a variety of social characteristics, so as to realize the comprehensive stack method, the promotion method and the integration method user credit picture prediction system (fifth chapter). It is worth mentioning that the series data mining for micro-blog social data, the user portrait method and the user generated by this paper His type of social data (such as Facebook data, WeChat data) is largely applicable. Although this paper focuses on the prediction and portrait of the user's credit attributes, the new method is also applicable to other types of personal tags such as age, sex, or marital status.
【學(xué)位授予單位】:中國科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 靈子;信用消費(fèi)悄然興起[J];信息經(jīng)濟(jì)與技術(shù);1994年12期
2 高靜霞,何英華;淺析信用及其對(duì)企業(yè)的影響[J];科技情報(bào)開發(fā)與經(jīng)濟(jì);2002年06期
3 肖勝;柯曉燕;徐靜;全波;馮炳麟;;開放信用消費(fèi) 實(shí)現(xiàn)差異化服務(wù)[J];通信企業(yè)管理;2013年06期
4 韓偉;;小議信用檔案的建立[J];機(jī)電兵船檔案;2003年01期
5 ;信用你用了嗎?[J];數(shù)字生活;2001年03期
6 陳正月,張建忠;信用檔案悄然叩開百姓門[J];湖北檔案;2000年09期
7 任瓏;陳小筑;曹文煉;張子紅;;加速培育信用信息服務(wù)市場(chǎng)[J];中國信息界;2004年07期
8 ;新聞·時(shí)事追蹤[J];上海微型計(jì)算機(jī);1999年42期
9 王雪玉;;銀行系電商崛起[J];金融科技時(shí)代;2014年06期
10 ;[J];;年期
相關(guān)重要報(bào)紙文章 前10條
1 大林;信用消費(fèi)呼喚誠信[N];健康報(bào);2006年
2 商務(wù)部市場(chǎng)秩序司司長 向欣;支持信用消費(fèi)發(fā)展 推動(dòng)消費(fèi)模式轉(zhuǎn)變[N];國際商報(bào);2009年
3 譚浩俊;信用消費(fèi)基礎(chǔ)建設(shè)應(yīng)提速[N];經(jīng)濟(jì)參考報(bào);2012年
4 孫韶華;促消費(fèi)新政出臺(tái) 信用消費(fèi)或受支持[N];中國貿(mào)易報(bào);2012年
5 記者 孫韶華;信用消費(fèi)有望獲政策“紅包”[N];經(jīng)濟(jì)參考報(bào);2012年
6 江德斌;鼓勵(lì)信用消費(fèi) “債務(wù)奴隸”會(huì)不會(huì)更多[N];中國商報(bào);2012年
7 孫韶華;信用消費(fèi)有望獲鼓勵(lì)[N];聯(lián)合日?qǐng)?bào);2012年
8 本報(bào)記者 吳力;不要對(duì)信用消費(fèi)盲目叫好[N];國際商報(bào);2013年
9 記者 張慧敏;“三零”信用消費(fèi)模式促消費(fèi)效果明顯[N];北京商報(bào);2013年
10 胡慧平;“債百萬”敲響信用消費(fèi)警鐘[N];大眾科技報(bào);2003年
相關(guān)博士學(xué)位論文 前5條
1 郭光明;基于社交大數(shù)據(jù)的用戶信用畫像方法研究[D];中國科學(xué)技術(shù)大學(xué);2017年
2 陳忠;信用消費(fèi)論[D];中國社會(huì)科學(xué)院研究生院;2002年
3 葉建亮;交易擴(kuò)展中的信用[D];浙江大學(xué);2004年
4 葉建亮;交易擴(kuò)展中的信用——一個(gè)制度與組織的視角[D];浙江大學(xué);2004年
5 葉圣利;中國誠信經(jīng)濟(jì)思想研究[D];復(fù)旦大學(xué);2004年
相關(guān)碩士學(xué)位論文 前10條
1 黃羽茜;美國信用消費(fèi)保護(hù)法律體系的歷史發(fā)展及其對(duì)我國的借鑒意義[D];中國政法大學(xué);2007年
2 曲豪;我國信用消費(fèi)的倫理研究[D];河北大學(xué);2015年
3 許勇;基于流動(dòng)性風(fēng)險(xiǎn)和信用風(fēng)險(xiǎn)的M商業(yè)銀行違約風(fēng)險(xiǎn)研究[D];南京理工大學(xué);2015年
4 張萍;渭南市信用消費(fèi)發(fā)展研究[D];西北農(nóng)林科技大學(xué);2015年
5 宋昊澤;信用評(píng)級(jí)變動(dòng)與盈余管理的相關(guān)性研究[D];東北財(cái)經(jīng)大學(xué);2015年
6 高彩鳳;信用消費(fèi)中消費(fèi)者權(quán)益保護(hù)法律問題探討[D];江西財(cái)經(jīng)大學(xué);2015年
7 黃敏;蒙古族大學(xué)生信用消費(fèi)問題研究[D];內(nèi)蒙古師范大學(xué);2016年
8 王漪鷗;個(gè)人信用消費(fèi)貸款法律制度研究[D];首都經(jīng)濟(jì)貿(mào)易大學(xué);2011年
9 王國棟;我國信用消費(fèi)倫理研究[D];山西財(cái)經(jīng)大學(xué);2012年
10 苗炎;信用的建立與維護(hù)——一個(gè)法律社會(huì)學(xué)的分析[D];吉林大學(xué);2004年
,本文編號(hào):2171683
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2171683.html