基于微博的消費(fèi)意圖挖掘
發(fā)布時(shí)間:2018-03-05 17:11
本文選題:消費(fèi)意圖挖掘 切入點(diǎn):SVM算法 出處:《哈爾濱工業(yè)大學(xué)》2014年碩士論文 論文類(lèi)型:學(xué)位論文
【摘要】:微博作為一種新型的社交媒體,已經(jīng)積聚了大量的用戶和影響力。由于發(fā)布微博簡(jiǎn)單方便,傳播速度快,微博上的用戶發(fā)布了大量的內(nèi)容豐富的信息。這些信息中有相當(dāng)數(shù)量都表現(xiàn)了用戶對(duì)某種商品的購(gòu)買(mǎi)愿望,也就是消費(fèi)意圖。這些具有消費(fèi)意圖的文本數(shù)據(jù)對(duì)科學(xué)研究和商業(yè)應(yīng)用都有著極高的價(jià)值。另外,這些文本也對(duì)社交媒體中的預(yù)測(cè)任務(wù)有著重要的意義。 本文中,針對(duì)基于基于微博的消費(fèi)意圖挖掘進(jìn)行了以下三方面的研究: (1)消費(fèi)意圖語(yǔ)料獲取及分類(lèi)。文中首先探討了消費(fèi)意圖初始語(yǔ)料的獲取方法,并在一淘求購(gòu),京東和微博上獲取了消費(fèi)意圖初始語(yǔ)料,并對(duì)語(yǔ)料進(jìn)行了預(yù)處理。本文將消費(fèi)意圖視為一個(gè)二元分類(lèi)問(wèn)題,使用獲取的消費(fèi)意圖語(yǔ)料抽取了多個(gè)維度的特征。最后,本文提出了基于SVM,Na ve Bayes以及深度學(xué)習(xí)(Deep Learning)的消費(fèi)意圖分類(lèi)模型。其中,基于深度學(xué)習(xí)的消費(fèi)意圖分類(lèi)方法的F值(F-measure)最高。 (2)消費(fèi)意圖到行為轉(zhuǎn)化。在之前的實(shí)驗(yàn)中,消費(fèi)意圖正例采用人工標(biāo)注的方式獲得。然而,雖然制定了消費(fèi)意圖標(biāo)注標(biāo)準(zhǔn),但是在多人標(biāo)注的過(guò)程中仍然存在標(biāo)注結(jié)果不統(tǒng)一的問(wèn)題。而且,,即便用戶表達(dá)出了消費(fèi)意圖,也不代表用戶一定會(huì)實(shí)施消費(fèi)行為。本文中提出了一種基于社交媒體的大規(guī)模調(diào)查問(wèn)卷發(fā)放方法,從社交媒體上自動(dòng)采集了大量用戶消費(fèi)行為數(shù)據(jù)。這些數(shù)據(jù)被用于評(píng)價(jià)之前的消費(fèi)意圖分類(lèi)模型,并用于構(gòu)建消費(fèi)行為分類(lèi)器。 (3)消費(fèi)意圖于預(yù)測(cè)任務(wù)上的應(yīng)用。本文中探討了一類(lèi)特定產(chǎn)品(即電影)的消費(fèi)意圖,并將電影消費(fèi)意圖應(yīng)用于電影預(yù)測(cè)票房的任務(wù)上。實(shí)驗(yàn)結(jié)果表明,通過(guò)結(jié)合消費(fèi)意圖特征和傳統(tǒng)方法中用于預(yù)測(cè)票房的特征,我們的模型取得了超過(guò)所有前人工作的R值。另外,我們還構(gòu)建了一個(gè)電影票房預(yù)測(cè)系統(tǒng),該系統(tǒng)從多個(gè)數(shù)據(jù)源自動(dòng)采集數(shù)據(jù)并進(jìn)行分析處理,最終在每部電影上映前給出該電影的票房預(yù)測(cè)結(jié)果。
[Abstract]:Weibo, as a new type of social media, has accumulated a lot of users and influence. Users on Weibo have published a large amount of rich information. A considerable amount of this information shows the user's desire to buy a certain product. That is, consumer intent. These consumptive text data are of great value for both scientific research and business applications. In addition, these texts are important for social media prediction tasks. In this paper, the consumption intention mining based on Weibo is studied in the following three aspects:. In this paper, we first discuss the methods of obtaining the initial data of consumption intention, and obtain the initial data of consumption intention on JingDong and Weibo. In this paper, the consumption intention is regarded as a binary classification problem, and the features of multiple dimensions are extracted by using the obtained consumption intention corpus. Finally, In this paper, a classification model of consumption intention based on SVMNve Bayes and Deep Learning Learning is proposed, in which the F-measure is the highest in the classification of consumption intention based on Deep Learning. In previous experiments, consumption intention was obtained by manual labeling. However, although the criteria for consumer intention labeling were established, However, there is still the problem of inconsistent labeling results in the process of multi-person tagging. Moreover, even if the user expresses his intention to consume, Nor does it necessarily mean that users will commit to consumer behavior. In this paper, a method of distributing large-scale questionnaires based on social media is proposed. A large number of consumer behavior data are automatically collected from social media. These data are used to evaluate the previous consumer intention classification model and to construct consumer behavior classifier. In this paper, we discuss the consumption intention of a class of specific products (that is, film), and apply it to the task of predicting the box office. The experimental results show that, By combining the consumption intention feature with the features used to predict the box office in the traditional method, our model has achieved a higher R value than all previous work. In addition, we have also constructed a film box office prediction system. The system automatically collects data from multiple data sources and analyzes and processes, and finally gives the box office prediction results of each movie before it is released.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.092;TP311.13
本文編號(hào):1571153
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1571153.html
最近更新
教材專(zhuān)著