當(dāng)前位置：主頁(yè) > 管理論文 > 移動(dòng)網(wǎng)絡(luò)論文 >

基于微博的意圖識(shí)別

發(fā)布時(shí)間：2019-06-01 16:35

【摘要】：微博是一種新興的社交平臺(tái),數(shù)以億計(jì)的用戶每天在微博上發(fā)布海量的微博數(shù)據(jù)。在這些海量的微博中有些微博具有一定意圖,它們通常使用顯式或者隱式的表達(dá)方式來(lái)表達(dá)相應(yīng)的意圖。將微博中的這些意圖準(zhǔn)確地識(shí)別出來(lái)具有巨大的商業(yè)價(jià)值,本文針對(duì)微博中的意圖識(shí)別主要進(jìn)行了以下三方面的研究:(1)微博中的顯式意圖的識(shí)別。具有顯式意圖的微博通常有“想要”,“希望”等意圖觸發(fā)詞。本文我們提出了一種新的基于維基百科的顯式意圖識(shí)別模型。對(duì)于每一種意圖,我們首先選取一些最能代表這種意圖的概念集合,即種子概念。然后將種子概念放在維基百科進(jìn)行查詢,通過(guò)概念之間的鏈接關(guān)系,我們可以獲得與種子概念相關(guān)的概念集合,這些擴(kuò)展的概念在一定程度上也具有相同意圖。之后我們利用獲得的概念集合構(gòu)建相應(yīng)意圖的維基百科鏈路圖,在圖中用隨機(jī)游走算法對(duì)每個(gè)概念分配意圖得分。最后我們將微博映射到相應(yīng)的意圖空間中得到相應(yīng)的意圖得分,我們根據(jù)意圖得分判斷微博是否具有對(duì)應(yīng)的意圖。如果微博中的某些詞沒(méi)有被維基百科收錄,我們利用顯式語(yǔ)義分析(ESA)的方法將其映射到最相關(guān)的維基百科概念,然后再進(jìn)行映射計(jì)算相應(yīng)的意圖得分。(2)微博中隱式意圖的識(shí)別。具有隱式意圖的微博通常不具有意圖觸發(fā)詞,但是可以通過(guò)推理得到微博中的意圖。目前大多數(shù)的研究工作都是針對(duì)顯式意圖識(shí)別的,而本文我們進(jìn)行了微博中隱式意圖的識(shí)別。我們利用編碼器-解碼器模型(Encoder-Decoder)將具有隱式意圖的微博“翻譯”為相應(yīng)的顯式意圖的表達(dá)方式,然后再進(jìn)行顯式意圖的識(shí)別。編碼器-解碼器模型主要應(yīng)用于序列到序列問(wèn)題(seq2seq),例如機(jī)器翻譯,語(yǔ)音識(shí)別,圖像描述等,而隱式意圖與顯式意圖的轉(zhuǎn)換也屬于seq2seq問(wèn)題,因此可以使用編碼器-解碼器模型。傳統(tǒng)的基于RNN的編碼器-解碼器模型的主要思想是將輸入的句子編碼為一個(gè)固定長(zhǎng)度的語(yǔ)義向量,然后將該語(yǔ)義向量解碼生成相應(yīng)的輸出句子。Bahdanau等人后來(lái)提出了注意力模型,改進(jìn)了基于RNN的編碼器-解碼器模型,他們將輸入句子編碼為一個(gè)長(zhǎng)度不固定的語(yǔ)義向量,這樣使得即使句子長(zhǎng)度很長(zhǎng),翻譯效果同樣很好。實(shí)驗(yàn)中我們進(jìn)行了兩種模型的對(duì)比,結(jié)果表明注意力模型要比基于RNN的編碼器-解碼器模型效果更好。為了訓(xùn)練模型,我們構(gòu)建了包含隱式意圖的微博和相應(yīng)的顯式意圖微博的語(yǔ)料庫(kù)。一旦我們通過(guò)注意力模型獲得意圖的顯式表達(dá)方式之后,接下來(lái)就可以使用本文提出的基于維基百科的顯式意圖識(shí)別模型識(shí)別其中的顯式意圖。(3)微博中意圖的識(shí)別。我們提出一種基于詞向量與卷積神經(jīng)網(wǎng)絡(luò)的意圖識(shí)別模型,這種模型具有通用性,不僅可以用來(lái)識(shí)別顯式意圖,也可以用來(lái)識(shí)別隱式意圖。模型的通用性主要得益于兩方面,一方面是詞的詞向量表示具有豐富的語(yǔ)義特征,另一方面是因?yàn)榫矸e神經(jīng)網(wǎng)絡(luò)可以提取句子的語(yǔ)義特征。因此,當(dāng)我們將意圖識(shí)別問(wèn)題視為多元分類問(wèn)題時(shí),即一條微博是否具有某種意圖,模型可以對(duì)具有意圖的微博進(jìn)行分類,無(wú)論其中的意圖表達(dá)方式是顯式的還是隱式的,詞向量和卷積神經(jīng)網(wǎng)絡(luò)模型都可以提取其中的語(yǔ)義特征然后進(jìn)行正確的意圖識(shí)別。
[Abstract]:The Internet is a new social platform, with hundreds of millions of users posting massive microblogging data on microblogs every day. Some of these microblogs have some intent in these massive volumes, and they often use explicit or implicit expressions to express the corresponding intent. The purpose of this paper is to identify the following three aspects: (1) the identification of the explicit intention in the system. Microblogs with explicit intent typically have "want", "hope", and other intent-triggered words. In this paper, a new wikipedia-based explicit intention recognition model is presented. For each intent, we first select a set of concepts that best represent this intent, that is, the concept of a seed. The concept of the seeds is then placed in Wikipedia for query, through the link between concepts, we can obtain a set of concepts related to the concept of the seeds, which have the same intent to some extent. After that, we use the obtained concept set to construct the corresponding intention Wikipedia link diagram, and assign the intention score to each concept by the random walk algorithm in the figure. Finally, we map the micro-blog to the corresponding intention space to get the corresponding intention score, and we judge whether the micro-blog has the corresponding intention according to the intention score. If some of these words are not included in Wikipedia, we map it to the most relevant Wikipedia concept by means of explicit semantic analysis (ESA) and then map the corresponding intent score. (2) The identification of implicit intention in the system. Microblog with implicit intent usually does not have an intention to trigger a word, but it can be inferred by reasoning. At present, most of the research work is identified for explicit intent, and the text of this article is to identify the implicit intent. We use the Encoder-Decoder model (Encoder-Decoder) to use the implicit intent-oriented microblogging "translation" as the corresponding explicit intent expression, and then to identify the explicit intent. The encoder-decoder model is mainly applied to sequence-to-sequence problems (seq2seq), such as machine translation, speech recognition, image description, etc., while the implicit intention and explicit intention conversion also belong to the seq2seq problem, and thus an encoder-decoder model can be used. The main idea of the traditional RNN-based encoder-decoder model is to encode the input sentence into a fixed-length semantic vector, and then decode the semantic vector to generate the corresponding output sentence. The Bahdanau et al. later presented an attention model that improved the RNN-based encoder-decoder model, which codes the input sentence as a length-invariant semantic vector, so that even if the sentence length is long, the translation effect is equally good. The results show that the attention model is better than the RNN-based encoder-decoder model. In order to train the model, we built a corpus of microblogs with implicit intent and corresponding explicit intent microblogs. Once we get the explicit expression of intent through the attention model, the explicit intent in this article can then be identified using the Wikipedia-based explicit intent recognition model proposed in this article. (3) Identification of the intention in the case. We propose an intention recognition model based on the word vector and the convolution neural network. The model has the universality, not only can be used to identify the explicit intention, but also can be used to identify the implicit intention. The generality of the model is mainly from two aspects, on the one hand, the word vector representation of the word has a rich semantic feature, and on the other hand, the convolution neural network can extract the semantic features of the sentence. Therefore, when we view the problem as a multi-classification problem, that is, whether a micro-blog has some intention, the model can classify the micro-blog with intention, whether the intention expression is explicit or implicit, The word vector and the convolution neural network model can extract the semantic features therein and then perform the correct intent recognition.
【學(xué)位授予單位】：西華大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.1;TP393.092

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 嚴(yán)軍勇,金翊,孫浩;三值光計(jì)算機(jī)多位編碼器與解碼器的可行性實(shí)驗(yàn)研究[J];計(jì)算機(jī)工程;2004年14期

2 張金超;艾山·吾買爾;買合木提·買買提;劉群;;基于多編碼器多解碼器的大規(guī)模維漢神經(jīng)網(wǎng)絡(luò)機(jī)器翻譯模型[J];中文信息學(xué)報(bào);2018年09期

3 謝瑞和，周如彪，王晨;編碼器／解碼器專用芯片及其應(yīng)用[J];電子技術(shù);1994年05期

4 張仁祿;;美國(guó)RCA公司臺(tái)式編碼器及移動(dòng)式解碼器的測(cè)試分析[J];通訊裝備;1982年04期

5 ;ViBE產(chǎn)品系列(編碼器,解碼器,編解碼一體機(jī))[J];現(xiàn)代電視技術(shù);2004年S1期

6 張興明;編碼器MC145026及解碼器MC145027/MC145028[J];現(xiàn)代通信;1997年12期

7 周建峰;;為什么我的電影看不了呀?[J];電腦愛(ài)好者;2008年22期

8 孔令成;編碼器HT-12E和解碼器HT-12D的原理及應(yīng)用[J];國(guó)外電子元器件;1997年05期

9 張光建;;基于稀疏自動(dòng)編碼器的深度神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)[J];現(xiàn)代計(jì)算機(jī)(專業(yè)版);2017年35期

10 周啟德;MPEG聲頻編碼器與解碼器[J];電聲技術(shù);1996年01期

相關(guān)會(huì)議論文前10條

1 白永威;夏加寬;季慧穎;;基于CPLD的編碼器抗干擾電路設(shè)計(jì)與實(shí)現(xiàn)[A];第十三屆沈陽(yáng)科學(xué)學(xué)術(shù)年會(huì)論文集（理工農(nóng)醫(yī)）[C];2016年

2 張磊;夏傳浩;洪一;;使用糾錯(cuò)技術(shù)的8b/10b編碼器設(shè)計(jì)[A];全國(guó)第22屆計(jì)算機(jī)技術(shù)與應(yīng)用學(xué)術(shù)會(huì)議(CACIS·2011)暨全國(guó)第3屆安全關(guān)鍵技術(shù)與應(yīng)用（SCA·2011）學(xué)術(shù)會(huì)議論文摘要集[C];2011年

3 ;堡盟微米級(jí)精度絕對(duì)值編碼器[A];高速鐵路與軌道交通（領(lǐng)航版）[C];2015年

4 李博;;編碼器在寬厚板中的控制應(yīng)用[A];中國(guó)計(jì)量協(xié)會(huì)冶金分會(huì)2015年會(huì)論文集[C];2015年

5 ;堡盟最新推出超速開(kāi)關(guān)組合式重載編碼器[A];高速鐵路與軌道交通（證券版）[C];2016年

6 楊柳;薛方;;形狀自適應(yīng)可擴(kuò)縮編碼器的研究與實(shí)現(xiàn)[A];全國(guó)第十四屆計(jì)算機(jī)科學(xué)及其在儀器儀表中的應(yīng)用學(xué)術(shù)交流會(huì)論文集[C];2001年

7 郝雙暉;劉勇;周春蛟;劉杰;;基于標(biāo)定原理的單磁極編碼器設(shè)計(jì)[A];2005全國(guó)自動(dòng)化新技術(shù)學(xué)術(shù)交流會(huì)論文集（三）[C];2005年

8 陳成;;連鑄自動(dòng)化系統(tǒng)編碼器軟件設(shè)計(jì)[A];全國(guó)冶金自動(dòng)化信息網(wǎng)2018年會(huì)論文集[C];2018年

9 賀庚賢;;一種高精度編碼器的檢測(cè)方法[A];第九屆全國(guó)光學(xué)測(cè)試學(xué)術(shù)討論會(huì)論文（摘要集）[C];2001年

10 蔡智威;關(guān)存太;陳永彬;;一種4KbpsCELP編碼器的實(shí)現(xiàn)[A];第三屆全國(guó)人機(jī)語(yǔ)音通訊學(xué)術(shù)會(huì)議（NCMMSC1994）論文集[C];1994年

相關(guān)重要報(bào)紙文章前10條

1 MEB記者張?zhí)m;小零件“解碼”大智慧[N];機(jī)電商報(bào);2017年

2 本報(bào)記者李文潔;盈動(dòng)高科:占領(lǐng)全球編碼器市場(chǎng)高端[N];東莞日?qǐng)?bào);2014年

3 本報(bào)記者丹璐;乘AVS+東風(fēng) 編碼器向一流躍升[N];中國(guó)電子報(bào);2014年

4 賀順華;編碼器在定型機(jī)中的應(yīng)用[N];中國(guó)紡織報(bào);2009年

5 記者謝曉燕通訊員盧達(dá) 李院紅;國(guó)內(nèi)首臺(tái)H.265視頻實(shí)時(shí)編碼器在我市問(wèn)世[N];邢臺(tái)日?qǐng)?bào);2013年

6 任可;長(zhǎng)虹啟動(dòng)新一代AVS編碼器研發(fā)[N];科技日?qǐng)?bào);2009年

7 陳運(yùn)發(fā);普什寧江自制高速磁性編碼器通過(guò)評(píng)審[N];機(jī)電商報(bào);2009年

8 ;高性能磁電式編碼器問(wèn)世[N];今日信息報(bào);2004年

9 記者李恩樹(shù);警方摧毀解碼器盜車團(tuán)伙[N];法制日?qǐng)?bào);2013年

10 記者史謙;摧毀一特大制販、使用汽車解碼器犯罪網(wǎng)絡(luò)[N];人民公安報(bào);2013年

相關(guān)博士學(xué)位論文前10條

1 王鑫;基于表示學(xué)習(xí)的情感分析關(guān)鍵技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2017年

2 周穎華;具時(shí)變脈沖的神經(jīng)網(wǎng)絡(luò)的穩(wěn)定性研究[D];西南大學(xué);2018年

3 張婷;深度神經(jīng)網(wǎng)絡(luò)的核映射結(jié)構(gòu)和跨連結(jié)構(gòu)研究[D];北京工業(yè)大學(xué);2018年

4 胡中旭;虛擬場(chǎng)景人機(jī)交互中手勢(shì)識(shí)別技術(shù)研究[D];華中科技大學(xué);2018年

5 蒙西;模塊化神經(jīng)網(wǎng)絡(luò)優(yōu)化設(shè)計(jì)及應(yīng)用研究[D];北京工業(yè)大學(xué);2018年

6 吳霆;基于光譜技術(shù)的三文魚(yú)品質(zhì)檢測(cè)方法研究[D];華南農(nóng)業(yè)大學(xué);2018年

7 楊莉萍;量子計(jì)算及其在生物化學(xué)問(wèn)題中的應(yīng)用研究[D];華中科技大學(xué);2018年

8 涂正文;基于慣性項(xiàng)和四元數(shù)的神經(jīng)網(wǎng)絡(luò)的動(dòng)力學(xué)行為分析[D];東南大學(xué);2018年

9 張浩;耦合反應(yīng)擴(kuò)散神經(jīng)網(wǎng)絡(luò)的同步分析與控制[D];華中科技大學(xué);2018年

10 周徐達(dá);稀疏神經(jīng)網(wǎng)絡(luò)和稀疏神經(jīng)網(wǎng)絡(luò)加速器的研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2019年

相關(guān)碩士學(xué)位論文前10條

1 李晨星;基于微博的意圖識(shí)別[D];西華大學(xué);2017年

2 王秀梅;基于TAP3的漫游數(shù)據(jù)編碼器和解碼器的設(shè)計(jì)與實(shí)現(xiàn)[D];北京郵電大學(xué);2006年

3 崔建勇;快速稀疏編碼器的研究及應(yīng)用[D];大連理工大學(xué);2014年

4 張春輝;基于自編碼器的稀疏深度模型的研究與應(yīng)用[D];西安電子科技大學(xué);2017年

5 徐晨飛;基于深度學(xué)習(xí)的年齡不變?nèi)四樧R(shí)別技術(shù)研究與實(shí)現(xiàn)[D];電子科技大學(xué);2017年

6 王曉慧;AVS-M編碼器算法研究和解碼器DSP移植和優(yōu)化[D];中國(guó)海洋大學(xué);2006年

7 李斐然;基于霍爾原理的絕對(duì)式磁編碼器的研究[D];哈爾濱工業(yè)大學(xué);2015年

8 馮立鋒;面向高精度伺服系統(tǒng)的磁電式編碼器研究[D];哈爾濱工業(yè)大學(xué);2014年

9 徐彬;基于FPGA的模擬量編碼器接口電路的設(shè)計(jì)與實(shí)現(xiàn)[D];江西科技師范大學(xué);2014年

10 武亭;一種復(fù)合式磁編碼器的研究[D];哈爾濱工業(yè)大學(xué);2011年

，

本文編號(hào)：2490420

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/guanlilunwen/ydhl/2490420.html

上一篇：基于OpenStack的業(yè)務(wù)云平臺(tái)資源分配模型的研究與實(shí)現(xiàn)
下一篇：基于無(wú)限重復(fù)博弈的P2P網(wǎng)絡(luò)信任模型研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于微博的意圖識(shí)別