天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

評價(jià)對象識別模型與方法研究

發(fā)布時間:2018-08-19 15:48
【摘要】:隨著互聯(lián)網(wǎng)技術(shù)的發(fā)展,電子商務(wù)成為人們?nèi)粘I钪性絹碓讲豢扇鄙俚囊徊糠?隨之而來的是用戶意見和評論數(shù)據(jù)量的飛速增長。這些評論中包含了用戶對某一領(lǐng)域相關(guān)功能、屬性和物品等的各種評價(jià)信息。有效地利用這些評論信息對于改善產(chǎn)品質(zhì)量、了解消費(fèi)者的真實(shí)需求都有很大的幫助,這也就促使評價(jià)對象識別技術(shù)的產(chǎn)生和發(fā)展。評論信息中的評價(jià)對象就是觀點(diǎn)持有者表達(dá)情感的目標(biāo)實(shí)體,通常由一個或多個單詞組成。評價(jià)對象識別就是在給定的商品評論中準(zhǔn)確地提取真實(shí)的評價(jià)實(shí)體。從方法的角度,評價(jià)對象識別方法可以分為有監(jiān)督學(xué)習(xí),無監(jiān)督學(xué)習(xí)和半監(jiān)督學(xué)習(xí);從應(yīng)用的角度,評價(jià)對象識別可以分為單領(lǐng)域問題和跨領(lǐng)域問題。本文將對單領(lǐng)域評價(jià)對象識別問題的模型與方法進(jìn)行研究,通過對比各模型與方法的試驗(yàn)結(jié)果,分析各模型與方法的優(yōu)缺點(diǎn)。本文的主要研究內(nèi)容可以歸納為以下三點(diǎn):第一,基于無監(jiān)督學(xué)習(xí)的評價(jià)對象識別方法。首先本文采用了數(shù)據(jù)挖掘技術(shù)中的關(guān)聯(lián)規(guī)則挖掘方法提取出語料庫中最常出現(xiàn)的名詞短語作為候選對象,再根據(jù)詞語的語義相關(guān)度進(jìn)行進(jìn)一步的過濾,得出語句中的評價(jià)對象的候選集合。在此基礎(chǔ)上,本文采用一種基于句法分析樹和二次傳播算法的評價(jià)對象識別方法,分別用以識別名詞短語構(gòu)成的評價(jià)對象和出現(xiàn)頻率較低的評價(jià)對象。第二,基于時序模型的評價(jià)對象識別方法。由于評論信息是一種上下文相關(guān)的單詞序列,采用時序模型可以有效地利用上下文信息,增加評價(jià)對象識別的準(zhǔn)確性。本文提取了單詞層面特征、句法層面特征以及外部語料特征等作為模型的輸入,使用條件隨機(jī)場模型學(xué)習(xí)這些特征之間的相互關(guān)系。實(shí)驗(yàn)證明,特征組合對結(jié)果有著很大的影響。在給定合適特征的條件下,時序模型可以取得非常優(yōu)異的結(jié)果。第三,基于循環(huán)神經(jīng)網(wǎng)絡(luò)的評價(jià)對象識別。循環(huán)神經(jīng)網(wǎng)絡(luò)是一種端對端的模型,可以省去繁瑣的預(yù)處理過程和特征提取過程。本文對比幾種常見的循環(huán)神經(jīng)網(wǎng)絡(luò)模型在評價(jià)對象識別任務(wù)上的表現(xiàn),分析循環(huán)神經(jīng)網(wǎng)絡(luò)在該任務(wù)上的優(yōu)勢與不足。針對循環(huán)神經(jīng)網(wǎng)絡(luò)不能有效地獲取輸出標(biāo)簽間的相互依賴關(guān)系的問題,本文還提出了一種新型的循環(huán)神經(jīng)網(wǎng)絡(luò):輸出感知循環(huán)神經(jīng)網(wǎng)絡(luò)。實(shí)驗(yàn)證明輸出感知循環(huán)神經(jīng)網(wǎng)絡(luò)不僅在效果上好于其他循環(huán)神經(jīng)網(wǎng)絡(luò),而且有著更快的收斂速度。
[Abstract]:With the development of Internet technology, electronic commerce has become an indispensable part of people's daily life, followed by a rapid increase in the amount of user opinions and comments. These comments contain a variety of user evaluation information about functions, attributes, items, etc. It is helpful to improve the quality of products and understand the real needs of consumers by using these comments effectively, which promotes the production and development of object recognition technology. The object of evaluation in comment information is the object entity of the viewpoint holder expressing emotion, which is usually composed of one or more words. Evaluation object identification is to extract the real evaluation entity accurately from a given commodity comment. From the point of view of method, evaluation object recognition can be divided into supervised learning, unsupervised learning and semi-supervised learning, and from the perspective of application, evaluation object recognition can be divided into single-domain and cross-domain problems. In this paper, the models and methods of single domain object identification are studied, and the advantages and disadvantages of each model and method are analyzed by comparing the experimental results of each model and method. The main contents of this paper can be summarized as follows: first, an evaluation object recognition method based on unsupervised learning. First of all, this paper uses association rule mining method in data mining technology to extract the most common noun phrases in the corpus as candidate objects, and then filter further according to the semantic relevance of words. A candidate set of evaluation objects in a statement is obtained. On this basis, this paper uses an evaluation object recognition method based on syntactic parse tree and quadratic propagation algorithm, using the evaluation object which is composed of identifying noun phrases and the evaluation object with low occurrence frequency, respectively. Second, the evaluation object recognition method based on time series model. Because the comment information is a kind of context-dependent word sequence, the temporal model can effectively utilize the context information and increase the accuracy of object identification. In this paper, word level feature, syntactic level feature and external corpus feature are extracted as the input of the model, and the conditional random field model is used to learn the relationship between these features. Experimental results show that the combination of features has a great impact on the results. The time series model can obtain excellent results under the condition of given suitable features. Third, the evaluation object recognition based on cyclic neural network. Cyclic neural network is an end-to-end model, which can eliminate the tedious preprocessing process and feature extraction process. In this paper, the performance of several common cyclic neural network models in evaluating object recognition task is compared, and the advantages and disadvantages of cyclic neural network in this task are analyzed. Aiming at the problem that cyclic neural network can not effectively obtain the interdependence between output labels, a new type of cyclic neural network, output perceptual cyclic neural network, is proposed in this paper. The experimental results show that the output perceptual cyclic neural network not only has better effect than other cyclic neural networks, but also has a faster convergence rate.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.1
,

本文編號:2192122

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2192122.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f5ea2***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com