融合注意力和動(dòng)態(tài)語義指導(dǎo)的圖像描述模型

發(fā)布時(shí)間：2018-08-18 18:06

【摘要】：針對(duì)當(dāng)前圖像語義描述生成模型對(duì)圖像內(nèi)目標(biāo)細(xì)節(jié)部分描述不充分問題,提出了一種結(jié)合圖像動(dòng)態(tài)語義指導(dǎo)和自適應(yīng)注意力機(jī)制的圖像語義描述模型。該模型根據(jù)上一時(shí)刻信息預(yù)測下一時(shí)刻單詞,采用自適應(yīng)注意力機(jī)制選擇下一時(shí)刻模型需要處理的圖像區(qū)域。此外,該模型構(gòu)建了圖像的密集屬性信息作為額外的監(jiān)督信息,使得模型可以聯(lián)合圖像語義信息和注意力信息進(jìn)行圖像內(nèi)容描述。在Flickr8K和Flickr30K圖像集中進(jìn)行了訓(xùn)練和測試,并且使用了不同的評(píng)估方法對(duì)所提模型進(jìn)行了驗(yàn)證,實(shí)驗(yàn)結(jié)果表明所提模型性能有較大的提高,尤其與Guiding-Long Short-Term Memory模型相比,得分提高了4.1、1.8、2.4、0.8、3.1,提升幅度達(dá)到6.3%、4.0%、7.9%、3.9%、17.3%;與Soft-Attention相比,得分分別提高了1.9、2.4、3.3、1.5、2.74,提升幅度達(dá)到2.8%、5.5%、11.1%、7.5%、14.8%。
[Abstract]:An image semantic description model based on dynamic semantic guidance and adaptive attention mechanism is proposed to solve the problem of inadequate description of target details in the current image semantic description generation model. According to the information of the previous moment, the model predicts the words of the next moment, and adopts the adaptive attention mechanism to select the image region to be processed by the next moment model. In addition, the model constructs the dense attribute information of the image as additional monitoring information, which enables the model to combine image semantic information and attention information to describe the image content. The proposed model is trained and tested in Flickr8K and Flickr30K images, and different evaluation methods are used to verify the proposed model. The experimental results show that the performance of the proposed model is greatly improved, especially compared with the Guiding-Long Short-Term Memory model. The score increased by 4.1 / 1.82.40.80.81, and reached 6.3 / 4.07.9and 3.9m / 17.3.The score increased by 1.92.43.31.52.74 respectively compared with Soft-Attention, and the range of promotion reached 2.80.11.511.7.5and 14.80.The score increased by 1.92.43.31.52.74, respectively, and reached the range of 2.81.7.5.
【作者單位】：江南大學(xué)物聯(lián)網(wǎng)技術(shù)應(yīng)用教育部工程研究中心;
【基金】：中央高校基本科研業(yè)務(wù)費(fèi)專項(xiàng)資金No.JUSRP51510~~
【分類號(hào)】：TP183;TP391.41

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 劉清堂;金晶;趙剛;程文青;楊宗凱;;學(xué)習(xí)資源權(quán)利描述模型及執(zhí)行策略研究[J];計(jì)算機(jī)應(yīng)用研究;2006年12期

2 孫偉,翟玉慶;一種以動(dòng)作狀態(tài)為中心的數(shù)字權(quán)限描述模型[J];計(jì)算機(jī)工程與應(yīng)用;2005年10期

3 孫偉,翟玉慶;一種采用一階動(dòng)態(tài)邏輯表示的數(shù)字權(quán)限描述模型[J];計(jì)算機(jī)應(yīng)用;2005年04期

4 彭宇行;;CHDL模型探討[J];計(jì)算技術(shù)與自動(dòng)化;1990年03期

5 張英朝,張維明,肖衛(wèi)東,沙基昌;虛擬組織中面向共享的信息統(tǒng)一描述模型研究[J];系統(tǒng)工程學(xué)報(bào);2005年01期

6 李行;張立臣;;面向方面的CORBA模型[J];現(xiàn)代計(jì)算機(jī)(專業(yè)版);2008年05期

7 劉超;蔣祖華;劉宇龍;;中醫(yī)推拿動(dòng)素的規(guī)范化描述模型與實(shí)例應(yīng)用[J];計(jì)算機(jī)工程;2009年11期

8 許占民,張全,景韶宇,陸長德;面向產(chǎn)品造型設(shè)計(jì)的形態(tài)風(fēng)格描述模型構(gòu)建[J];計(jì)算機(jī)應(yīng)用研究;2005年11期

9 何建華;劉耀林;俞艷;;不確定方向關(guān)系的模糊描述模型[J];武漢大學(xué)學(xué)報(bào)(信息科學(xué)版);2008年03期

10 李文杰;馮志勇;趙德新;;基于本體的零件描述模型研究[J];計(jì)算機(jī)工程;2007年08期

相關(guān)會(huì)議論文前1條

1 張曉寧;李學(xué)慶;;一種基于MDA的UIMS實(shí)現(xiàn)[A];第四屆和諧人機(jī)環(huán)境聯(lián)合學(xué)術(shù)會(huì)議論文集[C];2008年

相關(guān)碩士學(xué)位論文前4條

1 白曉磊;面向服務(wù)計(jì)算的服務(wù)描述模型研究[D];電子科技大學(xué);2012年

2 鄭丹丹;動(dòng)態(tài)對(duì)象不確定方向關(guān)系描述與推理[D];燕山大學(xué);2010年

3 楊海;基于MPEG-7標(biāo)準(zhǔn)的人臉結(jié)構(gòu)描述模型的研究[D];黑龍江大學(xué);2013年

4 代一帆;基于角色協(xié)同的公眾參與評(píng)估系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];西南交通大學(xué);2009年

，

本文編號(hào)：2190264

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2190264.html

上一篇：雙折射光纖環(huán)鏡應(yīng)變靈敏度優(yōu)化研究
下一篇：履帶式管道清潔機(jī)器人控制系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

融合注意力和動(dòng)態(tài)語義指導(dǎo)的圖像描述模型