基于深度學(xué)習(xí)的手語識別方法研究

發(fā)布時(shí)間：2018-03-03 17:35

本文選題：手語識別　切入點(diǎn)：深度學(xué)習(xí)　出處：《吉林大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：手語作為聾啞人交流的重要工具,在聾啞人中有著廣泛的使用價(jià)值,而對其復(fù)雜多變的手勢的研究也能促進(jìn)基于手勢的人機(jī)交互技術(shù)的發(fā)展。但正是由于手語的復(fù)雜多變以及其復(fù)雜的使用環(huán)境,對手語識別的研究一直困難重重。傳統(tǒng)的手語識別研究方法往往要求手語者佩戴昂貴的用于手語信息捕捉的數(shù)據(jù)手套或者是要求手語者佩戴彩色手套,以方便對手語的手勢進(jìn)行特征提取等操作,利用這類方法雖然能在限定的使用條件下達(dá)到較高的準(zhǔn)確率。但這類方法推廣性較差,往往在更換一個(gè)手語數(shù)據(jù)集后就得重新手動的提取特征。本文將深度學(xué)習(xí)的系列方法引入到手語識別的研究中,具體的在靜態(tài)手語識別方面本文結(jié)合深度卷積神經(jīng)網(wǎng)絡(luò)提出了靜態(tài)手語識別模型一(SLR-CNN1)和靜態(tài)手語識別模型二(SLR-CNN2)。利用SLR-CNN1驗(yàn)證了深度卷積神經(jīng)網(wǎng)絡(luò)在手語識別上的可行性。利用SLR-CNN2模型進(jìn)一步提高了靜態(tài)手語識別的準(zhǔn)確率,本文將全局均值池化引入到手語識別模型中,極大的降低了參數(shù)數(shù)量,防止過擬合現(xiàn)象的發(fā)生。通過大量實(shí)驗(yàn)驗(yàn)證了深度卷積神經(jīng)網(wǎng)絡(luò)可以自動的學(xué)習(xí)到有用的手語特征,且深度卷積神經(jīng)網(wǎng)絡(luò)能學(xué)習(xí)到手語的細(xì)微變換,從而可以有效的對手語進(jìn)行識別。本文還利用深度學(xué)習(xí)Caffe框架訓(xùn)練了兩個(gè)可以用于實(shí)際部署的深度學(xué)習(xí)手語識別模型。在動態(tài)手語識別方面,本文將深度卷積神經(jīng)網(wǎng)絡(luò)和長短時(shí)記憶循環(huán)神經(jīng)網(wǎng)絡(luò)結(jié)合,提出了動態(tài)手語識別模型一(SLR-LSRCN1)和動態(tài)手語識別模型二(SLRLSRCN2)。并對深度學(xué)習(xí)框架Caffe的源碼進(jìn)行修改,使其能接受連續(xù)的視頻幀作為深度學(xué)習(xí)模型的輸入。通過大量實(shí)驗(yàn)得出利用卷積神經(jīng)網(wǎng)絡(luò)和循環(huán)神經(jīng)網(wǎng)絡(luò)結(jié)合的方式,可以對動態(tài)手語做出有效的識別。在此基礎(chǔ)上訓(xùn)練了可用于實(shí)際部署的動態(tài)手語識別模型。最后為了驗(yàn)證深度學(xué)習(xí)算法在手語識別上的可行性,本文通過結(jié)合現(xiàn)有數(shù)據(jù)庫和自錄數(shù)據(jù)庫的方式,標(biāo)記了大量的可用于靜態(tài)手語識別的樣本庫,可以更方便的進(jìn)行算法的驗(yàn)證和實(shí)驗(yàn)。本文通過將深度學(xué)習(xí)的方法引入到手語識別任務(wù)中,為手語識別增加了一條可擴(kuò)展性強(qiáng),具有魯棒性的新思路。
[Abstract]:Sign language is an important tool for communication among deaf and mute people. The research on its complex and changeable gestures can also promote the development of human-computer interaction technology based on gestures, but it is precisely because of the complexity of sign language and its complex use environment, Research on sign language recognition has been difficult. Traditional sign language recognition methods often require sign language users to wear expensive data gloves for sign language information capture or color gloves for sign language users. In order to facilitate sign language gesture feature extraction and other operations, although the use of this method can achieve a higher accuracy under limited conditions of use, but this kind of method is less popularizing. After replacing a sign language data set, we often have to re-extract the features manually. In this paper, a series of in-depth learning methods are introduced into the study of sign language recognition. In the aspect of static sign language recognition, this paper proposes a static sign language recognition model (SLR-CNN1) and a static sign language recognition model (SLR-CNN2) combined with deep convolution neural network. The SLR-CNN1 is used to verify the feasibility of deep convolution neural network in sign language recognition. Using SLR-CNN2 model to further improve the accuracy of static sign language recognition, In this paper, the global mean pool is introduced into the sign language recognition model, which greatly reduces the number of parameters and prevents over-fitting. Through a large number of experiments, it is verified that the deep convolution neural network can automatically learn useful sign language features. And the deep convolution neural network can learn the subtle transformation of sign language. So we can effectively recognize sign language. In this paper, we also train two Deep-Learning sign language recognition models which can be used in actual deployment by using the Caffe framework of in-depth learning. In the aspect of dynamic sign language recognition, In this paper, the deep convolution neural network and the long and short time memory circulatory neural network are combined, and a dynamic sign language recognition model (SLR-LSRCN1) and a dynamic sign language recognition model (SLRLSRCN2) are proposed. The source code of the deep learning framework (Caffe) is modified. It can accept the continuous video frame as the input of the depth learning model. Through a lot of experiments, the method of combining the convolution neural network with the cyclic neural network is obtained. On the basis of this, the dynamic sign language recognition model which can be used in actual deployment is trained. Finally, in order to verify the feasibility of the deep learning algorithm in sign language recognition, In this paper, a large number of sample libraries for static sign language recognition are marked by combining the existing database and the self-recording database. This paper introduces the method of deep learning into the task of sign language recognition, and adds a new idea with strong extensibility and robustness for sign language recognition.
【學(xué)位授予單位】：吉林大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP391.41

【參考文獻(xiàn)】

相關(guān)期刊論文前4條

1 王繼紅;;國內(nèi)外手語翻譯研究:歷史與現(xiàn)狀[J];上海翻譯;2009年02期

2 任海兵,祝遠(yuǎn)新,徐光yP,林學(xué)，

本文編號：1562013

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/shoufeilunwen/xixikjs/1562013.html

上一篇：基于深度學(xué)習(xí)的非限定條件下人臉識別研究
下一篇：面向視障人群的DAISY閱讀推薦系統(tǒng)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于深度學(xué)習(xí)的手語識別方法研究