激活函數(shù)導(dǎo)向的RNN算法優(yōu)化

發(fā)布時間：2017-12-27 21:09

本文關(guān)鍵詞：激活函數(shù)導(dǎo)向的RNN算法優(yōu)化　出處：《浙江大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：循環(huán)神經(jīng)網(wǎng)絡(luò)(RecurrentNeuralNetworks,RNN)是人工神經(jīng)網(wǎng)絡(luò)的一個重要分支。它在隱含層引入了反饋機制,實現(xiàn)對序列數(shù)據(jù)的有效處理。循環(huán)神經(jīng)網(wǎng)絡(luò)具有存儲和處理上下文信息的強大能力,成為語音識別、自然語言處理、計算機視覺等領(lǐng)域的研究熱點之一。一方面,循環(huán)神經(jīng)網(wǎng)絡(luò)普遍采用S型函數(shù)作為激活函數(shù),而S型函數(shù)的飽和區(qū)限制了RNN訓(xùn)練收斂速度,因此對激活函數(shù)的優(yōu)化研究成為研究熱點。另一方面,循環(huán)神經(jīng)網(wǎng)絡(luò)主要采用軟件實現(xiàn)的方式,算法的硬件加速研究具有重要意義。本文針對上述問題和研究背景,在前人的研究基礎(chǔ)上做了如下工作:循環(huán)神經(jīng)網(wǎng)絡(luò)的理論總結(jié)研究。長短時記憶單元((Long Short-Term Memory,LSTM)特有的門結(jié)構(gòu)解決了傳統(tǒng)循環(huán)神經(jīng)網(wǎng)絡(luò)時間維度的梯度消失問題,成為RNN結(jié)構(gòu)的重要組成部分。分析了 LSTM型RNN的訓(xùn)練過程,包括前向傳播過程和反向傳播過程。在反向傳播過程中,激活函數(shù)及其導(dǎo)數(shù)直接影響網(wǎng)絡(luò)訓(xùn)練的收斂速度。從激活函數(shù)方面著手對循環(huán)神經(jīng)網(wǎng)絡(luò)算法進行優(yōu)化。傳統(tǒng)的S型激活函數(shù)存在飽和區(qū)導(dǎo)致收斂速度慢,前人提出的修正線性單元避免了飽和區(qū)梯度消失問題,但是帶來了梯度爆炸問題。利用S型函數(shù)系數(shù)不同,非飽和區(qū)范圍不同,進一步分析了不同系數(shù)之間的訓(xùn)練收斂速度的大小關(guān)系。通過實驗證明了擴展非飽和區(qū)的優(yōu)化方法有效地加快了訓(xùn)練收斂速度。從激活函數(shù)的硬件實現(xiàn)著手對循環(huán)神經(jīng)網(wǎng)絡(luò)算法進行優(yōu)化,激活函數(shù)的硬件實現(xiàn)難度較大,具有更重要的研究意義。優(yōu)化誤差的研究,引入了擬合直線誤差修正項,在硬件開銷不變的前提下,誤差變?yōu)樵瓉淼亩种?優(yōu)化分割方法的研究,調(diào)整不同子區(qū)間的分割段數(shù),在硬件開銷不變的前提下,使得誤差進一步減小;激活函數(shù)硬件實現(xiàn)的可擴展性研究,基于Sigmoid函數(shù)的硬件實現(xiàn),實現(xiàn)了參數(shù)化Sigmoid函數(shù)和Tanh函數(shù)。
[Abstract]:RecurrentNeuralNetworks (RecurrentNeuralNetworks, RNN) is an important branch of artificial neural network. It introduces the feedback mechanism in the hidden layer to realize the effective processing of the sequence data. Recurrent neural network has strong ability to store and process context information, and has become one of the research hotspots in speech recognition, Natural Language Processing, computer vision and other fields. On the one hand, recurrent neural network generally uses S function as activation function, while the saturation area of S function restricts the convergence speed of RNN training. Therefore, the optimization research of activation function has become a research hotspot. On the other hand, recurrent neural network mainly adopts the way of software implementation, and the research of hardware acceleration of the algorithm is of great significance. In view of the above problems and research background, the following work has been done on the basis of previous studies: the theoretical summary of recurrent neural network. The Long Short-Term Memory (LSTM) unique gate structure solves the gradient disappearance problem of the traditional recurrent neural network time dimension and becomes an important part of RNN structure. The training process of LSTM type RNN, including forward propagation process and reverse propagation process, is analyzed. In the process of backpropagation, the activation function and its derivative directly affect the convergence speed of network training. Starting from the activation function, the recurrent neural network algorithm is optimized. The traditional S activation function has saturation region, which leads to slow convergence. The corrected linear element proposed by predecessors avoids the problem of gradient disappearance in saturated region, but it brings gradient explosion problem. Using the different coefficients of the S type function and the different unsaturated zone range, the relation between the speed of training convergence between different coefficients is further analyzed. It is proved by experiments that the optimization method of the extended unsaturated zone can effectively speed up the convergence speed of training. Starting from the hardware realization of the activation function, the recurrent neural network algorithm is optimized. The hardware realization of the activation function is difficult, and it has more important research significance. Study on Optimization of error, introduced the linear fitting error correction, unchanged in hardware overhead, the error into the original 1/2; optimization of segmentation method, different sub interval segments, unchanged in hardware overhead, the error is further reduced; the activation function of the hardware implementation can extended research, realize Sigmoid function based on hardware realization of parameterized Sigmoid function and Tanh function.
【學(xué)位授予單位】：浙江大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TP183

【參考文獻】

相關(guān)期刊論文前2條

1 高麗媛;盧達;趙光宙;裘君;;應(yīng)用自動微分的永磁同步電機預(yù)測控制[J];電機與控制學(xué)報;2012年10期

2 黃聚永;袁慧梅;吳向陽;崔國亮;高琴;;基于查找表和Newton插值算法的正余弦函數(shù)的FPGA實現(xiàn)[J];繼電器;2007年16期

，

本文編號：1343190

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/shoufeilunwen/xixikjs/1343190.html

上一篇：大數(shù)據(jù)環(huán)境下基于SNA的輿情監(jiān)控研究
下一篇：纖維增強復(fù)合材料熔融沉積成型增材制造裝備研發(fā)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

激活函數(shù)導(dǎo)向的RNN算法優(yōu)化