基于近似動態(tài)規(guī)劃的優(yōu)化控制研究及在電力系統(tǒng)中的應(yīng)用

發(fā)布時間：2018-07-13 13:32

【摘要】：基于近似動態(tài)規(guī)劃(Approximate dynamic programming. ADP)的最優(yōu)控制問題是近年來控制領(lǐng)域研究的熱點之一。結(jié)合強(qiáng)化學(xué)習(xí)思想的近似動態(tài)規(guī)劃是利用函數(shù)近似結(jié)構(gòu)來逼近動態(tài)規(guī)劃方程中的代價函數(shù)和控制策略,以滿足最優(yōu)性原理,從而獲得最優(yōu)代價函數(shù)和最優(yōu)控制策略。因而,近似動態(tài)規(guī)劃成功避免了動態(tài)規(guī)劃求解最優(yōu)控制存在的“維數(shù)災(zāi)”問題而獲得廣泛的關(guān)注。但是,近似動態(tài)規(guī)劃理論及其算法還沒有得到完善,利用ADP研究動態(tài)系統(tǒng)最優(yōu)控制的許多理論與技術(shù)問題還有待解決。為此,在國家自然科學(xué)基金項目“智能電網(wǎng)的動態(tài)全局優(yōu)化與節(jié)能控制理論及其應(yīng)用(50977008)”等的資助下,本文基于近似動態(tài)規(guī)劃理論進(jìn)一步研究動態(tài)系統(tǒng)若干優(yōu)化控制問題,提出適合不同情形的迭代ADP算法。并將ADP方法應(yīng)用到電力系統(tǒng)中,擴(kuò)展了ADP方法的應(yīng)用范圍。本文主要工作和貢獻(xiàn)如下：1.針對未知連續(xù)線性系統(tǒng)的最優(yōu)跟蹤控制問題,提出了一種新型的基于ADP的最優(yōu)跟蹤控制方案。首先,將原系統(tǒng)的最優(yōu)跟蹤問題轉(zhuǎn)化成一個增廣系統(tǒng)的最優(yōu)調(diào)節(jié)控制問題。并證明了增廣系統(tǒng)的最優(yōu)控制解等價于原系統(tǒng)的最優(yōu)跟蹤控制問題的標(biāo)準(zhǔn)解。然后,給出了一種新的在線ADP算法來在線求解增廣代數(shù)Riccati方程,實現(xiàn)了在線求得未知系統(tǒng)的最優(yōu)跟蹤控制器。2.提出了一種基于ADP的自適應(yīng)最優(yōu)控制方案,有效解決了一類離散仿射非線性系統(tǒng)的最優(yōu)控制問題。首先,利用兩個神經(jīng)網(wǎng)絡(luò)作為在線參數(shù)結(jié)構(gòu)來分別近似代價函數(shù)和最優(yōu)控制律,分別被稱為評價網(wǎng)絡(luò)和執(zhí)行網(wǎng)絡(luò)。在考慮神經(jīng)網(wǎng)絡(luò)近似誤差的基礎(chǔ)上,通過Lyapunov理論,證明了系統(tǒng)狀態(tài)和神經(jīng)網(wǎng)絡(luò)權(quán)值估計誤差都是一致最終有界性,并且能夠保證所獲得的控制輸入在最優(yōu)控制輸入的一個小的鄰域內(nèi)。3.針對一類帶有外界擾動的離散非線性系統(tǒng)的H∞控制問題,提出了一個新的在線自適應(yīng)策略學(xué)習(xí)方案。利用三個神經(jīng)網(wǎng)絡(luò)作為在線參數(shù)結(jié)構(gòu)分別設(shè)計了評價網(wǎng)絡(luò)、執(zhí)行網(wǎng)絡(luò)和擾動網(wǎng)絡(luò),并給出網(wǎng)絡(luò)權(quán)值的在線更新律。在考慮神經(jīng)網(wǎng)絡(luò)近似誤差的基礎(chǔ)上,通過Lyapunov理論,證明了系統(tǒng)狀態(tài)和所有的網(wǎng)絡(luò)權(quán)值估計誤差都是一致最終有界性,并且能夠保證所獲得的控制輸入在最優(yōu)控制輸入的一個小的鄰域內(nèi)。4.提出了一種新的迭代兩級DHP算法,解決了一類帶有飽和執(zhí)行器的非線性切換系統(tǒng)的最優(yōu)控制問題。利用一個非二次型泛函解決了執(zhí)行飽和約束問題,保證了控制函數(shù)在飽和執(zhí)行器內(nèi)是一個光滑函數(shù),推導(dǎo)出一種新異的迭代兩級DHP算法用來求解約束HJB方程。給出嚴(yán)格的數(shù)學(xué)證明保證了所提出迭代兩級DHP算法的收斂性。5.針對一類離散非線性切換系統(tǒng)的最優(yōu)跟蹤控制問題,設(shè)計了一種迭代ADP算法來獲取最優(yōu)跟蹤混合控制策略。首先,將最優(yōu)跟蹤控制問題轉(zhuǎn)化為一個誤差切換系統(tǒng)的最優(yōu)調(diào)節(jié)控制問題。其次,給出了一種新的迭代兩級ADP算法來求解誤差系統(tǒng)的HJB方程。最后給出算法的收斂性分析,保證了得到跟蹤混合控制策略是最優(yōu)的。6.設(shè)計了一種迭代兩級ε-ADP算法,其有效地解決了一類離散非線性切換系統(tǒng)的有限時間最優(yōu)控制問題。首先,給出了迭代兩級ADP算法來求解HJB方程,并給出了迭代算法的嚴(yán)格的收斂性分析。接著,給出了ε-最優(yōu)控制策略,使得迭代兩級ADP算法能夠在有限步得到在ε誤差邊界內(nèi)接近最優(yōu)值的近似最優(yōu)代價函數(shù),從而實現(xiàn)了離散非線性切換系統(tǒng)的有限時間最優(yōu)控制。7.針對未知電力系統(tǒng)的負(fù)荷頻率控制問題,提出了一個基于ADP的在線H∞魯棒負(fù)荷頻率控制器設(shè)計方案。首先利用H∞控制方法來處理系統(tǒng)的不確定性問題。然后,利用二人零和微分對策理論來解決H∞控制問題,并通過利用ADP技術(shù)和克羅內(nèi)克積理論,給出了一個基于數(shù)據(jù)的在線ADP算法,該算法通過利用系統(tǒng)狀態(tài)和控制輸入的在線信息學(xué)習(xí)博弈代數(shù)Riccati方程的解,從而實現(xiàn)了解決完全未知電力系統(tǒng)的負(fù)荷頻率控制問題。
[Abstract]:The optimal control problem based on the approximate dynamic programming (Approximate dynamic programming. ADP) is one of the hot topics in the field of control field in recent years. The approximate dynamic programming combining the thought of reinforcement learning is used to approximate the cost function and control strategy in the dynamic programming equation by using the approximate structure of the function to satisfy the optimality principle. The optimal cost function and the optimal control strategy. Therefore, the approximate dynamic programming successfully avoids the "dimensionality disaster" problem of the dynamic programming to solve the optimal control. However, the approximate dynamic programming theory and its algorithm have not been improved, and many theoretical and technical problems of the optimal control of dynamic systems are studied by ADP. For this reason, under the support of the National Natural Science Fund Project "dynamic global optimization and energy saving control theory and its application (50977008)" of the National Natural Science Foundation, this paper further studies some optimization control problems of dynamic system based on the approximate dynamic programming theory, and puts forward the iterative ADP algorithm suitable for different situations. And the ADP party is put forward. The application of the method to the power system extends the application scope of the ADP method. The main work and contributions of this paper are as follows: 1. a new optimal tracking control scheme based on ADP is proposed for the optimal tracking control problem of an unknown continuous linear system. First, the optimal tracking problem of the original system is transformed into an optimal tune of an augmented system. It is proved that the optimal control solution of the augmented system is equivalent to the standard solution of the optimal tracking control problem of the original system. Then, a new online ADP algorithm is given to solve the augmented algebraic Riccati equation online, and the optimal tracking controller.2. on line for the unknown system is realized, and a ADP based adaptive optimization is proposed. The optimal control scheme effectively solves the optimal control problem for a class of discrete affine nonlinear systems. First, two neural networks are used as online parameter structures to approximate the cost functions and optimal control laws respectively, which are called the evaluation network and the executive network respectively. On the basis of considering the neural network approximation error, the Lyapunov theory is adopted. It is proved that the system state and the weight estimation error of the neural network are all consistent and ultimate boundedness, and can guarantee the control input of the obtained control input in a small neighborhood of the optimal control input.3. for a class of H infinity control problems of a class of discrete nonlinear systems with external disturbances, and a new online adaptive strategy learner is proposed. Three neural networks are used as online parameter structures to design evaluation network, execute network and disturbance network, and give an online update law of network weight value. Based on the approximate error of neural network, it is proved that the system state and the estimation error of all network weight values are all consistent and ultimate boundedness on the basis of Lyapunov theory. And can ensure that the obtained control input is in a small neighborhood of the optimal control input.4., a new iterative two level DHP algorithm is proposed to solve the optimal control problem of a class of nonlinear switched systems with a saturated actuator. A non two order functional solution is used to execute the saturation constraint problem, and the control function is guaranteed. The number in the saturated actuator is a smooth function, and a new different iterative two level DHP algorithm is derived to solve the constrained HJB equation. The strict mathematical proof guarantees the convergence of the proposed iterative two DHP algorithm for the optimal tracking control problem of a class of discrete nonlinear switched systems, and an iterative ADP algorithm is designed. The optimal tracking hybrid control strategy is obtained. First, the optimal tracking control problem is transformed into an optimal control problem of an error switching system. Secondly, a new iterative two level ADP algorithm is given to solve the HJB equation of the error system. Finally, the convergence analysis of the algorithm is given to ensure that the tracking hybrid control strategy is obtained. The optimal.6. designs an iterative two stage epsilon -ADP algorithm, which effectively solves the finite time optimal control problem of a class of discrete nonlinear switched systems. First, an iterative two level ADP algorithm is given to solve the HJB equation, and the strict convergence analysis of the iterative algorithm is given. Then, the optimal control strategy is given, which makes the iteration of the iterative algorithm. The two level ADP algorithm can obtain an approximate optimal cost function which is close to the optimal value in the boundary of the epsilon error, thus realizing the load frequency control problem of the finite time optimal control.7. for the unknown power system by the finite time optimal control of the discrete nonlinear switched system, and proposes a design side of the online H robust load frequency controller based on the ADP. First, the H infinity control method is used to deal with the uncertainty of the system. Then, the two person zero sum differential game theory is used to solve the H infinity control problem. A data based online ADP algorithm is presented by using the ADP technology and Kronecker's product theory. The algorithm passes the online informatics of the system state and control input. The solution of algebraic Riccati equation is achieved, so that the load frequency control problem of fully unknown power system can be realized.
【學(xué)位授予單位】：東北大學(xué)
【學(xué)位級別】：博士
【學(xué)位授予年份】：2014
【分類號】：O221;TM711

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 王翼;最優(yōu)控制在經(jīng)濟(jì)系統(tǒng)中的應(yīng)用[J];信息與控制;1980年06期

2 潘健;;二次型性能指標(biāo)下的人口移民最優(yōu)控制[J];廣西大學(xué)學(xué)報(自然科學(xué)版);1985年02期

3 朱文驊;一種最優(yōu)控制的胞映射算法[J];應(yīng)用力學(xué)學(xué)報;1988年04期

4 毛云英,邊馥萍;動態(tài)投入產(chǎn)出最優(yōu)控制模型[J];數(shù)學(xué)的實踐與認(rèn)識;1992年01期

5 蒲志林;非線性發(fā)展系統(tǒng)最優(yōu)控制的存在性及其應(yīng)用[J];四川師范大學(xué)學(xué)報(自然科學(xué)版);1998年03期

6 張仁忠;帶緩沖器的多出口串行生產(chǎn)線的無阻塞最優(yōu)控制[J];純粹數(shù)學(xué)與應(yīng)用數(shù)學(xué);2000年01期

7 于書敏,張仁忠;帶緩沖器的多入口多出口串行生產(chǎn)線的無阻塞最優(yōu)控制[J];哈爾濱師范大學(xué)自然科學(xué)學(xué)報;2000年03期

8 朱貴鳳,商妮娜;階形桿縱向振動的最優(yōu)控制[J];太原理工大學(xué)學(xué)報;2001年05期

9 劉國志;動態(tài)投入產(chǎn)出最優(yōu)控制模型[J];數(shù)學(xué)的實踐與認(rèn)識;2002年04期

10 潘穎,王超,盛嚴(yán),陳雙全;結(jié)構(gòu)振動瞬時最優(yōu)控制的一種時滯補(bǔ)償算法[J];力學(xué)與實踐;2002年05期

相關(guān)會議論文前10條

1 王青;張穎昕;;“最優(yōu)控制”課程的教學(xué)研究與實踐[A];2011高等職業(yè)教育電子信息類專業(yè)學(xué)術(shù)暨教學(xué)研討會論文集[C];2011年

2 王青;張穎昕;;最優(yōu)控制課程實踐教學(xué)的思考與探索[A];2011高等職業(yè)教育電子信息類專業(yè)學(xué)術(shù)暨教學(xué)研討會論文集[C];2011年

3 付春江;王如彬;;手臂屈伸運(yùn)動中上位最優(yōu)控制對外部速度力場的補(bǔ)償適應(yīng)[A];中國力學(xué)學(xué)會學(xué)術(shù)大會'2009論文摘要集[C];2009年

4 時貞軍;王長鈺;;洗煤過程控制中的最優(yōu)控制模型及求解方法[A];復(fù)雜巨系統(tǒng)理論·方法·應(yīng)用——中國系統(tǒng)工程學(xué)會第八屆學(xué)術(shù)年會論文集[C];1994年

5 郭磊;于瑞林;田發(fā)中;;跳變時刻狀態(tài)受約束的跳變系統(tǒng)的最優(yōu)控制[A];第二十四屆中國控制會議論文集（上冊）[C];2005年

6 盧容德;朱月華;;人體系統(tǒng)的最優(yōu)控制研究[A];第六屆全國人—機(jī)—環(huán)境系統(tǒng)工程學(xué)術(shù)會議論文集[C];2003年

7 吳慶林;陳宗海;董道毅;;量子最優(yōu)控制研究綜述[A];’2004系統(tǒng)仿真技術(shù)及其應(yīng)用學(xué)術(shù)交流會論文集[C];2004年

8 陳育庭;鄧慧紅;;教學(xué)過程模型及其最優(yōu)控制[A];數(shù)學(xué)及其應(yīng)用文集——中南模糊數(shù)學(xué)和系統(tǒng)分會第三屆年會論文集（下卷）[C];1995年

9 尹翔康;吳沖鋒;;帶隨機(jī)參數(shù)線性系統(tǒng)的最優(yōu)控制(Ⅱ):控制矩陣是確定性矩陣[A];全國青年管理科學(xué)與系統(tǒng)科學(xué)論文集（第2卷）[C];1993年

10 李旭東;王建舉;;一類生產(chǎn)庫存問題的最優(yōu)控制[A];1994中國控制與決策學(xué)術(shù)年會論文集[C];1994年

相關(guān)博士學(xué)位論文前10條

1 龐留勇;腫瘤治療方案的數(shù)學(xué)模型研究及數(shù)值模擬[D];華中師范大學(xué);2015年

2 張吉烈;基于單網(wǎng)絡(luò)模糊及無模型自適應(yīng)動態(tài)規(guī)劃最優(yōu)控制方法的研究[D];東北大學(xué);2014年

3 秦春斌;基于近似動態(tài)規(guī)劃的優(yōu)化控制研究及在電力系統(tǒng)中的應(yīng)用[D];東北大學(xué);2014年

4 劉重陽;非線性切換動力系統(tǒng)的最優(yōu)控制及應(yīng)用[D];大連理工大學(xué);2010年

5 鄧留保;帶跳的不確定最優(yōu)控制及應(yīng)用[D];南京理工大學(xué);2013年

6 孟慶欣;有跳躍的隨機(jī)系統(tǒng)的最優(yōu)控制[D];復(fù)旦大學(xué);2010年

7 江勝宗;非直井跡最優(yōu)控制模型、算法及應(yīng)用[D];大連理工大學(xué);2002年

8 李春發(fā);分布參數(shù)系統(tǒng)辨識與最優(yōu)控制理論算法及應(yīng)用[D];大連理工大學(xué);2003年

9 魏慶來;基于近似動態(tài)規(guī)劃的非線性系統(tǒng)最優(yōu)控制研究[D];東北大學(xué);2009年

10 李搏;主部系數(shù)含控制的偏微分方程最優(yōu)控制[D];復(fù)旦大學(xué);2011年

相關(guān)碩士學(xué)位論文前10條

1 徐嘉龍;具有第一邊值條件的雙相Stefan問題自由邊界的最優(yōu)控制[D];東北師范大學(xué);2015年

2 于丹;具有第二邊值條件的雙相Stefan問題自由邊界的最優(yōu)控制[D];東北師范大學(xué);2015年

3 吳文婷;控制受約束的隨機(jī)線性二次最優(yōu)控制[D];復(fù)旦大學(xué);2014年

4 張鵬舉;多階段間歇發(fā)酵過程的最優(yōu)控制求解[D];大連理工大學(xué);2015年

5 邵嬌嬌;微生物間歇發(fā)酵中酶催化非線性時滯動力系統(tǒng)的最優(yōu)控制[D];大連理工大學(xué);2015年

6 程關(guān)明;間歇發(fā)酵非線性動力系統(tǒng)的魯棒最優(yōu)控制[D];大連理工大學(xué);2015年

7 康霞霞;兩類具有常數(shù)輸入率的SIRS模型的穩(wěn)定性與最優(yōu)控制[D];曲阜師范大學(xué);2015年

8 馬明;連續(xù)混沌系統(tǒng)的最優(yōu)控制[D];揚(yáng)州大學(xué);2009年

9 趙彥娟;線性系統(tǒng)二次型性能指標(biāo)的模糊最優(yōu)控制[D];南京理工大學(xué);2009年

10 蔣秀萍;一類非線性粘性色散波方程的最優(yōu)控制[D];江蘇大學(xué);2009年

，

本文編號：2119539

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/dianlilw/2119539.html

上一篇：基于FPGA的多通道LED控制器設(shè)計與實現(xiàn)
下一篇：基于日光定向反射原理的定日鏡系統(tǒng)的研制

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于近似動態(tài)規(guī)劃的優(yōu)化控制研究及在電力系統(tǒng)中的應(yīng)用