不完全信息條件下橋牌博弈算法的研究及應(yīng)用
發(fā)布時(shí)間:2018-07-03 11:00
本文選題:橋牌博弈 + 機(jī)器博弈; 參考:《電子科技大學(xué)》2017年碩士論文
【摘要】:人工智能近年來受到越來越多的關(guān)注,并成為今年全國兩會(huì)的熱門話題。機(jī)器博弈的研究為人工智能提供了很多方法和理論,如博弈搜索等。機(jī)器博弈又分為完全信息博弈和不完全信息博弈,其中由于不完全信息博弈更貼近現(xiàn)實(shí)生活中的問題,如戰(zhàn)爭、股票市場及商場等,受到了越來越多研究者的重視。在現(xiàn)有研究工作的基礎(chǔ)上,本文對不完全信息下橋牌博弈算法進(jìn)行了研究,主要研究內(nèi)容有以下幾個(gè)方面:(1)提出了一種基于滑動(dòng)窗口的抽樣時(shí)間分配的GHA-BP叫牌學(xué)習(xí)策略。一直以來,叫牌法則的模糊性是計(jì)算機(jī)叫牌中需要解決的首要問題,F(xiàn)有的叫牌學(xué)習(xí)策略雖一定程度上解決了叫牌法則的模糊性,但其在抽樣時(shí)間分配上不合理,并且基于ID3算法的叫牌預(yù)測準(zhǔn)確度不高,導(dǎo)致學(xué)習(xí)叫牌策略能力有限。針對這些問題,本文提出了一種改進(jìn)的叫牌學(xué)習(xí)策略,該策略采用滑動(dòng)窗口對抽樣時(shí)間進(jìn)行預(yù)測分配,采用GHA-BP神經(jīng)網(wǎng)絡(luò)對模糊的叫牌法則進(jìn)行分類學(xué)習(xí)。實(shí)驗(yàn)表明,該策略能夠更合理地分配有限的時(shí)間和提高叫牌學(xué)習(xí)策略的準(zhǔn)確率,降低與雙明手叫牌結(jié)果的差異。(2)提出了一種基于啟發(fā)式的蒙特卡羅打牌策略。一直以來,信息的不完全性是打牌過程中需要解決的首要問題。現(xiàn)有的蒙特卡羅打牌策略中通過隨機(jī)抽樣的方法為解決該問題提供了一種可行的思路,但常規(guī)抽樣方法由于其盲目性存在抽樣效率低的問題。因此,本文提出了一種改進(jìn)的蒙特卡羅打牌策略,該策略將啟發(fā)式的思想應(yīng)用于生成樣本牌局。實(shí)驗(yàn)驗(yàn)證了本策略能夠在相同時(shí)間內(nèi)產(chǎn)生更多滿足叫牌約束的樣本,能更準(zhǔn)確地模擬當(dāng)前狀態(tài)的牌局,從而做出更為合理的打牌決策。(3)基于前面橋牌博弈算法的研究,本文設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)計(jì)算機(jī)橋牌博弈系統(tǒng)。該系統(tǒng)包括控制系統(tǒng)和橋牌AI程序。控制系統(tǒng)實(shí)現(xiàn)了橋牌AI之間的通信功能,橋牌AI實(shí)現(xiàn)并驗(yàn)證了本文提出的叫牌及打牌算法。
[Abstract]:Artificial intelligence has received more and more attention in recent years, and has become a hot topic of the two sessions this year. The research of machine game provides many methods and theories for artificial intelligence, such as game search. Machine game is divided into complete information game and incomplete information game. Because incomplete information game is closer to the problems in real life, such as war, stock market and shopping mall, more and more researchers pay attention to it. Based on the existing research work, this paper studies the bridge game algorithm under incomplete information. The main research contents are as follows: (1) A GHA-BP bidding learning strategy based on sliding window sampling time allocation is proposed. All the time, the fuzziness of bidding rules is the most important problem to be solved in computer bidding. Although the existing bidding learning strategy solves the fuzziness of bidding rules to some extent, it is unreasonable in the allocation of sampling time, and the accuracy of bid prediction based on ID3 algorithm is not high, which leads to the limited ability of learning bidding strategy. In order to solve these problems, an improved bidding learning strategy is proposed, in which the sliding window is used to predict and allocate the sampling time, and the GHA-BP neural network is used to classify the fuzzy bidding rules. Experiments show that the strategy can allocate limited time more reasonably and improve the accuracy of bid learning strategy, and reduce the difference between the results of bidding and that of Shuangming hand. (2) A heuristic Monte Carlo strategy is proposed. The incompleteness of information has always been the first problem to be solved in the process of playing cards. The method of random sampling in the current strategy of Monte Carlo card playing provides a feasible way to solve this problem, but the conventional sampling method has the problem of low sampling efficiency because of its blindness. Therefore, an improved Monte Carlo card playing strategy is proposed, in which heuristic ideas are applied to generate sample cards. Experiments show that this strategy can generate more samples to meet the bidding constraints in the same time, and can more accurately simulate the current state of the card game, thus making more reasonable card playing decision. (3) based on the previous bridge game algorithm research, This paper designs and implements a computer bridge game system. The system includes a control system and a bridge AI program. The control system realizes the communication function between the bridge AI and the bridge AI, and verifies the bidding and playing algorithm proposed in this paper.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 馬驍;王軒;王曉龍;;一類非完備信息博弈的信息模型[J];計(jì)算機(jī)研究與發(fā)展;2010年12期
2 劉貴松;王曉彬;;采用自適應(yīng)GHA神經(jīng)網(wǎng)絡(luò)的分類器設(shè)計(jì)[J];電子科技大學(xué)學(xué)報(bào);2007年06期
3 楊明,張載鴻;決策樹學(xué)習(xí)算法ID3的研究[J];微機(jī)發(fā)展;2002年05期
,本文編號:2093418
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2093418.html
最近更新
教材專著