網(wǎng)絡化多主體系統(tǒng)中的社會規(guī)范涌現(xiàn)機制研究
發(fā)布時間:2018-06-19 02:39
本文選題:復雜網(wǎng)絡 + 演化博弈; 參考:《大連理工大學》2016年博士論文
【摘要】:社會規(guī)范分為協(xié)同規(guī)范和合作規(guī)范,其對于維持網(wǎng)絡化多主體系統(tǒng)的秩序和運行效率起著極其重要的作用,目前研究中面臨的挑戰(zhàn)是如何在特定的場景中快速有效的建立起社會規(guī)范。在網(wǎng)絡化多主體系統(tǒng)中,由于參與主體具有自私性和推理學習能力,他們會根據(jù)在系統(tǒng)中獲得的收益以及獲取的其他局部信息(如鄰居的行為和收益等)來不斷調(diào)整自己的行為進而優(yōu)化自己的收益,自下而上的涌現(xiàn)方式成為在多主體系統(tǒng)中建立社會規(guī)范的有效途徑。然而,在不同的網(wǎng)絡結(jié)構(gòu)和博弈沖突模型下,不同的涌現(xiàn)機制會給主體提供不同的誘導信息,從而對社會規(guī)范的涌現(xiàn)結(jié)果產(chǎn)生非常大的影響。因此,針對不同的網(wǎng)絡結(jié)構(gòu)和博弈場景設計合適的社會規(guī)范涌現(xiàn)機制是本文研究的核心科學問題。在對已有研究工作進行總結(jié)的基礎上,本文針對環(huán)形網(wǎng)絡上的協(xié)同博弈、靜態(tài)網(wǎng)絡上的囚徒困境博弈、移動網(wǎng)絡上的囚徒困境博弈以及網(wǎng)絡重復囚徒博弈四個問題場景,分別從策略更新規(guī)則、博弈矩陣調(diào)節(jié)機制和動態(tài)網(wǎng)絡構(gòu)建機制三個緯度提出了相應的社會規(guī)范涌現(xiàn)機制,并分別分析了涌現(xiàn)過程的微觀機理,主要貢獻在于:1).環(huán)形網(wǎng)絡因為其直徑最大,所以最容易產(chǎn)生局部協(xié)同規(guī)范。由于現(xiàn)有的近視最優(yōu)反應規(guī)則(MBR)、最高累積獎勵規(guī)則、Q學習規(guī)則等多種機制都不能有效地促進環(huán)形網(wǎng)絡上全局規(guī)范的涌現(xiàn),本文提出了帶有冰凍期的最優(yōu)反應規(guī)則(FBR)。個體在新采納一個新行為之后進入冰凍期,冰凍期內(nèi)的個體以很大概率延續(xù)上一時刻的行為,以極小的概率采用傳統(tǒng)的MBR規(guī)則進行策略更新。仿真結(jié)果和微觀分析表明,FBR規(guī)則可以使局部規(guī)范間的界面由隨機游走過程變成有偏隨機游走,從而使得界面的擴散率更高,在適中的冰凍期長度下能夠更快地促進環(huán)形網(wǎng)絡上的全局協(xié)同規(guī)范的涌現(xiàn)。2).以往的研究工作提出了多種策略更新規(guī)則和博弈矩陣調(diào)節(jié)機制來促進靜態(tài)網(wǎng)絡上的合作規(guī)范的涌現(xiàn),這些機制雖然可以使合作者在較大的參數(shù)空間內(nèi)在系統(tǒng)中存活,但通常無法保證全局合作規(guī)范的涌現(xiàn)。針對這一問題,我們提出了空間擴展的費米更新規(guī)則(N-FUR)和多博弈矩陣調(diào)節(jié)機制(MG)。在N-FUR規(guī)則中,個體使用學習對象的收益和學習對象鄰居的平均收益的加權和作為學習對象的適應度,這使得很小的合作者團簇能夠在背叛者的海洋中存活并擴張,從而提升全局合作規(guī)范涌現(xiàn)的臨界值。MG機制通過給系統(tǒng)中等比例的個體分配具有正值和負值的愚者收益(S)的博弈矩陣,可以首先提升采用正S矩陣的子群體中的合作水平,然后通過正S矩陣到負S矩陣的子群體的不對稱策略模仿流來提升群體的合作比例和社會總收益。3).由于移動網(wǎng)絡中不合理的移動規(guī)則容易破壞合作者的團簇,使得背叛者可以容易地入侵合作者團簇,從而不利于合作規(guī)范的涌現(xiàn)。針對這一問題,本文提出了度相關的矢量平均移動規(guī)則(DVAM)。在該規(guī)則中,個體利用鄰居移動方向的加權平均值來更新自己的移動方向,其中度大的鄰居被賦予較大的權重。仿真和分析結(jié)果表明,該機制可以促進系統(tǒng)更快的形成大合作團簇,從而抵御背叛者的入侵,DVAM規(guī)則能夠比傳統(tǒng)的隨機移動和矢量平均移動規(guī)則更有效地促進移動網(wǎng)絡上的合作規(guī)范的涌現(xiàn)。4).網(wǎng)絡重復博弈中個體的策略數(shù)會指數(shù)級增加,使得現(xiàn)有的基于模仿的策略更新規(guī)則在該場景下會賦予個體太強的觀察和推理能力而不適用,而基于期望的學習規(guī)則和全局極值更新規(guī)則又不能促進該場景下合作規(guī)范的涌現(xiàn)。針對這一問題,本文提出了局部化極值更新規(guī)則(LEUR)。在LEUR規(guī)則中,個體只和鄰居比較收益,當自身收益在鄰域中最小時將其策略更新為隨機選擇的新策略。仿真結(jié)果和微觀分析表明,當鄰域半徑為2時,系統(tǒng)中活躍個體數(shù)量和這些個體形成的小團簇數(shù)量最多,從而使得系統(tǒng)可以演化到平均收益最高的以牙還牙策略(TFT)狀態(tài)。
[Abstract]:Social norms are divided into cooperative norms and cooperation norms, which play an extremely important role in maintaining the order and efficiency of the networked multi-agent system. The challenge facing the present study is how to establish social norms quickly and effectively in a specific scene. In the network multi-agent system, the participants are selfish. And reasoning learning ability, they will adjust their behavior and then optimize their earnings according to the benefits obtained in the system and other local information (such as the behavior and income of the neighbours). The bottom-up emergence is an effective way to establish social norms in the multi-agent system. However, in different networks, they are in different networks. Under the complex structure and game conflict model, different emergence mechanisms provide different guidance information to the subject, which has a great influence on the emergence of social norms. Therefore, the design of appropriate social norm emergence mechanism for different network structures and game scenarios is the core scientific problem in this paper. On the basis of the summary of the work, this paper aims at the cooperative game on the ring network, the prisoner's dilemma game on the static network, the prisoner's dilemma game on the mobile network and the network repeated prisoner game four problems, which are respectively proposed from the policy updating rule, the game matrix regulator system and the dynamic network construction mechanism at three latitudes, respectively. The emergence mechanism of social norms should be introduced, and the micro mechanism of emergence process is analyzed respectively. The main contributions are as follows: 1) the ring network is most likely to produce local synergistic specifications because of its largest diameter. Due to the existing optimal response rules (MBR), the highest cumulative reward rules, and the Q learning rules, many mechanisms can not be effectively promoted. An optimal response rule (FBR) with frozen period is proposed in this paper. The individual enters the freezing period after a new behavior, and the individuals in the frozen period continue the behavior at the last moment in a large probability, and use the traditional MBR rules to update the strategy with a minimal probability. The analysis shows that the FBR rule can make the interface between the local specification from random walk process into biased random walk, which makes the diffusion rate of the interface higher, and can accelerate the emergence of the global cooperative specification on the annular network faster in the moderate freezing length. The previous research work puts forward a variety of policy updating rules and games. The mechanism of matrix adjustment promotes the emergence of cooperation specifications on static networks. Although these mechanisms can enable the collaborators to survive in a larger parameter space, they are often unable to guarantee the emergence of global cooperation specifications. In this case, we propose a spatial extended Fermi update rule (N-FUR) and a multi game matrix regulator. MG. In the N-FUR rule, the individual uses the income of the learning object and the weight of the average income of the learning object neighbours and the fitness of the learning object. This makes the small cooperative cluster can survive and expand in the sea of the Betrayer, thus enhancing the critical value.MG mechanism of the global cooperation rule by giving the system the same ratio. The individual distribution of the game matrix with positive and negative value of the fool's income (S) can first raise the level of cooperation in the subgroups of the positive S matrix, and then imitate the flow through the asymmetric strategy of the positive S matrix to the negative S matrix to enhance the cooperative proportion of the group and the total social income.3). Mobile rules can easily destroy the cluster of collaborators, making it easy for the Betrayer to invade the cluster of collaborators, which is not conducive to the emergence of the cooperation specification. In this paper, this paper proposes a degree related vector average mobility rule (DVAM). In this rule, the individual uses the weighted average of the neighbor's moving direction to update its own movement. In the direction, the big neighbors are given large weights. The simulation and analysis results show that the mechanism can promote the system to form large cooperative clusters faster and resist the invading of the Betrayer, and the DVAM rule can more effectively promote the emergence of.4 on the mobile network than the traditional random movement and vector average movement rules. The number of individual strategies in the network repeated game increases exponentially, making the existing imitation based policy updating rules endow individuals with too strong ability to observe and reasoning, but the expectation based learning rules and global extremum updating rules do not promote the emergence of cooperation norms in this scenario. In this paper, the localization extremum updating rule (LEUR) is proposed. In the LEUR rule, the individual is only compared with the neighbor, and the strategy is updated to a new strategy when its own income is at the nearest neighborhood. The simulation results and microanalysis show that when the neighborhood radius is 2, the number of active individuals in the system and the small formation of these individuals are small. The number of clusters is the largest, so that the system can evolve to the TFT with the highest average returns.
【學位授予單位】:大連理工大學
【學位級別】:博士
【學位授予年份】:2016
【分類號】:O157.5;O225
【相似文獻】
相關期刊論文 前10條
1 阮青松,余穎,黃向暉;群體規(guī)則和社會規(guī)范影響國企經(jīng)理改革的實證研究[J];當代經(jīng)濟科學;2005年02期
2 王召俠;黃魯成;;社會規(guī)范與新技術間的長期關系模型[J];科學學研究;2010年08期
3 吳凡罕;張四海;徐敏;王煦法;;公共品實驗中社會規(guī)范形成的研究及仿真[J];計算機工程與應用;2006年15期
4 徐夢秋;曹志平;;技術規(guī)范的特征與內(nèi)涵[J];自然辯證法通訊;2008年05期
5 王召俠;黃魯成;孟凡新;;社會規(guī)范與新技術間的短期關系模型[J];工業(yè)技術經(jīng)濟;2010年02期
6 李霆,張朋柱,王刊良;社會規(guī)范對技術接受行為的影響機制研究[J];科學學研究;2005年03期
7 陳敬燮;;試論科學社會規(guī)范的建設[J];科技管理研究;1984年04期
8 李常洪,李敏強,寇紀淞;多agent系統(tǒng)社會規(guī)范研究[J];系統(tǒng)工程學報;2003年01期
9 馮丹;;罰款為什么失效了?[J];大科技(百科新說);2010年02期
10 ;[J];;年期
相關會議論文 前5條
1 馬劍虹;劉,
本文編號:2038002
本文鏈接:http://sikaile.net/kejilunwen/yysx/2038002.html
最近更新
教材專著