﻿ 基于约束边长FART-Q的智能决策算法
 文章快速检索 高级检索

Intelligent decision-making algorithm based on bounded FART-Q
ZHOU Yanan, GONG Guanghong
School of Automation Science and Electrical Engineering, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
Abstract:Fuzzy adaptive resonance theory (ART) with bounded side length was proposed to address the problem emerged while applying fuzzy ART to intelligent decision-making. Integrating the modified fuzzy ART and Q learning algorithm, bounded fuzzy ART-Q learning (FART-Q) intelligent decision-making network was built. The original fuzzy ART might make unreasonable classifications only according to the fuzzy similarity between input vector and weight vector, without considering the physical meaning of the state variables. To solve this problem, a modified algorithm was proposed, strengthening the resonance condition of fuzzy ART with bounded side length. The improvement made it possible both to limit the side length according to the physical meaning of the state variables and to reduce the number of categories. The minefield navigation simulation was conducted to verify the availability and effectiveness of bounded FART-Q. Compared with the original fuzzy ART, the modified algorithm is able to make classifications more reasonably with higher success rate and less operation time.
Key words: artificial neural network     adaptive resonance theory     fuzzy set theory     Q learning     intelligent decision-making

1 模糊自适应共振理论 1.1 模糊ART神经网络 1.1.1 模糊运算与模糊子集

1.1.2 模糊ART分类算法

 图 1 模糊ART网络Fig. 1 Fuzzy ART network

1) 初始化神经网络,N=0.向输出层添加第1个神经元,N=1,且对所有l=1,2,…,L,令w1l=1.

2) 输入待分类向量I,对输出层的每个神经元,计算选择函数:

3) 对选择函数最大的神经元J,验证共振条件：

4) 对神经元进行学习:

5) 输出分类结果:

6) 若J=N,则向输出层新增加一个神经元(N:=N+1),且对所有l=1,2,…,L,令wNl=1.

1.2 约束边长的模糊ART

x∈(0,1］,可作为模糊ART分类网络输入向量中的一维.

I1中,x=0.1,即φ=－0.8π; I2中,x=0.7,即φ=0.4π.

M=2时,设输入I=(a,ac),相应地可将权值向量写为Wj=(uj,vcj),uj,vj均为二维向量.令uj,vj分别代表二维平面中的一个点.前例中W1=(0.1,0.5,0.3,0.4),则u1=(0.1,0.5),v1=(0.7,0.6),R1即为图 2中的长方形区域.

 图 2 W1所代表的分类区域Fig. 2 Category area covered by W1

∑=(1,1,…,1)时,式(16)恒成立,算法退化为传统的模糊ART.

1.3 边长约束的优点

1) 如1.2节所述,能够避免分类的某个边长过大导致的分类不合理的问题.

2) 对大量输入进行分类时,能够减少分类数量.由式(14)可知,模糊ART限制了分类区域的边长总和.可定义分类区域的体积:

2 约束边长FART-Q智能决策网络

Q学习与模糊ART结合,可用于智能决策.约束边长FART-Q智能决策网络如图 3所示,模糊ART网络输出状态分类si,选取使得Q值最大的动作aK,即

 图 3 约束边长FART-Q智能决策网络结构Fig. 3 Structure of intelligent decision-making network with bounded FART-Q

struct Action_Q

{

int action_index;    //动作编号

double Q_value;    //Q

};

class CCategory

{

int category_id;    //分类编号

CVector weight;    //权值向量

vectorQ> action_reward;    //动作-Q值对数组

};

1) 将从传感器等渠道获取的态势信息进行归一化预处理,生成分类输入向量I.

2) 将I输入到模糊ART网络中进行分类,得到分类结果si,并通过学习调整模糊ART网络.

3) 通过状态si的动作-Q值对选取Q值最大的动作aK并执行.

4) 获得执行完aK后的态势输入I′,经模糊ART网络分类后得到下一状态s′,获得执行aK的回报r,并将s′r反馈给动作-Q值对,通过式(19)学习Q值.

3 智能决策仿真实验 3.1 雷区导航实验简介

 图 4 雷区导航实验Fig. 4 Minefield navigation experiment

1) 探测:车的左、左前、前、右前、右5个方向上各有一个传感器,可以探测相应方向上障碍的距离di(i=1,2，…,5);另有一个传感器可以感知终点的相对方向b(1×5),b的每一维代表一个方向,如图 5所示,若终点在相应方向范围内,则这一方向上的值为1,其他方向上值为0.

 图 5 探测目标相对方向的范围Fig. 5 Destination’s direction scopes relative to the vehicle
2) 移动:小车每次可以向车的左、左前、前、右前、右5个方向移动1格.

3) 学习:每移动1步后,小车可获得相应的回报r(见表 1),Q学习算法根据回报对执行的动作效果进行学习.若移动后,小车离终点更近,则r=0.8,否则r=0.2;若移动后小车到达终点,r=1.0,若碰到障碍,则r=0.

 移动后结果 离终点更近 离终点没有更近 成功 失败 回报r 0.8 0.2 1.0 0

3.2 实验结果与分析

 组号 α β ρ ∑ η γ 第1组 0.1 1.0 0.5 (1,1,…,1) 0.5 0.1 第2组 0.1 1.0 0.8 (1,1,…,1) 0.5 0.1 第3组 0.1 1.0 0.8 (0.5,0.5,…,0.5) 0.5 0.1

 图 6 3组实验的平均成功率比较Fig. 6 Comparison of average success rate among three test groups

 组号 总时间/ms 总移动步数 每步平均时间/ms 3 000回合后分类数 3 000回合成功率/% 第1组 10 957.3 29 557.5 0.367 22.7 64.37 第2组 49 994.1 28 864.4 1.732 186.6 89.24 第3组 30 250.9 24 126.1 1.253 124.1 95.26

4 结 论

1) 本文提出了约束边长的模糊ART算法,并将其与Q学习结合构建了约束边长FART-Q智能决策网络.

2) 经3组雷区导航仿真实验验证,该网络可快速进行智能决策.实验中,输入向量维数为20(M=10),在分类数达到120以上的情况下,每步决策平均用时为1~2 ms;

3) 与传统的模糊ART相比,约束边长的模糊ART能够使分类更为合理,既能提高决策的成功率,又可以减小决策的运算时间.

 [1] 祝世虎,董朝阳,张金鹏,等.基于神经网络与专家系统的智能决策支持系统[J].电光与控制,2006,13(1):8-11.Zhu S H,Dong C Y,Zhang J P,et al.An intelligent decision-making system based on neural networks and expert system[J].Electronics Optics and Control,2006,13(1):8-11(in Chinese). Cited By in Cnki (26) [2] 魏强,周德云.基于专家系统的无人战斗机智能决策系统[J].火力与指挥控制,2007,32(2):5-7.Wei Q,Zhou D Y.Research on UCAV' s intelligent decision-making system based on expert system[J].Fire Control and Command Control,2007,32(2):5-7(in Chinese). Cited By in Cnki (12) [3] 马耀飞,龚光红,彭晓源.基于强化学习的航空兵认知行为模型[J].北京航空航天大学学报,2010,36(4):379-383.Ma Y F,Gong G H,Peng X Y.Cognition behavior model for air combat based on reinforcement learning[J].Journal of Beijing University of Aeronautics and Astronautics,2010,36(4):379-383(in Chinese). Cited By in Cnki (3) | Click to display the text [4] 杨兴,朱大奇,桑庆兵.专家系统研究现状与展望[J].计算机应用研究,2007,24(5):4-9.Yang X,Zhu D Q,Sang Q B.Research and prospect of expert system[J].Application Research of Computers,2007,24(5):4-9(in Chinese). Cited By in Cnki (136) | Click to display the text [5] Ueda H,Naraki T,Hanada N,et al.Fuzzy Q-learning with the modified fuzzy ART neural network[J].Web Intelligence and Agent Systems,2007,5(3):331-341. Click to display the text [6] 彭小萍.自适应共振理论原理与应用研究[D].北京:北京化工大学,2012.Peng X P.The study on adaptive resonance theory principles and applications[D].Beijing:Beijing University of Chemical Technology,2012(in Chinese). Cited By in Cnki [7] Carpenter G A,Grossberg S,Rosen D B.Fuzzy ART:fast stable learning and categorization of analog patterns by an adaptive resonance system[J].Neural Networks,1991,4(6):759-771. Click to display the text [8] Hsieh S,Su C L,Liaw J.Fuzzy ART for the document clustering by using evolutionary computation[J].WSEAS Transactions on Computers,2010,9(9):1032-1041. Click to display the text [9] Song X H,Hopke P K,Bruns M A,et al.A fuzzy adaptive resonance theory-supervised predictive mapping neural network applied to the classification of multivariate chemical data[J].Chemometrics and Intelligent Laboratory Systems,1998,41(2):161-170. Click to display the text [10] Li Y Y,Parker L E.Classification with missing data in a wireless sensor network[C]//Southeastcon,2008.Piscataway,NJ:IEEE,2008:533-538. Click to display the text [11] Ediriweera D D,Marshall I W.Advances in computational algorithms and data analysis[M].Netherlands:Springer,2009:293-304. [12] Araujo R.Prune-able fuzzy ART neural architecture for robot map learning and navigation in dynamic environments[J].Neural Networks,IEEE Transactions on Neural Networks,2006,17(5):1235-1249. Click to display the text [13] Tan A H.FALCON:a fusion architecture for learning,cognition and navigation[C]//2004 IEEE International Joint Conference on Neural Networks.Piscataway,NJ:IEEE,2004,4:3297-3302. Click to display the text [14] Teng T H,Tan A H.Knowledge-based exploration for reinforcement learning in self-organizing neural networks[C]//Proceedings of the 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology,Volume 02.Washington,D C:IEEE Computer Society,2012:332-339. Click to display the text [15] Teng T H,Tan A H,Teow L N.Adaptive computer-generated forces for simulator-based training[J].Expert Systems with Applications,2013,40(18):7341-7353 Click to display the text

文章信息

ZHOU Yanan, GONG Guanghong

Intelligent decision-making algorithm based on bounded FART-Q

Journal of Beijing University of Aeronautics and Astronsutics, 2015, 41(1): 96-101.
http://dx.doi.org/10.13700/j.bh.1001-5965.2014.0076