﻿ 变精度下不完备邻域决策系统的属性约简算法
«上一篇
 文章快速检索 高级检索

 智能系统学报  2017, Vol. 12 Issue (3): 386-391  DOI: 10.11992/tis.201705027 0

### 引用本文

WANG Yinglong, ZENG Qi, QIAN Wenbin, et al. Attribute reduction algorithm of the incomplete neighborhood decision system with variable precision[J]. CAAI Transactions on Intelligent Systems, 2017, 12(3): 386-391. DOI: 10.11992/tis.201705027.

### 文章历史

1. 江西农业大学 计算机与信息工程学院, 江西 南昌 330045;
2. 江西农业大学 软件学院, 江西 南昌 330045

Attribute reduction algorithm of the incomplete neighborhood decision system with variable precision
WANG Yinglong1, ZENG Qi1, QIAN Wenbin2, YANG Jun2
1. School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330045, China;
2. School of Software, Jiangxi Agricultural University, Nanchang 330045, China
Abstract: Neighborhood rough set model has been widely used in numerical data processing complete, but the discussion of attribute reduction for numeric and symbolic mixed incomplete data is relatively small. Therefore, to resolve this problem, by combining the neighborhood rough set, first, the upper and lower approximation operators and the attribute reduction of the incomplete neighborhood decision system were analyzed based on the variable precision model. Subsequently, based on the generalized neighborhood relation, a rough set model was constructed using the neighborhood granulation method. Furthermore, a method evaluating the attribute significance degree was proposed. Based on this method, an attribute reduction algorithm for the incomplete neighborhood decision system was designed, which can deal with incomplete values directly type and symbolic mixed data. Finally, through the example analysis, the algorithm can solve the attribute reduction result of incomplete neighborhood decision system with variable precision.
Key words: rough set theory    neighborhood relation    incomplete information system    variable precision classification    granular computing    multi-granulation    reducation    decision-theoretic rough sets

1 基本知识

1)∀x, yU, ΔA(x, y)≥0, 当ΔA(x, y)=0时, ∀aiA, ai(x)=ai(y);

2)∀x, yU, ΔA(x, y)=ΔA(y, x);

3)∀x, y, zU, ΔA(x, z)≤ΔA(x, y)+ΔA(y, z)。

 $\Delta A\left( {x,y} \right) = \sqrt {\sum\limits_{{a_i} \in A} {{{\left| {{a_i}\left( x \right) - {a_i}\left( y \right)} \right|}^2}} }$

 $\Delta A\left( {x,y} \right) = \left\{ \begin{array}{l} 0,\;\;\;\;\;{a_i}\left( x \right) = {a_i}\left( y \right)\\ \infty ,\;\;\;\;{a_i}\left( x \right) \ne {a_i}\left( y \right) \end{array} \right.$

δ=0时，变为经典粗糙集模型。

 $\begin{array}{l} R\left( x \right) = \left\{ {\left( {x,y} \right) \in {U^2}:\forall a \in x \cap {f_1}\left( x \right) = } \right.\\ \;\;\;\;\;\;\;\;\;\;{f_1}\left( y \right),a\left( x \right) \in \delta \left( {y,a} \right) \cup a\left( y \right) \in \\ \;\;\;\;\;\;\;\;\;\;\left. {\delta \left( {x,a} \right) \cup a\left( x \right) = * \cup a\left( y \right) = * } \right\} \end{array}$

${\overline {{P_B}} }$(x)是可能包含x的邻域信息粒子组合的集合，${\underline {{P_B}} }$(x)是包含x的邻域信息粒子组合的集合，与x无关的邻域信息粒子集合为U${\overline {{P_B}} }$(x), 记为~x

 $e\left( {X,Y} \right) = \left\{ \begin{array}{l} 1 - \frac{{\left| {X \cap Y} \right|}}{{\left| X \right|}},\;\;\;\;\;\left| X \right| > 0\\ 0,\;\;\;\;\left| X \right| = 0 \end{array} \right.$

2 不完备可变精度粗糙集模型

 $\begin{array}{*{20}{c}} {\overline {P_B^k} \left( {{d_i}} \right) = \left\{ {x \in U\left| {e\left( {{N_B}\left( x \right),{d_i}} \right) < 1 - k} \right.} \right\}}\\ {\underline {P_B^k} \left( {{d_i}} \right) = \left\{ {x \in U\left| {e\left( {{N_B}\left( x \right),{d_i}} \right) \le k} \right.} \right\}} \end{array}$

1) ${\underline {{P_B^k}} }$(di)⊆${\underline {{P_B}} }$(di)，${\overline {{P_B^k}} }$(di)⊆${\overline {{P^B}} }$(di);

2) 对于B′⊆BA，则存在${\underline {{P_{B'}^k}} }$ (di)⊆${\underline {{P_B^k}} }$(di), ${\overline {{P_{B'}^k}} }$(di)⊆${\overline {{P_B^k}} }$(di)。

2) 由定义5可得，对于∀xU, 则有e(NB(x), di) < 1-k成立，因为B′⊆BA，则e(NB(x), di)=$1 -{\rm{ }}\frac{{\left| {{N_{B\prime }}\left(x \right) \cap {d_i}} \right|}}{{\left| {{N_{B\prime }}\left(x \right)} \right|}} < 1 -{\rm{ }}\frac{{\left| {{N_B}\left(x \right) \cap {d_i}} \right|}}{{\left| {{N_B}\left(x \right)} \right|}}$=e(NB(x), di) < 1－k，即x${\underline {{P_B}} }$(di)，所以${\underline {{P_{B'}^k}} }$(di)⊆ ${\underline {{P_B^k}} }$(di)。类似可证${\overline {{P_{B'}^k}} }$(di)⊆${\overline {{P_B^k}} }$(di)，证毕。

δ=0.1，k=0.2，因为样本x1x5的决策属性D取值不同，就算连续型的属性值都在邻域范围内，符号型条件属性取值相同，也不能视为同一类; 因为当k=0.2时，即两个样本在C1, C2, C3, C4属性中只能有一个属性取值不同或不在同一邻域中，所以x1, x2属于同一类，x1x3, x4不属于同一类。

 $\gamma _B^k = \frac{{\left| {{\rm{POS}}_B^k\left( {{d_i}} \right)} \right|}}{{\left| U \right|}}$

1) γBk(d)=γCk(d);

2) ∀aB, γB-ak(d)≠γBk(d)。

 ${\rm{SIG}}\left( {a,B,D} \right) = \gamma _{B \cup \left\{ a \right\}}^k\left( D \right) - \gamma _B^k\left( D \right)$
3 变精度下不完备邻域系统的属性约简 3.1 变精度下不完备邻域系统的属性约简算法

1) 初始化RED=ϕ;

2) 根据决策属性D的值对论域U进行划分U/D={D1, D2, …, Dm};

3) 根据可变精度k值和邻域半径δ值，计算邻域关系NA及对应属性依赖度γCik(D);

4) 选取γCik(D)中的最大值，令RED=Ci;

5) 对于∀CiC－RED，根据定义6和定义7计算属性的重要度SIG(Ci, RED, D)= γRED∪{Ci}k(D)－γREDk(D)，选取属性重要度最大的条件属性:其满足SIG(Ci, RED, D)= max{SIG(Ci, RED, D)}，并将属性Ci放入RED;

6) 如果γREDk(D)≠γCk(D)，则算法回到5)，否则执行7);

7) 输出约简RED，算法结束。

3.2 与经典粗糙集及邻域模型比较

1) 经典粗糙集的属性约简适用于离散型属性约简，需先离散化连续型数据，这将不可避免地造成信息的丢失。而变精度不完备邻域系统的属性约简模型既可处理离散型属性约简，也可直接用于连续型属性约简。本文的属性约简模型是对经典粗糙集模型的扩展。

2) 对于含有不完备型数据的决策系统，经典的粗糙集模型较难直接处理，而本文提出的属性约简模型可直接对数据进行分析，并在可变精度的调节下，能得到数据不同层次的信息粒度。

3) 变精度不完备邻域系统的属性约简模型是对邻域模型的进一步扩展，基于邻域的属性约简需计算各样本的邻域，而本文的属性约简模型因为在可变精度的调控下先对样本进行初步筛选，再进行邻域计算，有效减少了计算量。

4 实例分析

 ${D_1}\left( x \right) = \left\{ {{x_3},{x_4},{x_5},{x_8},{x_9}} \right\}$
 ${D_2}\left( x \right) = \left\{ {{x_1},{x_2},{x_6},{x_7},{x_{10}}} \right\}$

 $\gamma _{{C_1}}^{0.2}\left( D \right) = \frac{1}{2}\;\;\;\;\;\gamma _{{C_2}}^{0.2}\left( D \right) = \frac{4}{5}$
 $\gamma _{{C_3}}^{0.2}\left( D \right) = \frac{7}{{10}}\;\;\;\;\;\gamma _{{C_4}}^{0.2}\left( D \right) = \frac{1}{{10}}$

 ${\rm{SIG}}\left( {{C_1},{\rm{RED,}}D} \right) = \gamma _{{C_1} \cup {\rm{RED}}}^{0.2}\left( D \right) - \gamma _{{\rm{RED}}}^{0.2}\left( D \right) = \frac{1}{{10}}$
 ${\rm{SIG}}\left( {{C_3},{\rm{RED,}}D} \right) = \gamma _{{C_3} \cup {\rm{RED}}}^{0.2}\left( D \right) - \gamma _{{\rm{RED}}}^{0.2}\left( D \right) = \frac{1}{5}$
 ${\rm{SIG}}\left( {{C_4},{\rm{RED,}}D} \right) = \gamma _{{C_4} \cup {\rm{RED}}}^{0.2}\left( D \right) - \gamma _{{\rm{RED}}}^{0.2}\left( D \right) = 0$

5 结束语

 [1] PAWLAK Z. Rough sets and intelligent data analysis[J]. Information sciences, 2002, 147(1): 1-12. (0) [2] ZHANG Junbo, WONG Jiansyuan, PAN Yi, et al. A parallel matrix-based method for computing approximations in incomplete information systems[J][J]. IEEE transactions on knowledge and data engineering, 2015, 27(2): 326-339. DOI:10.1109/TKDE.2014.2330821 (0) [3] WU Weizhi, QIAN Yuhua, LI Tongjun, et al. On rule acquisition in incomplete multi-scale decision tables[J]. Information sciences, 2017, 378: 282-302. DOI:10.1016/j.ins.2016.03.041 (0) [4] 张文修, 吴伟志, 梁吉业, 等. 粗糙集理论与方法[M]. 北京: 科学出版社, 2001: 123-131. (0) [5] 刘芳, 李天瑞. 基于边界域的不完备信息系统属性约简方法[J]. 计算机科学, 2016, 43(3): 242-245. LIU Fang, LI Tianrui. Method for attribute reduction based on rough sets boundary regions[J]. Computer science, 2016, 43(3): 242-245. DOI:10.11896/j.issn.1002-137X.2016.03.044 (0) [6] WU Jianrong, KAI Xuewen, LI Jiaojiao. Atoms of monotone set-valued measures and integrals[J]. Fuzzy sets and systems, 2015, 183: 972-979. (0) [7] 王国胤, 张清华. 不同知识粒度下粗糙集的不确定性研究[J]. 计算机学报, 2008, 31(9): 1588-1598. WANG Guoyin, ZHANG Qinghua. Uncertainty of rough set in different knowledge granularities[J]. Chinese journal of computers, 2008, 31(9): 1588-1598. (0) [8] 钱文彬, 杨炳儒, 谢永红, 等. 一种基于属性度量的快速属性约简算法[J]. 小型微型计算机系统, 2014, 35(6): 1407-1411. QIAN Wenbin, YANG Bingru, XIE Yonghong, et al. A quick algorithm for attribute reduction based on attribute measure[J]. Journal of chinese computer systems, 2014, 35(6): 1407-1411. (0) [9] 鞠恒荣, 马兴斌, 杨习贝, 等. 不完备信息系统中测试代价敏感的可变精度分类粗糙集[J]. 智能系统学报, 2014, 9(2): 219-223. JU Hengrong, MA Xingbin, YANG Xibei, et al. Test-cost-sensitive based variable precision classification rough set in incomplete information system[J]. CAAI transactions on intelligent systems, 2014, 9(2): 219-223. (0) [10] 陈昊, 杨俊安, 庄镇泉. 变精度粗糙集的属性核和最小属性约简算法[J]. 计算机学报, 2012, 35(5): 1011-1017. CHEN Hao, YANG Junan, ZHUANG Zhenquan. The core of attributes and minimal attributes reduction in variable precision rough set[J]. Chinese journal of computers, 2012, 35(5): 1011-1017. (0) [11] 张清华, 薛玉斌, 王国胤. 粗糙集的最优近似集[J]. 软件学报, 2016, 27(2): 295-308. ZHANG Qinghua, XUE Yubin, WANG Guoyin. Optimal approximate sets of rough sets[J]. Journal of software, 2016, 27(2): 295-308. (0) [12] 孟慧丽, 马媛媛, 徐久成. 基于下近似分布粒度熵的变精度悲观多粒度粗糙集粒度约简[J]. 计算机科学, 2016, 43(2): 83-85, 104. MENG Huili, MA Yuanyuan, XU Jiucheng. Granularity reduct of variable precision pessimistic multi-granulation rough set based on granularity entropy of lower approximate distribution[J]. Computer science, 2016, 43(2): 83-85, 104. DOI:10.11896/j.issn.1002-137X.2016.02.018 (0) [13] 续欣莹, 刘海涛, 谢珺, 等. 信息观下基于不一致邻域矩阵的属性约简[J]. 控制与决策, 2016, 31(1): 130-136. XU Xinying, LIU Haitao, XIE Jun, et al. Attribute reduction based on inconsistent neighborhood matrix under information view[J]. Control and decision, 2016, 31(1): 130-136. (0) [14] 胡清华, 于达仁, 谢宗霞. 基于邻域粒化和粗糙逼近的数值属性约简[J]. 软件学报, 2008, 19(3): 640-649. HU Qinghua, YU Daren, XIE Zongxia. Numerical attribute reduction based on neighborhood granulation and rough approximation[J]. Journal of software, 2008, 19(3): 640-649. (0) [15] QIAN Yuhua, LI Shunyong, LIANG Jiye. Pessimistic rough set based decisions: a multigranulation fusion strategy[J]. Information sciences, 2014, 264: 196-210. DOI:10.1016/j.ins.2013.12.014 (0) [16] LIN Guoping, QIAN Yuhua, LI Jinjin. Neighborhood based multigranulation rough sets[J]. International journal of approximate reasoning, 2012, 7(53): 1080-1093. (0) [17] 沈家兰, 汪小燕, 申元霞. 可变程度多粒度粗糙集[J]. 小型微型计算机系统, 2016, 37(05): 1012-1016. SHEN Jialan, WANG Xiaoyan, SHEN Yuanxia. Variable Grade multi-granulation rough set[J]. Journal of Chinese computer systems, 2016, 37(5): 1012-1016. (0) [18] 许韦, 吴陈, 杨习贝. 基于容差关系的不完备可变精度多粒度粗糙集[J]. 计算机应用研究, 2013, 30(6): 1712-1715. XU Wei, WU Chen, YANG Xibei. Incomplete variable precision multigranularity rough set based on tolerance relation[J]. Application research of computers, 2013, 30(6): 1712-1715. (0) [19] 徐久成, 张灵均, 孙林, 等. 广义邻域关系下不完备混合决策系统的约简[J]. 计算机科学, 2013, 40(4): 244-248. XU Jiucheng, ZHANG Lingjun, SUN Lin, et al. Reduction in incomplete hybrid decision systems based on generalized neighborhood relationship[J]. Computer science, 2013, 40(4): 244-248. (0)