文章快速检索 高级检索

1. 江苏理工学院 计算机工程学院, 江苏 常州 213015;
2. 南京信息工程大学 江苏省大数据分析技术重点实验室, 江苏 南京 210044

An incremental attribute reduction algorithm for group objects
QIAN Jin1,2, ZHU Yayan1
1. School of Computer Engineering, Jiangsu University of Technology, Changzhou 213015, China ;
2. Jiangsu Key Laboratory of Big Data Analysis Technology/B-DAT, Nanjing University of Information Science & Technology, Nanjing 210044, China
Abstract: Real-world datasets change in size dynamically. Non-incremental attribute reduction methods usually need to re-compute source data when obtaining a new reduction without considering the information in the existing reduction, which consumes a great deal of computational time and storage space. Therefore, in this paper, some reduction invariance properties for dynamic datasets are discussed. An incremental attribute reduction algorithm for group objects using the previous reduction is proposed to quickly update a reduction with high inheritance rate and thus improve the efficiency of incremental learning. Finally, the incremental approach proposed is compared with an existing incremental attribute reduction algorithm for a single object, the non-incremental attribute reduction algorithms on the UCI, and synthetic datasets. Experimental results show that this incremental attribute reduction algorithm for group objects can deal with dynamic data rapidly, as it has better inheritance of reduction.
Key words: rough set theory     attribute Reduction     group objects     inheritance rate of Reduct     incremental learning

1 粗糙集概念

1)POSRed(D)=POSC(D)；

2)∀a∈Red，POSRed－{a}(D)≠POSRed(D)。

2 面向成组对象集的增量式属性约简算法

1)y无法用Red区分，当且仅当∃x∈U使得∀a∈Red[f(x,a)=f(y,a)];

2)y可以用Red区分，当且仅当∀x∈U使得∃a∈Red[f(x,a)≠f(y,a)]。

①若∃Xi∈BNDCU[yXi]，则POSRedU′= POSCU′

②若∃Xi∈POSCU[yXi]，分为2种情况：a) ∀xXi[f(x,d)=f(y,d)]，则POSRedU′= POSRedU∪ {y}，而POSCU′=POSCU∪ {y}，即POSRedU′=POSCU′b)∃xXi [f(x,d)≠f(y,d)]，则POSCU′= POSCU - Xi，而POSRedU′= POSRedU - Xi，则POSCU′=POSRedU′。总之，POSRedU′= POSCU′

③若[y]CU/C∧[y]RedU/Red，若∃x∈POSRedU使得∀z∈[x]Red[f(z,d)=f(y,d)]，则POSRedU′= POSRedU∪{y} ，而POSCU′= POSCU∪{y}，则POSRedU′=POSCU′

④当[y]CU/C∧[y]RedU/Red，若∃x∈POSRedU使得∃z∈[x]Red [f(x,d)≠f(y,d)]，则POSRedU′=POSRedU- [x]Red，而POSCU′=POSCU∪{y}，即POSRedU′≠ POSCU′

1)A=RedU

2)计算U/A={A1,A2,…,Ar}，U/C={X1,X2,…,Xt}，计算POSCU和POSAU//y属于旧的正区域

3)如果y∈POSCU，则转到9)；//y属于旧的边界域

4)如果y∈BNDCU，则转到9)；//y属于新的正区域对象

5)如果y∉U/A,则转到9)；否则A′h=Ah∪{y}；

6)判断A′h 一致性，若A′h中对象的决策一致，则转到9)；//y与其他对象在A属性集上产生冲突

7)While (|POSAA′h|≠|POSCA′h|)

8) For each attribute cA

9)RedU∪{y}=A，输出RedU∪{y}

1)A=RedUUbnd=Ø；

2)计算 U/A={A1,A2,…,Ar}，U/A={A1,A2,…,Ar′}和(U∪U)/A={A1,A2,…,A′h,Ah+1,…,Ar,Ah+1,…,Ar′}；

3)U/C={X1,X2,…,Xt}，U/C={X1 ,X2,…,Xt′}和(U∪U)/C={X′1,X′2,…,X′h′,Xh′+1,…,Xr,Xh′+1,…,Xt}；

4)如果h=0 和h′=0，转5；

5)计算POSAU和POSCU，如果POSAU=POSCU，转到10)；否则转到6)。//新增对象集中约简不能区分的矛盾对象

6)Ubnd=UΔ - POSAU;

7)for i=1 to h {如果A′i不一致，则Ubnd=UbndA′i；}//累加同一等价类中约简不能区分的矛盾对象

8)计算POSAUbnd和POSCUbnd

9)While (|POSAUbnd| ≠|POSCUbnd|)

//若这样的属性有多个，则任选一个；A=A ∪ {a}；}

10)For each attribute cA

11)RedUU=A,输出RedUU

 U c1 c2 c3 c4 d U△ c1 c2 c3 c4 d x1 1 0 1 0 2 y1 2 1 0 0 1 x2 0 1 0 1 2 y2 1 0 1 0 4 x3 1 1 1 1 3 y3 0 1 0 1 2 x4 1 0 1 0 1 y4 0 2 1 1 1 x5 0 2 2 1 2 y5 0 1 1 1 2 x6 1 2 0 0 2 y6 0 1 1 2 1

 NewObject Case RedU RedU′ IR RedU RedU′ IR RedU RedU′ IR 21001 1) c2c1 c2c1 1 c2c3 c2c1 0.5 c3c4 c3c4c1 1 10104 ① c2c1 c2c1 1 c2c3 c2c3 1 c3c4 c3c4 1 01012 ② c2c1 c2c1 1 c2c3 c2c3 1 c3c4 c3c4 1 02111 ② c2c1 c2c1 1 c2c3 c2c3 1 c3c4 c3c4c1 1 01112 ③ c2c1 c2c1 1 c2c3 c2c1 0.5 c3c4 c3c4c1 1 01121 ④ c2c1 c2c1c3 1 c2c3 c2c3c1 1 c3c4 c3c4 1 合计 平均传承率 1 平均传承率 0.83 平均传承率 1

3 实验验证

 图 1 增量式约简算法和非增量式约简算法运行时间比较 Fig. 1 Comparison of incremental and non-incremental Reduction algorithms on running time

 序号 数据集 对象数 属性个数 类别数 1 Chess 3 196 36 2 2 mushroom 8 124 22 2 3 nursery 12 960 9 5 4 Connect 67 557 42 3 5 Dataset1 200 000 30 5 6 Dataset2 500 000 30 9

 图 2 约简长度比较 Fig. 2 Comparison of Reduct length

 % 数据集 20% 40% 60% 80% 100% Chess 100 100 95.45 95.45 95.45 mushroom 100 100 100 100 100 nursery 100 100 100 100 100 Connect 100 100 100 100 100 Dataset1 100 100 100 100 100 Dataset2 100 100 90.9 90.9 100

4 结论

 [1] PAWLAK Z. Rough sets[J]. International journal of computer & information sciences , 1982, 11 (5) : 341-356 [2] SKOWRON A, RAUSZER C. The discernibility matrices and functions in information systems[M]//SLOWINSKI R. Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory. Netherlands:Springer, 1992:311-362. [3] 苗夺谦, 胡桂荣. 知识约简的一种启发式算法[J]. 计算机研究与发展 , 1999, 36 (6) : 681-684 MIAO Duoqian, HU Guirong. A heuristic algorithm for reduction of knowledge[J]. Journal of computer research and development , 1999, 36 (6) : 681-684 [4] 王国胤, 于洪, 杨大春. 基于条件信息熵的决策表约简[J]. 计算机学报 , 2002, 25 (7) : 759-766 WANG Guoyin, YU Hong, YANG Dachun. Decision table reduction based on conditional information entropy[J]. Chinese journal of computers , 2002, 25 (7) : 759-766 [5] HU Feng, WANG Guoyin, HUANG Hai, et al. Incremental attribute reduction based on elementary sets[M]//SLEZAK D, WANG Guoyin, SZCZUKA M, et al. Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Berlin Heidelberg:Springer, 2005:185-193. [6] 杨明. 一种基于改进差别矩阵的属性约简增量式更新算法[J]. 计算机学报 , 2007, 30 (5) : 815-822 YANG Ming. An incremental updating algorithm for attribute reduction based on improved discernibility matrix[J]. Chinese journal of computers , 2007, 30 (5) : 815-822 [7] 冯少荣, 张东站. 一种高效的增量式属性约简算法[J]. 控制与决策 , 2011, 26 (4) : 495-500 FENG Shaorong, ZHANG Dongzhan. Effective incremental algorithm for attribute reduction[J]. Control and decision , 2011, 26 (4) : 495-500 [8] 尹林子, 阳春华, 王晓丽, 等. 基于标记可辨识矩阵的增量式属性约简算法[J]. 自动化学报 , 2014, 40 (3) : 397-404 YIN Linzi, YANG Chunhua, WANG Xiaoli, et al. An incremental algorithm for attribute reduction based on labeled discernibility matrix[J]. Acta automatica sinica , 2014, 40 (3) : 397-404 [9] SHU Wenhao, SHEN Hong. Updating attribute reduction in incomplete decision systems with the variation of attribute set[J]. International journal of approximate reasoning , 2014, 55 (3) : 867-884 DOI:10.1016/j.ijar.2013.09.015 [10] CHEN Hongmei, LI Tianrui, LUO Chuan, et al. A Decision-theoretic rough set approach for dynamic data mining[J]. IEEE transactions on fuzzy systems , 2015, 23 (6) : 1958-1970 DOI:10.1109/TFUZZ.2014.2387877 [11] LIANG Jiye, WANG Feng, DANG Chuangyin, et al. A group incremental approach to feature selection applying rough set technique[J]. IEEE transactions on knowledge and data engineering , 2014, 26 (2) : 294-308 DOI:10.1109/TKDE.2012.146 [12] QIAN Jin, YE Feiyue, LV Ping. An incremental attribute reduction algorithm in decision table[C]//Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). Yantai, China:IEEE, 2010, 4:1848-1852. [13] QIAN Jin, MIAO Duoqian, ZHANG Zehua, et al. Hybrid approaches to attribute reduction based on indiscernibility and discernibility relations[J]. International journal of approximate reasoning , 2011, 52 (2) : 212-230 DOI:10.1016/j.ijar.2010.07.011 [14] QIAN Jin, MIAO Duoqian, ZHANG Zehua, et al. Parallel attribute reduction algorithms using MapReduce[J]. Information sciences , 2014, 279 : 671-690 DOI:10.1016/j.ins.2014.04.019 [15] 康向平, 苗夺谦. 一种基于概念格的集值信息系统中的知识获取方法[J]. 智能系统学报 , 2016, 11 (3) : 287-293 KANG Xiangping, MIAO Duoqian. A knowledge acquisition method based on concept latticein set-valued information systems[J]. CAAI transactions on intelligent systems , 2016, 11 (3) : 287-293
DOI: 10.11992/tis.201606005

0

#### 文章信息

QIAN Jin, ZHU Yayan

An incremental attribute reduction algorithm for group objects

CAAI Transactions on Intelligent Systems, 2016, 11(4): 496-502
http://dx.doi.org/10.11992/tis.201606005