﻿ 基于三支决策的非重叠社团划分
«上一篇
 文章快速检索 高级检索

 智能系统学报  2017, Vol. 12 Issue (3): 293-300  DOI: 10.11992/tis.201705013 0

### 引用本文

FANG Liandi, ZHANG Yanping, CHEN Jie, et al. Three-way decision based on non-overlapping community division[J]. CAAI Transactions on Intelligent Systems, 2017, 12(3): 293-300. DOI: 10.11992/tis.201705013.

### 文章历史

1. 安徽大学 计算机科学与技术学院, 安徽 合肥 230601;
2. 安徽大学 计算机智能与信号处理教育部重点实验室, 安徽 合肥 230601;
3. 安徽大学 国际商学院, 安徽 合肥 230601

Three-way decision based on non-overlapping community division
FANG Liandi1,2, ZHANG Yanping1,2, CHEN Jie1,2, WANG Qianqian3, LIU Feng1,2, WANG Gang1,2
1. School of Computer Science and Technology, Anhui University, Hefei 230601, China;
2. Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China;
3. School of Business, Anhui University, Hefei 230601, China
Abstract: This paper proposes an algorithm called N-TWD based on the theory of three-way decision, which can further divide overlapping communities formed by the initial clustering into non-overlapping communities. First, it utilizes a hierarchical clustering algorithm to get an overlapping community structure. The nodes in the non-overlapping parts of the community of the left side between two communities with overlapping parts were defined as positive regions. Then, the nodes on its right are denoted as the negative region, and nodes in the overlapping parts are denoted as the boundary region. The degree of belonging (BP, BN) between the positive and negative regions was calculated using the nodes in the boundary region. Moreover, a further division was done based on the degree of belonging. After division, the belonging of the rest nodes in the boundary region would be determined by voting to ultimately get a non-overlapping community structure. The experimental results for four classical social networks and one real-world data-set indicate that the proposed algorithm has a lower time complexity and gets a higher modularity value than other community division algorithms (GN, NFA, LPA, CACDA).
Key words: complex network    community division    overlapping node    three-way decision    granulation coefficient    hierarchical clustering    community structure    node belonging degree

N-TWD算法对于社团重叠部分(边界域)获取了节点，与已明确划分的社团(正域/负域)之间的紧密关系依次进行划分，而不仅仅只根据邻居数目进行投票，更好地体现了节点的真实归属。本文采用4个真实的社交网络数据集和1个真实数据集对N-TWD算法进行了验证，实验结果证明了三支决策方法对部分重叠节点处理的可行性和有效性。

1 相关工作 1.1 三支决策理论

 $若P\left( {X\left| {{{\left[X \right]}_R}} \right.} \right) \ge \alpha, 则x \in {\rm{POS}}\left( X \right)$ (1)
 $若\beta < P\left( {X\left| {{{\left[X \right]}_R}} \right.} \right) < \alpha, 则x \in {\rm{BND}}\left( X \right)$ (2)
 $若P\left( {X\left| {{{\left[X \right]}_R}} \right.} \right) \le \beta, 则x \in {\rm{NEG}}\left( X \right)$ (3)

 $\alpha = \frac{{{\lambda _{PN}}-{\lambda _{BN}}}}{{\left( {{\lambda _{PN}}-{\lambda _{BN}}} \right) + \left( {{\lambda _{BP}}-{\lambda _{PP}}} \right)}}$ (4)
 $\beta = \frac{{{\lambda _{PN}}-{\lambda _{NN}}}}{{\left( {{\lambda _{BN}}-{\lambda _{NN}}} \right) + \left( {{\lambda _{NP}}-{\lambda _{BP}}} \right)}}$ (5)
1.2 重叠社团中的3个域

1) 正域(POS)：左边社团的非重叠节点。

2) 负域(NEG)：右边社团的非重叠节点。

3) 边界域(BND)：重叠部分的节点。

 图 1 重叠社团3个域的划分 Fig.1 The division of three domains in overlapping community
1.3 社团评价标准

 $粒化系数\;f\left( C \right) = \frac{{\left\lfloor {{\text{BND}}} \right\rfloor }}{{\left\lfloor {{\text{POS + NEG + BND}}} \right\rfloor }}$ (6)

 $Q = \sum\limits_i {\left( {{e_{ii}} - a_i^2} \right)} = {\rm{Tre}} - \left\| {{\mathit{\boldsymbol{e}}^2}} \right\|$ (7)

2 基于三支决策的非重叠社团划分算法

2.1 节点相似度与社团归属度

 ${\rm{SVV}}\left( {{v_i}, {v_j}} \right) = \frac{{\left| {\mathit{\Gamma }\left( {{v_i}} \right) \cap \mathit{\Gamma }\left( {{v_j}} \right)} \right|}}{{k\left( {{v_i}} \right) \times k\left( {{v_j}} \right)}}$ (8)

 $\begin{array}{*{20}{c}} {{B_P} = {\rm{Belongness\_POS}}\left( {v, {\rm{POS}}} \right) = }\\ {\frac{{\sum\limits_i^{{N_P}} {{\rm{SVV}}\left( {{v_i}, {v_j}} \right)} }}{{{N_P}}}} \end{array}$ (9)
 $\begin{array}{*{20}{c}} {{B_N} = {\rm{Belongness\_NEG}}\left( {v, {\rm{NEG}}} \right) = }\\ {\frac{{\sum\limits_i^{{N_N}} {{\rm{SVV}}\left( {{v_i}, {v_j}} \right)} }}{{{N_N}}}} \end{array}$ (10)

 $若\frac{{{B_P} - {B_N}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} > \gamma, 则\;v \in {\rm{POS}}\left( X \right)$ (11)
 $若\frac{{{B_N} - {B_P}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} > \gamma, 则\;v \in {\rm{NEG}}\left( X \right)$ (12)
 $若\frac{{{\left| {{B_P} - {B_N}} \right|}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} < \gamma, 则\;v \in {\rm{BND}}\left( X \right)$ (13)

 图 2 计算归属度后3个域的划分 Fig.2 The division of three domains after calculation of belonging degree
 $\begin{array}{*{20}{c}} {{B_{P9}} = 0.11, {B_{N9}} = 0.05;}\\ {{B_{P9}} - {B_{N9}} = 0.11 - 0.05 = 0.06;}\\ {{B_{N9}} - {B_{P9}} = 0.05 - 0.11 = - 0.06;} \end{array}$
 $\begin{array}{*{20}{c}} {{B_{P10}} = 0.12, {B_{N10}} = 0.09;}\\ {{B_{P10}} - {B_{N10}} = 0.12 - 0.09 = 0.03;}\\ {{B_{N10}} - {B_{P10}} = 0.09 - 0.12 = - 0.03;} \end{array}$

2.2 算法流程描述

1) 初始化网络，根据公式(6) 计算粒化系数POS(X)、NEG(X)、BND(X)。

2) 计算边界域中节点与正域、负域中的社团的归属度BPBN

 $\begin{array}{*{20}{c}} {{\rm{while}}\;\exists v \in {\rm{BND}}\left( X \right)、\frac{{{\left| {{B_P} - {B_N}} \right|}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} > \gamma \;{\rm{do}}}\\ {{\rm{if}}\frac{{{B_P} - {B_N}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} > \gamma }\\ {{\rm{then}}\;v \in {\rm{POS}}\left( X \right), {\rm{POS}}\left( X \right) = {\rm{POS}}\left( X \right) + \left\{ v \right\}}\\ {{\rm{else}}\;v \in {\rm{NEG}}\left( X \right), {\rm{NEG}}\left( X \right) = {\rm{NEG}}\left( X \right) + \left\{ v \right\}}\\ {{\rm{end}}\;{\rm{while}}} \end{array}$

3) 在2) 完成后，对于仍然留在边界域中的节点进行投票处理：

 $\begin{array}{*{20}{c}} {{\rm{while}}\;\exists v \in {\rm{BND}}\left( X \right), \frac{{{\left| {{B_P} - {B_N}} \right|}}}{{{{\left| {{B_P} - {B_N}} \right|}_{\max }}}} < \gamma \;{\rm{do}}}\\ {L = {\rm{Label}}\left( v \right)}\\ {{\rm{if}}\;N\left( L \right) \in {\rm{POS}}\left( X \right)}\\ {{\rm{then}}\;v \in {\rm{POS}}\left( X \right), {\rm{POS}}\left( X \right) = {\rm{POS}}\left( X \right) + \left\{ v \right\}}\\ {{\rm{else}},\;v \in {\rm{NEG}}\left( X \right), {\rm{NEG}}\left( X \right) = {\rm{NEG}}\left( X \right) + \left\{ v \right\}}\\ {{\rm{end}}\;{\rm{while}}} \end{array}$

4) 输出POS(X)和NEG(X)。

3 实验分析 3.1 基准数据实验

 图 3 Q值与λ值关系图 Fig.3 The connection between λ and Q

 图 4 γ值与Q值关系图 Fig.4 The connection between γ and Q

3.2 应用数据实验

 图 5 无权无向网络 Fig.5 A unweighted and undirected network

 图 6 真实数据集中Q值与λ值关系图 Fig.6 The connection between λ and Q of true dataset
4 结束语

 [1] SHEN H, CHENG X, CAI K, et al. Detect overlapping and hierarchical community structure in networks[J]. Physica a: statistical mechanics and its applications, 2009, 388(8): 1706-1712. DOI:10.1016/j.physa.2008.12.021 (0) [2] LIU Y, PAN L, JIA X, et al. Three-way decision based overlapping community detection[C]//Rough Sets and Knowledge Technology. Springer, Berlin, 2013: 279-290. http://link.springer.com/chapter/10.1007/978-3-642-41299-8_27 (0) [3] 柯望. 基于层次粒化的社团发现方法研究[D]. 合肥: 安徽大学, 2016. KE Wang. Reach on community detection algorithm based on Hierarchical Granulation[D]. Hefei: Anhui University, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10357-1016128267.htm (0) [4] KERNIGHAN B W, LIN S. An efficient heuristic procedure for partitioning graphs[J]. The bell system technical journal, 1970, 49(2): 291-307. DOI:10.1002/bltj.1970.49.issue-2 (0) [5] ZHANG Y P, WANG Y. Detecting communities using spectral bisection method based on normal matrix[J]. Computer engineering and applications, 2006, 46(27): 43-45. (0) [6] POTHEN A, SIMON H D, LIOU K P. Partitioning sparse matrices with eigenvectors of graphs[J]. SIAM journal on matrix analysis and applications, 1990, 11(3): 430-452. DOI:10.1137/0611030 (0) [7] GIRVAN M, NEWMAN M E J. Community structure in social and biological networks[J]. Proceedings of the national academy of sciences, 2002, 99(12): 7821-7826. DOI:10.1073/pnas.122653799 (0) [8] 金弟, 杨博, 刘杰, 等. 复杂网络簇结构探测——基于随机游走的蚁群算法[J]. 软件学报, 2012, 23(3): 451-464. JIN Di, YANG Bo, LIU Jie, et al. Ant colony optimization based on random walk for community detection in complex networks[J]. Journal of software, 2012, 23(3): 451-464. (0) [9] NEWMAN M E J. Fast algorithm for detecting community structure in networks[J]. Physical review E, 2004, 69(6): 066133. DOI:10.1103/PhysRevE.69.066133 (0) [10] RADICCHI F, CASTELLANO C, CECCONI F, et al. Defining and identifying communities in networks[J]. Proceedings of the national academy of sciences of the United States of America, 2004, 101(9): 2658-2663. DOI:10.1073/pnas.0400054101 (0) [11] 赵姝, 柯望, 陈洁, 等. 基于聚类粒化的社团发现算法[J]. 计算机应用, 2014, 34(10): 2812-2815. ZHAO Shu, KE Wang, CHEN Jie, et al. Community detection algorithm based on clustering granulation[J]. Journal of computer applications, 2014, 34(10): 2812-2815. DOI:10.11772/j.issn.1001-9081.2014.10.2812 (0) [12] YAO Y Y. Three-way decisions with probabilistic rough sets[J]. Information sciences, 2010, 180(3): 341-353. DOI:10.1016/j.ins.2009.09.021 (0) [13] YAO Y. Two semantic issues in a probabilistic rough set model[J]. Fundamental informaticae, 2011, 108(3/4): 249-265. (0) [14] 贾修一, 商琳, 周献中, 等. 三支决策理论与应用[M]. 南京: 南京大学出版社, 2012. (0) [15] 于洪, 王国胤, 李天瑞, 等. 三支决策:复杂问题求解方法与实践[M]. 北京: 科学出版社, 2015. (0) [16] YU H, ZHANG C, WANG G. A tree-based incremental overlapping clustering method using the three-way decision theory[J]. Knowledge-based systems, 2016, 91: 189-203. DOI:10.1016/j.knosys.2015.05.028 (0) [17] YU H, WANG Y, JIAO P. A three-way decisions approach to density-based overlapping clustering[M]. Berlin Heidelberg: Springer, 2014: 92-109. (0) [18] YU H, ZHANG C, HU F. An Incremental Clustering Approach Based on Three-Way Decisions[C]//Rough Sets and Current Trends in Computing.Springer, Berlin Heidelberg, 2014, 8536: 152-159. https://link.springer.com/chapter/10.1007/978-3-319-08644-6_16 (0) [19] LIU Y, PAN L, JIA X, et al. Three-way decision based overlapping community detection[C]//Rough Sets and Knowledge Technology, 2013, 8171: 279-290. https://link.springer.com/chapter/10.1007/978-3-642-41299-8_27 (0) [20] YAO Y. An Outline of a theory of three-way decisions[C]//Rough Sets and Currnet Trends in Computing. Berlin Heidelberg:Springer, 2012: 1-17. https://link.springer.com/chapter/10.1007/978-3-642-32115-3_1 (0) [21] YAO Y Y, WONG S K M. A decision theoretic framework for approximating concepts[J]. International journal of man-machine studies, 1992, 37(6): 793-809. DOI:10.1016/0020-7373(92)90069-W (0) [22] JIA X, TANG Z, LIAO W, et al. On an optimization representation of decision-theoretic rough set model[J]. International journal of approximate reasoning, 2014, 55(1): 156-166. DOI:10.1016/j.ijar.2013.02.010 (0) [23] QIAN Y, ZHANG H, SANG Y, et al. Multi-granulation decision-theoretic rough sets[J]. International journal of approximate reasoning, 2014, 55(1): 225-237. DOI:10.1016/j.ijar.2013.03.004 (0) [24] NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks[J]. Physical review E, 2004, 69(2): 026113. DOI:10.1103/PhysRevE.69.026113 (0) [25] 贾修一, 李伟湋, 商琳, 等. 一种自适应求三枝决策中决策阈值的算法[J]. 电子学报, 2011, 39(11): 2520-2525. JIA Xiuyi, LI Weiwei, SHANG Lin, et al. An adaptive learning parameters algorithm in three-way decision-throretic rough set model[J]. Chinese journal of electronics, 2011, 39(11): 2520-2525. (0) [26] LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical review E, 2006, 73(2): 026120. DOI:10.1103/PhysRevE.73.026120 (0) [27] RAGHAVAN U N, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks[J]. Physical reviewe, 2007, 76(3): 036106. (0)