﻿ 基于三支决策的序列数据代价敏感分类算法
«上一篇
 文章快速检索 高级检索

 智能系统学报  2019, Vol. 14 Issue (6): 1255-1261  DOI: 10.11992/tis.201905049 0

### 引用本文

LIU Mulei, XU Feifei. A sequence data, cost-sensitive classification algorithm based on three-way decisions[J]. CAAI Transactions on Intelligent Systems, 2019, 14(6): 1255-1261. DOI: 10.11992/tis.201905049.

### 文章历史

A sequence data, cost-sensitive classification algorithm based on three-way decisions
LIU Mulei , XU Feifei
School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai 200090, China
Abstract: Cost-sensitive classification is different from the general classification method, which pays more attention to the classification accuracy of high-cost categories, but tolerates the accuracy of global classification. Three-way decisions are a solution to a cost-sensitive classification problem and lack support for sequence data. Combined with the ability of the LSTM model in sequence data processing, a method for classifying sequence data a using three-way decision method (3WD) is proposed. First, a general classification of the original data was done through the LSTM network; second, an overall cost estimate was performed on the classification result of step one; finally, the high-risk result was delayed or rejected. Methods were tested on four data sets and two sets of comparative experiments were performed. Experimental results showed that the new method distinguished the classification results of the LSTM model without changing the original structure.
Key words: cost-sensitive    three-way decision    LSTM    sequence data classification    classification algorithm    high-cost categorie    cost estimate

1 相关工作 1.1 三支决策

(P)当 ${{{P}}_{{r}}}\left( {{{X|}}\left[ {{x}} \right]} \right) \geqslant \alpha$ 时， $x \in {\rm POS}\left( {\rm{X}} \right);$

(B)当 $\beta < {{{P}}_{{r}}}\left( {{{X|}}\left[ {{x}} \right]} \right) <\alpha$ 时， $x \in{\rm BND}\left( {{X}} \right);$

(N)当 ${{{P}}_{{r}}}\left( {{{X|}}\left[ {{x}} \right]} \right) \leqslant \beta$ 时， $x \in {\rm NEG}\left( {{X}} \right);$

 ${\rm{\alpha }} = \frac{{{{\rm{{\textit{λ}} }}_{{\rm{PN}}}} - {{\rm{{\textit{λ}} }}_{{\rm{BN}}}}}}{{\left( {{{\rm{{\textit{λ}} }}_{{\rm{PN}}}} - {{\rm{{\textit{λ}} }}_{{\rm{BN}}}}} \right) + \left( {{{\rm{{\textit{λ}} }}_{{\rm{BP}}}} - {{\rm{{\textit{λ}} }}_{{\rm{PP}}}}} \right)}}$ (1)
 ${\rm{\beta }} = \frac{{{{\rm{\textit{λ} }}_{{\rm{BN}}}} - {{\rm{\textit{λ} }}_{{\rm{NN}}}}}}{{\left( {{{\rm{\textit{λ} }}_{{\rm{BN}}}} - {{\rm{\textit{λ} }}_{{\rm{NN}}}}} \right) + \left( {{{\rm{\textit{λ} }}_{{\rm{NP}}}} - {{\rm{\textit{λ} }}_{{\rm{NP}}}}} \right)}}$ (2)

 $0 \leqslant {\rm{\beta }} < \alpha \leqslant 1$ (3)
1.2 长短时记忆网络

LSTM是由Hoehreiterhe与Schmiduhber于1997年提出后经过大量的改进，目前被广泛应用[9]，成为目前处理序列与时序问题上的热门方案。LSTM是由一般的RNN改进而来。LSTM与一般的RNN的主要区别是在LSTM中的神经元不再是由单纯的神经元组成而是由4个功能不同的门来共同作用。其中包括了输入门、输出门、状态门，以及遗忘门。具体的结构如图1所示。

LSTM的独特结构是为了使其能够解决长期依赖问题而专门设计的。不同于RNN网络，LSTM的重复结构是由更加复杂的3个门相互连接而成。包括遗忘门、输入门与输出门。

 ${f_t} = \sigma \left( {{W_f} \cdot \left[ {{h_{t - 1}},{x_t}} \right] + {b_f}} \right)$ (4)
 ${i_t} = {\rm{\sigma }}\left( {{W_i} \cdot \left[ {{h_{t - 1}},{x_t}} \right] + {b_i}} \right)$ (5)
 $\widetilde {{C_t}} = {\rm tanh}\left( {{W_C} \cdot \left[ {{h_{t - 1}},{x_t}} \right] + {b_C}} \right)$ (6)
 ${C_t} = {f_t}{\cdot }{C_{t - 1}} + {i_t}{\cdot}\widetilde {{C_t}}$ (7)
 ${o_t} = {\rm{\sigma }}\left( {{W_o} \cdot \left[ {{h_{t - 1}},{x_t}} \right] + {b_o}} \right)$ (8)
 ${h_t} = {o_t}{\cdot}{\rm tanh}\left( {{C_t}} \right)$ (9)

1.3 代价敏感分类

 $L\left( {x,i} \right) = \mathop \sum \limits_j P\left( {j{\rm{|}}x} \right)c\left( {i,j} \right)$ (10)

2 基于LSTM的三支决策分类算法

 Download: 图 2 基于LSTM的三支决策算法流程 Fig. 2 Flow of 3WD based on LSTM

2.1 前置分类器

2.2 三支决策

 $\begin{array}{l} \left( {\rm Pli} \right){\rm IF}\;\;{\rm Pr}\left( {X{\rm{|}}{u_i}} \right) \geqslant {\alpha _i},\;\;{\rm THEN}\;\;{u_i} \in {\rm POS}\left( X \right)\\ \left( {\rm Bli} \right)\;{\rm IF}\;{\beta _i} < Pr\left( {X{\rm{|}}{u_i}} \right) < {\alpha _i},\;{\rm THEN}\;{u_i} \in {\rm BND}\left( X \right)\\ \left( {\rm Nli} \right)\;{\rm IF}\;{\rm Pr}\left( {X{\rm{|}}{u_i}} \right) \leqslant {\beta _i},\;{\rm THEN}\;{u_i} \in {\rm NEG}\left( X \right) \end{array}$
2.3 算法概述

BEGIN：

1) 　输入 $f$ $t:$ 分类特征，分类表

2) 　 $t' =$ 由LSTM模型预测或分类数据 $t$

3) 　输入 $v:$ 代价函数表

4) 　计算边界 $\alpha$ $\beta$

5) 　FOR 样本 $i$ IN $t'$ :

计算分类概率 ${p_i}$

IF ${p_i} \geqslant \alpha$ :

$i \in {\rm pos}$

ELSE IF $\alpha > {\rm pi} \geqslant \beta$ :

$i \in {\rm bnd}$

ELSE:

$i \in {\rm neg}$

6) 　 $d=$ 计算整体代价

7) 　IF $d >$ 目标值 $d'$ :

GOTO 2

END

3 实验与结果

PM2.5数据集来自于UCI数据库，该数据集记录了从2010年1月1日至2014年12月31日北京市的空气质量指数和气象数据。数据集为时间序列数据，特征为连续特征，任务可作为分类或回归任务。数据一共43 824条记录，特征共13个，部分数据缺失。

 Download: 图 3 原始数据集中的特征分布 Fig. 3 Frequency of features in this dataset
 Download: 图 4 原数据集中分类结果与时间的变化关系 Fig. 4 Relations between classify result and time change

 Download: 图 5 1949—1960年国际航班旅客人数 Fig. 5 Number of international airline traveler between 1949—1960
3.1 PM2.5数据集

 ${\rm{\lambda }} = \mathop \sum \limits_{{i}} {{\rm{\lambda }}_{{{ix}}}}{{{w}}_{{{ix}}}} + {{b}}$ (11)

3.2 国际旅行旅客人数数据集

 $\textit{λ} = \mathop \sum \limits_{i \in \left\{ {P,B,N} \right\},j \in \left\{ {X,\neg X} \right\}} {\textit{λ} _{i,j}}{w_{i,j}} + b$ (12)

 ${{\rm{\textit{λ} }}_{{{i}},{{j}}}} \sim {{f}}\left( {{\hat{\rm \theta }},{{{t}}_{{n}}}} \right)$ (13)

 Download: 图 9 随 ${t_n}$ 而代价越来越大的判断曲线 ${\textit{λ} _t}$ Fig. 9 As ${{{t}}_{{n}}}$ increases, the prediction results become more and more inaccurate