﻿ 混合神经网络和条件随机场相结合的文本情感分析
«上一篇
 文章快速检索 高级检索

 智能系统学报  2021, Vol. 16 Issue (2): 202-209  DOI: 10.11992/tis.201907041 0

### 引用本文

ZHAI Xueming, WEI Wei. Text sentiment analysis combining hybrid neural network and conditional random field[J]. CAAI Transactions on Intelligent Systems, 2021, 16(2): 202-209. DOI: 10.11992/tis.201907041.

### 文章历史

Text sentiment analysis combining hybrid neural network and conditional random field
ZHAI Xueming , WEI Wei
School of Control and Computer Engineering, North China Electric Power University, Baoding 071003, China
Abstract: To solve problems such as the long training time of neural network models and insufficient contextual-information learning in text sentiment analysis, in this paper, we propose a model that combines a hybrid neural network with the conditional random field (CRF). Taking the neural network as the language model, the model combines the semantic information and structural features of the convolutional neural network with the bi-directional gated recurrent unit. The CRF model is used as a classifier that determines the probability distributions of emotions, from which it can then accurately determine the emotion category. The model was tested on the NLPCC 2014 data set, and achieved an accuracy rate of 91.74%. Compared with other classification models, the proposed model can obtain better accuracy and F values.
Key words: convolutional neural network (CNN)    gated recurrent unit (GRU)    conditional random field (CRF)    text sentiment analysis    language model    semantic feature    contextual information    classifier

1 情感分析过程

2 混合神经网络和条件随机场相结合的文本情感分析模型

2.1 混合神经网络的建立

 Download: 图 3 混合神经网络结构 Fig. 3 Structure diagram of Hybrid neural network

 $y_n^d = f\left( {{W^d} \circ {x_{n:n + \delta - 1}} + {b^d}} \right)$ (1)

 ${{{y}}^d} =[ y_1^d\;\;y_2^d\;\;\cdots\;\;y_{N - \delta + 1}^d]$ (2)

Bi-GRU是GRU的改进，它可以在前后方向同时获取上下文信息，相比GRU能够获得更高的准确率。不仅如此，Bi-GRU还具有复杂度低，对字向量依赖性低，响应时间快的优点。在Bi-GRU结构中，在每个训练序列之前和之后都存在循环神经网络。在t时刻，Bi-GRU单元激活值 ${h}_{t}$ 同时受到t−1时刻激活值 ${h}_{t-1}$ ，候选激活值 ${\stackrel{~}{h}}_{t}$ 和更新门z的控制。其计算方式如式(3)、(4)， $\odot$ 表示元素相乘：

 ${\tilde h_t} = {\rm{tanh}}\left( {{{W}}{x_t} + {{U}}\left( {{r_t} \odot {h_{t - 1}}} \right)} \right)$ (3)
 $h_t^j = \left( {1 - z_t^j} \right)h_{t - 1}^j + z_t^j\tilde h_t^j$ (4)

 Download: 图 4 Bi-GRU获取特征过程 Fig. 4 Process diagram of Bi-GRU obtain feature

 ${{S}}' = {f_c} \oplus {f_g}$ (5)

S′进行线性变换后，得到对应词语所属情感类别的分数k

 $k = {{{U}}_1}{{S}}' + b$ (6)

 $\vec {{E}} = \mathop \sum \limits_{t = 1}^{m - 1} - \log\sigma \left( {{{\overrightarrow {{{{h}}_t}} }^{\rm{T}}}{v_{{w_{t + 1}}}}} \right) + \mathop \sum \limits_{i = 1,{w_i} \in N}^l \log\sigma \left( {{{\overrightarrow {{{{h}}_t}} }^{\rm{T}}}{v_{{w_i}}}} \right)$ (7)
 $\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\leftarrow}$}}{{{E}}} = \mathop \sum \limits_{t = 1}^{m - 1} - \log\sigma \left( {{{\overleftarrow {{{{h}}_t}} }^{\rm{T}}}{v_{{w_{t - 1}}}}} \right) + \mathop \sum \limits_{i = 1,{w_i} \in N}^l \log\sigma \left( {{{\overleftarrow {{{{h}}_t}} }^{\rm{T}}}{v_{{w_i}}}} \right)$ (8)

 $N = \left\{ {{w_j}|{w_j} \in V\& {w_j} \notin \left\{ {{w_t},{w_{t - 1}}, \cdots ,{w_1}} \right\}} \right\}$ (9)

 ${p_{{w_t}}}\left( {y{\rm{|}}S'} \right) = \left[ \!\!{\begin{array}{*{20}{c}} {p({y_1} = 1|S')} \\ {p({y_2} = 1|S')} \\ {\begin{array}{*{20}{c}} \vdots \\ {p({y_m} = 1|S')} \end{array}} \end{array}} \!\!\right] = \frac{1}{{\displaystyle\sum \limits_{i = 1}^m {e^{{k_i}}}}}\left[\!\! {\begin{array}{*{20}{c}} {{e^{{k_1}}}} \\ {{e^{{k_2}}}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{e^{{k_m}}}} \end{array}} \end{array}} \!\!\right]$ (10)

2.2 CRF分类器模型

CRF分类器模型和神经网络分类器模型各自具有优点和不足。CRF模型需要人工提前对语料信息进行标注，手动设计词的词性、程度等特征，而神经网络模型可以学习训练数据自动生成特征向量，取得更好的效果[19]。但是，神经网络模型往往需要更长的训练时间，且神经网络模型的有些输出在命名实体识别上是不合法的，因此有必要使用 CRF随后将命名实体的规则添加到序列标记过程中。本文根据CRF与神经网络模型各自的特点进行组合，得到在性能上更具优势的联合模型[20]

CRF模型的学习与预测是在样本的多个特征上进行的。CRF模型本身可以生成特征向量并进行分类，本文使用混合神经网络提取的特征作为中间量，替换原公式中的向量值。

CRF分类器模型中的发射概率是指序列中的单词属于每个情感分类的概率，即为 ${p}_{w}$ 。转移概率是从标签类到相邻标签类的概率。传统CRF分类器的发射概率是根据特征模版生成的，但是本文为了获得更好的上下文信息，使用混合神经网络自动获取的特征作为发射概率。本文CRF分类器模型发射概率计算公式为

 ${p_{{w_t}}}\left( {w{\rm{|}}S',y} \right) = \frac{1}{{\displaystyle \sum \limits_{i = 1}^m {e^{{k_i}}}}}\left[ {\begin{array}{*{20}{c}} {{e^{{k_1}}}} \\ {{e^{{k_2}}}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{e^{{k_m}}}} \end{array}} \end{array}} \right]$ (11)

 ${p_{tw}} = \mathop \prod \limits_{t = 1}^n {\mathit{\Phi }}\left( {{y_{{w_{t - 1}}}},{y_{{w_t}}}} \right)*{p_{{w_t}}}\left( {y{\rm{|}}S'} \right)$ (12)

 ${\rm{loss}} = {y_{tw}} - \max\left( {{p_{tw}}} \right)$ (13)

3 实验与分析 3.1 实验数据集

3.2 评价指标

 $A=\frac{m+p}{m+n+l+p}$ (14)
 $R=\frac{m}{m+p}$ (15)
 $F=\frac{2m}{2m+l+p}$ (16)

3.3 实验参数

 Download: 图 5 基于迭代次数的F值的变化趋势 Fig. 5 Trends in F values based on iterations

 Download: 图 6 基于词向量维度的F值的变化趋势 Fig. 6 Trends in F values based on word vector dimensions

3.4 实验结果及对比分析

CNN：将词向量作为输入在CNN中进行分类。

Bi-GRU：将词向量作为输入在双向门控循环单元中进行分类。

CRF：将词向量输入到条件随机场中进行分类。

CNN+Bi-GRU：将CNN与Bi-GRU采用联合训练的方式，词向量分别输入两种神经网络中，得到的输出进行特征融合，利用Softmax进行分类。

Bi-GRU+CRF：将Bi-GRU模型与CRF模型以链式方式进行组合，将训练好的词向量作为Bi-GRU模型的输入，其输出作为CRF模型的输入，最终输出情感分析结果。

C-BG+CRF：本文提出的混合神经网络与条件随机场相结合的情感分析模型。

 Download: 图 7 6种模型F值比较 Fig. 7 Comparison of F value in six models

4 结束语

 [1] CAMBRIA E. Affective computing and sentiment analysis[J]. IEEE intelligent systems, 2016, 31(2): 102-107. (0) [2] 陈龙, 管子玉, 何金红, 等. 情感分类研究进展[J]. 计算机研究与发展, 2017, 54(6): 1150-1170. CHEN Long, GUAN Ziyu, HE Jinhong, et al. A survey on sentiment classification[J]. Journal of computer research and development, 2017, 54(6): 1150-1170. DOI:10.7544/issn1000-1239.2017.20160807 (0) [3] 杨立公, 朱俭, 汤世平. 文本情感分析综述[J]. 计算机应用, 2013, 33(6): 1574-1578, 1607. YANG Ligong ZHU Jian, TANG Shiping. Survey of text sentiment analysis[J]. Journal of computer applications, 2013, 33(6): 1574-1578, 1607. DOI:10.3724/SP.J.1087.2013.01574 (0) [4] TABOADA M, BROOKE J, TOFILOSKI M, et al. Lexicon-based methods for sentiment analysis[J]. Computational linguistics, 2011, 37(2): 267-307. DOI:10.1162/COLI_a_00049 (0) [5] 丁晟春, 吴靓婵媛, 李红梅. 基于SVM的中文微博观点倾向性识别[J]. 情报学报, 2016, 35(12): 1235-1243. DING Shengchun, WU Jingchanyuan, LI Hongmei. Chinese micro-blogging opinion recognition based on SVM model[J]. Journal of the China society for scientific and technical information, 2016, 35(12): 1235-1243. DOI:10.3772/j.issn.1000-0135.2016.012.001 (0) [6] 梁军, 柴玉梅, 原慧斌, 等. 基于深度学习的微博情感分析[J]. 中文信息学报, 2014, 28(5): 155-161. LIANG Jun, CHAI Yumei, YUAN Huibin, et al. Deep learning for Chinese micro-blog sentiment analysis[J]. Journal of Chinese information processing, 2014, 28(5): 155-161. DOI:10.3969/j.issn.1003-0077.2014.05.019 (0) [7] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of machine learning research, 2011, 12: 2493-2537. (0) [8] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 2014: 1746−1751. (0) [9] YIN Wenpeng, KANN K, YU Mo, et al. Comparative study of CNN and RNN for natural language processing[J]. 2017. (0) [10] TANG Duyu, QIN Bing, LIU Ting. Document modeling with gated recurrent neural network for sentiment classification[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, 2015: 1422−1432. (0) [11] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780. DOI:10.1162/neco.1997.9.8.1735 (0) [12] ZHU Xiaodan, SOBHANI P, GUO Hongyu. Long short-term memory over recursive structures[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France, 2015: 1604−1612. (0) [13] 白静, 李霏, 姬东鸿. 基于注意力的BiLSTM-CNN中文微博立场检测模型[J]. 计算机应用与软件, 2018, 35(3): 266-274. BAI Jing, LI Fei, JI Donghong. Attention based BiLSTM-CNN Chinese microblogging position detection model[J]. Computer applications and software, 2018, 35(3): 266-274. DOI:10.3969/j.issn.1000-386x.2018.03.051 (0) [14] 刘洋. 基于GRU神经网络的时间序列预测研究[D]. 成都: 成都理工大学, 2017. LIU Yang. The research of time series prediction based on GRU neural network[D]. Chengdu: Chengdu University of Technology, 2017. (0) [15] 魏韡, 向阳, 陈千. 中文文本情感分析综述[J]. 计算机应用, 2011, 31(12): 3321-3323. WEI Wei, XIANG Yang, CHEN Qian. Survey on Chinese text sentiment analysis[J]. Journal of computer applications, 2011, 31(12): 3321-3323. (0) [16] 齐小英. 基于NLPIR的人工智能新闻事件的语义智能分析[J]. 信息与电脑(理论版), 2019, 31(20): 104-107. QI Xiaoying. Semantic intelligence analysis of artificial intelligence news events based on NLPIR[J]. China computer & communication, 2019, 31(20): 104-107. (0) [17] MIKOLOV T, CHEN Kai, CORRADO G, et al. Efficient estimation of word representations in vector space[C]// Proceedings of Workshop at ZCLR. [S.l.], 2013. (0) [18] MNIH A, TEH Y W. A fast and simple algorithm for training neural probabilistic language models[C]//Proceedings of the 29th International Conference on International Conference on Machine Learning. Edinburgh, UK, 2012: 419−426. (0) [19] 王昊, 邓三鸿. HMM和CRFs在信息抽取应用中的比较研究[J]. 现代图书情报技术, 2007(12): 57-63. WANG Hao, DENG Sanhong. Comparative Study on HMM and CRFs applying in information extraction[J]. New technology of library and information service, 2007(12): 57-63. (0) [20] 王鸿飞. 基于条件随机场的中文微博情感分析研究[D]. 广州: 广东工业大学, 2013. WANG Hongfei. Research of sentiment analysis for Chinese micro blog based on conditional random field[D]. Guangzhou: Guangdong University of Technology, 2013. (0)