﻿ 基于DAE-iForest的燃气轮机排气温度异常检测
 舰船科学技术  2023, Vol. 45 Issue (24): 132-136    DOI: 10.3404/j.issn.1672-7649.2023.24.024 PDF

Abnormal detection of gas turbine exhaust temperature based on DAE-iForest
LI Kun-tai, YU You-hong
College of Power Engineering, Naval University of Engineering, Wuhan 430033, China
Abstract: Abnormal detection is carried out on high-temperature components such as the combustion chamber and the blades of the first several stages of the turbine through the exhaust temperature of the gas turbine. Early and reliable abnormal detection is crucial to ensure the efficient operation of the gas turbine. With the wide application of machine learning, data-driven condition monitoring methods have become more and more popular. To solve the problem of gas turbine exhaust temperature distribution anomaly detection in the case of missing fault data, deep autoencoder (DAE) was used to learn characteristics, and isolated forset (iForset) was used to learn normal information of characteristic data, so as to achieve abnormal detection. Compared with other single classification anomaly detection methods, this method has the best detection performance index and can realize effective and sensitive gas turbine exhaust temperature anomaly detection.
Key words: gas turbine     exhaust gas temperature     abnormal detection     deep autoencoder (DAE)     isolated Forset
0 引　言

1 模型与方法 1.1 深度自编码器

 图 1 DAE结构示意图 Fig. 1 Structure diagrams of DAE

 $h = {f_\theta }(x) = s({\boldsymbol{W}}x + b)，$ (1)
 $x' = {g_\theta }(h) = s({\boldsymbol{W}}'h + b') 。$ (2)

DAE将训练集D上的重构误差作为目标函数，通过利用反向传播算法中的目标函数进行优化来确定网络参数$\theta = \{ {\boldsymbol{W}},{\boldsymbol{W}}',b,b'\}$

 ${J_{AE}}(\theta ) = \frac{1}{{{N_D}}}\sum\limits_{x \in D} {L(x,g(f(x)))}。$ (3)

 $L = \sum\nolimits_{i = 1}^{{d_x}} {{{({x_i} - {{x'}_i})}^2}}。$ (4)
1.2 隔离森林算法

 图 2 隔离森林结构示意图 Fig. 2 Structure diagrams of iforset

1）在p维属性中随机选择一个属性xi

2）在xi最大值与最小值之间随机选择一个分隔值p

3）分别根据X中每个样本的xi属性大于或小于pX划分为XlXr

4）将XlXr作为新的X重复以上步骤构造一棵iTree，直到子节点只有一个实例，或者数据集X中的所有数据具有相同值，或者iTree达到了限制高度。

 $IF = \{ {t_1},...,{t_T}\}。$ (5)

 $h(x) = \frac{1}{T}\sum\limits_{t \in IF} {{h_t}(x)} 。$ (6)

 $s(x,n) = {2^{ - \frac{{h(x)}}{{c(n)}}}} 。$ (7)

c(n)为规范h(x)的标准化因子，是将一个样本与其他n个样本隔离所需的平均步骤数，用作给定n个样本的平均路径长度基准，定义为：

 $c(n) = \left\{ {\begin{array}{*{20}{l}} {2H(n - 1) - 2(n - 1)/n{\text{ }},n > 2}，\\ {1{\text{ }},n = 2}，\\ {0{\text{ }},{\rm{otherwise}}} 。\end{array}} \right.$ (8)

$h(x) \to c(n)$时，$s(x,n) \to 0.5$，即测试实例x没有明显异常；当$h(x) \to 0$时，$s(x,n) \to 1$，即测试实例x可视为异常；当$h(x) \to n - 1$时，$s(x,n) \to 0$，即异常分数接近0时，测试实例x很大可能为正常值。

1.3 基于DAE-iForset的异常检测流程

 图 3 DAE-iForset异常检测算法流程 Fig. 3 DAE-iForset anomaly detection algorithm flow

1）数据的预处理。首先，将正常数据划分为训练集、验证集和测试集，然后将数据进行均值归一化处理。

2）训练DAE。用训练集${X_{train}}$对训练集进行训练，在训练过程中，通过验证集来调整DAE的超参数。训练好的深度自编码器可将输入样本的关键信息保存在一个低维空间中，包括提取的隐藏特征和导致样本重建错误的特征。其次，DAE可将正常数据的重构误差降至最低，这使得异常数据的重构误差较大，所以DAE更适合于无监督异常检测。

3）训练iForset。通过DAE计算得到每个样本的隐藏特征h和重构误差L，然后将其合并为最终特征$\mu = [h,L]$来训练隔离森林算法，得到正常样本的异常分数，根据异常分数确定阈值。

4）测试和异常检测。将测试实例x输入到DAE模型中计算得到隐藏特征h和重构误差L，合并为最终特征$\mu = [h,L]$，将$\mu$输入至iForset模型，得到异常分数，然后与异常分数阈值Th进行比较，若大于Th则为异常样本，反之则为正常样本。

2 实例验证 2.1 数据描述

 图 4 某燃气轮机低压涡轮排气温度正常与故障分布对比 Fig. 4 Comparison between normal and fault distribution of exhaust temperature of the low-pressure turbine of a gas turbine
2.2 实验设置与评价指标

 ${\text{ Accuracy }} = \frac{{TP + TN}}{{TP + FP + TN + FN}} ，$ (9)
 ${\text{Precision }} = \frac{{TP}}{{TP + FP}} ，$ (10)
 ${\text{Recall }} = \frac{{TP}}{{TP + FN}} ，$ (11)
 $F_1 - {\text{score }} = \frac{{2 \times {\text{ Precision }} \times {\text{ Recall }}}}{{{\text{ Precision }} + {\text{ Recall }}}} 。$ (12)

2.3 实验结果对比与分析

 图 5 分类结果混淆矩阵 Fig. 5 Confusion matrix of classification results

 图 6 多种模型的ROC曲线 Fig. 6 ROC curves of different models

3 结　语

 [1] 余又红, 贺星. 燃气轮机性能退化的动态特性[J]. 海军工程大学学报, 2012, 24(5): 39-42. [2] GAO F T, HUANG J, et al. Nonlinear Kalman filters for aircraft engine gas path health estimation with measurement uncertainty[J]. Aerospace Science and Technology, 2018, 76: 126–140. [3] LEE H, LI G, RAI A, et al. Real-time anomaly detection framework using a support vector regression for the safety monitoring of commercial aircraft[J]. Advanced Engineering Informatics, 2020, 44: 101071. DOI:10.1016/j.aei.2020.101071 [4] WONG P K, YANG Z, VONG C M, et al. Real-time fault diagnosis for gas turbine generator systems using extreme learning machine[J]. Neurocomputing, 2014, 128: 249-257. DOI:10.1016/j.neucom.2013.03.059 [5] LIU J. Gas path fault diagnosis of aircraft engine using HELM and transfer learning[J]. Engineering Applications of Artificial Intelligence, 2022, 114: 105149. DOI:10.1016/j.engappai.2022.105149 [6] MONTAZERI-Gh M, NEKOONAM A. Gas path component fault diagnosis of an industrial gas turbine under different load condition using online sequential extreme learning machine[J]. Engineering Failure Analysis, 2022, 135: 106115. DOI:10.1016/j.engfailanal.2022.106115 [7] ZHOU D, YAO Q, WU H, et al. Fault diagnosis of gas turbine based on partly interpretable convolutional neural networks[J]. Energy, 2020, 200: 117467. DOI:10.1016/j.energy.2020.117467 [8] TAN Y, NIU C, TIAN H, et al. Decay detection of a marine gas turbine with contaminated data based on isolation forest approach[J]. Ships and Offshore Structures, 2021, 16(5): 546-556. DOI:10.1080/17445302.2020.1747750 [9] YAN W. Detecting gas turbine combustor anomalies using semi-supervised anomaly detection with deep representation learning[J]. Cognitive Computation, 2020, 12(2): 398-411. DOI:10.1007/s12559-019-09710-7 [10] 白明亮, 张冬雪, 刘金福, 等. 基于深度自编码器和支持向量数据描述的燃气轮机高温部件异常检测[J]. 发电技术, 2021, 42(4): 422-430. DOI:10.12096/j.2096-4528.pgt.21021 [11] FU S, ZHONG S, LIN L, et al. A re-optimized deep auto-encoder for gas turbine unsupervised anomaly detection[J]. Engineering Applications of Artificial Intelligence, 2021, 101: 104199. DOI:10.1016/j.engappai.2021.104199 [12] 刘娇. 燃气轮机高温部件故障早期预警研究[D]. 哈尔滨: 哈尔滨工业大学, 2019. [13] 房友龙, 刘东风, 余又红, 等. 一种基于经验的燃气轮机参数折合方法[J]. 航空动力学报, 2018, 33(11): 2802-2808. DOI:10.13224/j.cnki.jasp.2018.11.027 [14] LIU F T, TING K M, ZHOU Z H. Isolation forest[C]//2008 Eighth IEEE International Conference on Data Mining. IEEE, 2008: 413-422.