Wastewater treatment processes (WWTPs) are complex and energyintensive systems as they require continuous operation with effluent treatment [1]. In fact, to meet the effluent discharge standards and reduce the effluent quality (EQ), wastewater treatment plants always run at full load status. It is worth noting that EQ is used to represent the fines which are required to be paid due to the discharge of pollutants in to the receiving water bodies. Under full load status, both of the dissolved oxygen concentration (
At present, the mechanism models of EC and EQ are mainly designed by the usage of process variables in the WWTP. Hernández et al. utilized a nonradial data envelopment analysis (DEA) method to calculate the energy index for a wastewater treatment plant located in Spain [3]. Yang et al. applied a computational fluid dynamic (CFD) model to optimize the operating condition by taking into consideration the flow field and dissolved oxygen concentration [4]. Vera et al. employed a statistical software, called INFOSTAT, to establish an EC model. However, these mechanism models have certain limitations, such as a single decision variable, complex structure, and many parameters [5]. Thus, the prediction accuracy of these mechanismbased models cannot satisfy the requirements of multiobjective optimal control in the WWTP.
The datadriven modeling method has the advantages of online detection, short time, and high accuracy. It is widely employed to identify the nonlinear dynamic systems. With the ability to approach any nonlinear system with arbitrary precision, neural networks have been widely applied to predict the critical parameters of water quality in WWTPs [6]. Noori et al. utilized a support vector machine to predict the biochemical oxygen demand (BOD) in order to determine the uncertainty of the datadriven method [7]. Qiao et al. introduced a softcomputing method for BOD, where the TS fuzzy neural network with the error backpropagation (EBP) algorithm is utilized to design the prediction model [8]. Kusiak et al. adopted a multilayer perceptronbased datamining method to establish the EC and wastewater flow rate models [9]. Han et al. applied an adaptive regressive kernel function (ARKF) approach to develop an EC model [10]. However, the research on the modeling of EC and EQ from the perspective of multiobjective optimal control is rare. In addition, since the initial network structure is generated randomly and the parameters are usually adjusted by the EBP algorithm during the learning process, the prediction accuracy of the neural networkbased datadriven models needs to be further improved.
Recent studies have shown that fuzzy neural network (FNN), which combines the learning capability of a neural network with the interpretability of a fuzzy system, has tremendous advantage, and has an extremely extensive application in datadriven modeling [11]. To develop the EC and EQ models with the usage of FNN, two issues must be dealt with, including the determination of network structure and the learning of network parameters. To determine the network size, some flexible clustering methods, such as fuzzy cmeans clustering [12], GustafsonKessel clustering [13], and Knearest neighbors clustering [14], are applied to partition the input space. These methods are mainly used for offline modeling. Another class of previously proposed methods, considering that whole training samples might not be available at once, utilizes online modeling [15], [16]. However, for these online modeling approaches, the main problem regarding how to produce new rules or prune redundant rules based on some approximate criteria still remains open. To update the parameters of FNN, the EBP algorithm is adopted frequently. However, as widely reported in the literature, the EBP algorithm has some drawbacks, such as slow convergence, local optimum, and low precision [17]. To enhance the learning performance of FNN, various types of learning algorithms have been proposed for the training of FNN, such as hybrid learning algorithm [18], adaptive gradient descent approach [19], and evolutionary algorithm [20]. However, these algorithms have either poor convergence ability or long training time.
In this paper, a novel fuzzy clustering approach based on the density peaks (DP) is applied to determine the network size. Fuzzy rules with appropriate centers can enhance the generalization performance of FNN. Meanwhile, an adaptive learning algorithm based on the improved LevenbergMarquardt approach is designed to optimize the parameters of FNN during the learning process. The convergence speed and approximation accuracy of FNN can be enhanced by the combination of density peaks clustering method and adaptive parameter learning algorithm.
The rest of this paper is outlined as follows. The architecture of FNN is introduced in Section Ⅱ. Section Ⅲ details the proposed DPAFNN, including the DPbased clustering method and the improved LM parameter learning algorithm. Subsequently, the convergence of DPAFNN is discussed in Section Ⅳ. In Section Ⅴ, the proposed DPAFNN is utilized to develop the nonlinear dynamic models of EC and EQ. Conclusion of the paper is provided in Section Ⅵ.
Ⅱ. FUZZY NEURAL NETWORKFor a multiinput and singleoutput (MISO) system, the nonlinear continuous time form of the dynamical system can be defined as
$ \begin{equation} \label{eq1} y(t)=f(x(t)) \end{equation} $  (1) 
where
To model the MISO nonlinear system, a fourlayer MISO FNN is introduced. The structure of the MISO FNN is shown in Fig. 1. The four layers are, respectively, the input layer, the membership function (MF) layer, the rule layer, and the output layer [15]. The mathematical description of this FNN is expressed as follows.
Download:


Fig. 1 Architecture of the FNN. 
The Input Layer: there are
$ \begin{equation} \label{eq2} u_{i} (t)=x_{i} (t), \quad i=1, 2, \dots, n \end{equation} $  (2) 
where
The MF Layer: there are
$ \begin{equation} \label{eq3} \mu_{ij} (t)=e^{\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma_{ij}^{2} (t)}}, i=1, 2, \dots, n; j=1, 2, \dots, r \end{equation} $  (3) 
where
The Rule Layer: there are
$ \begin{eqnarray} v_{j} (t)=\prod\limits_{i=1}^n {\mu_{ij} (t)} =e^{\sum\limits_{i=1}^n {\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma_{ij}^{2} (t)}} } \nonumber\\ i=1, 2, \dots, n; ~j=1, 2, \dots, r \end{eqnarray} $  (4) 
and
$ \begin{eqnarray} h_{j} (t)=\frac{v_{j} (t)}{\sum\limits_{j=1}^r {v_{j} (t)} }=\frac{e^{\sum\limits_{i=1}^n {\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma _{ij}^{2} (t)}} }}{\sum\limits_{j=1}^r {e^{\sum\limits_{i=1}^n {\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma_{ij}^{2} (t)}} }} } \nonumber\\ i=1, 2, \dots, n; ~j=1, 2, \dots, r \end{eqnarray} $  (5) 
where
The Output Layer: the output value of this layer is the summation of incoming signals
$ \begin{eqnarray} y(t)&={\pmb w}(t){\pmb h}(t)=\sum\limits_{j=1}^r {w_{j} (t)h_{j} (t)} \nonumber\\ & =\frac{\sum\limits_{j=1}^r {w_{j} (t)e^{\sum\limits_{i=1}^n {\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma _{ij}^{2} (t)}} }} }{\sum\limits_{j=1}^r {e^{\sum\limits_{i=1}^n {\frac{(x_{i} (t)c_{ij} (t))^{2}}{\sigma_{ij}^{2} (t)}} }} } \end{eqnarray} $  (6) 
where
In the following, the principle of the DPbased clustering method is given firstly. Then, the adaptive learning algorithm is described in detail. Finally, the procedure of DPAFNN modeling is provided.
A. Density Peaks Clustering MethodRecently, a powerful clustering approach based on the density peaks (DP) has been proposed by Rodriguez and Laio in [21]. In this study, the DPbased clustering method is applied to partition the input space and to obtain a suitable classification of the training samples. Then, the centers and variances of the clusters are utilized as the parameters of FNN. In the DP algorithm, there are two assumptions: 1) the cluster centers are surrounded by neighbors which have a lower local density; 2) the cluster centers have a relatively large distance from any point which has a higher local density. For the convenience of the searching, two indicators, including the local density
In the DP algorithm, the local density $ \rho_{i} $ is given as
$ \begin{equation} \label{eq7} \rho_{i} =\sum\limits_j {\chi (d_{ij} d_{c} )} \end{equation} $  (7) 
where
In the DP algorithm,
$ \begin{equation} \label{eq8} \delta_{i} =\min\limits_{j: \rho_{j} >\rho_{i} } (d_{ij} ). \end{equation} $  (8) 
When the point
$ \begin{equation} \label{eq9} \gamma_{i} =\rho_{i} \delta_{i}. \end{equation} $  (9) 
In the clustering process, data point will be chosen as cluster center if its
To avoid trapping into local optimal, the EBP algorithm which has been widely used in FNNs is replaced by an improved LM algorithm in this study. Thereby, the modeling accuracy can be enhanced with the usage of the improved LM parameter optimization algorithm. For the pth sample, the error between the desired output and network output of FNN can be described as
$ \begin{equation} \label{eq10} e^{p}(t)=y_{d}^{p} (t)y^{p}(t), \quad p=1, 2, \dots, P \end{equation} $  (10) 
where
Derived from LM algorithm [22], all the parameters in the FNN, i.e., centers, widths, and output weights are adjusted by the following formula
$ \begin{equation} \label{eq11} \boldsymbol{\Theta }(t+1)=\boldsymbol{\Theta }(t)(\boldsymbol{\Psi }(t)+\eta (t)\times {\pmb I})^{1}\times \boldsymbol{\Omega }(t) \end{equation} $  (11) 
where
$ \begin{equation} \label{eq12} \eta (t)=2\left\ {\pmb{e}(t)} \right\^{2} \end{equation} $  (12) 
where
$ \boldsymbol{\Psi }(t)=\sum\limits_{p=1}^P {\pmb{\psi }_{p} (t)} $  (13) 
$ \boldsymbol{\Omega }(t)=\sum\limits_{p=1}^P {\pmb{\omega}_{p} (t)} $  (14) 
where the submatrix
$ \pmb{\psi }_{p} (t)={\pmb j}_{p}^{T} (t){\pmb j}_{p} (t) $  (15) 
$ \pmb{\omega }_{p} (t)={\pmb j}_{p}^{T} (t)e_{p} (t) $  (16) 
where
$ \begin{equation} \label{eq17} \begin{aligned} {\pmb j}_{p} (t)&=\left[{\frac{\partial e_{p} (t)}{\partial w_{1} (t)}, \cdots, \frac{\partial e_{p} (t)}{\partial w_{r} (t)}, \frac{\partial e_{p} (t)}{\partial c_{11} (t)}, \dots \frac{\partial e_{p} (t)}{\partial c_{ij} (t)}, \dots, } \right. \\ &\quad~ \left. \frac{\partial e_{p} (t)}{\partial c_{nr} (t)}, \frac{\partial e_{p} (t)}{\partial \sigma_{11} (t)}, \dots, \frac{\partial e_{p} (t)}{\partial \sigma_{ij} (t)}, \dots, \frac{\partial e_{p} (t)}{\partial \sigma_{nr} (t)} \right]\!\!. \end{aligned} \end{equation} $  (17) 
According to the update rule of gradient descent learning approach, the elements of the Jacobian row vector are presented as
$ \frac{\partial e_{p} (t)}{\partial w_{j} (t)} =\frac{\partial e_{p} (t)}{\partial y_{p} (t)}\; \frac{\partial y_{p} (t)}{\partial w_{j} (t)}\; =h_{j} (t) $  (18) 
$ \begin{align} \label{eq19}\nonumber \frac{\partial e_{p} (t)}{\partial c_{ij} (t)} & =\frac{\partial e_{p} (t)}{\partial y_{p} (t)}\frac{\partial y_{p} (t)}{\partial h_{j} (t)}\frac{\partial h_{j} (t)}{\partial v_{j} (t)}\frac{\partial v_{j} (t)}{\partial \mu_{ij} (t)}\frac{\partial \mu_{ij} (t)}{\partial c_{ij} (t)} \\ & =w_{j} (t)\frac{\sum\limits_{i\ne j}^r {v_{i} (t)}}{(\sum\limits_{i=1}^r {v_{i} (t)} )^{2}}\prod\limits_{k\ne i} {\mu_{kj} (t)} \frac{\partial \mu_{ij} (t)}{\partial c_{ij} (t)} \\ \end{align} $  (19) 
$ \begin{align} \label{eq20}\nonumber \frac{\partial e_{p} (t)}{\partial \sigma_{ij} (t)} & =\frac{\partial e_{p} (t)}{\partial y_{p} (t)}\frac{\partial y_{p} (t)}{\partial h_{j} (t)}\frac{\partial h_{j} (t)}{\partial v_{j} (t)}\frac{\partial v_{j} (t)}{\partial \mu_{ij} (t)}\frac{\partial \mu_{ij} (t)}{\partial \sigma _{ij} (t)} \\ & =w_{j} (t)\frac{\sum\limits_{i\ne j}^r {v_{i} (t)}}{(\sum\limits_{i=1}^r {v_{i} (t)} )^{2}}\prod\limits_{k\ne i} {\mu_{kj} (t)} \frac{\partial \mu_{ij} (t)}{\partial \sigma_{ij} (t)} \end{align} $  (20) 
where
$ \begin{align*} \frac{\partial \mu_{ij} (t)}{\partial c_{ij} (t)}& =\frac{2(x_{i} (t)c_{ij} (t))\exp ({(x_{i} (t)c_{ij} (t))^{2}}/{\sigma_{ij}^{2} (t)})}{\sigma _{ij}^{2} (t)} \\ \frac{\partial \mu_{ij} (t)}{\partial \sigma_{ij} (t)}& =\frac{2(x_{i} (t)c_{ij} (t))^{2}\exp ({(x_{i} (t)c_{ij} (t))^{2}}/{\sigma_{ij}^{2} }(t))}{\sigma_{ij}^{3} (t)}. \end{align*} $ 
The proposed improved adaptive LM algorithm is a secondorder parameter estimation algorithm. While computing the submatrix
To reveal the proposed modeling method clearly, the design steps of the models of EC and EQ with the usage of DPAFNN are summarized as follows.
Step 1: Generate enough process data about EC and EQ using BSM1 and execute data normalization operation.
Step 2: Calculate the local density
Step 3: Calculate the minimum distance
Step 4: Compute the cluster center selection parameter
Step 5: Create two initial fourlayer FNN. The number of neurons in the input layer is the same as the number of input variables. The initial centers of fuzzy rules are the centers of clusters. The initial widths of all fuzzy rules are set to be the variances of clusters. The initial output weights of DPAFNN are set to be the outputs of training samples which are selected as the centers of fuzzy rules. The maximum number of iterations is set to be
Step 6: Calculate the network output
Step 7: Make the error
Step 8: Make the accumulation of submatrices
Step 9: Update the learning rate
Step 10: Stop and evaluate the test samples if the stopping criteria is satisfied. Otherwise, go to Step 6.
Ⅳ. CONVERGENCE ANALYSISThe convergence of the proposed DPAFNN is of great importance for modeling EC and EQ. In this section, our analysis suggests that the convergence of DPAFNN can be guaranteed.
We choose the Lyapunov function as follows
$ \begin{equation} \label{eq21} V(\boldsymbol{\Theta }(t))=\frac{1}{2}\pmb{e}^{T}(t)\pmb{e}(t) \end{equation} $  (21) 
where
$ \begin{eqnarray} \Delta V(\boldsymbol{\Theta }(t))&\hspace{0.2cm}=&\hspace{0.2cm}V(\boldsymbol{\Theta }(t+1))V(\boldsymbol{\Theta}(t)) \nonumber\\[2mm] &&\hspace{0.6cm}=\frac{1}{2}e^{T}(t+1)e(t+1)\frac{1}{2}e^{T}(t)e(t) \nonumber\\[2mm] &&\hspace{0.6cm}=\nabla E^{T}(\boldsymbol{\Theta }(t))\Delta \boldsymbol{\Theta }(t)\nonumber\\[2mm] &&\hspace{0.6cm}\quad~ +\frac{1}{2}\Delta \boldsymbol{\Theta }^{T}(t)\nabla ^{2}E(\boldsymbol{\Theta }(t))\Delta \boldsymbol{\Theta }(t) \end{eqnarray} $  (22) 
where
$ \begin{equation} \label{eq23} \left\ {\Delta {\boldsymbol{\Theta }}(t)} \right\\leq \min \left\{ {\left\ {\Delta {\boldsymbol{\Theta }}(t1)} \right\, \frac{\left\ {\boldsymbol{\Omega }({\boldsymbol{\Theta }}(t1))} \right\}{\left\ {\boldsymbol{\Psi }({\boldsymbol{\Theta }}(t1))} \right\}} \right\}. \end{equation} $  (23) 
There will be
$ \begin{equation} \label{eq24} \begin{cases} \Delta {\boldsymbol{\Theta }}(t)=(\boldsymbol{\Psi }({\boldsymbol{\Theta }}(t))+\eta (t)\times {{\pmb I}})^{1}\times \boldsymbol{\Omega }({\boldsymbol{\Theta}}(t)) \\[3mm] \nabla {{\pmb E}}({\boldsymbol{\Theta }}(t))=\boldsymbol{\Omega }({\boldsymbol{\Theta }}(t)) \\[3mm] \nabla^{2}{{\pmb E}}({\boldsymbol{\Theta }}(t))=\boldsymbol{\Psi }({\boldsymbol{\Theta }}(t))+\eta (t)\times {{\pmb I}}. \end{cases} \end{equation} $  (24) 
It follows from (23) and (24) that
$ \begin{equation} \label{eq25} \Delta V(\boldsymbol{\Theta }(t))=\frac{1}{2}\Delta \boldsymbol{\Theta}^{T}(t)\nabla^{2}E(\boldsymbol{\Theta }(t))\Delta \boldsymbol{\Theta }(t). \end{equation} $  (25) 
When the condition (21) is satisfied, the matrix
$ \begin{equation} \label{eq26} \lim\limits_{t\to +\infty } \pmb{e}(t)=\boldsymbol{0}. \end{equation} $  (26) 
According to the Lyapunov stability theorem, we know that the DPAFNN is convergent.
Ⅴ. SIMULATION STUDIES A. Wastewater Treatment ProcessThe most popular technology of wastewater treatment is the activated sludge process (ASP). To evaluate the performance of different control strategies, a benchmark simulation platform based on ASP has been developed with the framework of COST Actions 682 and 624BSM1 [24].
In the BSM1, EC is the sum of aeration energy (AE) and pumping energy (PE). AE and PE are calculated as
$ AE=\frac{S_\rm{O, sat} }{T\times 1.8\times 1000}\int_{kT}^{(k+1)T} {\sum\limits_{i=1}^5 {V_{i} \cdot K_{Lai} (t)dt} } $  (27) 
$ PE=\frac{1}{T\times 1000}\int_{kT}^{(k+1)T} {(4Q_{a} (t)+8Q_{r} (t)+50Q_{w} (t))dt} $  (28) 
where
Moreover, EQ is defined as
$ \begin{align} \label{eq29}\nonumber &EQ=\frac{1}{T\cdot 1000} \\ &\times\!\!\int_{kT}^{(k+1)T} \begin{pmatrix} 2\cdot SS_{e} (t)\!+\!COD_{e} (t)\!+\!3\cdot S_{NKj, e} (t) \\ +10\cdot S_\rm{NO, e} (t)\!+\!2\cdot BOD_{5, e} (t) \\ \end{pmatrix} Q_{e} (t)dt \end{align} $  (29) 
where
In the BSM1, there are two control loops. One is the control loop of the dissolved oxygen concentration
Download:


Fig. 2 Schematic diagram of the modeling of EC and EQ. 
Download:


Fig. 3 The values of input variables in the predicting process. 
In this section, the effectiveness of the proposed DPAFNN datadriven modeling method is demonstrated by using the BSM1 platform. A total of 500 samples are collected from BSM1. The preceding 400 samples are chosen as the training set, and the remaining 100 samples are applied as the testing set. Through the practical test, the parameters of DPAFNN are set as follows:
To assess the performance of the proposed DPAFNN, the root meansquareerror (RMSE) and the predicting accuracy are adopted as two performance criteria in the experiments. For comparative analysis, the obtained results of DPAFNN are compared with some existing methods, including modelbased and FNNbased methods. All the results are averaged on 30 independent runs. RMSE and Accuracy are defined as
$ RMSE=\sqrt{\frac{1}{P}\sum\limits_{p=1}^P {(y_{d}^{p} y^{p})^{2}} } $  (30) 
$ Accuracy=\frac{1}{P}\sum\limits_{p=1}^P {\Big(1\frac{\left {y_{d}^{p} y^{p}} \right}{\left {y_{d}^{p} } \right}\Big)} \times 100 \%. $  (31) 
For the EC modeling, Fig. 4 shows the plot of
Download:


Fig. 4 The value of 
Download:


Fig. 5 RMSE values during training. 
Download:


Fig. 6 The predicting values of EC. 
Download:


Fig. 7 The predicting error of EC. 
The performance of the proposed DPAFNN for EC modeling is compared with the FNNEBP, BSMMBR [25], TCIM [26], MLR [27], ARKF [10], and DFNN [15]. The details of performance comparison are given in Table Ⅱ. As observed in Table Ⅱ, the proposed DPAFNN achieves the smallest testing RMSE (0.97) and the best testing accuracy (99.39%). The performance of DPAFNN is superior to that of FNNEBP and other mechanism methods. Although the MLR gets the least training time (0.03 s), its testing accuracy (95.17%) is lower than that of DPAFNN. The comparisons indicate that the proposed DPAFNN algorithm can be utilized to address the EC modeling problem in the WWTP.
For the EQ modeling, the variation of RMSE values in the training process is presented in Fig. 8. Based on the results shown in Fig. 8, the proposed DPAFNN method achieves faster convergence speed than that of FNNEBP. The predicting results of the EQ values are shown in Fig. 9, and the graphs of the predicting errors between the desired and predicted outputs are displayed in Fig. 10. The predicting values demonstrate that the proposed DPAFNN has high modeling precision.
Download:


Fig. 8 RMSE values during training. 
Download:


Fig. 9 The predicting values of EQ. 
Download:


Fig. 10 The predicting error of EQ. 
The performance of the DPAFNN for EQ modeling is compared with the FNNEBP, MLR [27], and DFNN [15]. The detailed results of different algorithms are exhibited in Table Ⅲ. The testing RMSE and accuracy of the proposed DPAFNN model approach to 0.96 and 93.04%, respectively, which is remarkable in comparison with other results. The experimental results in Table Ⅲ demonstrate that the proposed DPAFNN has high generalization performance.
In this paper, the DPAFNN approach for designing the models of EC and EQ in the WWTP has been presented. The proposed method is technically developed with the usage of the DPbased clustering algorithm and the improved LM parameter learning algorithm. Based on the results obtained from the modeling of EC and EQ, the main contributions of this paper include: 1) Since the DPbased clustering algorithm can partition the input space well, the initial fuzzy rules of the proposed DPAFNN are suitable for all the training samples. The simulation results indicate that the DPbased fuzzy rules extracting method can also reduce the computational complexity and enhance generalization ability. 2) Since the parameters are optimized by using the improved LM algorithm, the convergence speed and learning ability of the proposed DPAFNN are superior to that of the FNNEBP. 3) The experimental results indicate that the proposed DPAFNN achieves better modeling accuracy than the mechanism models (i.e., BSMMBR and TCIM) and other FNNs (i.e., FNNEBP and DFNN). 4) The experimental results also demonstrate that the EC and EQ models in the WWTP can be developed with acceptable accuracy utilizing
This work, which has solved the modeling problems of EC and EQ, plays an extremely essential role in the subsequent study of the multiobjective optimal control in the WWTP. The following work of our team includes two aspects. One is to deal with the complicated multiobjective optimization problem with the usage of the multiobjective evolutionary algorithm. The traditional multiobjective evolutionary algorithm should be improved so that highquality Pareto optimal solutions can be found. The other is to design two tracking controllers for the optimal setpoints of
[1]  J. F. Wan, J. Gu, Q. Zhao, and Y. Liu, "Cod capture: a feasible option towards energy selfsufficient domestic wastewater treatment, " Sci. Rep. , vol. 6, pp. Article No. 25054, Apr. 2016. 
[2]  D. Mamais, C. Noutsopoulos, A. Dimopoulou, A. Stasinakis, and T. D. Lekkas, "Wastewater treatment process impact on energy savings and greenhouse gas emissions, " Water Sci. Technol. , vol. 71, no. 2, pp. 303308, Jan. 2015. http://www.ncbi.nlm.nih.gov/pubmed/25633956 
[3]  F. Hernández Sancho, M. MolinosSenante, and R. SalaGarrido, "Energy efficiency in Spanish wastewater treatment plants: a nonradial DEA approach, " Sci. Total Environ. , vol. 409, no. 14, pp. 26932699, Jun. 2011. http://europepmc.org/abstract/MED/21549411 
[4]  Y. Yang, J. K. Yang, J. L. Zuo, Y. Li, S. He, X. Yang, and K. Zhang, "Study on two operating conditions of a fullscale oxidation ditch for optimization of energy consumption and effluent quality by using CFD model, " Water Res. , vol. 45, no. 11, pp. 34393452, May 2011. http://www.ncbi.nlm.nih.gov/pubmed/21529877 
[5]  I. Vera, K. Sáez, and G. Vidal, "Performance of 14 fullscale sewage treatment plants: comparison between four aerobic technologies regarding effluent quality, sludge production and energy consumption, " Environ. Technol. , vol. 34, no. 15, pp. 22672275, Jan. 2013. http://europepmc.org/abstract/med/24350481 
[6]  D. J. Dürrenmatt and W. Gujer, "Datadriven modeling approaches to support wastewater treatment plant operation, " Environ. Model. Softw. , vol. 30, pp. 4756, Apr. 2012. http://europepmc.org/abstract/med/2673599 
[7]  R. Noori, H. D. Yeh, M. Abbasi, F. T. Kachoosangi, and S. Moazami, "Uncertainty analysis of support vector machine for online prediction of fiveday biochemical oxygen demand, " J. Hydrol. , vol. 527, pp. 833843, Aug. 2015. http://www.sciencedirect.com/science/article/pii/S0022169415003960 
[8]  J. F. Qiao, W. Li, and H. G. Han, "Soft computing of biochemical oxygen demand using an improved TS fuzzy neural network, " Chin. J. Chem. Eng. , vol. 22, no. 1112, pp. 12541259, Nov. 2014. http://www.sciencedirect.com/science/article/pii/S100495411400130X 
[9]  A. Kusiak, Y. H. Zeng, and Z. J. Zhang, "Modeling and analysis of pumps in a wastewater treatment plant: A datamining approach, " Eng. Appl. Artif. Intell. , vol. 26, no. 7, pp. 16431651, Aug. 2013. http://www.sciencedirect.com/science/article/pii/S0952197613000572 
[10]  H. G. Han, L. Zhang, and J. F. Qiao, "An energy consumption model of wastewater treatment process based on adaptive regressive kernel function, " CIESC J. , vol. 67, no. 3, pp. 947953, Mar. 2016. 
[11]  M. Pratama, M. J. Er, X. Li, R. J. Oentaryo, E. Lughofer, and I. Arifin, "Data driven modeling based on dynamic parsimonious fuzzy neural network, " Neurocomputing, vol. 110, pp. 1828, Jun. 2013. http://www.sciencedirect.com/science/article/pii/S0925231212008934 
[12]  R. J. Hathaway and Y. K. Hu, "Densityweighted fuzzy cmeans clustering, " IEEE Trans. Fuzzy Syst. , vol. 17, no. 1, pp. 243252, Feb. 2009. 
[13]  L. Teslic, B. Hartmann, O. Nelles, and I. Skrjanc, "Nonlinear system identification by gustafsonkessel fuzzy clustering and supervised local model network learning for the drug absorption spectra process, " IEEE Trans. Neural Netw. , vol. 22, no. 12, pp. 19411951, Dec. 2011. http://ieeexplore.ieee.org/document/6060917/ 
[14]  H. Malek, M. M. Ebadzadeh, and M. Rahmati, "Three new fuzzy neural networks learning algorithms based on clustering, training error and genetic algorithm, " Appl. Intell. , vol. 37, no. 2, pp. 280289, Sep. 2012. 
[15]  S. Q. Wu and M. J. Er, "Dynamic fuzzy neural networks a novel approach to function approximation, " IEEE Trans. Syst. Man Cybern. B (Cybern. ), vol. 30, no. 2, pp. 358364, Apr. 2000. 
[16]  H. G. Han, X. L. Wu, and J. F. Qiao, "Nonlinear systems modeling based on selforganizing fuzzyneuralnetwork with adaptive computation algorithm, " IEEE Trans. Cybern. , vol. 44, no. 4, pp. 554564, Apr. 2014. http://europepmc.org/abstract/med/23782841 
[17]  B. M. Wilamowski and H. Yu, "Neural network learning without backpropagation, " IEEE Trans. Neural Netw. , vol. 21, no. 11, pp. 17931803, Nov. 2010. 
[18]  M. Davanipoor, M. Zekri, and F. Sheikholeslam, "Fuzzy wavelet neural network with an accelerated hybrid learning algorithm, " IEEE Trans. Fuzzy Syst. , vol. 20, no. 3, pp. 463470, Jun. 2012. http://ieeexplore.ieee.org/document/6081924/ 
[19]  W. Q. Zhao, K. Li, and G. W. Irwin, "A new gradient descent approach for local learning of fuzzy neural models, " IEEE Trans. Fuzzy Syst. , vol. 21, no. 1, pp. 3044, Feb. 2013. http://ieeexplore.ieee.org/document/6203571/ 
[20]  G. Leng, T. M. McGinnity, and G. Prasad, "Design for selforganizing fuzzy neural networks based on genetic algorithms, " IEEE Trans. Fuzzy Syst. , vol. 14, no. 6, pp. 755766, Dec. 2006. http://ieeexplore.ieee.org/document/4016084/ 
[21]  A. Rodriguez and A. Laio, "Clustering by fast search and find of density peaks, " Science, vol. 344, no. 6191, pp. 14921496, Jun. 2014. http://www.ncbi.nlm.nih.gov/pubmed/24970081 
[22]  B. M. Wilamowski and H. Yu, "Improved computation for levenbergmarquardt training, " IEEE Trans. Neural Netw. , vol. 21, no. 6, pp. 930937, Jun. 2010. http://europepmc.org/abstract/MED/20409991 
[23]  H. G. Han, L. M. Ge, and J. F. Qiao, "An adaptive second order fuzzy neural network for nonlinear system modeling, " Neurocomputing, vol. 214, pp. 837847, Nov. 2016. http://www.sciencedirect.com/science/article/pii/S0925231216307329 
[24]  J. F. Qiao and W. Zhang, "Dynamic multiobjective optimization control for wastewater treatment process, " Neural Comput. Appl. , vol. 29, no. 11, pp. 12611271, Jun. 2018. 
[25]  T. Maere, B. Verrecht, S. Moerenhout, S. Judd, and I. Nopens, "BSMMBR: a benchmark simulation model to compare control and operational strategies for membrane bioreactors, " Water Res. , vol. 45, no. 6, pp. 21812190, Mar. 2011. http://www.ncbi.nlm.nih.gov/pubmed/21329957 
[26]  P. Vanrolleghem and S. Gillot, "Robustness and economic measures as control benchmark performance criteria, " Water Sci. Technol. , vol. 45, no. 45, pp. 117126, Feb. 2002. http://www.ncbi.nlm.nih.gov/pubmed/11936624 
[27]  E. Lee, S. Han, and H. Kim, "Development of software sensors for determining total phosphorus and total nitrogen in waters, " Int. J. Environ. Res. Public Health, vol. 10, no. 1, pp. 219236, Jan. 2013. http://europepmc.org/articles/PMC3564139/ 