Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods

Pooria Karimi; Hooshang Jazayeri-Rad

doi:10.12691/automation-2-1-4

AUTOMATION» Archive» Volume 2» Issue 1»Research Article

OPEN ACCESS Peer Reviewed

PEER-REVIEWED

Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods

Pooria Karimi^1,, Hooshang Jazayeri-Rad¹

¹Department of Instrumentation and Automation, Petroleum University of Technology, Ahwaz, Iran

Abstract

1. Introduction

2. Methods

3. Application: Tennessee-Eastman Process

4. Results and Discussion

5. Conclusion

References

Abstract

This work employs potential and distinct features of artificial neural networks, in particular the feed-forward multilayer structure, to achieve fault diagnosis in chemical plants. Artificial neural networks can automatically store information by learning from historical fault data without using any qualitative or quantitative model of the system. However, different limitations are encountered in utilizing a single neural network classifier in fault diagnosis of complex large-scale chemical processes. These limitations are mostly originated from the low reliability and also high generalization error of a single neural network classifier which may be faced in real applications. An attractive remedy for these issues is to use an ensemble of different neural network models instead of relying on a single one. In this work, boosting as a general well-known approach to improve the learning performance of learning algorithms by building ensemble models is investigated. In particular, an efficient boosting method called the Stage-wise Additive Modeling using a Multi-class Exponential Loss Function (SAMMELF) is utilized. This technique can effectively overcome the main limitations of the early boosting methods for fault diagnosis applications. The potential of the SAMMELF algorithm is proved through its application to the challenging diagnosis problem of the Tennessee-Eastman Process (TEP) where the whole sets of the predefined TEP faults are considered in the classification problem.

At a glance: Figures

View all figures

12345

Prev Next

Keywords: artificial neural network, process fault diagnosis, adaptive boosting, Tennessee-Eastman Process

Journal of Automation and Control, 2014 2 (1), pp 21-32.
DOI: 10.12691/automation-2-1-4

Received February 16, 2014; Revised February 25, 2014; Accepted February 28, 2014

Cite this article:

Karimi, Pooria, and Hooshang Jazayeri-Rad. "Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods." Journal of Automation and Control 2.1 (2014): 21-32.

Karimi, P. , & Jazayeri-Rad, H. (2014). Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods. Journal of Automation and Control, 2(1), 21-32.

Karimi, Pooria, and Hooshang Jazayeri-Rad. "Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods." Journal of Automation and Control 2, no. 1 (2014): 21-32.

Import into BibTeX

Import into EndNote

Import into RefMan

Import into RefWorks

1. Introduction

Changes in the physical conditions of process units, control systems or external conditions may lead to what are generally referred to as faults. Faults in the broadest sense include symptoms resulting from physical changes, such as deviations of temperature or pressure from their normal operating ranges, as well as physical changes themselves such as scaling, foaming, leaks and wear. Even changes in unmeasured process parameters such as heat or mass transfer coefficients can be considered to be faults. Generally speaking, fault is defined as any departure from an acceptable range of an observed variable or a calculated parameter of the process ^[1]. Faults in process equipment can give rise to off-specification production, increased operating expenses, the likelihood of line shutdown and an increasing risk of damaging environments. Hence, the growing demand for performance, efficiency, reliability and safety of industrial systems, is encouraging a growing interest both from industry and academic circles in fault diagnosis.

Process monitoring associated to fault management mission is essentially a four-stage process: fault detection, fault identification, fault diagnosis, and process recovery. (i) The task of fault detection is to determine when abnormal process conditions occur on-line. (ii) Fault identification is identifying the observation variables most relevant to diagnosing the fault. The purpose of this procedure is to focus the plant operator’s and engineer’s attention on the subsystems most pertinent to the diagnosis of the fault, so that the effect of the fault can be eliminated in a more efficient manner. (iii) Fault diagnosis which is the purpose of this article, is determining which fault occurred, in other words, determining the root cause of the observed out-of-control status. The fault diagnosis procedure is essential to the counteraction or elimination of the fault. (iv) Process recovery, also called intervention, is removing the effect of the fault ^[2].

Numerous techniques have been proposed for process fault detection and diagnosis in the past few decades. According to ^[3] these techniques can be broadly classified as model based and process history-based approaches. Model based approaches generally utilize results from the field of control theory and are based on parameter estimation or state estimation ^[4]. This approach is based on the fact that a fault will cause changes in certain physical parameters which in turn will lead to changes in some model parameters or states. It is then possible to detect and diagnose faults by monitoring the estimated model parameters or states. When using this approach, it is essential to have the knowledge about the relationships between faults and model parameters or states. Furthermore, quite accurate models are required. The main problem with these techniques is that obtaining an exact process model is not always possible or economical in industrial systems. In addition, errors in the model can be interpreted as faults thus yielding false alarms, or can prevent faults from being detected when they occur, especially in nonlinear or uncertain systems. Hence, many process model based quantitative techniques are only applicable to linear systems with limited availability of methods for nonlinear chemical processes ^[5].

The process history-based methods make use of the large amount of process data obtained from recorded measured variables of the process during abnormal and normal operations. Due to the recent advances in sensing and data collection technology which provides large amounts of data collected in many processes at chemical industries, these methods are more popular than model-based approaches in real applications. Many multivariate statistical techniques for analyzing these massive datasets have been developed including Principal Component Analysis (PCA), Partial Least Squares (PLS), and Fisher Discriminant Analysis (FDA) ^[6]. Although most of these techniques are well designed for the fault detection, one of the most relevant techniques for the diagnosis is the supervised classification. In this technique, different operating conditions including normal and abnormal ones are treated as patterns or classes. Given that there are many datasets in the historical database, each related to a different abnormal condition (root cause); the aim is then to allocate the on-line out-of-control measurements to the most narrowly associated fault class. Classification tools are one of the most extended fault diagnosis systems in literature. They have gotten ahead because of their ease of implementation, so they do not necessitate process operators’ experience or process first principles information. In addition, they can handle the inherent nonlinearity of variables existed in most chemical processes at no extra cost.

Numerous methods have been developed for supervised classification in machine learning area. These techniques which were successfully applied to fault diagnosis applications include the k-Nearest Neighborhood (kNN) ^[7], FDA ^[8], the Bayesian Networks (BNs) ^{[9, 10]} and the Support Vector Machines (SVMs) in more recent works ^{[11, 12, 13, 14]}. In particular, amongst the classification techniques, Artificial Neural Networks (ANNs) had an enormous attention in past years. ANNs have many useful properties concerning process fault diagnosis. They can handle nonlinear and undetermined processes where no process model is available and learn the diagnosis by means of the information of the learning data. ANNs are very noise tolerant and work well with noisy measurements. The ability to generalize the knowledge as well as the ability to adapt during their use is one of their very interesting properties. The use of ANNs in fault diagnosis is very straightforward ^[15]. Multi-Layer Perceptron (MLP) networks are the most popular ANNs which have been used extensively for fault diagnosis [16-21]^[16].

Despite simplicity and versatility of MLP networks, their application to process fault diagnosis in plant-wide scale is not without difficulties. One of the most important issues of employing MLP networks in process fault diagnosis is related to their over-fitting, meaning that the network shows very good performance on the training data, however behaves poorly with unseen data. Moreover, The MLP neural networks do not necessarily converge to a unique solution during the training phase and they suffer from low robustness especially when data available for training the network are not abundant. Besides, often the input-output relationship represented by a large set of fault data dispersed on a wide input feature space might not be captured adequately by a single network and so the obtained generalization performance may not be satisfactory. Hence, it is often difficult, if not impossible, to build a perfect single MLP network model for fault classification tasks in real industrial problems involving many types of faults. To encounter these issues, use of an ensemble of neural networks is proposed in literature. Ensemble or committee of neural networks is achieved through information fusion by combining different single neural network outputs by some kind of voting scheme to get a single output ^[22]. Using an ensemble based method may result in improving the performances of any learning system ^[23]. For neural networks mainly, if one takes up that the output of an individual neural network of the ensemble comprises of a true output combined with a random error constituent with zero average, then the grouping of the outputs from the individual networks lead to averaging of the random error constituents. Hence, it reduces the estimation error. Also, an ensemble of networks might reduce the over-fitting, by combining different networks with different architectures. Moreover, the risk of an unfortunate selection of a poorly performing classifier is reduced. Use of ensemble of neural networks as a general way to improve the generalization performance is quite common in neural network literature ^{[24, 25, 26]}. Although the advantages of ensemble network were recognized by several researchers, it is believed that no attempt except by ^[27] was made to apply an ensemble network for process fault diagnosis. In that work, to develop a diverse range of individual networks, each individual network is trained on a replication of the original training data generated through bootstrap resampling with replacement. The power of fusion approach in increasing neural network robustness and reliability and leading to an improved on-line diagnosis system was also demonstrated ^[27]. However, a quite simple chemical process comprising a Continuous Stirred Tank Reactor (CSTR) and a heat exchanger is employed to investigate the performances of different diagnosis systems.

In this work, the possibility of improvement in performance of the MLP neural network classifiers based on ensemble architecture achieved by boosting techniques is investigated through the application to the TEP fault diagnosis problem. Boosting in machine learning area refers to the general techniques attempt to improve the performance of learning methods. In particular, two powerful techniques for ensemble classification, called adaptive boosting, or AdaBoost for short, and the SAMMELF algorithm are examined in this work. Whole sets of the TEP faults are considered in this work to provide a challenging diagnosis problem. The remainder of this paper is organized as follows. Section 2 deliberates about the ensemble based learning principles and provides a brief explanation for AdaBoost, SAMMELF and MLP neural network classification methods. The TEP fault diagnosis problem is detailed in section 3. Fault diagnosis results obtained by application to the TEP problem are presented and discussed in section 4. Finally, section 5 concludes this paper.

2. Methods

2.1. MLP Neural Network Classifier

ANNs were motivated from the examination of the human brain and consist of many basic computational elements (neurons) locally affecting other neurons through connections. Neurons or nodes in ANNs are very basic computational elements originated from their biological equivalents. A neuron can be defined by a basic function providing a conversion from n-dimensional space (inputs acquired from other neurons) to 1-dimensional space (an output value that it directs to other neurons). There are weights on each of the interconnections and it is these weights that are updated during the training process to ensure that the inputs produce an output which is close to its actual value, using an appropriate training rule being applied to adjust the weights. Through the training process, the network learns from examples and in doing so acquires some capabilities for generalization beyond the training data. The architecture of these models is specified by the node characteristics, network topology and learning algorithm.

The classification of process data can be carried out with the information about different classes. Then we know that certain measurement pattern correspond to normal operation and each other measurement pattern correspond to each faulty operation. The training of the neural network using this kind of information is called supervised. The MLP network is the most prevalent architecture of ANNs which can be used for supervised classification. The network contains one input layer, one or more hidden layers and one output layer. Figure 1 depicts a simple structure of the MLP network with one hidden layer where the circles represent neurons arranged in the hidden and output layers. Each connection has a weight associated with it. For input layer, the input value is forwarded straight to the hidden layer. The hidden and output layers carry out two calculations: first a weighted sum of the inputs and then the output is calculated using a non-decreasing and differentiable transfer function. Usually a sigmoidal function is used. The information propagates in one direction only through the network (feed-forward structure).

For fault diagnosis, the input pattern represents the important variables that are affected by the existing faults and the output pattern represents the fault to be identified. The input pattern is directly fed to the network input layer and the output pattern corresponds to each node in the network output layer. Then, an output node target would be either 0 (no fault) or 1 (a particular fault). Once the weights on the connections are finally adjusted in the training phase, a new vector of sensor measurements can be sent to the input nodes of the network to be classified. Very little computation time is needed for this step. The degree of misclassification that occurs is a function of how well the knowledge stored in the connections and weights.

Figure 1. MLP neural network with one hidden layer

Download as

[1]	Himmelblau, D.M., Fault detection and diagnosis in chemical and petrochemical processes, Elsevier Scientific Publishing Company, Amsterdam, 1978.
	In article

[2]	Chiang, L.H., Braatz, R.D. and Russell, E.L., Fault detection and diagnosis in industrial systems, Springer, New York, 2001.
	In article	CrossRef

[3]	Venkatasubramanian, V., Rengaswamy, Yin, R.K. and Kavuri, S.N., “A review of process fault detection and diagnosis: Part I: Quantitative model-based methods,” Computers & chemical engineering, 27 (3). 293-311. 2003.
	In article	CrossRef

[4]	Isermann, R., “Process fault detection based on modeling and estimation methods-a survey,” Automatica, 20 (4). 387-404. 1984.
	In article	CrossRef

[5]	Dash, S. and Venkatasubramanian, V., “Challenges in the industrial applications of fault diagnostic systems,” Computers & chemical engineering, 24 (2). 785-791. 2000.
	In article	CrossRef

[6]	Kresta, J.V., Macgregor, J.F. and Marlin, T.E., “Multivariate statistical monitoring of process operating performance,” The Canadian journal of chemical engineering, 69 (1). 35-47. 1991.
	In article	CrossRef

[7]	He, Q.P. and Wang, J., “Fault detection using the k-nearest neighbor rule for semiconductor manufacturing processes,” Semiconductor manufacturing, IEEE transactions on, 20 (4). 345-354. 2007.
	In article

[8]	Chiang, L.H., Kotanchek, M.E. and Kordon, A.K., “Fault diagnosis based on Fisher discriminant analysis and support vector machines,” Computers & chemical engineering, 28 (8). 1389-1401. 2004.
	In article	CrossRef

[9]	Verron, S., Tiplica, T. and Kobi, A., “Bayesian networks and mutual information for fault diagnosis of industrial systems,” in Workshop on Advanced Control and Diagnosis (ACD’06), 2006.
	In article

[10]	Verron, S., Tiplica, T. and Kobi, A., “Fault diagnosis of industrial systems by conditional Gaussian network including a distance rejection criterion,” Engineering applications of artificial intelligence, 23 (7). 1229-1235. 2010.
	In article	CrossRef

[11]	Chen, K.Y., Chen, L.S., Chen, M.C. and Lee, C.L., “Using SVM based method for equipment fault detection in a thermal power plant,” Computers in industry, 62 (1). 42-50. 2011.
	In article	CrossRef

[12]	Yélamos, I., Escudero, G., Graells, M. and Puigjaner, L., “Performance assessment of a novel fault diagnosis system based on support vector machines,” Computers & chemical engineering, 33 (1). 244-255. 2009.
	In article	CrossRef

[13]	Kulkarni, A., Jayaraman, V.K. and Kulkarni, B.D., “Knowledge incorporated support vector machines to detect faults in Tennessee Eastman process,” Computers & chemical engineering, 29 (10). 2128-2133. 2005.
	In article	CrossRef

[14]	Xu, J., Zhao, J., Ma, B. and Hu, S., “Fault diagnosis of complex industrial process using KICA and sparse SVM,” Mathematical problems in engineering, 2013. 1-6. 2013.
	In article	CrossRef

[15]	Hussain, M. A., Hassan, C.R.C., Loh, K.S. and Mah, K.W., “Application of artificial intelligence technique in process fault diagnosis,” Journal of engineering science and technology, 2 (3). 260-270. 2007.
	In article

[16]	Watanabe, K., Matsuura, I., Abe, M., Kubota, M. and Himmelblau, D.M., “Incipient fault diagnosis of chemical processes via artificial neural networks,” AIChE journal, 35 (11). 1803-1812. 1989.
	In article	CrossRef

[17]	Koivo, H.N., “Artificial neural networks in fault diagnosis and control,” Control engineering practice, 2 (1). 89-101. 1994.
	In article	CrossRef

[18]	Himmelblau, D.M., “Applications of artificial neural networks in chemical engineering,” Korean journal of chemical engineering, 17 (4). 373-392. 2000.
	In article

[19]	Sharma, R., Singh, K., Singhal, D. and Ghosh, R., “Neural network applications for detecting process faults in packed towers,” Chemical engineering and processing: process intensification, 43 (7). 841-847. 2004.
	In article

[20]	Eslamloueyan, R., “Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee–Eastman process,” Applied soft computing, 11 (1). 1407-1415. 2011.
	In article	CrossRef

[21]	Behbahani, M.R., Jazayeri-Rad, H. and Hajmirzaee, S., “Fault detection and diagnosis in a sour gas absorption column using neural networks,” Chemical engineering & technology, 32 (5). 840-845. 2009.
	In article	CrossRef

[22]	Hansen, L.K. and Salamon, P., “Neural network ensembles,” Pattern analysis and machine intelligence, IEEE transactions on, 12 (10). 993-1001. 1990.
	In article

[23]	Polikar, R., “Ensemble based systems in decision making,” Circuits and systems magazine, IEEE, 6 (3). 21-45. 2006.
	In article

[24]	Baxt, W.G., “Improving the accuracy of an artificial neural network using multiple differently trained networks,” Neural computation, 4 (5). 772-780. 1992.
	In article	CrossRef

[25]	Ali, K.M. and Pazzani, M.J., “Error reduction through learning multiple descriptions,” Machine learning, 24 (3). 173-202. 1996.
	In article	CrossRef

[26]	Valdovinos, R.M. and Sanchez, J. S., “Ensembles of multilayer perceptron and modular neural networks for fast and accurate learning,” in Artificial Intelligence, MICAI’06. Fifth Mexican International Conference on, 2006, 229-236.
	In article

[27]	Zhang, J., “Improved on-line process fault diagnosis through information fusion in multiple neural networks,” Computers & chemical engineering, 30 (3). 558-571. 2006.
	In article	CrossRef

[28]	Bauer, E. and Kohavi, R., “An empirical comparison of voting classification algorithms: bagging, boosting, and variants,” Machine learning, 36 (1). 105-139. 1999.
	In article	CrossRef

[29]	Zio, E., Baraldi, P. and Gola, G., “Feature-based classifier ensembles for diagnosing multiple faults in rotating machinery,” Applied soft computing, 8 (4). 1365-1380. 2008.
	In article	CrossRef

[30]	Xu, L., Krzyzak, A. and Suen, C.Y., “Methods of combining multiple classifiers and their applications to handwriting recognition,” Systems, man and cybernetics, IEEE transactions on, 22 (3). 418-435. 1992.
	In article

[31]	Kuncheva, L.I., Bezdek, J.C. and Duin, R.P.W., “Decision templates for multiple classifier fusion: an experimental comparison,” Pattern recognition, 34 (2), 299-314, 2001.
	In article	CrossRef

[32]	Ruta, D. and Gabrys, B., “An overview of classifier fusion methods,” Computing and Information systems, 7 (1). 1-10. 2000.
	In article

[33]	Freund, Y. and Schapire, R.E., “Experiments with a new boosting algorithm,” in ICML, 1996, 148-156.
	In article

[34]	Freund, Y. and Schapire, R.E., “A desicion-theoretic generalization of on-line learning and an application to boosting,” in Computational learning theory, 1995, 23-37.
	In article

[35]	Schwenk, H., and Bengio, Y., “Boosting neural networks,” Neural Computation, 12 (8). 1869-1887. 2000.
	In article	CrossRef

[36]	Schapire, R.E., Freund, Y., Bartlett, P. and Lee, W.S., “Boosting the margin: A new explanation for the effectiveness of voting methods,” The annals of statistics, 26 (5). 1651-1686. 1998.
	In article	CrossRef

[37]	Samanta, B., Bandopadhyay, S., Ganguli, R. and Dutta, S., “A comparative study of the performance of single neural network vs. adaboost algorithm based combination of multiple neural networks for mineral resource estimation,” Journal of South African institute of mining and metallurgy, 105 (4). 237-246. 2005.
	In article

[38]	Schwenk, H. and Bengio, Y., “Adaboosting neural networks: Application to on-line character recognition,” in Artificial Neural Networks-ICANN’97, Springer, 1997, 967-972.
	In article	CrossRef

[39]	Canty, M.J., “Boosting a fast neural network for supervised land cover classification,” Computers & geosciences, 35 (6). 1280-1295. 2009.
	In article	CrossRef

[40]	Zhu, J., Zou, H., Rosset, S. and Hastie, T., “Multi-class adaboost,” Statistics and Its Interface, 2. 349-360. 2009.
	In article	CrossRef

[41]	Downs, J.J. and Vogel, E.F., “A plant-wide industrial process control problem,” Computers & chemical engineering, 17 (3). 245-255. 1993.
	In article	CrossRef

[42]	Lyman, P.R. and Georgakis, C., “Plant-wide control of the Tennessee Eastman problem,” Computers & chemical engineering, 19 (3). 321-331. 1995.
	In article	CrossRef

[43]	Møller, M.F., “A scaled conjugate gradient algorithm for fast supervised learning,” Neural networks, 6 (4). 525-533. 1993.
	In article	CrossRef

[44]	Mahadevan, S. and Shah, S.L., “Fault detection and diagnosis in process data using one-class support vector machines,” Journal of process control, 19 (10). 1627-1639. 2009.
	In article	CrossRef

Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Network...

Science and Education Publishing

Comparing the Fault Diagnosis Performances of Single Neural Networks and Two Ensemble Neural Networks Based on the Boosting Methods

Abstract

At a glance: Figures

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Figure 13

Figure 14

Cite this article:

1. Introduction

2. Methods

3. Application: Tennessee-Eastman Process

Table 1. Types of faults in the TE process

4. Results and Discussion

Table 2. Classification performances of different methods in diagnosis cases of faults 1 to 14 of TEP

5. Conclusion

References

Pages

Partners

Help & Contacts

Follow Us