Data Mining Approaches to Characterize Batch Process Operations

Similar documents
Combining Complementary Scheduling Approaches into an Enhanced Modular Software

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

Design of Fault Diagnosis System of FPSO Production Process Based on MSPCA

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION

Fuzzy-Kernel Learning Vector Quantization

A Brief Overview of Robust Clustering Techniques

SELECTION OF A MULTIVARIATE CALIBRATION METHOD

To be presented at the American Control Conference, Denver, CO, June 4 6, Data Compression Issues with Pattern Matching in Historical Data

Improved Version of Kernelized Fuzzy C-Means using Credibility

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data

Real-time Monitoring of Multi-mode Industrial Processes using Feature-extraction Tools

A Fuzzy Rule Based Clustering

OPTIMIZATION. Optimization. Derivative-based optimization. Derivative-free optimization. Steepest descent (gradient) methods Newton s method

Data-driven fault detection with process topology for fault identification

A MODIFIED FUZZY C-REGRESSION MODEL CLUSTERING ALGORITHM FOR T-S FUZZY MODEL IDENTIFICATION

Hydraulic pump fault diagnosis with compressed signals based on stagewise orthogonal matching pursuit

European Journal of Science and Engineering Vol. 1, Issue 1, 2013 ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM IDENTIFICATION OF AN INDUCTION MOTOR

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering

Multivariate Analysis

ECM A Novel On-line, Evolving Clustering Method and Its Applications

Kuske Martyna, Rubio, Rubio Rafael, Nicolas Jacques, Marco Santiago, Romain Anne-Claude

A robust optimization based approach to the general solution of mp-milp problems

Cluster Analysis. Ying Shen, SSE, Tongji University

FAULT DIAGNOSIS BASED ON MULTI-SCALE CLASSIFICATION USING KERNEL FISHER DISCRIMINANT ANALYSIS AND GAUSSIAN MIXTURE MODEL AND K-NEAREST NEIGHBOR METHOD

Graph Embedding in Vector Spaces

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

A NEW VARIABLES SELECTION AND DIMENSIONALITY REDUCTION TECHNIQUE COUPLED WITH SIMCA METHOD FOR THE CLASSIFICATION OF TEXT DOCUMENTS

Learning a Manifold as an Atlas Supplementary Material

An adjustable p-exponential clustering algorithm

QUALITATIVE MODELING FOR MAGNETIZATION CURVE

Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms

EVALUATION FUZZY NUMBERS BASED ON RMS

A New Fuzzy Neural System with Applications

Rotation Perturbation Technique for Privacy Preserving in Data Stream Mining

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

Component grouping for GT applications - a fuzzy clustering approach with validity measure

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Cluster analysis of 3D seismic data for oil and gas exploration

Pattern Recognition Methods for Object Boundary Detection

On-Line Monitoring of Particle Shape and Size Distribution in Crystallization Processes through Image Analysis

CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH

Matrix Inference in Fuzzy Decision Trees

High Resolution Remote Sensing Image Classification based on SVM and FCM Qin LI a, Wenxing BAO b, Xing LI c, Bin LI d

Prediction-based diagnosis and loss prevention using qualitative multi-scale models

An Endowed Takagi-Sugeno-type Fuzzy Model for Classification Problems

Chapter 7 UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION

Texture Segmentation and Classification in Biomedical Image Processing

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING

Change Detection in Remotely Sensed Images Based on Image Fusion and Fuzzy Clustering

Integrated management of hierarchical levels: towards a CAPE tool

Methods for Intelligent Systems

Semi-Supervised Clustering with Partial Background Information

A novel firing rule for training Kohonen selforganising

Cluster Tendency Assessment for Fuzzy Clustering of Incomplete Data

Dynamic Clustering of Data with Modified K-Means Algorithm

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

A Distance-Based Classifier Using Dissimilarity Based on Class Conditional Probability and Within-Class Variation. Kwanyong Lee 1 and Hyeyoung Park 2

Multi-Phase Analysis Framework for Handling Batch Process Data

Replacement of Missing Data and Outliers Using Wavelet Transform Methods

MultiGrid-Based Fuzzy Systems for Function Approximation

Unsupervised learning in Vision

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions

Temperature Calculation of Pellet Rotary Kiln Based on Texture

Multidirectional 2DPCA Based Face Recognition System

An indirect tire identification method based on a two-layered fuzzy scheme

Texture Image Segmentation using FCM

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Automatic basis selection for RBF networks using Stein s unbiased risk estimator

Mixture Models and the EM Algorithm

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

9.1. K-means Clustering

Chemometrics. Description of Pirouette Algorithms. Technical Note. Abstract

Improving the Wang and Mendel s Fuzzy Rule Learning Method by Inducing Cooperation Among Rules 1

Web Based Fuzzy Clustering Analysis

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Image Analysis, Classification and Change Detection in Remote Sensing

Jing Gao 1, Feng Liang 1, Wei Fan 2, Chi Wang 1, Yizhou Sun 1, Jiawei i Han 1 University of Illinois, IBM TJ Watson.

Modeling VM Performance Interference with Fuzzy MIMO Model

Combining Gabor Features: Summing vs.voting in Human Face Recognition *

Creating Time-Varying Fuzzy Control Rules Based on Data Mining

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

10701 Machine Learning. Clustering

Mostafa Naghizadeh and Mauricio D. Sacchi

FACE RECOGNITION USING INDEPENDENT COMPONENT

Applied Fuzzy C-means Clustering to Operation Evaluation for Gastric Cancer Patients

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION

An Approach for Fuzzy Modeling based on Self-Organizing Feature Maps Neural Network

Self-Organized Similarity based Kernel Fuzzy Clustering Model and Its Applications

Introduction to digital image classification

CHAPTER-6 WEB USAGE MINING USING CLUSTERING

CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS

An Adaptive Threshold LBP Algorithm for Face Recognition

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

Optimization Under Fuzzy If-Then Rules Using Stochastic Algorithms

Keywords - Fuzzy rule-based systems, clustering, system design

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Transcription:

Data Mining Approaches to Characterize Batch Process Operations Rodolfo V. Tona V., Antonio Espuña and Luis Puigjaner * Universitat Politècnica de Catalunya, Chemical Engineering Department. Diagonal 647, 08028 Barcelona, Spain. Abstract n this work, an approach to mine data from batch process operations is presented. The aim is to extract knowledge from data and to support the design or redesign of monitoring systems. Multivariate models at recipe and lower levels are obtained by a multiscale PCA (MsPCA) approach. Then, fuzzy clustering is used to help to identify operational conditions by each product recipe. Cluster membership information is used to define effective rules that aid to characterise the operation of the plant for future productions. How to handle time-varying trajectories and how to catch their associated dynamics with the multiscale PCA is specially considered. An example based on a real pilot plant is used for illustrative purposes. Keywords: Data Mining, Batch Process, MsPCA, Fuzzy Clustering. 1. ntroduction Nowadays, large amounts of process operational data are recorded in Chemical Plants. t has been recognised that these data have a great potential to provide insight into the process (Stockill, 2002). So, developments of advanced data analysis tools and methods are required. Particularly, adoptions and applications of Data Mining approaches that aid to extract useful knowledge from data are claimed (Stockill, 2002, Wang, 2001). Some recent proposals have been made to support monitoring of continuous and batch processes. For batch, existing multivariate methods like Multiway PCA (MPCA) and PLS (Nomikos et al, 1994) has been proposed to obtain reduced characterisations of productions. However, in these methods, issues like time varying operations, outliers, production by recipes and the transitory dynamics of trend variables are not well treated or solved. Also, issues like the multiscale nature of the data have been separately considered. n the case of MPCA it is assumed that operating times of all batches are equal. However, this is not true in many real applications. Extensions to solve the timevarying problem have been proposed by use of aligning of variables (resampling) by reference to a variable indicator with dynamic time warping techniques (Kassidas et al, 1998). The disadvantage of this is that multiscale feature of variables are not taken into account. Additionally, some variables cannot be available at the corresponding resampling intervals. Chen et al (2000) uses orthonormal basis functions to represent the variables profiles and then build an MPCA model over the coefficients of the * To whom correspondence should be addressed mail: luis.puigjaner@upc.es

orthonormal functions. This last approach is more appropriate to try the time varying problem, but important issues like the selection of the functions and the multiscale nature of the data are not considered. Clustering has also been proposed for batch operation (Yuan et al., 2001). t is used together with PCA to support the identification of operating conditions and to the design of the monitoring system. The issue of operation by recipes into the analysis is also considered. Nevertheless, time varying, outliers and multiscale are not considered here. As a consequence, the identification and the obtained monitoring model will be suboptimal. n this work an alternative approach is proposed to explore batch operational data and to assist in the design or redesign of monitoring system. The integration of orthonormal bases functions with PCA is adopted by using Wavelet. The resulting MsPCA is combined with Fuzzy Clustering. The identification with clustering allows the generation of operation rules that improves the knowledge of the process and serves as a monitoring system together with the MsPCA. 2. Proposed Approach 2.1 Multiscale Modelling of Data Multivariate Statistical techniques have been extensively used for monitoring Batch Processes. MPCA is one of the most known methods. To apply this method it is assumed that experimental data form a three-dimensional array. The resulting matrix, Xo, is of xjxk, where J variables are measured at K times in each one of the batches. Then, Xo can be unfolded into a large two-dimensional matrix, X, of xjk (figure 1). Then, PCA can be over this unfolded matrix. n the method, it is supposed equal operation time (K) for all the batches which limits the application to the cases where batches are different in time. Observations K Batches Vari ables J J xk Variables x Observations Figure 1. Unfolding of the Three-way Batch data set. To overcome this problem we adopt an approach based on function approximation (Chen et al, 2001). n this approach, the matrix X is obtained like in MPCA, but the resulting matrix is ordered as follows: [ X ] xjk 1 6444 74448 k = 1) k = 2)... k = K ) = M 6444 74448 k = 1) k = 2)... k = K ) L L L 1 64447444 8 x J ( k = 1) x J ( k = 2)... x J ( k = K ) M 6444 74448 k = 1) k = 2)... k = K ) (1)

n the above matrix, each element is the profile of variable x j in batch run i. These profiles can be represented by ƒ i,j (t) functions. Chen (2001) proposed the use of approximation functions to obtain ƒ i,j (t). Approximation functions constitute sets of orthonormal bases with very good properties for represent signals. They allow representing ƒ(x) as: N 1 n= 0 f ( x) c n φ ( t) (2) n where, C N = {c n } n=0,1,,n-1., and {φ n (t)}, represent a set of square integral functions. Then, based on Lagrange polynomial functions, equation (1) becomes: [ X ] M L M = [] c [] c L [] c xjk f1,1 ( t) L f1, J ( t) = [ xn xn xnj ] (3) 11 2 f,1( t) L f, J ( t) where N j is the required number of bases in ƒ i,j (t) to approximate the measurements j and [c] xnj is the trajectory coefficient matrix of measurement j which is spanned by N j. N j is always the same on normal operation. So, by applying PCA on X is obtained a good representation of batches with different time duration. The above solution strategy is proposed over one scale. However, it has been widely recognised the multiscale nature of chemical data (Bakshi, 1998). Also, the selection of Lagrange polynomials is not an obvious alternative. n this work, all this is solved by using Wavelets. Wavelets are families of functions with very good properties as orthonormal functions. By combining approximations, at different scales, they are able to catch fine details and trends with very good accuracy. So, the resulting approximation is expressed as: f ( x) = d mk ψ ( t) (4) m= 1 k= mk where d m represents the c n coefficients in equation 2 at the m th scale and ψ m define the k th basis functions at the m th scale. n relation to the selection of the function, Daubechies wavelets are used because of their very good capabilities to represent polynomial behaviour. The extraction of functions with wavelets is achieved through de-noising (Nounou et al,1999). t allows to eliminate the effect of noise with clear advantages over subsequent analysis. So, the matrix X is built by way of the d m denoised coefficients and PCA is applied on it. There is a clear difference with the multiscale PCA proposed by Bakshi (1998). Here, PCA is only built over the complete wavelets coefficients matrix and the de-noising is ensured before PCA. 2.2 Fuzzy Clustering Clustering are techniques that attempts to assess the relationships among data patterns belonging to different groups. n this work, fuzzy clustering is adopted with the purpose to identify operating region patterns and as a base for rules generation. Fuzzy-c-mean is used as the clustering technique. t is an algorithm that can automatically identify the centre of each cluster and calculate the membership values of each data case to each cluster. t is based on the minimization of the sum of squared Euclidean distances between data (X k, k=1 n) and cluster centres (v i, i =1 c):

Min J m c n m ( U, V ) = ( µ ) x v (5) 1 k= 1 ik k i 2 where 1 m is the fuzziness index, c is the number of clusters, and µ ik denotes the matrix of a fuzzy c-partition. The last fuzzy c-partition is constrained as follows: µ ik [0,1] i, k, µ ik = 1 k, C i = 1 n k = 1 µ ik < n, i. (6) n other words, each X k could belong to more than one cluster with each membership (µ) taking a fractional value between 0 and 1. The details of the algorithm are not shown for space reasons (Bezdeck, 1981). However, it is noted that the algorithm is dependently of C. Also, because the objective function is based on Euclidian distances, the method tends to identify clusters with only spherical forms. n this work, the Mahalanobis distance is used (equation 5) to extend the identification up to spherical and ellipsoids forms. Additionally, a simple algorithm, the mountain method (Yager et al, 1994), is used as a pre-estimator of C, for cases when it is not known a priori. Rules generation n the above methods, each cluster centre is in essence a prototypical data point that exemplifies characteristics behaviour of a system. Then, the membership information, µ, of each data point allows associating it to a pattern of the system. So, simple rules can be generated by extract the larger membership of each point and relate it to a pattern as follow: f µ J1 is A then {C 1 = pattern 1 } 1 is produced. f µ J2 is B and µ J2 is D then {C = pattern} 1 is produced. Here, A, B and D, can represent values of variables like Temperature, etc. and patterns can express an associated operating condition for a recipe. As a consequence, a more insight into the process can be obtained and used to design a monitoring system. 2.3 Global procedure. Data mining approaches. The above methods are combined to extract knowledge from batch process data. The way by which data is analysed, is determined by two important issues, the production by recipes and the operation development by stages. Two levels of analysis are proposed: One at the recipe level (entire batch) and another at the stage level (sub-step). At the first level, the overall data from different batches are used. A matrix, X c, with all process variables, is built together with a separate matrix of important quality variables, X Qi. Wavelet coefficient matrices for each matrix (X c, X Qi ) are obtained. The resulting matrices are processed with PCA. Reduced representations (patterns) of batches and the profile of each X Qi are obtained. Next, clustering is applied. Groups of patterns and its memberships µ to groups are obtained. Groups of one or two objects, with their respective coefficients into X c and X Qi are rejected as outliers and PCA models are obtained again. This rejection step allows to eliminate some possible abnormal batches and to select data of good batches. When no rejections are registered, groups of operations are defined. t can occur that some recipes will be grouped in a same cluster

which suggests similar operating conditions and, possibly, a single model for these recipes. n a second step, the data set obtained in the above step is used and the recipes identified as similar are analysed together. A pre-processing step with wavelet is applied over the variables profiles to identify stages. Then, MsPCA is applied over individuals X pi matrices and by groups of stages in profiles. The obtained information, µ, is mapped onto the µ information at the level recipe. So, simple rules about conditions in each stage that can conduct to a product grade of a recipe are derived. Finally, data rejected at the recipe level is analysed to identify the potentially abnormal operation. So, the knowledge about the process is expanded. t should be noted that a similar analysis (by levels) has already been proposed by Yuan et al, (2001). They first apply the stage level analysis to obtain a reduced representation of variables with the most significant principal components (PC s). Then, PCA is applied over PC s. However, PC s are not well suited to catch the dynamic trend information of variables. Our proposed use of wavelets is much more appropriate for this task. Checking about abnormal data is no made and the presence of outliers is not considered by Yuan et al., which can mask the identification of groups. Additionally, their approach can not be applied over time varying processes. 3. Batch Pilot Plant Application A Real pilot plant at UPC has been selected as the scenario for testing purposes. t contains three reactors, heat exchangers and a highly flexible connectivity between them that is achieved via a network of pipes, pumps and valves. t has been used to generate data for several products recipes. A total of 24 experiments are generated with different length in operation time. First, the analysis over the recipe level is made. Figure 2a. µ values with a bad batch Figure 2b. µ values without bad batches Membership values (µ) of batches to different groups are obtained (figure 2a). The groups are verified as representing specific recipes. Also, it is noted the effect of a bad batch with a low µ in recipe 3. When it is rejected, the definition of the groups is improved (figure 2b). Subsequent analysis help to identify different operating conditions associated to the existing recipes. Then, operating rules about each one of the recipes are obtained. t can be noted that in the application of this analysis the process variables were recorded with a frequency of one minute while the quality

variables were recorded at a frequency of 5-10 minutes. Because PCA is applied over the coefficients of wavelet approximations of each signal, the low difference in sampling is not limiting. However, it should be noted that the method can not be applied in cases with larger differences between sampling frequencies. 4. Conclusions A new methodology to explore batch processes has been proposed. The methodology is capable of deal with important issues like time varying trajectories and outliers. Also, it is appropriate to represent operations, by stages and by recipes, with rules. This last capability is useful to improve the understanding of the process. Also, it is shown as very useful to the design or redesign of monitoring systems. Data about the pilot plant have served to illustrate the methodology. Nevertheless, additional applications over this and/or other real scenarios should be made to establish the generalization of the method. Also, the problem of different sample frequencies must be additionally studied. Finally, the approach is observed as potentially useful to obtain a root cause analysis databases to support tasks like equipment maintenance or scheduling. All this, will be explored in future works. Acknowledgment Financial support from the Generalitat de Catalunya (F research grant for Tona, R.V.) and from the European Community (projects VPNET-GRD-CT-2000-00318 and CHEM-GRD-CT-2001-00466) are gratefully acknowledged. References Bakshi, B. R., 1998, AChE Journal, 44, 7, 1596-1610. Bezdek, J., 1981. Pattern recognition with fuzzy objective function algorithms, Plenum, N.Y. Chen, J., and Liu, J., 2001, Chem. Eng. Sci., 56, 10, 3289-3304. Kassidas, A., MacGregor, J.F., and Taylor, P., 1998, AChE J, 44, 4, pp. 864-875. Nomikos, P. and MacGregor, J.F., 1995, Technometrics, 37, 1, 41-59. Nounou, M. N. and Bakshi, B. R., 1999, AChE Journal, 45, 5, 1041-1058. Stockill, D., 2002, ESCAPE-12 (Ed. Grievink, J., Schjindel, J.,), Elsevier, 70-77, Amsterdam, Netherlands. Wang, X.Z., 2001. Application of Neural Networks and other Learning Technologies in Process Engineering. London : mperial College Press. Yager, R.R, and Filev, D.P., 1994, EEE Trans. on Syst., Man, & Cyb., 24, 8, pp.1279-1284. Yuan, and Wang, X.Z., 2001, Chem. Eng. Comm., 185, 201-221