Sensor Selection with Grey Correlation Analysis for Remaining Useful Life Evaluation

Similar documents
A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems

Cluster Analysis of Electrical Behavior

A Binarization Algorithm specialized on Document Images and Photos

Classifier Selection Based on Data Complexity Measures *

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

X- Chart Using ANOM Approach

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Smoothing Spline ANOVA for variable screening

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

y and the total sum of

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

S1 Note. Basis functions.

Study of Data Stream Clustering Based on Bio-inspired Model

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Hermite Splines in Lie Groups as Products of Geodesics

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

An Image Fusion Approach Based on Segmentation Region

Query Clustering Using a Hybrid Query Similarity Measure

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

TN348: Openlab Module - Colocalization

THE PATH PLANNING ALGORITHM AND SIMULATION FOR MOBILE ROBOT

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Available online at Available online at Advanced in Control Engineering and Information Science

Maintaining temporal validity of real-time data on non-continuously executing resources

A fast algorithm for color image segmentation

Wishing you all a Total Quality New Year!

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

Optimizing Document Scoring for Query Retrieval

Backpropagation: In Search of Performance Parameters

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Feature Reduction and Selection

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Simulation Based Analysis of FAST TCP using OMNET++

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

User Authentication Based On Behavioral Mouse Dynamics Biometrics

An Entropy-Based Approach to Integrated Information Needs Assessment

CS 534: Computer Vision Model Fitting

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Fusion Performance Model for Distributed Tracking and Classification

Parallelism for Nested Loops with Non-uniform and Flow Dependences

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A Background Subtraction for a Vision-based User Interface *

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Application of Clustering Algorithm in Big Data Sample Set Optimization

The Research of Support Vector Machine in Agricultural Data Classification

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Reducing Frame Rate for Object Tracking

Adaptive Transfer Learning

Network Intrusion Detection Based on PSO-SVM

An Improved Image Segmentation Algorithm Based on the Otsu Method

A Load-balancing and Energy-aware Clustering Algorithm in Wireless Ad-hoc Networks

The Codesign Challenge

Machine Learning: Algorithms and Applications

SVM-based Learning for Multiple Model Estimation

The Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding

Load Balancing for Hex-Cell Interconnection Network

Positive Semi-definite Programming Localization in Wireless Sensor Networks

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

Support Vector Machines

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Research and Application of Fingerprint Recognition Based on MATLAB

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Design of Structure Optimization with APDL

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Research on Categorization of Animation Effect Based on Data Mining

Finite Element Analysis of Rubber Sealing Ring Resilience Behavior Qu Jia 1,a, Chen Geng 1,b and Yang Yuwei 2,c

The Nottingham eprints service makes this work by researchers of the University of Nottingham available open access under the following conditions.

Meta-heuristics for Multidimensional Knapsack Problems

Performance Assessment and Fault Diagnosis for Hydraulic Pump Based on WPT and SOM

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

UB at GeoCLEF Department of Geography Abstract

An Optimal Algorithm for Prufer Codes *

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Video Proxy System for a Large-scale VOD System (DINA)

Support Vector Machines

Learning-Based Top-N Selection Query Evaluation over Relational Databases

The Shortest Path of Touring Lines given in the Plane

High-Boost Mesh Filtering for 3-D Shape Enhancement

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION

Evaluation of Parallel Processing Systems through Queuing Model

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Transcription:

Sensor Selecton wth Grey Correlaton Analyss for Remanng Useful Lfe valuaton Peng Yu, Xu Yong, Lu Datong, Peng Xyuan Automatc est Control Insttute, Harbn Insttute of echnology, Harbn, 5, Chna pengyu@ht.edu.cn xuyong.ht@gmal.com ABSRAC Sensor selecton n data modelng s an mportant research topc for prognostcs. he performance of predcton model may vary consderably under dfferent varable subset. Hence t s of great mportant to devse a systematc sensor selecton method that offers gudance on choosng the most representatve sensors for prognostcs. hs paper proposes a sensor selecton method based on the mproved grey correlaton analyss. From emprcal observaton, all the contnuous-value sensors wth a consstent monotonc trend are frstly selected for data fuson, a lnear regresson model s used to convert the mult-dmensonal sensor readngs nto one-dmensonal health factor (HF). he correlaton between HF each of the selected sensors s evaluated by calculatng the grey correlaton degree defned on two tme seres. he optmal sensor subset wth a relatvely large correlaton degree s selected to execute the fnal fuson. he effectveness of the proposed method was verfed expermentally on the turbofan engne smulaton data suppled by NASA Ames, usng nstance-based learnng methodology, the expermental results showed that predcton wth fewer sensor nputs can obtan a more accurate prognostcs performance than usng all sensors ntally consdered relevant.. INRODUCION Prognostcs health management (PHM) of complex engneered systems has ganed ncreasng attenton from the research communty worldwde. Prognostc bulds the foundaton of PHM, ts outcome drectly affects the other PHM components such as operatons plannng, tmely mantenance, logstcs, etc. Hence prognostcs can play an mportant part n reducng cost, ncreasng safety, accomplshng crtcal mssons (Heng et al., 8). Generally, prognostcs can be dvded nto the detecton of falure precursors the predcton of remanng useful lfe Peng Yu et al. hs s an open-access artcle dstrbuted under the terms of the Creatve Commons Attrbuton 3. Unted States Lcense, whch permts unrestrcted use, dstrbuton, reproducton n any medum, provded the orgnal author source are credted. (). Comparng to the detecton of falure precursors, predcton s mostly rrelevant to the applcaton. he methods for predcton are almost the same to all prognostcs applcatons. he ultmate am of most prognostcs systems s accurate estmaton the of ndvdual systems, the predcton accuracy reles not only on the predcton models used, but also on the types number of sensors selected (Cheng et al., ). As the performance of system degrades, the montored parameters tend to change accordngly. hese raw mult-dmensonal sensor data or features extracted from them may be used to track the degradaton behavor of system. ypcally, these degradaton data can be used as the nputs of data-drven prognostcs model to make estmaton. Degradaton data may consst of sensor readngs, such as temperature pressure, or nferred features, such as model resduals or physcs-based model predctons. Commonly, t s benefcal to fuse sensor readngs nferred features nto a sngle health factor, whch s consdered as a more robust nput to the prognostcs model (Coble, ). However, ncluson of rrelevant or redundant varables durng the fuson may lead to over-fttng or less senstvty of prognostcs model, whch s adverse to the predcton performance. Hence varable selecton s crtcal to make an accurate estmaton. ypcally, sensor selecton s left to expert knowledge, emprcal observaton of avalable data, ntmate knowledge of degradaton mechansms. hese methods are tme-consumng, scale wth the number of avalable sensors possble fault modes (Zhang, 5). For many real-world systems, t s almost mpossble to fully underst the degradaton behavor of the systems employ the frst-prncple models for prognostcs. Snce Instance-based learnng (IBL) approach develops prognostcs model based on a mass of hstorcal nstances, t becomes a preferable choce (Xue et al., 8). As the rapd development of communcaton sensor technology, abundant data collecton from complex systems, such as arcraft engnes, satellte power system, etc, becomes possble. hese massve lfe-cycle condton data collected

Annual Conference of Prognostcs Health Management Socety from varous nstances of the equpment further promotes the applcaton of IBL prognostcs approach. he current work focuses on mprovng the predcton accuracy of complex systems va sensor selecton. In real applcatons, complex system s operated under dynamc operatng condtons, k-means clusterng algorthm s employed to cluster the operatonal condtons nto a fnte number of operatng regmes. Accordng to the emprcal observaton, those sensors wth a consstent monotonc trend under dfferent operatng regmes are selected, a lnear regresson model s employed to convert the selected multvarate sensor readngs from ndvdual regmes nto a one-dmensonal HF, then all the HFs are merged to form a complete HF tme seres wth the orgnal tme stamps. he correlaton between HF each of the selected sensors s evaluated by calculatng the grey correlaton degree defned on two tme seres. he optmal sensor subset wth a relatvely large correlaton degree s selected to execute the fnal fuson. Fnally, the obtaned HF tme seres s ntegrated nto an IBL prognostcs archtecture consstng of model recognton, smlarty evaluaton predcton. Moreover, the performance of the sensor selecton strategy s verfed expermentally on the turbofan engne smulaton data suppled by NASA Ames, usng the ntegrated IBL prognostcs archtecture. he paper s organzed nto the followng sectons. In secton, the sensor selecton scheme based on mproved grey correlaton analyss s ntroduced. In secton 3, the IBL prognostcs archtecture s summarzed. he turbofan engne applcaton s elaborated n secton 4. he experment results dscussons are presented n secton 5. Concluson s drawn n secton 6.. SNSOR SLCION SCHM Sensor selecton s mostly relevant to the applcaton, t ams at reducng the unnecessary redundancy whle maxmzng the relevance n the sensor subset (Wang et al, 8). Accordng to the characterstcs of collected data, the sensor selecton scheme probably conssts of operaton condton dvson, emprcal observaton, data fuson grey correlaton analyss... Operaton Condton Dvson In real applcatons, the dynamc operaton condtons have a great mpact on the sensor readngs or nferred features from the system complcate the system degradaton behavors over tme. he sensor tme seres may show lttle trend. However, f the operaton condtons are clustered nto several operatng regmes by the use of certan clusterng algorthm, the sensor data collected from dfferent regmes may exhbt a promnent trend. he vector c represents the operaton condtons of the system at tme t. Suppose the operatonal condtons c can be concentrated nto a lmted number of operatng regmes Ο={ O, O,..., O P } usng k-means clusterng algorthm f. he output of f s defned as: where S= ( c )=(,,..., ) () f S S S P S p s membershp score when Op c. In case of dscrete operaton condtons, the output of f can be smplfed as:.. mprcal Observaton f( c )= arg max S k () k=,..., P he objectve of emprcal observaton s to elmnate the sensors whch are obvously not sutable for prognostcs modelng. he vsual nspecton procedures of sensor data under dfferent operatng regmes are as follows:. Some sensors wth one or multple dscrete values are frstly dscarded, from whch t s dffcult to track the degradaton trend of the system.. Some other sensors have contnuous values, but exhbt non-monotonc trend durng the lfe tme of the nstances, should also be dscarded. 3. All the remanng sensors wth contnuous values exhbt a monotonc trend, but some of them show nconsstent evoluton trend among the dfferent nstances, whch may represent varous fault modes of the system. As t s hard to quantze or dentfy the fault modes wthout system relevant nformaton, those sensors wth nconsstent trend are elmnated. 4. Only those sensors wth a consstent monotonc trend under dfferent operatng regmes are selected for data modelng or other processng..3. Data Fuson In ths paper, data fuson refers to convert the selected multvarate sensor data from ndvdual regmes nto a sngle HF wthn a normalzed range. herefore, the HFs obtaned from each regme can be merged to form a new one-dmenson tme seres, as descrbed n Fg.. For operatng regme created: O p, one local regresson model s ( p z = h (x ; θ ) ) (3) where x represents the multvarate sensor data, ( ) p θ denotes the local model parameters.

Annual Conference of Prognostcs Health Management Socety Local regresson model Sensor data collected under dynamc operaton condtons HF Operaton condton dvson Local regresson model HF tme seres K-means clusterng algorthm Regme Regme Regme P HF... Local regresson model P HF P Fgure. Data processng for modelng through mult-regme health assessment o make the HF comparable under dfferent operatng regme, the obtaned HF should be normalzed to a range, usually between, hence, the learnng procedures of a local regresson model can be followed: For those samples ( c, x ) collected at early stage, that s t <, assgn the matchng output z = ; For those samples ( c, x ) collected at mddle stage, that s < t <, wll not partcpate model tranng; For those samples ( c, x ) collected at late stage, that s t >, assgn the matchng output z =. he number parameters of local models vary wth the assgned thresholds, whch can greatly nfluence the model performance. Commonly, the parameters are chosen by the rule of thumb, e.g. = *% t = *9% t, where t represents the whole lfespan of the nstance. Above all, once the sensor data of tranng nstances x are provded, they wll be dvded nto fnte groups n accordance wth dfferent operatng regmes appled to learn dfferent local regresson models..4. Grey Correlaton Analyss As mentoned, the sensors are prelmnary selected by emprcal observaton, the sensor fuson method s provded. One major problem les n whether the sensor subset can be further optmzed. Due to large nose low senstvty, some sensors exhbt an unclear trend compared wth the others. Includng them n the data fuson may lower the predcton accuracy. Hence certan analyss method should be adopted to further select the sensors. Grey correlaton analyss s a prncple theory of grey system theory, whch can be appled n grey system analyss rom varables processng (Zhang & Zhang, 7). he correlaton between factors s represented by the smlarty level of geometry whch s called grey correlaton degree, the correlaton degree between reference sequences comparson sequences can be quanttatvely estmated. Grey correlaton degree descrbes the relatve change between dfferent factors n the process of system evoluton, the larger the correlaton degree s, the hgher the smlarty level s. hus, the correlaton degree can represent the mpact of dfferent sensors on the system degradaton behavors. By the calculaton of mproved correlaton degree between HF tme seres sensor tme seres, the sensors wth a relatvely large correlaton degree are selected snce they have a larger mpact on the HF. he calculaton steps of grey correlaton degree between HF sensors are as follows:. he HF tme seres whch can represent the system degradaton behavors s set as the reference sequences Z = { zk ( ) k=,,..., n}, the sensor tme seres whch can affect the system degradaton behavors are set as the comparson sequences X = X ( k) k=,,..., n, =,,..., m. { }. Due to the varous unts of measurements, the dmensons of sensor data are dfferent, whch may lead to a wrong correlaton analyss result. hus, the data should be converted nto dmensonless form. here are many dmensonless processng methods, such as equalzaton, ntalzaton, etc. In ths paper, ntalzaton method s appled, the whole data n the orgnal sequences are dvded by the ntal data, whch s shown as follows: X ( k) x ( k)=, k=,,..., n; =,,..., m (4) X () 3. he correlaton s substantally the fttng degree of geometry between curves, thus the dfference between curves s consdered as the performance ndcator of correlaton. Set ( k) = zk ( ) x( k), the grey correlaton coeffcent between zk ( ) x ( k) s: mn mn ( k) + ρ max max ( k) k k ξ ( k) = ( k) + ρ max max ( k) k (5) where ρ (, ), ts effect les n enhancng the sgnfcance of dfference between the correlaton coeffcents. he value of ρ s commonly set as.5. 4. he correlaton coeffcent represents the correlaton between reference sequences comparson sequences at varous ponts, there s a correspondng correlaton coeffcent at each pont. However, the decentralzed nformaton s nconvenent for holstc comparson. o solve the problem, the correlaton degree s proposed, whch can be represented by the mean of the correlaton coeffcents: 3

Annual Conference of Prognostcs Health Management Socety n r = ξ( k), k =,,..., n (6) n k = 5. he above-mentoned correlaton degree s calculated wthout consderng the dversty of correlaton coeffcents at dfferent pont. herefore, the stablty of correlaton coeffcent sequences s proposed: Sr ( ) ( ( k) r) n = ξ (7) n k = On the bass of the stablty, the computng model of grey correlaton degree s mproved: r * r () = (8) + Sr ( ) he object of calculatng grey correlaton degree les n comparng the mpact of dfferent sensors on system degradaton behavors. If r> r, the comparson sequence X has a greater mpact on Z than X. 3. IBL PROGNOSICS ARCHICUR From the above-mentoned sensor selecton scheme, an optmal sensor subset s selected, HF tme seres s formed by sensor data fuson, whch s ntegrated nto IBL prognostcs archtecture. In ths archtecture, a number of HF tme seres extracted from the hstorcal montorng data of the tranng nstances wth known falure tmes are used to form a lbrary of degradaton models. he smlarty between a test nstance each of the models are evaluated, each model can gve an ndvdual estmaton to the test nstance. hese estmatons can fuse nto a fnal predcton by the smlarty-weghted sum. he archtecture conssts of model recognton, smlarty evaluaton predcton. 3.. Model Recognton he HF tme seres extracted from one tranng nstance s avalable to establsh a model depctng the whole performance degradaton process of the nstance, the M can be constructed based on the multple HF tme seres extracted from tranng nstances. M s commonly a determnstc model that can gve a predcted output at a gven tme: model lbrary { } M : y = m ( t)+ ε, t (9) where ε s nosy term, s the lfetme of the tranng nstance used to establsh the model. he selecton of model type s applcaton dependent. For complex engneered systems, the man consderaton s commonly focused on the long term degradaton trend of the system, the fluctuatons n the degradaton process can be recognzed as nterference or nose. Hence, a model wth smoothng functon of the tme seres can be adopted. 3.. Smlarty valuaton he defnton of smlarty between dfferent nstances has a great mpact on the performance of IBL prognostcs method. In ths paper, grey correlaton degree ucldean dstant are adopted to respectvely represent the smlarty between the test nstance Z = { zk ( ) k=,,..., r} degradaton model M. he calculaton steps of grey correlaton degree between dfferent data sequences have already been ntroduced n secton.4 wll not be reterated here. he ucldean dstant between Z M s defned as: r = j= + D( τ, Z, M ) ( z( j) m ( τ r j)) / σ () where τ r+, τ represents the tme span that the tme seres Z s moved away from cycle zero of model M, σ s the predcton varance provded by M. he smaller the dstance s, the hgher the smlarty s. Moreover, the smlarty degree between defned as: τ Z = exp ( D( τ,, M) ) S(,, M ) 3.3. Predcton Z M s Z () Once the defnton of smlarty s defned, each model M n the lbrary can gve an ndvdual estmaton to the test nstance: = - r - arg mn D( τ, Z, M ) () τ All predctons correspondng smlarty degrees form a set {(, S ( τ, Z, M ) ) =,,..., I}, where I represents the number of models n the lbrary. he smlarty-weghted method s appled to fuse all the predctons n the set to get a fnal estmaton: 4. CAS SUDY r I = = S( τ, Z, M ) I = S( τ, Z, M ) (3) In ths secton, the performance of the sensor selecton scheme wll be valdated expermentally on the turbofan engne smulaton data avalable from NASA Ames Prognostc Data Repostory (Saxena & Goebel, 8), usng the ntegrated IBL prognostcs archtecture. 4

Annual Conference of Prognostcs Health Management Socety Set Set Set 3 Set 4 Fault modes Operaton condton 6 6 ranng unts 6 48 estng unts 59 49 able. xperment settngs of the data sets 4.. Data Descrpton Four data sets wth dfferent smulaton settngs such as the number of operaton condtons fault modes are provded by NASA, these data sets consst of multvarate tme seres from multple nstances of the turbofan engne. he collected data for each nstance conssts of a 4-dmensonal tme seres (3 operaton condtons sensor readngs for each flght cycle), these data can represent the condton of engne throughout ts flght hstory. he experment settngs of the data sets are descrbed n able. Furthermore, each data set s dvded nto tranng testng subsets. he nstances n the tranng subset have complete run-to-falure data, whch can be appled to develop prognostcs model, whle the nstances n the testng subset have up-to-date data correspondng falure tme data, whch can be appled to valdate the performance of prognostcs model. In ths paper, data set s selected to evaluate the effectveness of the proposed sensor selecton scheme. Nevertheless, only the tranng subset s appled for valdty assessment, for the testng nstances wth ncomplete run-to-falure data are not sutable for performance evaluaton metrcs based on successve estmatons throughout the whole lfe. Hence, the frst tranng nstances wll be appled for tranng, 3 out of the 6 remanng tranng nstances wll be selected romly for testng. 4.. Performance Metrcs In the context of prognostcs, the tradtonal accuracy-based or robustness-based performance metrcs are nadequate to farly assess the performance of predcton algorthms. Hence, four performance metrcs proposed by Saxena et al. () are adopted wth mnor modfcaton. hese metrcs are on the bass of successve predcton for each nstance. ) Predcton horzon Predcton horzon (PH) s defned as the estmaton that frstly satsfes the α -bound crtera: PH= t -t α (4) where t s the end-of-lfe tme stamp, t represents α the tme stamp of the estmaton that frstly satsfes the α -bound crtera. At each tme stamp t, the correspondng estmaton s r. t α s defned as: * * tα = mn{ tt [ ts, tf ], r - α t r r + α t} (5) where t s s the start tme of the estmaton, t f denotes the end tme of the estmaton. ) Rate of acceptable predctons Rate of acceptable predctons (AP) s defned as the rate of predctons that fall nto an acceptable cone-shape area when t th: AP=Mean({ δ t t t }) h f, f (- α) r r (- α) r δ =, otherwse * * (6) where t h s chose as PH calculated above. Obvously, AP s a strcter metrc than the predcton errors. 3) Relatve accuracy Relatve accuracy (RA) s defned as the mean absolute percentage errors for all t t h : r-r RA=-Mean({ t t t }) (7) * * r h f RA can gve a quanttatve metrc of the predcton accuracy wthn the specfed, comparng wth AR. 4) Convergence Convergence (CG) evaluate how fast the predcton performance mproves when more hstorcal data s avalable: f ( t = + - t ) s CG=- -t f ( t = +- t) s s tf -t s (8) * where performance metrc = r-r. he value of CG s between, CG>.5 ndcates convergence for the predcton. 5) Performance evaluaton o assess the predcton performance based on multple seres from K testng nstances, the medan of four performance metrcs s used: k PH=Medan({ PH} K ) k AP=Medan({ AP} K ) k RA=Medan({ RA} K ) k CG=Medan({ CG} ) K (9) 5

Annual Conference of Prognostcs Health Management Socety Wthn the four metrcs, PH represents the sze of tme nterval whle the others have a value between ( means perfect). 55.5 55 Regme 4.3. ranng Stage From vsual nspecton, the sensor data n the tranng nstances exhbt no promnent trend, as shown n Fg.. Hence, k-means clusterng algorthm s employed to cluster the 3 operaton condtons of all the tranng nstances, the operaton condtons are clustered n 6 dscrete operatng regmes, labeled by an ID from to 6. By ths way, the sensor data from each regme may exhbt a rsng trend, as shown n Fg.3 On the bass of the emprcal observaton mentoned n secton., the 9 sensors wth a consstent monotonc degradaton trend under all of the 6 operatng regmes are selected for further processng, namely #, #3, #4, #7, #, #, #5, # #. For nstance, the readngs of sensor # from all the tranng nstances under regme are llustrated n Fg.4. Sensor readngs 66 64 6 6 58 56 54 All regmes mxed together 5 5 5 5 3 Fgure. Raw data of sensor from one tranng nstance under all regmes Sensor readngs 55.5 55 549.5 549 548.5 548-4 -35-3 -5 - -5 - -5 Fgure 4. Raw data of sensor from all the tranng nstances under regme Snce the sensors have been prelmnary selected, HF tme seres can be obtaned through data fuson. It s remarkable that the selected sensors exhbt a consstent monotonc trend under dfferent operatng regmes, a lnear model can ft well for data wth consstent trend. Hence, a lnear regresson model s adopted to convert the multdmensonal sensor data nto HF: z = + β α 9 x = x + ε = α + β + ε () where x represents the selected 9-dmenson sensor data, z s the health factor, ε s the nose term. he sample set Ω ={ ( x, z) } s used to learn the lnear model: Ω = {( x, ) t > } {( x, ) t < } () In ths paper, the thresholds are set as =- t *9% =- t *%, where t represents the whole lfespan of the nstance. Fnally, 6 HF tme seres are extracted from all the tranng nstances, one of the tme seres s showed n Fg. 5. 69 Regme 4 only.4 All rgmes mxed together 68.5. Sensor readngs 68 67.5 Health factor.8.6.4 67. 66.5 5 5 5 3 Fgure 3. Raw data of sensor from one tranng nstance under regme 4 only -. -5 - -5 - -5 Fgure 5. HF tme seres extracted from one tranng nstance 6

Annual Conference of Prognostcs Health Management Socety Rankng #3 #5 #6 #9 () () () ().73636(6).74473(6).7649(8).78437(8) 3.758(8).7445(8).7639(6).7798(6) 4.733(3).73965(3).75887(3).7773() 5.7().7389().7567().777(3) 6.7695(4).73636(4).754(4).77345(4) 7.55575().5647().5553(9).6669(9) 8.5583(9).5597(9).55435().66538() 9.544(5).555(5).5457(5).6567(5).5437(7).55(7).54537(7).65556(7) able. Computng results of correlaton degrees n the selected 4 tranng nstances he HF tme seres correspondng 9-dmenson sensor tme seres are labeled by an ID from to. he grey correlaton degrees between 6 HF tme seres ther correspondng multvarate sensor tme seres are calculated, the results are sorted n descendng order. Among them, the computng results of correlaton degrees from 4 tranng nstances are shown n able. From able, the maxmal correlaton degree equals to, namely HF tme seres has a perfect smlarty wth tself. he correlaton degrees ranked from to 6 are rather close to each other, so are the correlaton degrees ranked from 7 to, but the correlaton degrees between these two subsets have an obvous dfference n value. It means that the frst 5 sensor tme seres have a greater mpact on the system degradaton behavors than the latter 4. By now, the frst 5 sensors, namely #, #3, #4, # #5, mght be selected to further optmze the sensor selecton. o verfy the generalty of grey correlaton analyss, the statstcal nformaton related wth correlaton degrees are gven n able 3. From able 3, the correlaton degrees of two sensors #5 #, are ranked from to 3 n most cases, the between HF each of the sensors n all tranng nstances correlaton degrees of the sensors # #3, are always ranked from 4 to 5, the correlaton degree of sensor #4 s ranked as 6 n most cases. Meanwhle, the correlaton degrees of the frst 5 sensors are rather close n value. Moreover, the correlaton degrees of the remanng 4 sensors have never featured n the top 6 rankngs, the sensors ranked n the top 6 have an obvously hgher coeffcent degree than the sensor ranked n 7. Hence, the smlarty between HF tme seres the sensor tme seres can be well represented by the mproved grey correlaton degree, the selected sensor subset ncludes 5 sensors, #, #3, #4, # #5. Furthermore, whether the selected optmal sensor subset can be further reduced remans to be dscussed. Rankng # #3 #4 #7 # # #5 # # 8 34 3 9 7 8 8 4 89 66 6 5 64 89 3 6 6 59 3 7 94 68 8 68 94 9 87 75 75 87 able 3. Statstcal nformaton of correlaton degrees As seen n Fg.5, HF tme seres demonstrate an exponental degradaton trend. hus, the exponental regresson models are adopted to descrbe the relatonshp between the HF z operatng tme t : z= a exp ( bt + c)+ d+ σ () where a, b, c, d are the model parameters to be learned from HF tme seres, σ s the nose term. he equaton z = ndcates the falure of the nstance, whch s equal to the constrant a exp ( bt + c)+ d=. After solvng the parameter d, the followng model form can be acheved: z= a (exp ( bt + c)-exp ( bt + c))+ σ (3) he model lbrary { M } can be constructed based on the 6 HF tme seres extracted from all the tranng nstances. 4.4. estng Stage he sensor data n the testng nstances should be converted nto HF tme seres. For each testng nstance, the selected sensor data wll be clustered by operatng regmes, transformed by the lnear regresson models obtaned durng the tranng stage, fused to obtan a HF tme seres. In ths paper, the smlarty between the test nstance each of the degradaton models are respectvely evaluated by grey correlaton degree ucldean dstance, usng q. (8) (). he fnal pont estmaton of s obtaned usng q. (3). 5. PRFORMANC VALUAION For each testng nstance, multple estmatons wll be made at dfferent tme stamps along the lfe of the nstance, each predcton s made based on the up-to-date data tll the correspondng tme. In ths applcaton, the start tme of predcton t s s set to 5; the end tme of predcton t s set to t - ; the tme nterval of predcton s set to 5. f 7

Annual Conference of Prognostcs Health Management Socety ) Impact of the sensor selecton scheme on predcton performance By emprcal observaton, the sensors wth a consstent monotonc degradaton trend under all of the 6 operatng regmes are selected, namely #, #3, #4, #7, #, #, #5, # #, whch s represented by sensor subset. Furthermore, based on the grey correlaton analyss, the optmal sensor subset wth a relatvely large correlaton degree s selected, the subset ncludes 5 sensors, labeled by #, #3, #4, # #5. In ths experment, ucldean dstance s adopted to evaluate the smlarty, the thresholds of lnear regresson models are set as =- t *9% =- t *%. 3 testng nstances are selected romly to valdate the mpact of dfferent sensor subsets on predcton performance, the comparson results are descrbed n able 4. As seen n able 4, ntegrate wth the IBL prognostcs algorthm, the proposed sensor selecton scheme mproves the predcton performance sgnfcantly. AP, RA CG are mproved by 5.9%, 4.3%.6% whle PH s nearly cycles larger than before. Hence, the sensor selecton scheme based on mproved grey correlaton analyss can effectvely mprove the predcton performance. he sensor subset wll be adopted for data fuson n the subsequent experments ) Impact of dfferent threshold settngs for lnear regresson models on predcton performance Not all the multvarate sensor data seres can be converted nto correspondng HF tme seres through lnear regresson. he number parameters of lnear regresson models wll vary wth the dfferent threshold settngs. In ths experment, ucldean dstance s adopted to evaluate the smlarty, the thresholds of lnear regresson models are set to =- t *9% =-4 =- =- t *%, =- t *95% =- t *5%, respectvely. 3 testng nstances selected here s dentcal to that of the prevous experment, whch are used to valdate the mpact of dfferent threshold settngs on predcton performance, the comparson results are descrbed n able 5. As seen n able 5, when =- t *9% =- t *%, AP, RA CG are much greater than those n the other two threshold settngs, except that PH s relatvely small. Moreover, t seems as f the settng of threshold parameters usng a certan percentage of the total lfe of the nstance can lead to a better predcton effect than the threshold parameters wth fxed value. If hard threshold s appled, All the tranng nstances whose lfetme are smaller than, cannot be used to tran the lnear regresson models, resultng n the decreasng of the model types n the model lbrary. Performance Metrc Subset Subset PH 34 44.5 AP.4955.6337 RA.78859.87 CG.6794.68985 able 4. Predcton performance of IBL prognostcs algorthm under dfferent sensor subsets Performance Metrc =- t *9% =- t *% =- t *95% =- t *5% =-4 =- PH 44.5 45 6.5 AP.6337.3743.987 RA.87.6669.58335 CG.68985.5643.533 able 5. Predcton performance of IBL prognostcs algorthm under varous thresholds 3) Impact of dfferent smlarty measurements on predcton performance he defnton of smlarty between dfferent nstances has a great mpact on the performance of IBL prognostcs method. he purpose of ths experment s to fgure out ether ucldean dstance or grey correlaton degree s a preferable smlarty measurement. In the experment, 3 testng nstances whch have been selected n the prevous experment are used to valdate the mpact of dfferent smlarty measures on predcton performance, the comparson results are descrbed n able 6. predctons of certan testng nstances usng dfferent smlarty measures are shown n Fg.6. As seen n able 6, combned n the IBL prognostcs method, ucldean dstance s a preferable smlarty measurement. AP, RA CG are respectvely mproved by 9.3%, 8.8% 3.%, but PH s relatvely smaller. Above all, when sensor selecton s executed based on grey correlaton analyss, the threshold parameters are set to =- t *9% =- t *%, ucldean dstance s chosen as the smlarty measurement, the IBL algorthm can acheve an optmal predcton performance. Performance Metrc ucldean Dstance PH 44.5 6.5 AP.6337.57 RA.87.75639 CG.68985.63457 Grey Correlaton Degree able 6. Predcton performance of IBL prognostcs algorthm under dfferent smlarty measurements 8

Annual Conference of Prognostcs Health Management Socety 5 estng nstance ID:8 ucldean Dstant 4 3 estng nstance ID:6 ucldean Dstant 5 5 5 5 5 5 5 5 3 35 5 Correlaton Degree 4 3 Correlaton Degree 5 5 5 5 5 5 5 (a) Instance ID: 8 estng nstance ID:35 4 6 8 4 6 8 5 5 4 6 8 4 6 8 5 5 (b) Instance ID: 35 estng nstance ID:47 (c) Instance ID: 47 ucldean Dstant Correlaton Degree 4 6 8 4 6 8 5 5 ucldean Dstant Correlaton Degree 4 6 8 4 6 8 5 5 5 3 35 (d) Instance ID: 6 Fgure 6. predctons for selected nstances usng dfferent smlarty measures From Fg.6, the selected nstances show a desrable performance, where the predctons converge to the true as tme ncreases. In the fnal stage, the predctons are almost equal to the correspondng true s, ndcatng that the IBL algorthm has an excellent convergence predcton effect n ths applcaton. 6. CONCLUSION A grey correlaton analyss method to selectng the most representatve sensors for the predcton of complex engneered systems s developed. he performance of sensor selecton scheme ntegrated wth IBL prognostcs algorthm was evaluated usng four performance metrcs desgned n the context of PHM. he addton of other sensors not selected by grey correlaton analyss to the nput sensor subset has led to a decreasng n the predcton performance, confrmng the effectveness of the sensor selecton scheme. he scheme presented s expected to gan a smlar performance for other complex engneered systems. RFRNCS Heng, A., Zhang, S., an, C.., & Mathew, J. (8). Rotatng machnery prognostcs: State of the art, challenges opportuntes. Mechancal Systems Sgnal Processng, vol 3, pp. 74-739. do:.6/j.ymssp.8.6.9 Cheng, S., Azaran, M. H., & Pecht, M. G. (). Sensor systems for prognostcs health management. Sensors, vol, pp 5774-5797, do:.339/s65774 Coble, J. B., (). Mergng Data Sources to Predct Remanng Useful Lfe An Automated Methods to Identfy Prognostc Parameters, Doctoral dssertaton. 9

Annual Conference of Prognostcs Health Management Socety Unversty of ennessee, Knoxvlle, USA. http://trace.tennessee.edu/utk_gradds/683 Zhang, G. F., (5). Optmal Sensor Localzaton/Selecton n A Dagnostc/Prognostc Archtecture, Doctoral dssertaton. Georga Insttute of echnology, Atlanta, USA. Xue, F., Bonssone, P., Varma, A., Yan, W. Z., & Goebel, K. (8). An Instance-based method for remanng useful lfe estmaton for arcraft engnes. Journal of Falure Analyss Preventon, vol 8, pp. 99-6. do:.7/s668-8-98-9 Wang,. Y., Yu, J. B., Segel, D., & Lee, J. (8). A smlarty-based prognostcs approach for remanng useful lfe estmaton of engneered systems. Internatonal Conference on Prognostcs Health Management. October 6-9, Denver, CO. do:.9/phm.8.474 Zhang, Y. J., & Zhang, X. (7). Grey correlaton analyss between strength of slag cement partcle fractons of slag powder. Cement Concrete Compostes, vol 9, pp. 498-54. do:.6/j.cemconcomp.7..4 Saxena, A., & Goebel, K. (8). C-MAPSS data set. NASA Ames Prognostcs Data Repostory.http://t.arc.nasa.gov/project/prognostcdata-repostory Saxena, A., Celaya, J., Saha, B., Saha, S., & Goebel, K. (). Metrcs for offlne evaluaton of prognostc performance. Internatonal Journal of Prognostcs Health Management, vol. BIOGRAPHIS Peng Yu was born n 973. He receved the Ph.D. n nstrument scence technology from Harbn Insttute of echnology. Snce 7, he has been a professor n the Department of Instrument scence technology, Harbn Insttute of echnology. Hs research nterests nclude pattern recognton, data mnng wreless sensor networks. Xu Yong was born n 985. He s currently workng toward the Ph.D. degree from Harbn Insttute of echnology. Hs research nterests nclude wreless sensor networks prognostcs health management of complex system.