A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems

Similar documents
Sensor Selection with Grey Correlation Analysis for Remaining Useful Life Evaluation

Wishing you all a Total Quality New Year!

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Unsupervised Learning

Support Vector Machines

Feature Reduction and Selection

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Classifying Acoustic Transient Signals Using Artificial Intelligence

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

S1 Note. Basis functions.

A Binarization Algorithm specialized on Document Images and Photos

Smoothing Spline ANOVA for variable screening

Machine Learning: Algorithms and Applications

CS 534: Computer Vision Model Fitting

y and the total sum of

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

X- Chart Using ANOM Approach

Hermite Splines in Lie Groups as Products of Geodesics

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Classifier Selection Based on Data Complexity Measures *

The Research of Support Vector Machine in Agricultural Data Classification

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Biostatistics 615/815

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

An Image Fusion Approach Based on Segmentation Region

Cluster Analysis of Electrical Behavior

A Robust Method for Estimating the Fundamental Matrix

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

SVM-based Learning for Multiple Model Estimation

TN348: Openlab Module - Colocalization

Lecture 4: Principal components

Machine Learning 9. week

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Virtual Machine Migration based on Trust Measurement of Computer Node

Fusion Performance Model for Distributed Tracking and Classification

Unsupervised Learning and Clustering

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

Meta-heuristics for Multidimensional Knapsack Problems

The Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Optimizing Document Scoring for Query Retrieval

(1) The control processes are too complex to analyze by conventional quantitative techniques.

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

The Codesign Challenge

A Semi-parametric Regression Model to Estimate Variability of NO 2

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Performance Evaluation of Information Retrieval Systems

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Lecture 5: Multilayer Perceptrons

A Background Subtraction for a Vision-based User Interface *

Backpropagation: In Search of Performance Parameters

A Statistical Model Selection Strategy Applied to Neural Networks

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Three supervised learning methods on pen digits character recognition dataset

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Support Vector Machines

Adjustment methods for differential measurement errors in multimode surveys

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks

Query Clustering Using a Hybrid Query Similarity Measure

A New Approach For the Ranking of Fuzzy Sets With Different Heights

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

An Entropy-Based Approach to Integrated Information Needs Assessment

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech.

Hierarchical clustering for gene expression data analysis

Classification / Regression Support Vector Machines

Help for Time-Resolved Analysis TRI2 version 2.4 P Barber,

Machine Learning. Topic 6: Clustering

Active Contours/Snakes

Load-Balanced Anycast Routing

USING GRAPHING SKILLS

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Local Quaternary Patterns and Feature Local Quaternary Patterns

The Comparison of Calibration Method of Binocular Stereo Vision System Ke Zhang a *, Zhao Gao b

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Transcription:

2008 INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT A Smlarty-Based Prognostcs Approach for Remanng Useful Lfe Estmaton of Engneered Systems Tany Wang, Janbo Yu, Davd Segel, and Jay Lee Abstract Ths paper presents a smlarty-based approach for estmatng the Remanng Useful Lfe (RUL) n prognostcs. The approach s especally sutable for stuatons n whch abundant run-to-falure data for an engneered system are avalable. Data from multple unts of the same system are used to create a lbrary of degradaton patterns. When estmatng the RUL of a test unt, the data from t wll be matched to those patterns n the lbrary and the actual lfe of those matched unts wll be used as the bass of estmaton. Ths approach s used to tacle the data challenge problem defned by the 2008 PHM Data Challenge Competton, n whch, run-to-falure data of an unspecfed engneered system are provded and the RUL of a set of test unts wll be estmated. Results show that the smlarty-based approach s very effectve n performng RUL estmaton. Index Terms Health management, Performance assessment, Prognostcs, Remanng useful lfe R I. INTRODUCTION EMAINING Useful Lfe (RUL) estmaton s the most common tas n the research feld of prognostcs and health management. The data-drven approach for RUL estmaton normally reles on the avalablty of run-to-falure data, based on whch the RUL can be estmated, ether drectly through a multvarate pattern matchng process, or ndrectly through damage estmaton followed by extrapolaton to the damage progresson [1]. In ths paper, we present a novel datadrven approach for RUL estmaton, whch also starts from damage estmaton (often referred to as performance assessment), but followed by a smlarty-based matchng method for RUL determnaton. Manuscrpt receved on July 18, 2008. Ths wor was conducted prmarly for the Data Challenge Competton organzed by the 2008 PHM conference. The wor s supported by the U.S. Natonal Scence Foundaton Industry/Unversty Cooperatve Research Center for Intellgent Mantenance Systems (NSF I/UCR Center for IMS) at the Unversty of Cncnnat. Tany Wang s wth Department of Mechancal Engneerng, Unversty of Cncnnat, Cncnnat, OH 45221 USA (phone: 513-556-3412; fax: 513-556- 3390; e-mal: wangt@emal.uc.edu). Janbo Yu s wth Shangha Jao Tong Unversty, Shangha, 200240 Chna. He s now a vstng student at Unversty of Cncnnat, Cncnnat, OH 45220 USA (e-mal: yjb168@sjtu.edu.cn). Davd Segel s wth Department of Mechancal Engneerng, Unversty of Cncnnat, Cncnnat, OH 45220 USA (e-mal: segeldn@emal.uc.edu). Jay Lee s wth Department of Mechancal Engneerng, Unversty of Cncnnat, Cncnnat, OH 45220 USA (e-mal: jay.lee@uc.edu). The approach was chosen based on the followng assumptons: ) Run-to-falure hstorcal data from multple unts of a system/component are recorded; (The term unt refers to an nstance of a system/component.) ) The hstorcal data covers a representatve set of unts of the system/component; ) The hstory of each unt ends when t reaches a falure condton, or a preset threshold of undesrable condtons, after whch no more runs wll be possble or desrable. (The hstory can start, however, from a varable degradng condton.) Then, a lbrary of degradaton patterns can be created from these unts wth complete run-to-falure data (called tranng unts). A unt whose remanng lfe wll be predcted (called a test unt) also has ts hstorcal data recorded contnuously. Instead of fttng a curve for a test unt and extrapolatng t, the data wll be matched to a certan lfe perod of certan tranng unts wth the best matchng scores. Fnally, the RUL of the test unt can be estmated by usng the real lfe of the matched tranng unts mnus the current lfe poston of the test unt (Fg. 1). The detaled methodology of the approach employed to tacle the data challenge problem (defned by 2008 PHM Data Challenge Competton [2]) s ntroduced n Secton II. The expermental data used for the competton s descrbed n Secton III; the procedures taen to solve the data challenge problem are elaborated n Secton IV; the results are further dscussed n Secton V. Fnally, conclusons as well as the future wor on the approach are dscussed n Secton VI. II. METHODOLOGY The approach conssts of two essental procedures: performance assessment and RUL estmaton. Feature extracton before performance assessment s optonal, snce the sensor readngs themselves can be consdered features n some cases. Sometmes, the data collected covers varous operatng condtons of the system. Those condtons wll be consdered separately wth local performance assessment models; n ths case, operatng regme parttonng [3] s necessary before performance assessment s conducted.

Fg. 1. Overvew of the RUL estmaton strategy. The remanng lfe of a test unt s estmated based on the actual lfe of a tranng unt that has the most smlar degradaton pattern. A. Performance assessment The mult-dmensonal sensor readngs, as well the features extracted from the raw sensor data, are frst fused to produce a sngle Health Indcator (HI). The process to acheve ths s called performance assessment. In [4], logstc regresson s used to convert the mult-dmensonal features nto HIs, whch can then be used to predct machne performance through ARMA models. However, as found from ths study, logstc regresson wll dstort the orgnal degradaton pattern (e.g. exponental pattern) of the system, so that curve extrapolaton methods based on the output of logstc regresson wll become less accurate. Specfcally, the logstc curve s flat when the value approaches 0 or 1; therefore the HIs produced by logstc regresson are less senstve near the early and end lfe of the system than n the mddle lfe, whch may lead to larger predcton error when extrapolatng the HI curve. To preserve the orgnal patterns n the sgnal/features, a lnear regresson model s used as performance assessment: T N y = α + x + ε = α + = β x + ε (1) 1 where x = (x 1, x 2,, x N ) s the N dmensonal feature vector, y s the health ndcator, (, ) = (, 1, 2,, N ) s N+1 model parameters, and s the nose term. Note that the model s actually the exponental part of a logstc regresson model and the tranng data set can be prepared n the same way as for a logstc regresson model,.e. tang data from healthy and near-falure condtons of the system and assgnng the correspondng outputs wth 1 and 0 respectvely. Lnear regresson cannot guarantee the transformaton to produce a HI wthn the range of 0 to 1 as logstc regresson does; however, ths does no harm to RUL estmaton. B. RUL estmaton An ntutve method of RUL estmaton s to ft a curve of the avalable data of a testng unt usng regresson models and extrapolate the curve to certan crtera ndcatng system falure. However, the avalable hstory of a testng unt sometmes may be short; extrapolatng the ftted curve may produce large errors whereas the avalable run-to-falure data are not fully utlzed. The same problem exsts wth the predcton method that employs ARMA models bult on the testng data. These methods are sutable when run-to-falure data are unavalable or nsuffcent, but are not the best choce for the type of problems under nvestgaton n ths paper. Predcton methods based on Neural Networs can tae advantage of the run-to-falure data n the model tranng process. However, these methods lac a systematc approach to select the structure and parameters of the Neural Networs and lac ntuton for people to contnuously mprove ther performance. Snce a representatve set of unts are avalable (refer to the assumptons n Secton I), t s reasonable to frst derve multple representatve degradaton models from those unts, fnd the models wth smlar degradaton patterns as the test unt and use them as the bass for RUL estmaton. The smplest way s, of course, to have one model for each tranng unt. The HIs calculated usng (1) from each cycle of the tranng unt form a one-dmensonal tme seres, whch can be used to buld a model that depcts the pattern of performance degradaton from normal to falure. A lbrary of models {M }, each from one tranng unt, can be establshed. M s usually a determnstc model (e.g. regresson models, ARMA models, etc.) that can produce an estmated output y at a gven tme t: M : y = f ( t ), T t 0 (2) where T s the tme lmt assocated wth the model. Note that the functons are translated along t so that t = 0 corresponds to the last cycle before falure and t < 0 to all other cycles n the hstory. In ths paper we only consder dscrete tme unts, or cycles, whch have only nteger values. Cases n whch contnuous tme unts are consdered (wth varable tme ntervals between adjacent observatons) are very smlar. The form of the functon n (2) s applcaton dependent. For example, f the data clearly shows an exponental degradaton pattern, an exponental model can be drectly used to ft the data; f no apparent patterns can be observed, other models le the ARMA models, can be used. A certan dstance measure between a model M and Y = y 1 y 2 y r, a test unt s HI sequence from r consecutve observatons, has to be defned: d ( τ, Y, M ), 0 τ T r + 1 (3) It s a functon of τ, the number of cycles that the sequence Y s shfted away from cycle zero of model M. The dstance measure tells the smlarty level that a test unt behaves as model M at ts hstory τ. Smaller dstance means hgher smlarty. The dstance functon d (, Y, M ) can be defned n varous ways. A smple defnton can be gven by Eucldean dstance: d 2 2 ( y f ( τ r j ) ) / σ r ( τ, Y, M ) j = 1 j + = (4) where 2 s the predcton varance gven by model M. In fact, M can also be a probablstc model (e.g. Hdden Marov Models) that gves the probablty of y at tme/cycle t: M : p = Pr( y t, M ), T t 0 (5) In ths case d (, Y, M ) can be defned as negatve logarthm of

Fg. 2. Procedures of RUL estmaton for the Data Challenge problem. lelhood: r d ( τ, Y, M ) = = log τ + (6) ( Pr( y r j, M )) j 1 j Once the dstance measure s defned, each model M n the lbrary can produce one estmated RUL for the test unt: RUL = arg mn d ( τ, Y, M ) (7) τ At the same tme, a dstance score can be gven to the estmaton: D = mn d ( τ, Y, M ) (8) τ The fnal RUL of the test unt can be estmated through weghted sum of the obtaned RULs: RUL = w RUL, w = 1 (9) The weghts w can be assgned based on the dstance score D. For example, wth the -nearest-neghbor method, those w for the smallest D can be assgned wth 1/, whereas all other w are assgned wth 0. In realty, however, the number of nearest neghbors, as well as the way that the weghts are assgned to those neghbors, depends hghly on the applcaton. III. EXPERIMENTAL DATA The data set, provded by the 2008 PHM Data Challenge Competton, conssts of multvarate tme seres that are collected from multple unts of an unspecfed component. Each tme seres s from a dfferent nstance of the same complex engneered system, e.g., the data mght be from a fleet of shps of the same type. There are three operatonal settngs that have a substantal effect on unt performance. The data for each cycle of each unt nclude the unt ID, cycle ndex, 3 values for the operatonal settngs and 21 values for 21 sensor measurements. The sensor data are contamnated wth nose. Each unt starts wth dfferent degrees of ntal degradaton and manufacturng varaton whch s unnown. Ths degradaton and varaton s consdered normal. The unt s operatng normally at the start of each tme seres, and develops a fault at some pont durng the seres. The data set s further dvded nto tranng and testng subsets. In the tranng data set (218 unts), the fault grows n magntude untl system falure, at whch tme, one or more lmts for safe operaton have been reached, and the unt may Fg. 3. Operatonal settngs of all unts. The sx dots are actually sx hghly concentrated clusters that contan thousands of sample ponts each. These clusters ndcate sx dscrete operatng condtons of the system. not be used for another operatonal cycle. There s no hard falure n the data set; however, the remanng useful lfe of the last operatonal cycle of each unt n the tranng data s consdered as zero. In the testng data set, the tme seres ends some tme pror to system falure. The objectve of the problem s to predct the number of remanng operatonal cycles before falure n the testng data set,.e., the number of operatonal cycles after the last cycle that the unt wll contnue to operate. A porton of the testng data set (218 unts) s provded frst to assst algorthm development and the rest (435 unts) s released towards the end of the competton as the valdaton data set to score the algorthm. The score for one predcton s defned as the exponental penalty to the predcton error; and the score of an algorthm s defned as the total score S from all the predctons for the K unts n the testng data set (defned by the Competton): d = estmated RUL actual RUL S S = e = e K = 1 d /13 d /10 S 1, d 1, d 0, > 0 = 1,..., K (10) As we can see, the penalty functon s asymmetrc, wth late predctons penalzed more heavly than early predctons. Lower scores are better; a perfect algorthm would score zero. IV. PROCEDURES The methodology descrbed n Secton II s expanded n more detal when appled to the expermental data. Seven procedures are developed, and are dvded nto two stages, tranng (model development) and testng (RUL estmaton), as shown n Fg. 2. A. Tranng stage The tranng stage ncludes four procedures appled to the tranng data set. 1) Operatng regme parttonng A quc observaton of the sensor data shows that the data exhbt no promnent trend along the lfe of a unt f the operatng settngs, ndcated by three varables, are not

Unt 1: all operatng regmes combned Ftted curves of 10 unts from the tranng set 1 Observatons Ftted Curve 1 0.8 0.8 Health Indcator 0.6 0.4 0.2 Health Indcator 0.6 0.4 0.2 0 0 (a) (b) Fg. 4. Selected sensors n selected regmes. (a) A sensor wth consstent degradaton pattern among all unts. (b) A sensor wth dfferent degradaton patterns among the unts. dfferentated. The 3-D plot of the three varables shows that the data concentrated n sx clusters, ndcatng sx dscrete operatng condtons, or regmes (Fg. 3). Ths observaton vods the need of any sophstcated clusterng technques for operatng regme parttonng, snce the value of the frst operatonal settng s actually enough to dstngush the sx operatng regmes. Now, each cycle of a unt can be labeled by a regme ID from 1 to 6, replacng the orgnal three varables of operatonal settngs. 2) Sensor selecton To be consstent wth the models n the form of (2), the cycle ndces C of the data for each unt are rearranged: C adj = C Unt Lfe. The last cycle of a unt always has the ndex 0 whereas all the prevous cycles have negatve cycle ndces. In ths way, data from all unts plotted on a sngle graph can now show the trend of system degradaton (Fg. 4). Sensor selecton starts from observaton n each operatng regme. A few sensors have sngle or multple dscrete values, from whch t s hard to fnd trace of system degradaton. Most of sensors wth contnuous values exhbt a monotonc trend durng the lfetme of the unts. However, some of them show nconsstent end-lfe trends among the dfferent unts n the tranng data set as shown n Fg. 4-b, whch mght ndcate, for example, dfferent falure modes of the system. It mght be possble to frst classfy the unts by falure modes based on these sensors and then process them usng dfferent predcton models; ths strategy, however, wll encounter two challenges. Frst, the end-lfe readngs of these sensors spread out over a large range, whch mae t hard to quantze the falure modes wthout extra nformaton. Second, the falure modes mght not be unambguously dentfable, f not completely ndscernble, at the early age of a unt, and thus mght contrbute lttle to RUL estmaton when only early hstory of the unt s avalable. Therefore, only those contnuous-value sensors wth a consstent trend (Fg. 4-a) are selected for further processng. Theses sensors are ndexed by 2, 3, 4, 7, 11, 12, 15, 20 and 21. Although these nne sensors are selected from observaton n ths wor, t s not hard to defne crtera (e.g., usng sgnfcance test for regresson analyss) for an algorthm to select them automatcally. The major challenge, n fact, les n whether all of the nne sensors selected help to mprove the accuracy of RUL estmaton,.e. whether usng a subset of those selected sensor can actually mprove the accuracy. Some sensors do not show a clear trend as others due to hgh nose or ther low senstvty -0.2-400 -350-300 -250-200 -150-100 -50 0 Cycle -0.2-400 -350-300 -250-200 -150-100 -50 0 Cycle (a) (b) Fg. 5. Curve fttng for tranng unts. (a) One unt (b) Ten unts to degradaton. Includng them n the analyss may lower the accuracy of predcton. However, up tll the day the results are produced n ths study, sensor subset selecton s not optmzed. Only two combnatons from the nne sensors are tested n the experments: one wth three sensors 7, 12 and 15, whch show the clearest trend (from observaton) n all sx regmes, and another one wth seven sensors, 2, 3, 4, 7, 11, 12 and 15, leavng out two sensors that exhbt relatvely larger varance, and thus, less clearer trend throughout the unt s lfe. Experments on the test data sets showed that the choce of seven sensors produced better overall RUL estmaton than the one wth only three sensors. It s no doubt that the selecton of sensors can be further optmzed regardng to predcton accuracy. In the followng procedures of ths wor, the seven sensors, 2, 3, 4, 7, 11, 12 and 15, are used. 3) Performance assessment The selected seven sensors x = (x 1, x 2,, x 7 ) are used to buld sx lnear models n the form of (1), one for each operatng regme. The sensor values are used drectly wthout extractng other features. Those models obtaned can then transform the sensor data nto the HI y; meanwhle, the sgnals n each regme are scaled to a smlar range so that they can be merged agan to form a tme seres for the unt s complete lfe hstory. The sample set = {(x, y)} requred to ft the models conssts of selected samples x from the tranng data set wth purposely assgned y value to them. Those cycles near the end lfe of all unts are selected and assgned wth value 0; and those cycles n the early lfe of the unts (whch should have a long enough lfe) are selected and assgned wth value 1: adj adj Ω = x,0) C > C } {( x,1) C C } (11) {( max < mn The selecton of C max and C mn should create an approprate sze of sample set whch contans adequate early-lfe and endlfe samples to tran the lnear models, and whch s also not too large to undermne the representatveness of early-lfe and end-lfe propertes. In ths applcaton, C max = 5 and C mn = 300 (very few unts have hstory longer than 300 cycles) are selected consderng that the samples n have to be dvded by the sx operatng regmes to ft sx models. Once the model parameters n each regme are obtaned, they wll be used to transform the complete hstory of each unt nto HI tme seres (Fg. 5-a), whch n turn wll be used for the followng step. 4) Model dentfcaton In ths applcaton, the exponental (nonlnear) regresson models are used to descrbe the relatonshp between the

Normalzed sensor readng adjusted cycle ndex C adj and the HI y: adj bc + c c y = a ( e e ) + ε (12) where a, b, and c are the model parameters to be determned, and s the nose term. The term e c s used to force the model to gve y = 0 when C adj = 0. The model for each tranng unt wll have one set of parameters (a, b, c ) as well as the estmated nose varance 2 that s requred by (4). The functons descrbed n (2) can then be expressed as: b t + c c M : y = f ( t ) = a ( e e ) (13) 1.4 1.2 1 0.8 0.6 0.4 0.2 0-400 -300-200 -100 0 100 200 Cycle As found, the unts have dfferent lfe expectancy and wear out at dfferent rates, whch are shown n Fg. 5-b. B. Testng stage Test unt 18: Age=135; RUL=76 * Test unt --- Best matches from tranng unts Outler Fg. 6. RUL estmaton from the best matched tranng unts that have runto-falure hstory. Each curve represents the degradaton pattern of one unt. The fnal RUL of the test unt s estmated based on the RULs gven by each matched tranng unt. The testng stage ncludes three procedures appled to each unt n the testng data set. 1) Sgnal transformaton Ths procedure utlzes the parameters found from the frst three procedures n the tranng stage. For each unt n the testng data set, the selected sensors data wll be classfed by operatng regmes, transformed by the lnear models for performance assessment obtaned durng tranng, and merged to obtan an HI sequence Y. The correspondng cycle ndces use the orgnal non-adjusted ones,.e., 1, 2,, r. 2) Dstance evaluaton and RUL estmaton In ths study, the HI sequence Y s frst fltered usng the movng average method. And then the dstances between a test unt and each of the models {M } are evaluated usng (4), (13) and (8); the RULs are obtaned usng (7) and form a RUL pool. 3) RUL fuson A sngle, fnal RUL wll be calculated from the RUL pool n three steps: canddate selecton, outler removng, RUL determnaton. ) Canddate selecton All the RULs are frst sorted by the dstance scores gven by (8) n ascendng order. Top-ranng RULs are selected. A cutoff dstance score s set to a 25% ncrease of the smallest score usng constrant D 1.25D 1. If too few RULs reman, a fxed number of top-ranng RULs wll be selected. ) Outler removng RUL estmaton for a unt wth short hstory tends to produce great uncertanty or varance, whch means unreasonably long or short estmaton s lely to appear n ths stuaton. Therefore, those exceptonally long RULs (e.g. larger than 190 cycles) wll be removed; those short RULs that mae a test unt s total lfe (RUL + the current lfe of the test unt) exceptonally short (e.g. 125 cycles) wll be removed, too, due to the fact that a unt s lfe s less lely to go below a certan lmt as shown by the statstcs of the tranng unts true lfe. Those RULs that passed the above crtera are llustrated n Fg. 6. Each curve represents one unt and gves one RUL estmaton. Then, more outlers n the RULs, f enough s left, wll be removed usng constrant (Q 0.5 3(Q 0.5 Q 0.25 ) < RUL < Q 0.5 + 2(Q 0.75 Q 0.5 ), where Q 0.25, Q 0.5, and Q 0.75 are the frst, second and thrd quartles of the RULs left from the prevous outlerremovng process. Note that the outler condton for larger RULs s strcter than for smaller RULs. ) RUL determnaton The fnal RUL s computed from a weghted sum of the RULs that pass the prevous step. Consderng the scorng method gven n (10), averagng all RUL estmatons s napproprate. Instead, the followng weghtng method s used, whch emphaszes on the upper and lower boundary only (consderng the exponental penalty to predcton errors). RUL = 13 / 23) mn ( RUL ) + (10 / 23) max ( RUL ) (14) ( V. RESULTS AND DISCUSSIONS The RUL estmaton method s tested and tuned usng the frst batch of testng data (218 unts) provded by the PHM Competton. The actual remanng lfe of these unts s ept unnown and only the total score of 218 estmated RULs s computed and returned to the user at each submsson of results. Therefore, the average predcton error measured n cycles s unavalable. The average predcton error measured n scores defned n (10), however, s not ntutve, because several large errors among the many predctons wll completely domnate the fnal score. For example, a late predcton of 60 cycles wll gve a penalty score of 402.43, whch could happen when only a very short hstory (e.g. 25 cycles) s avalable. As comparson, a late predcton of 10 cycles gves only 1.72 penalty score. As the scorng mechansm defned n (10) doesn t provde a penalty lmt for a sngle predcton, t s rather rsy to mae a large predcton error for those unts wth short hstory. It s also found that large errors of both early predcton and late predcton are possble. The authors beleve that, n order to mtgate the rs, a certan rs model (e.g. from Bayesan decson theory) can be derved from the dstrbuton of the actual lfe of the tranng unts to adjust the RUL estmatons made by the algorthms. Tll ths wor s reported, however, the method used for RUL adjustment s rule-based and the parameters are determned from experments. For example, an

RUL larger than 135 s adjusted to 135 drectly to reduce rs. The RUL adjustment rules, as well as the RUL fuson procedures dscussed n a prevous secton, were developed based on the experments on the frst porton of the testng data set released along wth the tranng data set. Near the end of the competton, the algorthm and rules were appled to the valdaton data set whch contans 435 samples (the valdaton data set dd not allow a tral-and-error type of submsson). A total score of 5636.06 s acheved, whch s the overall best n the competton. [4] J. Yan, M. Koc, and J. Lee, A prognostc algorthm for machne performance assessment and ts applcaton, Producton Plannng & Control, Vol. 15, No. 8, 2004, pp. 796 801. VI. CONCLUSION AND FUTURE WORK The approach presented n ths paper s very effectve for the data set provded by the competton, and s expected to demonstrate smlar performance for applcatons under the same assumptons made n Secton I. The approach, however, has great potental of mprovement n, for example, automatng parameter selecton and generalzaton for other prognostcs stuatons. The followng future wor could be pursut: ) Automate operatng regme parttonng. Clusterng algorthms, although seemngly unnecessary due to dscrete operatng condtons, can be used n ths applcaton to automate the process; for problems wth contnuous operatng condtons, however, t s a must. ) Automate sensor selecton. The sensors canddates can be frst fltered through unsupervsed feature selecton methods, e.g. test of sgnfcance for fttng desred regresson models. Then the optmal sensor subset can be determned through supervsed feature selecton methods, usng the overall predcton score as the selecton crtera. ) Use analytcal methods to determne the parameters for RUL fuson. Some of the parameters, such as the upper and lower lmts for canddate RULs, can be determned by nvestgatng the statstcs of the tranng unts actual lfe. Others can be optmzed, ether usng the overall predcton score as the target functon (f possble), or usng a rs model bult on the penalty of large predcton errors. The rs model can be used to mae the adjustment to the fnal RUL estmaton, whch s now largely determned through experment. v) Test other models than an exponental model wthn the framewor of the approach. Both determnstc models and probablstc models can be used. Other data sets that do not exhbt clear exponental degradaton patterns can be employed. REFERENCES [1] K. Goebel, B. Saha, and A. Saxena, A comparson of three data-drven technques for prognostcs, Falure preventon for system avalablty, 62th meetng of the MFPT Socety - 2008, pp. 119 131. [2] 2008 PHM data challenge competton, [Onlne] http://www.phmconf.org/ocs/ndex.php/phm/2008/challenge [3] T. Wang, and J. Lee, The operatng regme approach for precson health prognoss, Falure Preventon for System Avalablty, 62th meetng of the MFPT socety - 2008, pp. 87-98.