Non-Negative Matrix Factorization and Support Vector Data Description Based One Class Classification

Similar documents
Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Feature Reduction and Selection

Support Vector Machines

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Support Vector Machines

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Classifier Selection Based on Data Complexity Measures *

Cluster Analysis of Electrical Behavior

The Research of Support Vector Machine in Agricultural Data Classification

Classification / Regression Support Vector Machines

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Announcements. Supervised Learning

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Discriminative Dictionary Learning with Pairwise Constraints

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Unsupervised Learning

Lecture 4: Principal components

A Binarization Algorithm specialized on Document Images and Photos

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

PCA Based Gait Segmentation

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Machine Learning 9. week

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Data Mining: Model Evaluation

An Image Fusion Approach Based on Segmentation Region

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

Face Recognition Method Based on Within-class Clustering SVM

Collaboratively Regularized Nearest Points for Set Based Recognition

Edge Detection in Noisy Images Using the Support Vector Machines

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Support Vector Machines. CS534 - Machine Learning

CLASSIFICATION OF ULTRASONIC SIGNALS

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

An Optimal Algorithm for Prufer Codes *

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

Network Intrusion Detection Based on PSO-SVM

TN348: Openlab Module - Colocalization

Face Recognition Based on SVM and 2DPCA

The Study of Remote Sensing Image Classification Based on Support Vector Machine

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Classifying Acoustic Transient Signals Using Artificial Intelligence

Smoothing Spline ANOVA for variable screening

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Detection of an Object by using Principal Component Analysis

A Robust Method for Estimating the Fundamental Matrix

Private Information Retrieval (PIR)

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Recognizing Faces. Outline

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

A Background Subtraction for a Vision-based User Interface *

Impact of a New Attribute Extraction Algorithm on Web Page Classification

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b

Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification

Review of approximation techniques

Machine Learning: Algorithms and Applications

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

A fast algorithm for color image segmentation

Learning a Class-Specific Dictionary for Facial Expression Recognition

CS 534: Computer Vision Model Fitting

Meta-heuristics for Multidimensional Knapsack Problems

Modeling Inter-cluster and Intra-cluster Discrimination Among Triphones

Modular PCA Face Recognition Based on Weighted Average

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Programming in Fortran 90 : 2017/2018

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Associative Based Classification Algorithm For Diabetes Disease Prediction

An Anti-Noise Text Categorization Method based on Support Vector Machines *

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Feature Extractions for Iris Recognition

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

SVM-based Learning for Multiple Model Estimation

Classifier Ensemble Design using Artificial Bee Colony based Feature Selection

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Improved Image Segmentation Algorithm Based on the Otsu Method

Quadratic Program Optimization using Support Vector Machine for CT Brain Image Classification

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Solving two-person zero-sum game by Matlab

Hybrid Non-Blind Color Image Watermarking

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES

A Study on Clustering for Clustering Based Image De-Noising

Shape-adaptive DCT and Its Application in Region-based Image Coding

Novel Pattern-based Fingerprint Recognition Technique Using 2D Wavelet Decomposition

Transcription:

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 36 Non-Negatve Matrx Factorzaton and Support Vector Data Descrpton Based One Class Classfcaton Lyong Ma 1, Nazhang Feng and Q Wang 3 1 School of Informaton & Electrcal Engneerng, Harbn Insttute of Technology at Weha Weha, 6409, Chna School of Informaton & Electrcal Engneerng, Harbn Insttute of Technology at Weha Weha, 6409, Chna 3 Department of Automatc Measurement and Control, Harbn Insttute of Technology Harbn, 150001, Chna Abstract One class classfcaton s wdely used n many applcatons. Only one target class s well characterzed by nstances n the tranng data n one class classfcaton, and no nstance s avalable for other non-target classes, or few nstances are present and they cannot form statstcally representatve samples for the negatve concept. A two-step paradgm employng nonnegatve matrx factorzaton (NMF) and support vector data descrpton (SVDD) for one class classfcaton tranng of nonnegatve data s developed. Frstly, a proected gradent based NMF method s used to fnd the hdng structure from the tranng nstances and the tranng nstances are proected nto a new feature space. Secondly, SVDD s employed to perform one class classfcaton tranng wth the proected feature data. Classfcaton examples demonstrate that the proposed method s superor to prncpal component analyss (PCA) based SVDD method and other standard one class classfers. Keywords: Non-Negatve Matrx Factorzaton, Support Vector Data Descrpton, One Class Classfcaton. 1. Introducton In recent years, there has been consderable nterest n one class classfers. One class classfcaton was orgnally proposed n obect recognton applcaton [1]. Only one target class s well characterzed by nstances n the tranng data n one class classfcaton problems, and no nstance s avalable for other non-target classes, or few nstances are present and they cannot form statstcally representatve samples for the negatve concept [-4]. Ths s true n many classfcaton applcatons, such as fault dagnoss and obect dentfcaton [1, 4]. In fault dagnoss, t s easy to obtan the normal operaton nstances, but we have few nstances or no nstance to model the fault class. In obect dentfcaton, t s very challengng to collect non-obect samples when we tran the machne to learn an obect, because too many samples are avalable and t s hard to represent the negatve concept unformly. One class classfers are more dffcult to buld than conventonal mult-class classfer or bnary classfer, for only the target classfcaton boundary or densty can be obtaned when negatve sample data s ether absent or lmted n ts dstrbuton. One class classfers are generally classfed nto three man types, whch are densty estmaton, reconstructon and boundary estmaton approaches. Two classcal methods for one class classfcaton are the densty estmaton method and the reconstructon method []. Gaussan data descrpton, mxture of Gaussan data descrpton and Parzen data descrpton are well-known densty estmaton approaches. Some reconstructon approaches have also been developed, such as prncpal component analyss (PCA) data descrpton and autoencoded neural network data descrpton. Another mportant method for one class classfcaton s to obtan the boundary around the target nstances. Recently support vector data descrpton (SVDD) approach has been developed to dstngush the target class from others n the pattern space [, 5]. SVDD computes the hypersphere n the pattern space around the target class data wth the mnmum radus to encompass almost all the target nstances and exclude the non-target ones. There s an extensve lterature on the mplementaton and applcaton of SVDD. However, few researches nvestgate the data preprocessng method for SVDD. As we know, one class classfcaton requres a large number of nstances for obect tranng [6]. In addton, t s Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 37 dffcult to decde the feature set used to fnd the best separaton between target class and non-target class. Dmenson reducton and feature selecton s mportant for one class classfcaton [7]. PCA preprocessng has been reported for one class classfcaton to mprove the classfer performance [8]. In fact, the feature data s often non-negatve n many real lfe applcatons. Non-negatve matrx factorzaton (NMF) s superor to PCA for nonnegatve data as t employs non-negatve constran whch s accordng to the practcal meanng of the real lfe data [9, 10]. NMF has the ablty to fnd the hdng data structure and has been successfully used for feature extracton n some applcatons. In ths paper a NMF and SVDD based one class classfer s developed. Expermental results demonstrate that the performance of the proposed classfer s superor to PCA based SVDD classfer and other one class classfers.. NMF & SVDD Based One Class Classfer We wll ntroduce support vector data descrpton frstly n ths secton. L D Gven an nstance set X = [ x ld ] R, where L s the number of samples, and D s the number of features. The -sample s denoted as x. The SVDD dentfes a hypersphere wth mnmum volume contanng all or most of the nstance samples. The hypersphere volume s characterzed wth ts center c and radus R n the new feature space. The obectve of mnmum volume s acheved by mnmzng R, ths constraned optmzaton problem can be formulated as mn F ( c) = R s. t. ϕ ( x ) c R, = 1,..., L (1) where ϕ ( ) maps the feature data nto a new feature space, and s the L -norm. To allow the possblty of outlers n the tranng data set, slack varables are ntroduced as L mn F( c, ξ ) = R + C ξ s. t. = 1 ( x ) c R + ξ, ξ 0, = 1,..., L ϕ () where C s the penalty coeffcent for outlers, and ξ s the dstance between the -th nstance sample and hypersphere. It s also possble to use a kernel K ( u, v) to represent the nner product. The Gaussan kernel K( x, z) = exp( x z / s ) s known as an effcent kernel for SVDD [5], and we always use t n ths paper. The above problem can be solved by optmzng the followng dual problem after ntroducng Lagrange multplers L L L max α = K( x, x ) = α = α K( x, x 1 1 1 ) L s. t. = α = 1, α [0, C], = 1,..., L (3) 1 A quadratc programmng algorthm can be employed to solve the above problem. There are three types of tranng nstances dependng on whether α = 0, 0 < α = 0 < C or α = C. When α = 0, the nstances are wthn the hypersphere. When 0 < α = 0 < C, the nstances are on the hypersphere boundary. When α = C, the nstances are outsde the hypersphere and have nonzero ξ. The nstances are also called support vectors (SV) when α 0. The hypersphere center can be obtaned by c = x SV α ϕ( x ) (4) The square radus R can be calculated wth the dstance between c and any support vector x on the ball boundary. R = K( x, x) α x SV K( x, x) (5) + α α K( x, x ) x SV x SV The above SVDD hypersphere result can be used for one class classfcaton after tranng. Durng the classfcaton, the sgn of the followng functon s used to udge whether an nstance s nsde the SVDD hypersphere D( x) = sgn( R ϕ( x) c (6) = sgn( z + α K( x, x) K( x, x)) where x SV x SV z = R α α K( x, x ) (7) x SV A postve sgn mples that the tested nstance s wthn the SVDD hypersphere. Next we wll gve a summary to non-negatve matrx factorzaton, and we wll descrbe our one class classfcaton method. Non-negatve matrx factorzaton was orgnally proposed by Paatero and Tapper [9]. Gven the observaton matrx Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 38 X = ] L D [ x ld R, and the lower-rank J, NMF fnds such non-negatve factors T = ] J D [ x d R that A = ] L J [ a l R and X AT (8) Non-negatve constrans are appled to A and T durng the decomposton, that means a 0 and t 0. NMF became popular after the smple multplcatve update rule provded by Lee and Seung [10]. Some other algorthms have been developed after ther algorthm [11, 1]. NMF has been successfully used n a varety of real world applcatons, such as pattern recognton and data mnng. NMF can be solved by mnmzng the dfference between X and AT n terms of the squared Eucldean dstance f ( A, T ) = X AT / (9) where F denotes the Frobenus norm. Alternatng nonnegatve least squares s an effcent algorthm to solve the above problem usng a block coordnate descent method n bound-constraned optmzaton. After ntalzng non-negatve A 0 and 0 T, the followng update rule s employed k A k k arg mn f(a,t ) k T 1 k+ 1 k arg mn f(a,t ) (10) where k=0,1,,... We can use a proected gradent method called alternatng non-negatve least squares [1] to keep all the elements non-negatve. Ths method shows very fast convergence and t s used n ths paper. Only non-negatve data have practcal physcal meanng n many real world applcatons, and the underlyng components of data wth non-negatve decompostons are able to provde physcal nterpretaton. For example, the sensor measurement results of dstance and volume, the housng prce n the market. Non-negatve decomposton provdes an effcent tool to extract the relevant parts from the data. In ths paper, we employ NMF n one class classfer to fnd the local patterns hdden n the tranng data, and we expect the non-negatve constrans to gve the natural representaton of the tranng data wth decomposed components. It s also mportant that NMF s an addtve model whch does not allow subtracton. Therefore NMF s able to descrbe the entre entty wth the decomposed parts, that s to say NMF s a part-based representaton. A zero-value represents the absence and a postve number represents the presence of the decomposed components n our one class classfcaton applcaton. Ths addtve nature of NMF s expected to result n a new base of the data features. l F d Ths vew of NMF leads to a two-stage one class classfcaton method. Frstly, proected gradent method s used to perform NMF on the tranng data X. We can obtan the base A and the proecton T wth NMF X=AT. A s the new base whch ncludes the structure and components nformaton hdden n the tranng data X. T s the proecton result of the tranng data X onto the base A. T can provde more feature nformaton for one class classfcaton. Secondly, SVDD s employed to perform one class classfcaton tranng wth T and correspondng labels. The traned SVDD can be used for one class classfcaton after tranng. When we perform one class classfcaton wth test data Q, the test data features P can be obtaned after the test data Q s proected on the base A wth Q=AP. And the fnal classfcaton results can be obtaned when the traned SVDD s employed to the test data features P. 3. Experments and Dscusson We wll ntroduce support vector data descrpton frstly n ths secton. False postve rate (FPR) and false negatve rate (FNR) are usually employed as error measurement for classfcaton. FPR s the rato of the number of non-target nstances whch are mstakenly classfed as target to the total number of non-target nstances. Smlarly, FNR s the rato of the number of target nstances those are mstakenly classfed as non-target to the total number of target nstances. A good one class classfer wll have both a small FPR and a small FNR. Recall (RC) s also wdely used for classfcaton accuracy measurement. Recall s defned as the rato of the number of target nstances those are correctly predcted to the total number of target nstances. A good one class classfcaton wll have a bg recall value. Recever-operatng characterstc (ROC) curve that s a functon of the true postve rato to the false postve s usually used to compare the performance of classfers, but the curve comparson of dfferent classfers s not easy. The area under the ROC curve (AUC) measure s employed to compare the performance n our experments [13]. AUC s calculated from the ROC curve values. Wth ths defnton, the larger the AUC value s, the better s the performance of a one-class classfer. In our experments, data descrpton toolbox (DDTools 1.7.5) [14] s used. And default parameters for ths toolbox are employed. The tolerance for NMF s 10, and the maxmum teraton number s 5. 70 percent of the whole nstances s selected as the tranng data, and the other Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 39 nstances data s used to evaluate the classfcaton performance after the tranng. 3.1 Compare wth Other One Class Classfers Footnotes should be typed n sngled-lne spacng at the bottom of the page and column where t s cted. Footnotes should be rare. Databases wth non-negatve data used wdely for classfcaton test n other lterature [16, 17] were employed to test the performance of the proposed algorthm n ths paper. Results on the wne recognton database and the Boston housng database from Unversty of Calforna Irvne machne learnng repostory [15] are reported here. The wne recognton database s used to determne the orgn of wnes usng chemcal analyss. The class 1 s regarded as target class n ths wne recognton database. There are 59 target nstances and 119 outler nstances wth 13 features n the wne recognton database. The Boston Housng database s used to predct the housng prce n suburbs of Boston. The class whose medan prce s less than 35000 dollar s regarded as target class. There are 458 target nstances and 48 outler nstances wth 13 features n the Boston Housng database. 70 percent target nstances and outler nstances are randomly selected from the whole nstances data for classfer tranng, and the other 30 percent for testng and evaluaton n our experments. Dfferent one class classfers were evaluated on the wne database and the housng database. These classfers ncluded Gauss, mxture Gauss (MxGauss), PCA, SVDD, Parzen, auto-encoded neural network (AENN), and the proposed one class classfer. Expermental results of the wne database and the housng database are lsted n Table 1 and Table, respectvely. The factor sze of J n equaton (8) of NMF for the wne database and the housng database s selected as 6 and 5, respectvely. Both SVDD method and Parzen method can accurately classfy all the target nstances for the wne database n Table 1, but they all have bad FPR value. Ths means these two methods classfed most outler nstances as target. The wrong result s caused by the nsuffcent nformaton n the tranng set to correctly estmate the parameters for the classfer. The error classfcaton rates for target and outler of Gauss, mxture Gauss, PCA, and auto-encoded neural network are hgher than 11 percent. All the error classfcaton rates of the proposed method are less than 6 percent. The proposed method obtaned the less total error classfcaton rate and the greater recall value. The AUC value gves the overall performance evaluaton. The proposed method obtaned the greatest AUC value, t has the superor performance. Smlar results can also be found n Table for the housng database. The proposed method obtaned the best performance compared wth other methods n our experments. Table 1: Comparson of one class classfers for wne database Gauss 0.09 1.0000 0.9708 0.641 MxGauss 0.0511 1.0000 0.9489 0.6554 PCA 0.0438 0.986 0.956 0.6481 SVDD 0.9781 0.0000 1.0000 0.5615 Parzen 0.848 0.0000 0.175 0.5865 AENN 0.0438 0.7857 0.956 0.5553 Proposed 0.190 0.486 0.7810 0.8186 Table : Comparson of one class classfers for housng database Gauss 0.1765 0.86 0.835 0.9109 MxGauss 0.941 0.1714 0.7059 0.8588 PCA 0.1765 0.749 0.835 0.6084 SVDD 1.0000 0.0000 0.0000 0.6151 Parzen 0.7647 0.0000 0.353 0.7647 AENN 0.1176 0.3143 0.884 0.8336 Proposed 0.0588 0.086 0.941 0.9950 3. Compare wth PCA Based SVDD Classfer PCA s often used for feature extracton before classfcaton. It can capture the data varance n the squared error sense and map data nto orthonormal subspace. Egenvalue decomposton s used to obtan the egenvectors of the target covarance matrx n the practcal calculaton. The egenvectors correspondng to the largest egenvalues are consdered as the prncpal components, they are the prncpal axs n the drecton of the largest varance. These egenvectors are used to form an orthonormal bass for data mappng. The number of the Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 40 orthonormal bass vectors s optmzed to represent the data varance. The one class classfers that combne PCA feature extracton and SVDD are also compared wth the proposed method n our experments. Classcal PCA and kernel PCA (KPCA) [8] based feature selecton methods are used for test. The feature number was automatcally optmzed n the experment whle the total varance was selected as 90 percent. 1(a), 1(b) and 1(c), respectvely. Some outler nstances of source data are n the hgh densty area of the target nstances as showed n Fg.1(a). Ths s smlar for PCA processed data as showed n Fg.1(b). And most outler nstances are dstrbuted n the low densty area of the target nstances area for NMF processed data as showed n Expermental results of the wne database and the housng database are lsted n Table 3 and Table 4, respectvely. The overall performance evaluaton provded by AUC value shows that PCA and kernel PCA based SVDD classfers can mprove the classfcaton performance. However, the performance mprovements of these methods are small compared wth the proposed method. Databases wth non-negatve data used wdely for classfcaton test n other lterature [16, 17] were employed to test the performance of the proposed algorthm n ths paper. Results on the wne recognton database and the Boston housng database from Unversty of Calforna Irvne machne learnng repostory [15] are reported here. The wne recognton database s used to determne the orgn of wnes usng chemcal analyss. The class 1 s regarded as target class n ths wne recognton database. There are 59 target nstances and 119 outler nstances wth 13 features n the wne recognton database. (a) Source data Table 3: Compare wth PCA based SVDD for wne database SVDD 1.0000 0.0000 0.0000 0.6151 PCA+SVDD 0.941 0.086 0.0588 0.6630 KPCA+SVDD 1.0000 0.0000 0.0000 0.606 Proposed 0.0588 0.086 0.941 0.9950 (b) Data after PCA processng Table 4: Compare wth PCA based SVDD for housng database SVDD 0.9781 0.0000 0.019 0.5615 PCA+SVDD 0.5547 0.3571 0.4453 0.598 KPCA+SVDD 1.0000 0.0000 0.0000 0.6019 Proposed 0.190 0.486 0.7810 0.8186 The data feature dstrbuton of the housng database s plotted n Fg.1. The two prncpal features those are obtaned wth PCA analyss method of the source data, PCA processed data and NMF processed data s plotted n (c) Data after NMF processng Fg. 1 Data feature plot of housng database Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 41 Fg.1(c). Ths showed that the new feature space obtaned wth NMF based processng method s more sutable for one class classfcaton. PCA decomposton satsfes wth orthogonal constran, and cannot assure the decomposton results are non-negatve. NMF satsfes wth the nonnegatve constran, and t s more sutable for non-negatve data applcaton to fnd the hdden structure. Ths s the reason that NMF s superor to PCA n one class classfcaton for non-negatve data. 3.3 Comparson of dfferent factorzaton szes Decdng the factorzed matrx sze J n equaton (8) s mportant for NMF. It s known that when the sze of X s L D, J needs to satsfy J ( L D) /( L + D). Dfferent J n the proposed method s employed to the wne database and the housng database, and the results are lsted n Table 5 and Table 6, respectvely. Compared wth the AUC value 0.6151 of SVDD method n Table 1 for the wne database, all the AUC value of dfferent J n Table 5 are greater. Ths means our proposed NMF based method s effcent to mprove the performance of SVDD for one class classfcaton. Ths can also be verfed n Table 6 for the housng database, where all the AUC value of dfferent J s also greater than orgnal SVDD method. But the best choce of J s applcaton dependent, parameter optmzaton can be employed for selecton of J. Table 5: Comparson of dfferent factorzed sze for wne database Sze of J FPR FNR RC AUC J=5 0.1176 0.149 0.884 0.9664 J=6 0.0588 0.088 0.941 0.9950 J=7 0.1765 0.149 0.835 0.959 J=8 0.1765 0.149 0.835 0.9513 J=9 0.0588 0.4000 0.941 0.6840 J=10 0.1765 0.5714 0.835 0.6975 Table 6: Comparson of dfferent factorzed sze for housng database Sze of J FPR FNR RC AUC J=5 0.190 0.486 0.7810 0.8186 J=6 0.3066 0.3571 0.6934 0.749 J=7 0.044 0.486 0.7956 0.8175 J=8 0.1606 0.649 0.8394 0.7044 J=9 0.68 0.3571 0.737 0.8149 J=10 0.1898 0.857 0.810 0.7753 4. Conclusons A two stage method for one class classfcaton employng NMF and SVDD for non-negatve data s proposed. NMF s used to proect sample nstances to a new feature space before SVDD s employed for classfcaton. There are several advantages n ths hybrd method. Frst, NMF s more effcent than PCA to fnd the hdden structure for non-negatve data, and the feature space produced wth NMF s approprate for SVDD. Second, the proposed method s superor to other classcal one class classfers. Acknowledgments Ths work s partally supported by Chna Postdoctoral Scence Foundaton (0100471057), Shandong Provncal Natural Scence Foundaton (ZR011FM005), Promotve Research Fund for Excellent Young and Mddle-Aged Scentsts of Shandong Provnce (BS010DX001), Scence and Technque Development Proect of Weha (010-3-96) and Key Lab Open Fund of Harbn Insttute of Technology (HIT.KLOF.01.078). References [1] M. Moya, M. Koch and L. Hostetler, "One-class classfer network for target recognton applcatons", n Proc. World Congress on Neural Networks, 1993, pp.791-801. [] D.M.J. Tax, "One class classfcaton", Ph.D. thess, Delft Unversty of Technology, Delft, Netherlands, 001. [3] P. Juszczak, D.M.J. Tax, E. Pekalska and R.P.W. Dun, "Mnmum spannng tree based one-class classfer ", Neurocomputng, 009, Vol. 7, pp. 1859-1869. [4] L. Jamal, M. Bazmara, and S. Jarar, "Feature selecton n mbalanced data sets", Internatonal Journal of Computer Scence Issues, 01, Vol. 9, No. 3, pp. 4-45. [5] D.M.J. Tax and R.P.W. DunP. "Support vector doman descrpton", Pattern Recognton Letters, 1999, Vol. 0, No. 11-1, pp. 1191-1199. [6] H. Yu, "Sngle-class classfcaton wth mappng convergence", Machne Learnng, 005, Vol. 61, No.1-3, pp. 49-69. [7] S. D. Vllalba and P. Cunnngham, "An evaluaton of dmenson reducton technques for one-class classfcaton", Artfcal Intellgence Revew, 007, Vol. 7, No. 4, pp. 73-94. [8] D.M.J. Tax and P. JuszczakA. "Kernel whtenng for oneclass classfcaton", Internatonal Journal of Pattern Recognton and Artfcal Intellgence, 003, Vol. 17, pp. 333-347. [9] U. Paatero and A. Tapper, "Postve matrx factorzaton: a nonnegatve factor model wth optmal utlzaton of error estmates of data values", Envrometrcs, 1994, Vol. 5, No., pp. 111-16. [10] D.D. Lee and H.S. Seung, "Learnng the parts of obects by nonnegatve matrx factorzaton", Nature, 1999, Vol.401, pp.788-791. Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.

IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 4 [11] C.-J. Ln, "Proected gradent methods for nonnegatve matrx factorzaton", Neural Computaton, 007, Vol.19, No.10, pp.756-779. [1] A. Cchock, R. Zdunek, A. Phan and S. Amar, Nonnegatve matrx and tensor factorzatons, John Wley, Sngapore, 009. [13] A.P. Bradley, "The use of the area under the ROC curve n the evaluaton of machne learnng apgorthms", Pattern Recognton, 1997, Vol.30, No.7, pp.1145-1159. [14] D.M.J. Tax, DDTools, The data descrpton toolbox for Matlab, http://prlab.tudelft.nl/davd-tax/dd tools.html, 010. [15] A. Frank and A. Asuncon, UCI Machne Learnng Repostory, http://archve.cs.uc.edu/ml, Unversty of Calforna, 010. [16] M. Breaban and H. Luchan, "A unfyng crteron for unsupervsed clusterng and feature selecton, Pattern Recognton", 011, Vol.44, No.4, pp.854-865. [17] M. Kalakech, P. Bela, L. Macare and D. Hamad, "Constrant scores for sem-supervsed feature selecton: A comparatve study", Pattern Recognton Letters, 011, Vol.3, No.5, pp.656-665. Lyong Ma. receved the B.Sc degree from Harbn Insttute of Technology, Harbn, Chna, n 1993, the M.Sc. degree from Harbn Unversty of Scence and Technology, Harbn, Chna, n 1996, and Ph.D degree from Harbn Insttute of Technology n 007, respectvely. He s currently an Assocate Professor at Harbn Insttute of Technology at Weha, Weha, Chna. Hs man research areas nclude ntellgent testng and nformaton processng, bomedcal magng and mage processng. Nazhang Feng receved the B.Sc. and Ph.D. degree n control engneerng from Harbn Insttute of Technology, Harbn, Chna, n 1998 and 005, respectvely. From 005 to 007, he was a postdoctor n Fudan Unversty, Shangha, Chna. He s currently a Professor n Harbn Insttute of Technology at Weha. Hs research nterests nclude sgnal detecton and nformaton processng and medcal ultrasound magng. Q Wang receved the B.Sc. and M.Sc. degree n electromagnetc measurement from Harbn Insttute of Technology, Harbn, Chna, n 1967 and 1980, respectvely. From 1985 to 1987 and 1993 to 1995, he vsted the Unversty of Tsukuba and the Chba Insttute of Technology, Japan, as a Researcher. He s currently a Professor at the Harbn Insttute of Technology. So far, he has more than 10 papers publshed. Hs research nterests nclude the ntellgent testng nstrumentaton, nformaton processng, sensor fault dagnoss, and sensor fuson. Copyrght (c) 01 Internatonal Journal of Computer Scence Issues. All Rghts Reserved.