The Robust Classification for Large Data (Case: Classification of Jakarta Vegetation area by Using Remote Sensing Data )

Similar documents
Controlled Information Maximization for SOM Knowledge Induced Learning

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Color Correction Using 3D Multiview Geometry

Detection and Recognition of Alert Traffic Signs

IP Network Design by Modified Branch Exchange Method

A modal estimation based multitype sensor placement method

An Unsupervised Segmentation Framework For Texture Image Queries

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

A New Finite Word-length Optimization Method Design for LDPC Decoder

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

Optical Flow for Large Motion Using Gradient Technique

Illumination methods for optical wear detection

Concomitants of Upper Record Statistics for Bivariate Pseudo Weibull Distribution

Lecture # 04. Image Enhancement in Spatial Domain

Strictly as per the compliance and regulations of:

A ROI Focusing Mechanism for Digital Cameras

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

Image Enhancement in the Spatial Domain. Spatial Domain

A New and Efficient 2D Collision Detection Method Based on Contact Theory Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai MIAO, Jian XUE

Point-Biserial Correlation Analysis of Fuzzy Attributes

On Error Estimation in Runge-Kutta Methods

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Satellite Image Analysis

Clustering Interval-valued Data Using an Overlapped Interval Divergence

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

Comparisons of Transient Analytical Methods for Determining Hydraulic Conductivity Using Disc Permeameters

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Topological Characteristic of Wireless Network

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

COLOR EDGE DETECTION IN RGB USING JOINTLY EUCLIDEAN DISTANCE AND VECTOR ANGLE

AUTOMATED LOCATION OF ICE REGIONS IN RADARSAT SAR IMAGERY

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

A Novel Automatic White Balance Method For Digital Still Cameras

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

Gravitational Shift for Beginners

Methods for history matching under geological constraints Jef Caers Stanford University, Petroleum Engineering, Stanford CA , USA

FINITE ELEMENT MODEL UPDATING OF AN EXPERIMENTAL VEHICLE MODEL USING MEASURED MODAL CHARACTERISTICS

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

A Texture Feature Extraction Based On Two Fractal Dimensions for Content_based Image Retrieval

Fast quality-guided flood-fill phase unwrapping algorithm for three-dimensional fringe pattern profilometry

European Journal of Operational Research

Towards Adaptive Information Merging Using Selected XML Fragments

Detection and tracking of ships using a stereo vision system

An Assessment of the Efficiency of Close-Range Photogrammetry for Developing a Photo-Based Scanning Systeminthe Shams Tabrizi Minaret in Khoy City

DISTRIBUTION MIXTURES

Drag Optimization on Rear Box of a Simplified Car Model by Robust Parameter Design

Desired Attitude Angles Design Based on Optimization for Side Window Detection of Kinetic Interceptor *

Hierarchical Region Mean-Based Image Segmentation

Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis

UCLA Papers. Title. Permalink. Authors. Publication Date. Localized Edge Detection in Sensor Fields.

Monte Carlo Techniques for Rendering

HISTOGRAMS are an important statistic reflecting the

Scaling Location-based Services with Dynamically Composed Location Index

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM

View Synthesis using Depth Map for 3D Video

IMAGERY TEXTURE ANALYSIS BASED ON MULTI-FEATURE FRACTAL DIMENSION

Experimental and numerical simulation of the flow over a spillway

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL FREQUENCY INFORMATION

arxiv: v2 [physics.soc-ph] 30 Nov 2016

Signal integrity analysis and physically based circuit extraction of a mounted

New Algorithms for Daylight Harvesting in a Private Office

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

COMPARISON OF CHIRP SCALING AND WAVENUMBER DOMAIN ALGORITHMS FOR AIRBORNE LOW FREQUENCY SAR DATA PROCESSING

Assessment of Track Sequence Optimization based on Recorded Field Operations

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM

Erasure-Coding Based Routing for Opportunistic Networks

Color Interpolation for Single CCD Color Camera

Extract Object Boundaries in Noisy Images using Level Set. Final Report

Optimal Adaptive Learning for Image Retrieval

Journal of Machine Engineering, Vol. 15, No. 4, 2015

A comparative study of multiple-objective metaheuristics on the bi-objective set covering problem and the Pareto memetic algorithm

Shortest Paths for a Two-Robot Rendez-Vous

A New Graphical Multivariate Outlier Detection Technique Using Singular Value Decomposition

Topic -3 Image Enhancement

3D inspection system for manufactured machine parts

DEVELOPMENT OF A PROCEDURE FOR VERTICAL STRUCTURE ANALYSIS AND 3D-SINGLE TREE EXTRACTION WITHIN FORESTS BASED ON LIDAR POINT CLOUD

Fifth Wheel Modelling and Testing

Modeling spatially-correlated data of sensor networks with irregular topologies

All lengths in meters. E = = 7800 kg/m 3

Prof. Feng Liu. Fall /17/2016

Ranking Visualizations of Correlation Using Weber s Law

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

WIRELESS sensor networks (WSNs), which are capable

A Recommender System for Online Personalization in the WUM Applications

Cardiac C-Arm CT. SNR Enhancement by Combining Multiple Retrospectively Motion Corrected FDK-Like Reconstructions

LOSSLESS audio coding is used in such applications as

Haptic Glove. Chan-Su Lee. Abstract. This is a final report for the DIMACS grant of student-initiated project. I implemented Boundary

STUDIES ON IMPROVING TEXTURE SEGMENTATION PERFORMANCE USING GENERALIZED GAUSSIAN MIXTURE MODEL INTEGRATING DCT AND LBP

The EigenRumor Algorithm for Ranking Blogs

Image Registration among UAV Image Sequence and Google Satellite Image Under Quality Mismatch

Linear Ensembles of Word Embedding Models

3D VISUALIZATION OF A CASE-BASED DISTANCE MODEL

3D Reconstruction from 360 x 360 Mosaics 1

Elliptic Generation Systems

SELECTION OF VARYING SPATIALLY ADAPTIVE REGULARIZATION PARAMETER FOR IMAGE DECONVOLUTION

Research Article. Regularization Rotational motion image Blur Restoration

Transcription:

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. The Robust Classification fo Lage Data (Case: Classification of Jakata Vegetation aea by Using Remote Sensing Data ) Dyah E. Hewindiati, Sani M. Isa, and Desi Aisandi Abstact This pape discusses the obust classification fo lage data, in case classification of vegetation aea at Jakata with emote sensing. Remote Sensing is the pocess involving an inteaction between incident adiation and the tagets of inteest. The classification pocess is guided in two steps; taining and classification steps. The taining step is done to know the efeence spectal of vegetation aea, and the classification step is caied out to clasify the Jakata aea into the vegetation and the non vegetation aea. The hole pocess of classification is not simple. The main poblem is noise. The claud coveing aea is consideed as noise. The classification of lage data with noise needs the efficient and effective appoach. The aim of the pape is to popose a new obust appoach, the Modified MVV, to classify the vegetation aea of Jakata. The Modified MVV is the modified data subset having minimum of a squae of length of a paallelotope diagonal. The good popeties of Modified MVV ae the consistent estimato and the moe efficient computational time than is of MVV. Index Tems beakdown point, consistent estimato, outlie, emote sensing, minimum vecto vaiance T I. INTRODUCTION HE awaeness of anomalous obsevation o outlie was almost a hunded yeas ago, since Iwin [0] poposed a citeion fo the ejection of anomalous data based on deviation fom the mean. Thee ae vaious meanings of outlie, but it is well undestood that an outlie is an obsevation which seem to be clealy deviated among the othes, Gubbs [4]. Outlies can be aised fom the human eo, an instument eo, an uncontollable event o simply though natual deviations in populations. An outlie can influence the data analysis and even the esult of expeiment. To avoid the bad obsevation, it is ussually to discad an outlie. The one o moe outlies should not be discaded, the outlie accommodation shuld be chosen to minimize that effect. The obust method is one appoach of the outlie accomodation. Hampel et.al [6] stated that Robust Statistics, nontechnical sense, is concened with the fact that many assumptions commonly made in statistics (such as nomality, lineaity, independence) ae at most appoximations to eality. The Majo goal of obust Dyah E. Hewindiati is lectue at Taumanagaa Univesity, Jln Let. Jend. S Paman, Jakata 40, Indonesia. (e-mail: hewindiati@unta.ac.id). Sani M. Isa Autho is lectue at Taumanagaa Univesity, Jln Let. Jend. S Paman, Jakata 40, Indonesia. (e-mail: sani.fti.unta@gmail.com) Desi Aisandi is Lectue at Taumanagaa Univesity, Jln Let. Jend. S Paman, Jakata 40, Indonesia (e-mail: desi@fti.utaa.og) ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) statistics is to develop methods that ae obust against the possibility that one o seveal unannounced outlies may occu anywhee in the data. The obust Mahalanobis distance; the implementation of Mahalanobis distance fo handling an outlie, is chosen as a tool fo detecting outlies. The idea was poposed by many good scientists, such as Rousseeuw and van Zomeen [6], Hadi [5], Hawkins [7], Rousseeuw and van Diessen [4], Billo et.all [] and Hubet et.al [9]. They poposed the obust measue based on the same citeia that is to detemine the location and covaiance matix in a subset giving the minimum volume of ellipsoid o minimum covaiance deteminant (MCD). MCD is the famous measue of dispesions in the study of multivaiate analysis. The advantage of this measue ae that MCD is obust with high BP and it gives an affine equivaiant location estimato and a covaiance matix. But MCD comes with the high cost of computation. The efficiency of the algoithm deceases when the dimension of data inceases. On the aspect of algoithm efficiency, the MCD is not cheap, based on Cholesky decomposition fo lage value of vaiate p, the computation of CD is of ode 3 O( p ). Hewindiati et.all [8] poposed the Minimum Vecto Vaiance (MVV), the obust pocedue which is moe efficient than MCD pocedue. This algoithm, compaed with FMCD algoithm, has a lowe computational complexity; the computational complexity of MVV is of ode O( p ) This pape discusses the obust classification fo lage data, in case classification of vegetation aea of Jakata with emote sensing. Data of emote sensing is often known as multispectal data, that is the sets of data obtained simultaneously, but each set obtained by sensing a diffeent pat of the electomagnetic spectum. Remote sensing is defined as the science (and to some extent, at) of acquiing infomation about the eath s suface without actually being in contact with it, Natual Resouces Canada [] The classification pocess is guided in two steps, taining and classification steps. The taining step is done to know the efeence spectal of vegetation aea, and the classification step is caied out to clasify the Jakata aea into the vegetation and the non vegetation aea. The hole pocess of classification is not simple. The main poblem is noise. The cloud coveing aea is consideed as noise. The classification of lage data with noise needs the efficient and effective appoach. The aim of the pape is to popose a new obust appoach consideed to be the efficient and effective method. The WCE 0

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. appoach is called as the Modified MVV. The Modified MVV is the modified data subset having minimum of a squae of length of a paallelotope diagonal to estimate the location and scate. The algoithm and good pefomance of Modified MVV ae shown in Section III and IV. Next, the esult of vegetation classification ae appeaed in the end of pape. II. REMOTE SENSING DATA Remote Sensing is the pocess involving an inteaction between incident adiation and the tagets of inteest. Multispectal data of Jakata comes fom Landsat 7 satelite. Data is aquied by 7 spectal band senso which coves visible, nea infaed, and mid infaed spectum. The spatial esolution of band - 5, and band 7 ae 30 m, the esolution of the sixth band is 60 m. Jakata aea is lage (appoximately 649.7 km ), that is moe than 700.000 pixels. An each pixel contains seven channels of multispectal data. It means that an each pixel has seven digital numbes. Figue B. The Multispectal Jakata Fomatted tiff Fo Channel 4 Figue C. The Multispectal Jakata Fomatted tiff Fo Channel 5 Figue. The Multispectal Jakata fomatted RGB Colo Space, on the yea 006 In this eseach, we take the Jakata multispectal data fom Landsat -7 satelit. Data is captued by senso having 7 bands involving the visible spectal, nea - IR, and mid IR. The spatial esolution of 6 bands ( band - 5, and band 7) ae 30 squae metes, the esolution of the sixth band is 60 squae metes. Multispectal data of Jakata is aea coveed by coodinate ( 5 9' " - 6 3' 54") South Latitude and (06 ' 4" - 06 58' 8") East Longitude. The following figues ae the examples of Jakata mulispectal on the yea 000. Figue A. The Multispectal Jakata fomatted tiff Fo Channel 3 III. THE MODIFIED MVV FOR CLASSIFICATION OF JAKARTA VEGETATION AREA Jakata; the capital of Indonesia; has a high population density. The population is almost 9 million on the yea 006. Spead ove an aea of aound 700 squae kilometes, the density population of Jakata is ecoded 3.756 people pe squae kilometes. Land use changes without the good planning. The quality of the envionment gets wose day by day. This pape investigates the change of vegetation aea Jakata on thee yeas; the yea 000, 00 and 006; by using the classification pocess. The multispectal data of Jakata is lage and it is also found the noise in seveal aea. That pocess is not simple, it needs the efficient and effective appoach to classify the aeas. The new obust appoach is chosen to have the eliable esult. A. The Robust Modified Minimum Vecto Vaiance The Modified MVV is the modified data subset having minimum of a squae of length of a paallelotope diagonal to estimate the location and scate. Hewindiati et.all [8] poposed the Minimum Vecto Vaiance (MVV) fo application in outlie labeling. The algoithm of MVV is not significantly diffeent with Fast Minimum Covaiance Deteminant (FMCD), which is poposed by Rousseeuw and van Diessen [4], except that the citeion used hee is not MCD but MVV. The FMCD algoithm is high beakdown point obust pocedue that is constucted based on the so-called C-step. The basic theoem and concept of C-Step was descibed by Rousseeuw and van Diessen [4]as follows, X, X, L, Xn of p-vaiate Conside a data set { } obsevations. Let H { n},, L, with H = h, and put ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) WCE 0

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. = ( ) x and S i H : = ( ) ( x i T )( x i T ) i H : i T h If ( ) h. det S 0, define the elative distances () = ( i ) ( i ) fo i,,, n { d i, i H} :, ( d),,( d) d i x T S x T H such that () = L. Now take { } = L ae : n h: n odeed distances, and compute T and S based on H, then det ( S) det ( S), with equality if only if T = T and S = S. The algoitm is known as C-Step ( concentation step) because it concentates on h obsevations with smallest distances. The concept of C- step tells us that it takes many initial choices of H befoe doing C-Step. The detemining of H ; initial of data subset; is vey impotant fo estimato computation. In the computation of obust estimato, the subset of clean data has the impotant ole. We know that Rousseeuw and van Diessen [4] appoximated the MCD estimato by seaching among all subsets containing half of the data that is most tightly clusteed togethe; this subset has minimum genealized vaiance o minimum covaiance deteminant. Minimum Vecto Vaiance (MVV) is obust appoach method using the minimization of vecto vaiance (VV) citeia. The estimato MVV fo the pai ( μ, Σ) is the pai ( TVV, S VV ) giving minimum vecto vaiance. Let Σ be covaiance matix, if Rousseuw [3] poposed covaiance deteminant (CD), i.e., Σ as multivaiate dispesion measue, two decades late Djauhai [3] intoduces vecto vaiance o VV, i.e., T ( Σ ) as anothe measue. In geneal, the implementation of MVV algoithm, Hewindiati et.al [8], is to detemine the initial of data subset; to concentate the smallest distance using minimum vecto vaiance and to estimate the estimato. To compute multispectal data of Jakata, the fist step of MVV algoithm is not simple, we must choise the all possible data subset H containing h obsevations having the smallest VV fom outcome of a index pemutation. The computation becomes had if we must calculate the p + when n is pemutation of n distinct objects taken ( ) lage. The modified data subset of MVV pesents to educe the computational time fo the lage data. Suppose X, X, L, Xn ae andom samples of size n fom a p- vaiate distibution of location paamete μ and positive definite covaiance matix Σ, the algoithm of poposed method is descibed as:. Compute the nom vecto N( i ) = X i fo i =,, L, n. Next, sot fom the smallest to the geatest. This ode defines a pemutation π on the obsevation index. Suppose the soting is: N π N π L N π L N π ( ) ( ) ( ) ( ) k n ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) Let H { n} 0,, L, with H0. Detemine the median N ( π ) k = h and n+ p+ h =. Suppose T 0 and S0 ae a vecto mean of dimension p and matix covaiance sample of H 0. Compute d( i) = Xi T0 = X i T0 fo i =,, L, n. Next sot fom the smallest to the geatest. Conside d( π ) d( π ) d( π ) k L is the ode distance. Take a set H consists of h obsevations of index π (, ) π (, ) L, π ( h). H Can be assumed as the set of basic o the initial subset. B. The Algoithm of Taining and Classification Step The pocess of classification is done with two steps. The fist step is the taining step and the second one is the classification step. Both of steps ae composed by the Modified MVV. The taining step is done to know the efeence spectal of vegetation land. The algoitm is descibed as follows, pixel. Cop image of the vegetation aea in size ( 30 30) based on the RGB colo space of multispectal visual and Nomalized Difeence Vegetation Index (NDVI). Compute the sample mean and covaiance matix by using the Modified MVV. Fist, define the initial subset of Modified MVV (see III A). Second, apply the MVV algoitm fo estimation, see Hewindiati et.al [8] a. Let H old subset containing h data points. b. Compute the mean vecto and covaiance X H old matix H old and compute, t X X S X X dh old ( i ) = ( i H ) ( ) old Hold i Hold i =,,, n. ; fo all c. Sot these squaed distances in inceasing ode d. Define H new = { (), ( ),, X X L X π π π( )} h e. Calculate, d i. X Hnew S and ( ) H new H new f. If det ( S H new ) = 0, epeat steps 5. If det ( H new ) det ( S H old ), the pocess is stopped. Othewise, the pocess is continued until the k-th iteation if det det det S H ( S H k ) = ( S H k ). So we get ( ) + S... det ( S H k ) = det ( S H k ) + det ( H ) S = 3. Find the inteval of efeence spectal of vegetation aea though a 95% confidence inteval fo a distance of obust Modified MVV The classfication of vegetation aea is guided by the taining step, but the point 3 is diffeent. The classification is conducted by similaity distance of all multispectal data aea and efeence spectal of vegetation aea. The pixel is WCE 0

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. called vegetation if the distance is in a 95% confidence inteval of the Modified MVV distance. IV. HOW IS THE PERFORMANCE OF MODIFIED MVV A. The Consistency of Modified MVV Estimato The estimato is a measuable function of the data that is used to infe the value of an unknown paamete. An estimato fo a paamete is consistent if the estimato conveges in pobability to the tue value of the paamete, Kendal and Stuat []. Conside an estimato t n, computed fom a sample of n values, will be said to be a consistent estimato if thee is some N such that the pobability that tn θ < ε () Is geate than ( η ) fo all n> N. In the notation of the pobability theoy, { n } P t θ < ε > η n > N () fo any positive ε and η howeve small The fomulate () means that the distibutions of the estimatos become moe and moe concentated nea the tue value of the paamete being estimated, so that the pobability of the estimato being abitaily close to θ conveges to one. To claify the statement, we do the expeiment with 00 eplication. The multivaiate nomal of N ( μ ) geneates (whee μ = 0 5, ; = I5 and n = 000 ). The contaminant data appea in a data set beginning % and gadually to be highe; i.e. %; 3%, 4% and so on till 0%. The consistency of two estimatos fom two diffeent apoaches; Modified MVV and Classics estimato; ae computed. TABLE I The Consistency of Modified MVV, fo p=5 Contaminant Consistency_Modified Consistency_Classics MVV 0 0.905607 0.65579 0 0.90808958 0.559903 30 0.900343 0.5030706 40 0.90449 0.5000385 50 0.9079066 0.50000083 60 0.9044967 0.50000000 70 0.900353 0.5 80 0.90398779 0.5 90 0.90334757 0.5 00 0.905804 0.5 Two desciptions above, Table I and Fig. 3, state that estimato of Modified MVV is a consistent estimato. B. The Computational Time of Modified MVV One majo of motivation outlie detection eseach is to efficiently identify outlie, Angiullia and Pizzuti [5]. The eseach focused on computational time is the impotant topics in obust statistics and detection outlie. The efficiency of two methods; the Modified MVV and the MVV; ae compaed in this Section. Regading with the aim, we geneate the multivaite nomal N5 ( 0, I5 ) with 5% contaminant data. To have the efficiency and effectiveness of those methods, the expeiments ae eplicated on 00 times and the sample size is gadually inceased ; that is begining n = 500 until n = 3000. All of the expeiments demonstate that the contaminant data ae identified well but the computational times ae diffeent. The compaison is listed in Table. TABLE II The Compaison of Computational Time Modified MVV and MVV No Expeiment Sample Size Modified MVV MVV 500 0.7 8.835 000 0.33739 57.4498 3 000 0.750 4.9694 4 4000.973 30.4307 5 8000 6.9484 463.0770 6 6000 6.8368 96.3965 7 3000 08.53.9E+03 Fig. 4 shows clealy the diffeence of unning time Modified MVV and MVV. Fo the lage sample size, we see the computational time of MVV is going to be inceased, but the time is elatively stable fo the Modified MVV. Figue 4. The Compaison of Computational Time (Second) fo Two Appoaches Figue 3. The Consistency of Modified MVV, fo p=5 C.The Beakdown Point of The Modified MVV Rousseeuw and Leoy [0] defined fom the context of a sample that beakdown point is the smallest faction of data which causes the value of estimato to be infinite when the value of all data in the faction ae changed to be infinite. The the beakdown point is one of measue obustness. Geneally, the beakdown point estimato is a measue of its esistance, the highe the beakdown point of an estimato, T X the moe obust it is. Suppose the estimato ( ) n ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) WCE 0

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. * becomes n ( X ) T if the value of m data ae changed. The beakdown point of sample of size shows n as follows, ε m n ( T, X) = min sup T ( X ) T ( X) infinite ( 3) * * n n n * X To investigate the esistance of modified MVV estimato, we do simulation with 000 eplication of 30 N3 0, I, the contaminant n = multivaiate nomal ( ) n will be added, it is stated n = 0 till the estimato beaks (conside n = k ). The following figue tells us that the MVV estimato is obust and high beakdown point Figue 7. The Classification of Vegetation Jakata Aea on The Yea 000 Fig. 7 illustates the vegetation and the non vegetation of main Jakata with a ectangula shape. Based on the figue, the vegetation aea of Jakata is only 7.73%. The pecentage is quite fa to the Law detemining the pecentage of opened space (vegetation land and baen land) fo big cities in Indonesia; 30%. How is the geen o vegetation of Jakata on the yea 00? The Jakata Goveno made the effot to incease the pecentage of geen aea. He wanted to each the geen aea into 3% on the yea 00. Will the goal of Goveno be able to be ealized?. The following figues ae the esult of classification on the yea 00 and 006. Figue 5. The Beakdown Point of Modified MVV V. THE CLASSIFICATION OF JAKARTA VEGETATION AREA The classification pocess on the vegetation and no vegetation aea is discussed in this section. Thee ae two steps; the taining and testing steps fo classification. A. The Taining Step The goal of taing step is to have the spectal efeence of the vegetation aea. Regading of the goal, we cop the vegetation aea in size ( 30 30). The following figue is the example of cop of channel 3, 4 and 5 in actual size on the yea 000. Figue 8. The Classification of Vegetation Jakata Aea on The Yea 00 Figue 6. The example of copping vegetation aea in channel 3,4 and 5 The Modified MVV is used to define the digital numbe of spectal efeence of vegetation aea. The 95% confidence inteval of vegetation is 4.349084 < vegetation < 6.84484. The inteval is useful fo the vegetation efeence of classification. B. The Classification Step The classification of Jakata vegetation aea on the yea 000 is figued out in Figue 7. The figue is scaled aound :300 to the oiginal figue fom Landsat - 7. ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) Figue 9. The Classification of Vegetation Jakata Aea on The Yea 006 Related with the classification on the yea 00, we compute that the pecentage of vegetation is highe than the pecentage on the yea 000, that is 8. %. At fist, it seems that the deam that Jakata will be geen can be eached, but the eseah shows the opposite. WCE 0

Poceedings of the Wold Congess on Engineeing 0 Vol III WCE 0, July 6-8, 0, London, U.K. The statement is based on the esult of classifying vegetation on the yea 006. The classication calculates that the pecentage of geen is 6.%. That pecentage is smalle than the pecentage on the yea 00 and the yea 000. VI. REMARK The Modified MVV is the eliable method fo classification of lage data with noise. We evaluate in two conditions; the pocess computation and the esult of classification. The fist evaluation, the computational time of Modified MVV fo taining and classification is aound 70 minutes. The time is diffeent with the computational time of MVV. In the expeience of MVV computation, we need moe than hous to finish one classification. The second evaluation, the Modified MVV is ealible method fo classification. We evaluate fou changing aeas, the usage of aea is not diffeent to the esult classification. REFERENCES [] Angiulli, F. and Pizzuti, C.: Outlie Mining and Lage High- Dimensional Data Sets, IEEE Tansaction on Knowledge and Data Engineeing, Vol. 7, No, 03-5 (005) [] Billo, N., Hadi, A.S. and Velleman, P.F.: BACON: blocked adaptive computationally efficient outlie nominatos, Jounal of Computational Statistics and Data Analysis, 34, 79-98. (000) [3] Djauhai, M.A.: Impoved Monitoing of Multivaiate Pocess Vaiability, Jounal of Quality Technology, Vol. 37, No, 3-39 (005) [4] Gubbs, F.E.: Pocedues fo Detecting Outlying Obsevations Samples, Technometics,, - (969) [5] Hadi, A.S.: Identifying Multivaiate Outlie in Multivaiate Data, Jounal of Royal Statistical Society B, Vol. 53, No 3, 76-77 (99) [6] Hampel, F.R., Ronchetti, E. M., Rousseuw, P.J. and Stahel, W.A.: Robust Statistics, John Wiley, New Yok, (985) [7] Hawkins, D.M.: The Feasible Solution Algoithm fo the Minimum Covaiance Deteminant Estimato in Multivaiate Data, Jounal of Computational Statistics and Data Analysis, 7, 97-0 (994) [8] Hewindiati, D.E., Djauhai, M.A. and Mashui, M.:Robust Multivaiate Outlie Labeling, Jounal of Communication in Statistics Simulation And Computation, Vol. 36, No 6 (007) [9] Hubet, M., Rousseeuw, P.J. and vanden Banden, K.: ROBPCA: a New Appoach to Robust Pincipal Component Analysis, Jounal of Technometics, 47, 64-79, (005) [0] Iwin, J.O.: On a Citeion fo the Rejection of Outlying Obsevations, Jounal of Biometics, Volume 7, No (3/4), 38-50 (95) [] Kendall, S.M. and Stuat, A.: The Advanced Theoy of Statistics, Chales Giffin & Co Ltd, Vol., Fouth Edition, London (979) [] Natual Resouces Canada: Fundamental of Remote Sensing, 8 Januay 00, Available : http://www.ccs.ncan.gc.ca/index_e.php [3] Rousseeuw, P.J.: Multivaiate Estimation with High Beakdown Point, Pape appeed in Gossman W., Pflug G., Vincze I. dan Wetz W., editos, Mathematical Statistics and Applications, B, 83-97. D. Reidel Publishing Company (985) [4] Rousseeuw, P.J. and van Diessen, K.: A Fast Algoithm fo The Minimum Covaiance Deteminant Estimato, Jounal of Technometics, 4, -3 (999) [5] Rousseeuw P.J. and Leoy, A.M.: Robust Regession and Outlie Detection, John Wiley, New Yok (987) [6] Rousseeuw, P.J. and van Zomeen, B.C.: Unmasking Multivaiate Outlies and Leveage Points, Jounal of the Ameican Statistical Association, 8 Volume 85, No 4, 633-639 (990) ISBN: 978-988-95-5- ISSN: 078-0958 (Pint); ISSN: 078-0966 (Online) WCE 0