Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

Similar documents
Classifier Selection Based on Data Complexity Measures *

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

The Research of Support Vector Machine in Agricultural Data Classification

Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Classification / Regression Support Vector Machines

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

A Binarization Algorithm specialized on Document Images and Photos

S1 Note. Basis functions.

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Support Vector Machines

Unsupervised Learning

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Hermite Splines in Lie Groups as Products of Geodesics

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Lecture 5: Multilayer Perceptrons

Cluster Analysis of Electrical Behavior

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

An Entropy-Based Approach to Integrated Information Needs Assessment

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

SVM-based Learning for Multiple Model Estimation

Optimizing Document Scoring for Query Retrieval

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

CS 534: Computer Vision Model Fitting

Feature Reduction and Selection

Classifying Acoustic Transient Signals Using Artificial Intelligence

Review of approximation techniques

Meta-heuristics for Multidimensional Knapsack Problems

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Smoothing Spline ANOVA for variable screening

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

Three supervised learning methods on pen digits character recognition dataset

Machine Learning 9. week

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

IMAGE FUSION TECHNIQUES

An Optimal Algorithm for Prufer Codes *

LECTURE : MANIFOLD LEARNING

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Feature Selection as an Improving Step for Decision Tree Construction

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

High-Boost Mesh Filtering for 3-D Shape Enhancement

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Data Mining For Multi-Criteria Energy Predictions

A Similarity-Based Prognostics Approach for Remaining Useful Life Estimation of Engineered Systems

y and the total sum of

Object-Based Techniques for Image Retrieval

Local Quaternary Patterns and Feature Local Quaternary Patterns

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Simulation Based Analysis of FAST TCP using OMNET++

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Novel Fuzzy logic Based Edge Detection Technique

X- Chart Using ANOM Approach

Load-Balanced Anycast Routing

An Improved Image Segmentation Algorithm Based on the Otsu Method

A Semi-parametric Regression Model to Estimate Variability of NO 2

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Optimal Workload-based Weighted Wavelet Synopses

Wishing you all a Total Quality New Year!

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Lecture 4: Principal components

Adaptive Transfer Learning

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

UB at GeoCLEF Department of Geography Abstract

A Genetic Programming-PCA Hybrid Face Recognition Algorithm

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Machine Learning: Algorithms and Applications

Lecture 13: High-dimensional Images

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Solving two-person zero-sum game by Matlab

Fusion Performance Model for Distributed Tracking and Classification

From Comparing Clusterings to Combining Clusterings

Learning-Based Top-N Selection Query Evaluation over Relational Databases

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

An Image Fusion Approach Based on Segmentation Region

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Using Neural Networks and Support Vector Machines in Data Mining

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment

A Robust Method for Estimating the Fundamental Matrix

Transcription:

Assgnment and Fuson of Multple Learnng Methods Appled to Remote Sensng Image Analyss Peter Bajcsy, We-Wen Feng and Praveen Kumar Natonal Center for Supercomputng Applcaton (NCSA), Unversty of Illnos at Urbana-Champagn, Urbana, Illnos, USA { pbajcsy, fengww}@ncsa.uuc.edu, umar@uuc.edu Abstract Wth the advances n remote sensng, varous machne learnng technques could be appled to study varable relatonshps. Although predcton models obtaned usng machne learnng technques has proven to be sutable for predctons, they do not explctly provde means for determnng nput-output varable relevance. We nvestgated the ssue of relevance assgnment for multple machne learnng models appled to remote sensng varables n the context of terrestral hydrology. The relevance s defned as the nfluence of an nput varable wth respect to predctng the output result. We ntroduce a methodology for assgnng relevance usng varous machne learnng methods. The learnng methods we use nclude Regresson Tree, Support Vector Machne, and K-Nearest Neghbor. We derve the relevance computaton scheme for each learnng method and propose a method for fusng relevance assgnment results from multple learnng technques by averagng and votng mechansm. All methods are evaluated n terms of relevance accuracy estmaton wth synthetc and measured data.. Introducton The problem of understandng relatonshps and dependences among geographc varables (features) s of hgh nterest to scentsts durng many data-drven, dscovery-type analyses. Varous machne learnng methods have for data-drven analyses been developed to buld predcton models that represent nput-output varable relatonshps. However, predcton models obtaned usng machne learnng technques vary n ther underlyng model representatons and frequently do not provde a clear nterpretaton of nput-output varable relatonshps. Thus, the goal of data-drven modelng s not only accurate predcton but also nterpretaton of the nput-output relatonshps. In ths paper, we address the problem of data-drven model nterpretaton to establsh relevance of nput varables wth respect to predcted output varables. Frst, we ntroduce the prevous wor n Secton and formulate an nterpretaton of data-drven models by assgnng relevance to nput varables n Secton 3. assgnments are derved at the sample (local) or model (global) levels based on co-lnearty of nput varable bass vectors wth the normal of regresson hyper-planes formed over model-defned parttons of data samples. Second, we propose algorthms for combnng relevance assgnments obtaned from multple data-drven models n Secton 4. Fnally, we evaluate accuracy of relevance assgnment usng (a) three types of synthetc and one set of measured data, (b) three machne learnng algorthms, such as Regresson Tree (RT), Support Vector Machne (SVM), and K-Nearest Neghbors (KNN), and (c) two relevance assgnment fuson methods as documented n Secton 5 and summarze our results n Secton 6. The novelty of our wor les n developng a methodology for relevance assgnment over a set of machne learnng models, proposng relevance fuson methods, and demonstratng the accuracy of multple relevance assgnment methods wth multple synthetc and expermental data sets.. Prevous Wor Interpretaton of data-drven models to establsh nput-output varable relatonshps s also part of varable (feature) subset selecton. Feature subset POC: Peter Bajcsy, pbajcsy@ncsa.uuc.edu

selecton s one of the classcal research areas n machne learnng [3][5][4]. Its goal s to pre-select the most relevant features for learnng concepts, and mprove accuracy of learnng [9]. In the wor of Pudl et al [], the authors use a sequental search through all possble combnatons of feature subsets, and fnd an optmal feature subset that yelds the best result. Wth a smlar attempt, Perner and hs coworers [0] compare the nfluence of dfferent feature selecton methods n the learnng performance for decson tree. However, the exhaustve search over all possble combnatons of feature subsets s not always computatonally feasble for large number of features, for nstance, n the case of hyperspectral band selecton. To address the feasblty aspect, Bajcsy and Groves [8][5] proposed to use a combnaton of unsupervsed and supervsed machne learnng technques. Other approaches n spectral classfcaton were proposed by S. De Bacer et al. []. Another challenge of relevance assgnment s related to the multtude of relevance defntons. For example, relevance of an nput varable could be defned by traversng a regresson tree model, and summng ts weghted occurrence n the tree structure as proposed by Whte and Kumar []. In the survey by Blum et al [], the authors gve a conceptual defnton of a relevant nput varable as a varable that wll affect the predcton results f t s modfed. In our wor, whle adherng to the conceptual defnton of Blum et al [], we extend the defnton of relevance assgnment by numercally quantfyng relatve mportance of nput varables. Our relevance assgnment s related to the wor of Heler et al [6], n whch the authors used colnearty of bass vectors wth the normal of a separatng hyper-plane obtaned from Support Vector Machne (SVM) method as the metrc for relevance assgnment. We use the co-lnearty property n our relevance assgnments derved from a set of regresson hyper-planes formed over model-defned parttons of data samples. 3. Assgnments To analyze nput-output relatonshps n practce, one s presented wth dscrete data that could be descrbed as a set of M examples (rows of a table) wth N varables (columns of a table). Examples represent nstances of multple ground and remotely sensed measurements. Measurements are the varables (features) that mght have relatonshps among themselves and our goal s to obtan better understandng of the relatonshps. In ths wor, we start wth a mathematcal defnton of varable relevance that s consstent wth the conceptual understandng of nput-output relatonshps. We defne relevance of an nput varable v v = ( v, v,..., vn) as the partal dervatve of an output (predcted) functon f ( v, v,..., vn ) wth respect to the nput varable v. Equaton () shows a vector of relevance values for all nput varables. R f( v, v,..., v ) f( v, v,..., v ) N N = ;...; v vn () Ths defnton assumes that nput and output varables are contnuous, and an output functon f s C contnuous (frst dervatve exsts). In order to follow ths mathematcal defnton n practce, there arse challenges assocated wth () processng dscrete samples, such as defnng the neghborhood proxmty of a sample n the manfold of nput varables, () representng a data-drven model, such as dervng analytcal form of a predcted functon f, (3) scalng nput and output varables wth dfferent dynamc ranges of measurements, (4) removng dependences on algorthmc parameters of machne learnng technques, (5) understandng varablty of relevance as a functon of data qualty parameters (e.g., sensor nose, cloud coverage durng remote sensng), and (6) treatng a mx of contnuous and categorcal varables, just to name a few. Our approach to the above challenges s (a) to use a mathematcally, well defned (analytcally descrbed) model, le the mult-varate regresson, on sample parttons obtaned usng a machne learnng technque, (b) to scale all varables to the same dynamc range of [0, ], (c) to nvestgate dependences on algorthmc parameters of machne learnng technques and data qualty parameters, and (d) to propose the fuson of mult-method relevance results to ncrease our confdence n the relevance results.. Havng analytcal descrpton of a functon f allows us to derve relevance accordng to the defnton. We have currently constraned our wor to processng only contnuous varables and foresee the ncluson of categorcal varables n our future wor. Based on the above consderatons, we defne sample and model relevance assgnments for a set of dscrete samples wth measurements of contnuous nput and output varables. Example relevance R s the local relevance of each nput varable v computed at the sample s j. The computaton s defned for three machne learnng technques n the next sub-sectons.

Model relevance R s the global relevance of each nput varable v over all examples n the entre datadrven model computed by summng all sample relevances. To obtan comparable example and model relevances from multple data-drven models, we normalze the relevance values by the sum of all model relevances over all nput varables (see Equaton ()). The normalzed relevance values are denoted wth a tlde. R R j R = ; R = () R R j j In the next subsectons, we ntroduce relevance assgnment for Regresson Tree (RT), Support Vector Machne (SVM), and K-Nearest Neghbors (KNN). The man reason for choosng these three methods comes from our requrement for remote sensng data analyss to process contnuous nput and output varables. Furthermore, the methods represent a set of machne learnng technques wth dfferent underpnnng prncples for data-drven modelng. The KNN method bulds models usng only close neghbors to a gven sample. The SVM method bulds models based on all nput samples. As for the RT method, t bulds ts model by herarchcally subdvdng all nput samples and fttng a local regresson model to samples n each dvded cell (leaf). Thus, these methods represent a spectrum of supervsed machne learnng technques that would be evaluated n terms of relevance assgnment accuracy. 3.. Regresson Tree The process of buldng a regresson tree based model can be descrbed by splttng nput examples nto sub-groups (denoted as cells or tree leaves) based on a sequence of crtera mposed on ndvdual varables [4]. The splttng crtera, e.g., nformaton gan or data varance, mght depend on problem types, and become one of the algorthmc parameters. In ths wor, we choose varance as our splttng crteron. Once all examples are parttoned nto leaves, a multvarate lnear regresson model s used to predct output values for examples that fall nto the leaf. For any example e j, example relevance R of the varable v s computed from the lnear regresson model assocated wth the regresson tree leaf. The regresson model at each regresson tree leaf approxmates locally the predcton functon f wth a lnear model wrtten as: f v wv w v w v (3) ( ) = β0 + + + + N N The example (local) relevance assgnment R s then computed as a dot product between the normal w of a regresson model hyper-plane and the unt vector u of nput varable v as descrbed below: R = w u (4) where w u denotes the absolute value of a dot product of vectors w and u. 3.. K-Nearest Neghbors K-nearest neghbors s a machne learnng method that predcts output values based on K closest examples to any chosen one measured n the space of nput varables v. Predcted values are formed as a weghted sum of those K nearest examples [3][4]. For any example e j, example relevance R of the varable v s computed from the lnear regresson model obtaned from the K nearest neghbors to the example e j. The lnear regresson model approxmates locally the predcton functon f. The example relevance assgnment s performed accordng to Equatons (3) and (4). 3.3. Support Vector Machne Support vector machne (SVM) s a machne learnng method that could buld a model for separatng examples (classfcaton problem), or for fttng examples (predcton problem). We use SVM as a predcton technque to model nput data. The SVM model could be lnear or non-lnear dependng on a SVM ernel. The non-lnear models are obtaned by mappng nput data to a hgher dmensonal feature space, whch s convenently acheved by ernel mappngs [7]. Thus, the predcton functon f could be obtaned from mathematcal descrptons of lnear and non-lnear ernels. In ths paper, we focus only on a lnear model n order to smplfy the math. For a SVM method wth the lnear ernel, the mathematcal descrpton of f becomes a hyper-plane as descrbed n Equaton (3). The dfference between

RT, KNN and SVM methods s n the fact that SVM would use all examples to estmate the parameters of the hyper-plane and t would lead to only one hyperplane for the entre set of examples. The example relevance assgnment for SVM follows Equaton (4). In the case of SVM, the example relevance and model relevance are dentcal. 4. Fuson The goal of relevance fuson s to acheve more robust relevance assgnment n the presence of nose, varable data qualty (e.g., clouds durng remote sensng), as well as to remove any bas ntroduced by a sngle machne learnng method. The latter motvaton s also supported by the no-free-lunch (NFL) theorem [6], whch states that no sngle supervsed method s superor over all problem domans; methods can only be superor for partcular data sets. In ths paper, we propose two dfferent schemes to combne the results of relevance assgnment, such as average fuson and ran fuson. Average fuson s based on tang the numercal average of relevance values, and usng t as the combned relevance value. Ran fuson, on the other hand, uses the votng scheme to combne the relatve rans of each nput varable determned by each machne learnng method. 4.. Average Fuson Average fuson s executed by tang the normalzed relevance results from multple machne learnng methods, and computng the average of them. Gven R as the normalzed example relevance of the -th varable at the example e j obtaned from the -th machne learnng method, the example relevance assgnment of for the example R avg establshed usng average fuson e j s defned as: L L = avg R = R (5) where L s number of machne learnng methods used. The model relevance assgnment of avg R usng average fuson for an nput varable v s the sum of all example relevances as defned n Equaton (). 4.. Ran Fuson Ran fuson s executed n a smlar manner as the average fuson. The dfference les n assgnng a relevance ran to each varable v based on ts relevance ranng by the -th machne learnng model. The absolute magntude of relevance assgnment s lost n ran fuson snce the meanng of magntude s converted to relatve ranng durng the process. The ran fuson approach s expected to be more robust than the average fuson approach when some of the machne learnng methods create very ncorrect models due to varous reasons and sew the correct results of other machne learnng methods. The example relevance assgnment usng ran fuson s descrbed as follows: For each example e j and ts normalzed example relevance R, we defne the ran of each varable v as the ndex of a sorted lst of relevances from the smallest to the largest; ran R N.The ran fuson based { } {,,... } relevance assgnment for varable v s then computed as shown below: L ran R = ( N ran{ R }) (6) LN = The model relevance assgnment of ran R usng average fuson for an nput varable v s the sum of ran all example relevances R over all examples as defned n Equaton (). 5. Evaluaton Setup The evaluaton of relevance assgnments was performed usng GeoLearn software that was developed by NCSA and CEE UIUC. GeoLearn allows a user to model nput-output varable relatonshps from mult-varate NASA remote sensng mages over a set of boundares. The machne learnng algorthms n the GeoLearn system leverage fve software pacages, such as, ImLearn (remote sensng mage processng), ArcGIS (georeferencng), DK software (RT mplementaton), LbSVM [4] (SVM mplementaton), and KDTree [3] (KNN mplementaton). KNN and SVM methods are senstve to the scale of nput varables, and wll favor varables wth a wder scale of values. In order to avod ths type of a bas, the dynamc range of all varables s always normalzed to

the range between [0, ] accordng to the formula below: Value MnValue NormalzedValue = (7) MaxValue MnValue Evaluatons were performed wth both synthetc and measured data. Synthetc data allow us to smulate three categores of nput-output relatonshps wth nown ground truth to understand relevance assgnment accuracy and relevance dependences. Measured data was used to demonstrate the applcaton of relevance assgnment to studyng vegetaton changes and the results were verfed based on our lmted understandng of the phenomena. The next sub-sectons descrbe synthetc data smulatons, model buldng setup and evaluaton metrcs to assess relevance assgnment accuracy. 5.. Synthetc Data Smulatons Three sets of nput-output relatonshps were smulated to represent () lnear addtve, () nonlnear addtve and (3) non-lnear multplcatve categores of relatonshps. To ntroduce rrelevant nput varables nto the problem, we smulated output usng only two nput varables v, v (the relevant varables) whle modelng relatonshps wth four varables, where the addtonal two nput varables v 3, v 4 have values drawn from a unform dstrbuton of [0,] (the rrelevant varables). The specfc analytcal forms for generatng the three data sets are provded n Equatons (8) -lnear addtve, (9) nonlnear addtve and (0) non-lnear multplcatve., v, v3, v4) 4 f ( v = v + v (8) π f ( v, v, v3, v4) = sn π v + cos v (9) f ( v, v, v, v = v v (0) 3 4) In addton to smulatng multple nput-output relatonshps and relevant-rrelevant varables, we added nose to generated output values to test the nose robustness of relevance assgnments. Nose s smulated to be an addtve varable followng D Gaussan dstrbuton wth zero mean µ and standard devaton σ ; N ( µ = 0, σ ). The standard devaton was parameterzed as σ = αd, where α s the percentage of the dynamc range d of an output varable. In our experments, we used α = 0. and α = 0. 3 to generate the total of nne synthetc data sets (3 wthout nose, 3 wth addtve noseα = 0., and 3 wth addtve nose α = 0.3 ). 5.. Model Buldng Setup Model buldng setup s concerned wth optmzaton of algorthmc parameters and cross valdaton. Frst, we set the algorthmc parameters to the followng values: () RT - varance error as a crteron for splttng, mnmum number of examples per leaf to eght, maxmum tree depth to ; () KNN K = N + 3, where N s the dmenson of all nput varables. The reason for settng K slghtly larger than the nput varable dmensons s to meet the leastsquare fttng requrements for estmatng a hyperplane from K examples; (3) SVM - lnear ernel, cost factor C=.0, and termnaton crtera Epslon = 0.00. The optmzaton of KNN s parameter K and RT s parameter maxmum tree depth was nvestgated expermentally. Second, we omtted cross valdaton of models n our experments and rather computed nput varable relevance based on all avalable examples. We wll nvestgate n the future the accuracy of nput varable relevance assgnment from examples selected by cross valdaton or all avalable examples. 5.3. Evaluaton Metrcs To evaluate the accuracy of nput varable relevance assgnment usng multple machne learnng methods, we ntroduced two metrcs, such as percentage of correctness and error dstance. The evaluatons are conducted only for the synthetc data aganst the ground truth values of normalzed example relevance R and normalzed model relevance GT R GT. The ground truth values are obtaned by computng partal dervatves of Equatons (8), (9) and (0) accordng to Equaton (). The percentage of correctness metrc s defned n Equaton () as: PC M δ j j= = M () GT where δ s f max R = max R, and s 0 j otherwse. The error dstance metrc s defned n Equaton () as the Eucldean dstance between the true model relevance derved from partal dervatve and the relevance estmaton from our methods. Ths

Table : table obtaned from synthetc data wthout nose by RT, KNN and SVM methods. Model Regresson Tree K-Nearest Neghbors Support Vector Machne (Lnear) metrc does not apply to the relevance results obtaned usng ran fuson snce the results are categorcal. N GT ErrorDst. = ( R R ) () = 6. Experment Results In ths secton, we present evaluatons wth synthetc and measured data n two forms. Frst, we report a relevance mage that shows the color of an nput varable wth maxmum relevance value at each pxel locaton. The color codng schema maps red, green, blue and yellow colors to ( v, v, v3, v 4). Second, we provde a relevance table wth model relevance value for each nput varable. Lnear Non-Lnear Add. Non-Lnear Mult. Var Correct % Correct % Correct % Dstance Error Dstance Error Dstance Error v 0.8 0.66745 95.5% 0.43898 v 0. 0.33093 0.5755 9.83% v3 3.80E-7 4.09373E-9 8.54E-04.73E-04.79E-06 v4 3.7E-7 8.0E-04 3.7E-04 0.00057 v 0.8 0.6684 0.35364 95.07% v 0. 0.36967 0.5685 96.94% v3 7.48E-4.4784E-5 0.0799 0.0390 0.00463 v4 3.53E-3 0.087 0.038838 0.006 v 0.748753 0.0407 0.3969 0.69% v 0.49007 0.95396 0.598363 75% v3 6.9E-04 0.005030969 0.0059 0.00454 0.78345 v4 0.006.69E-05 0.008564 0.00068 locaton, we antcpated LAI to be more relevant to Fpar than LST, and both Lattude and Longtude to be almost rrelevant. The relevance results are summarzed n Fgure 3 and Table 3 6.. Synthetc Data The relevance assgnment results usng RT, KNN and SVM methods from synthetc data wthout nose are summarzed n Fgure and Table. The results obtaned usng the fuson methods for the same data are summarzed n Fgure and Table. We also evaluated the methods wth synthetc data wth nose and the results were omtted for brevty. 6.. Measured Data We processed measured remotely sensed data from NASA acqured n 003, at spatal resoluton 000m per pxel and at the locaton (lattude x longtude) = ([35.34, 36.35] x [-9.54, -93.3]. We modeled output Fpar varable (fracton of Photosynthetc actve radaton) as a functon of nput varables consstng of LST (Land Surface Temperature), LAI (Leaf Area Index), Lattude, and Longtude. For ths geo-spatal Fgure. mages obtaned from synthetc data wthout nose by RT, KNN and SVM methods. Rows correspond to categores of relatonshps (top lnear addtve, mddle - non-lnear multplcatve, bottom - non-lnear addtve). Columns correspond to methods (left - ground truth, second left RT, second rght KNN and rght SVM). 7. Dscusson The results presented n the prevous secton are dscussed by comparng ndvdual machne learnng methods and fuson methods.

Table : table obtaned from synthetc data wthout nose by average and ran fuson methods. Model Fuson Average Fuson Ran Lnear Non-Lnear Add. Non-Lnear Mult. Var Correct % Correct % Correct % Dstance Error Dstance Error Dstance Error v 0.7898 0.4465 0.38979 68.67% v 0.6336 0.53786 0.580795 94.7% v3.0e-04 0.0379 0.03579 0.000558 0.093948 v4 5.37E-04 0.009685 0.05906 0.000893 v 0.4 0.3 0.3 97.58% v 0.3 0.4 0.4 97.6% v3 0. 0.33333 0.66667 NA NA v4 0. 0.66667 0.33333 NA 7.. Comparson of Indvdual Methods mddle average fuson, and rght ran fuson). From the expermental results, we observe that RT s a relable hybrd method that usually gves accurate relevance estmaton. KNN s flexble under dfferent data type and always gves good relevance estmaton n terms of correctness. However, t s very senstve to nose, and we observe sgnfcant performance drops when nose presents. SVM usually yelds more robust result under nose, but s restrcted by ts expressvty snce we use only lnear ernel here. There s no sngle best method for every dataset. The correctness of relevance assgnment strongly depends on the type of data and the learnng method we use. The fuson methods are proposed based on ths motvaton. LAI LST Lattude Longtude Fgure 3. mages for NASA data obtaned usng RT (left top), KNN (left mddle), SVM (left bottom), average fuson (rght top) and ran fuson (rght mddle). The most relevant nput varable at each pxel s encoded accordng to the color scheme n the legend (rght bottom). The majorty of pxels s labeled red (LAI). 7.. Comparson of Fuson Methods Fgure. mages obtaned from synthetc data wthout nose by usng average and ran fuson methods. Rows correspond to categores of relatonshps (top lnear addtve, mddle - non-lnear multplcatve, bottom - non-lnear addtve). Columns correspond to methods (left - ground truth, Fuson methods usually outperform any sngle method n terms of relevance assgnment correctness (% Correct) for synthetc datasets wthout nose. Under ths settng, the dfference between average fuson and ran fuson s not obvous. Although we have not ncluded the results for synthetc data wth nose for brevty reasons, we would comment on the results obtaned. For synthetc datasets wth nose, the correctness of relevance assgnment for fuson methods s stll more stable than for a sngle method (RT or KNN or SVM). However, fuson methods do not gve the optmal estmaton comparng to the best sngle method. 8. Concluson and Future Wor In ths paper, we proposed a framewor for computng nput varable relevance wth respect to a

predcted output varable from multple machne learnng methods. Followng the conceptual defnton of relevance n the lterature, we defned partal dervatves of nput-output dependences as our relevance assgnment approach. The estmaton of two types of relevances, such as example and model relevances, were mplemented for regresson tree, K- nearest neghbors, and support vector machne methods. Addtonal fuson schemes for combnng the relevance results from multple methods were evaluated together wth sngle methods by usng synthetc and measured data and two metrcs. Based on three categores of synthetc nput-output smulatons, ncludng lnear addtve, non-lnear addtve and nonlnear multplcatve relatonshps, wthout or wth nose added, we concluded that the relevance assgnment usng fuson approaches demonstrate more robust performance than the assgnment usng a sngle machne learnng method. Table 3 : assgnment results for remote sensng data set from NASA. Model Var Regresson Tree LAI 0.99604 LST 0.003058 Lattude.07E-04 Longtude 8.0E-04 K-Nearest Neghbors LAI 0.449047 LST 0.7847 Lattude 0.75 Longtude 0.9746 Support Vector Machne LAI 0.88594 (Lnear) LST 0.073454 Lattude 0.040576 Longtude.8E-05 Average Fuson LAI 0.77700 LST 0.08498 Lattude 0.0793 Longtude 0.066088 Ran Fuson LAI 0.4 LST 0.66667 Lattude 0.33333 Longtude 0. In the future, we would le to extend the fuson methods to nclude the results from other learnng methods and to understand the dependences of relevance assgnment on model buldng setups. 0. References [] Blum A.L. and P. Langley, 997. Selecton of Relevant Features and Examples n Machne Learnng. Artfcal Intellgence 97, 997 Elsever Scence. [] Whte A. and P. Kumar, 005. Domnant Influences on Vegetaton Greenness Over the Blue Rdge Ecoregon Submtted to Ecosystems. [3] Jan A. and D. Zonger, 997. Feature selecton: evaluaton, applcaton, and small sample performance. IEEE Trans. Pattern Anal. Machne Intell., 9:53 89, 997. [4] Ln C.J., 006. LIBSVM -- A Lbrary for Support Vector Machnes. http://www.cse.ntu.edu.tw/~cjln/lbsvm/. [5] Kudo M. and J. Slansy, 000. Comparson of algorthms that select features for pattern classfers. Pattern Recognton, 33:5-4, 000. [6] Heler M., D. Cremers, and C. Schnorr, 00. Effcent Feature Subset Selecton for Support Vector Machnes. Techncal Report /00 Dept. of Math and Computer Scence, Unversty of Mannhem. [7] Crstann N. and J. S. Taylor, 000. An Introducton to Support Vector Machnes and Other Kernel-based Learnng Methods, Cambrdge Unversty Press, March 000. [8] Bajcsy P. and P. Groves, 004. Methodology For Hyperspectral Band Selecton, Photogrammetrc Engneerng and Remote Sensng journal, Vol. 70, Number 7, July 004, pp. 793-80. [9] Perner P. and C. Apte, 000. Emprcal Evaluaton of Feature Subset Selecton Based on a Real World Data Set, Prncples of Data Mnng and Knowledge Dscovery, Sprnger Verlag 000, p. 575-580. [0] Perner P., 00. Improvng the Accuracy of Decson Tree Inducton by Feature Pre-Selecton, Appled Artfcal Intellgence 00, vol. 5, No. 8, p. 747-760. [] Pudl P., J. Navovcova, and J.Kttler, 994. Floatng search methods n feature selecton, Pattern Recognton Letters, 5(994), 9-5. [] De Bacer S., P. Kempeneers, W. Debruyn and P. Scheunders, 005. "A Band Selecton Technque for Spectral Classfcaton," IEEE Geoscence and Remote Sensng Letters, Vol., Nr. 3, p. 39-33, July (005). [3] Levy S. D., 006, A Java class for KD-tree search. http://www.cs.wlu.edu/~levy/ [4] Han, J., and Kamber, M., 00. Data Mnng: Concepts and Technques, Morgan Kaufmann Publshers, San Francsco, CA. [5] Groves P. and P. Bajcsy, 003. Methodology for Hyperspectral Band and Classfcaton Model Selecton, IEEE Worshop on Advances n Technques for Analyss of Remotely Sensed Data, Washngton DC, October 7, 003. [6] Duda, R., P. Hart and D. Stor, 00. Pattern Classfcaton, Second Edton, Wley-Interscence.