VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS

Similar documents
Overview. Data Mining for Business Intelligence. Shmueli, Patel & Bruce

Package nodeharvest. June 12, 2015

Tutorial 1. Linear Regression

Regularization of Evolving Polynomial Models

Statistical Machine Learning Hilary Term 2018

Package KernelKnn. January 16, 2018

Multiple Linear Regression

machine learning framework for Mathematica Version 1.5 What's New

The Review of Attributes Influencing Housing Prices using Data Mining Methods

Orange3 Educational Add-on Documentation

Machine Learning: Algorithms and Applications Mockup Examination

Introduction to Artificial Intelligence

Package Cubist. December 2, 2017

Basic Concepts Weka Workbench and its terminology

Evolution of Regression II: From OLS to GPS to MARS Hands-on with SPM

Machine Learning Chapter 2. Input

1 Lab 1. Graphics and Checking Residuals

Genetic Tuning for Improving Wang and Mendel s Fuzzy Database

Data Mining Practical Machine Learning Tools and Techniques

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Getting Started in R

2 cubist.default. Fit a Cubist model

Input: Concepts, Instances, Attributes

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

Summary. Machine Learning: Introduction. Marcin Sydow

Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018

2002 Journal of Software.. (stacking).

Getting Started in R

A Soft-Computing Approach to Knowledge Flow Synthesis and Optimization

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

Rule extraction from support vector machines

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Application of Data-Mining Techniques on Predictive Analysis

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

Classification: Decision Trees

k-nearest Neighbors + Model Selection

A Proposal of Regression Hybrid Modeling for Combining Random Forest and X-Means Methods

An Introduction to Cluster Analysis. Zhaoxia Yu Department of Statistics Vice Chair of Undergraduate Affairs

Evolution of Regression III:

Multi-label classification using rule-based classifier systems

Data Preprocessing. Slides by: Shree Jaswal

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

A Brief Introduction to Data Mining

Nearest Neighbor Classification

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Representing structural patterns: Reading Material: Chapter 3 of the textbook by Witten

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Simulation of Back Propagation Neural Network for Iris Flower Classification

Heuristic Optimisation

Improving interpretability in approximative fuzzy models via multi-objective evolutionary algorithms.

Multicriterial Optimization Using Genetic Algorithm

Table Of Contents: xix Foreword to Second Edition

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Data Mining Practical Machine Learning Tools and Techniques

CS 584 Data Mining. Classification 1

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Study on the Application Analysis and Future Development of Data Mining Technology

An overview for regression tree

Speeding Up the Wrapper Feature Subset Selection in Regression by Mutual Information Relevance and Redundancy Analysis

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time,

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

Unsupervised learning

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Data warehouse and Data Mining

Data Mining Concepts

Support Vector Machines - Supplement

k-nearest Neighbor (knn) Sept Youn-Hee Han

3. Data Preprocessing. 3.1 Introduction

2. Data Preprocessing

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Introduction to Artificial Intelligence

CHAPTER 4 DETECTION OF DISEASES IN PLANT LEAF USING IMAGE SEGMENTATION

SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

8. Tree-based approaches

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition

Data mining with Support Vector Machine

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

Information Fusion Dr. B. K. Panigrahi

CSCI567 Machine Learning (Fall 2014)

Techniques for Dealing with Missing Values in Feedforward Networks

Metamodeling. Modern optimization methods 1

A genetic algorithms approach to optimization parameter space of Geant-V prototype

Data Mining: STATISTICA

Optimization Methods for Machine Learning (OMML)

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

CISC 4631 Data Mining

Information theory methods for feature selection

An Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa

Linear discriminant analysis and logistic

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA

Data Warehousing and Machine Learning

Performance Analysis of Data Mining Classification Techniques

Transcription:

VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS Ivo Kondapaneni, Pavel Kordík, Pavel Slavík Department of Computer Science and Engineering, Faculty of Eletrical Engineering, Czech Technical University in Prague, Czech Republic Presenting author: Pavel Kordík (kordikp@fel.cvut.cz)

Overview Motivation Data mining models Visualization based on sensitivity analysis Regression problems Classification problems Definition of interesting plots Genetic search for 2D and 3D plots 2

Motivation Data mining extracting new, potentially useful information from data DM Models are automatically generated Are models always credible? Are models comprehensible? How to extract information from models? Visualization 3

Data mining models Often black-box models generated from data E.g. Neural networks Input variables What is inside? Data mining black box model Output variable (s) 4

Inductive model Estimates output from inputs Generated automatically Evolved by niching GA Grows from minimal form Contains hybrid units Several training methods Ensemble of models input variables P L C first layer P C P G second layer 3 inputs max P P C third layer interlayer connection 4 inputs max L output layer output variable 5

Example: Housing data Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA Per capita crime rate by town Proportion of owner-occupied units built prior to 1940 Weighted distances to five Boston employment centres MEDV Output variable Median value of owner-occupied homes in $1000's 6

Housing data records Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA 24 0.00632 18 2.31 53.8 6.575 65.2 4.09 1 296 15.3 396.9 21.6 0.02731 0 7.07 46.9 6.421 78.9 4.9671 2 242 17.8 396.9 4.98 9.14 MEDV Output variable 7

Housing data inductive model Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA Niching genetic algorithm evolves units in first layer sigmoid sigmoid Error: 0.13 Error: 0.21 MEDV=1/(1-exp(-5.724*CRIM+ 1.126)) MEDV=1/(1-exp(-5.861*AGE+ 2.111)) MEDV Output variable 8

Housing data inductive model Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA sigmoid sigmoid sigmoid linear Error: 0.13 Error: 0.21 Error: 0.24 Error: 0.26 polyno mial Niching genetic algorithm Error: 0.10 evolves units in second layer MEDV=0.747*(1/(1-exp(-5.724*CRIM+ 1.126))) +0.582*(1/(1-exp(-5.861*AGE+ 2.111))) 2 +0.016 MEDV Output variable 9

Housing data inductive model Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA sigmoid sigmoid sigmoid linear polyno mial polyno mial expo nential linear Constructed model has very low validation error! Error: 0.08 MEDV Output variable 10

Housing data inductive model Input variables CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV=(exp((0.038* 3.451*(1/(1-exp(-5.724*CRIM+ 1.126)))*(1/(1- S S S L exp(2.413*dis-2.581)))*(1/(1-exp(2.413*dis-2.581)))+0.429*(1/(1- exp(-5.861*age+ 2.111)))+0.024*(1/(1-exp(2.413*DIS- 2.581)))+0.036+ P 0.038*0.350*(1/(1-exp(-3.613*RAD-0.088)))+ 0.999*( 0.747*(1/(1-exp(-5.724*CRIM+ 1.126)))+0.582*(1/(1-exp(- 5.861*AGE+ P 2.111)))*(1/(1-exp(-5.861*AGE+ L 2.111)))+0.016)- 0.046*(1/(1-exp(-5.724*CRIM+ 1.126)))-0.079+ 0.002*INDUS- 0.001*LSTA+ 0.150)*0.860)*13.072)-14.874 E Error: 0.08 Math equation is not comprehensible MEDV any more we have to treat it as a black box model! Output variable 11

Visualization based on sensitivity analysis constant x 1 y k x 1 = x 3 = const. moving x 2 y k constant x 3 ModGMDH GAME min x 2 max moving moving x 1 x 2 y m y m x 3 = const. max x 1 constant x 3 ModGMDH GAME min x 2 max 12

Sensitivity analysis of inductive model of MEDV House no. 189 House no. 164 Credible output? What will happen with the value of house when criminality in the area decreases/increases? 13

Ensemble of inductive models Random initialization y k x1 x2 y k-1 Developing on the same training set y k-1 x3 x1 x2 ModGMDH GAME y k Training affect just well defined areas of input space y k+1 min i = x 2 max x3 x1 x2 x3 ModGMDH GAME ModGMDH GAME y k+1 Each model - unique architecture, similar complexity similar transfer functions Similar behavior for well defined areas Different behavior under-defined defined areas 14

Credibility of models: Artificial data set Advantages: No need of the training data set, Modeling method success considered, Inputs importance considered. Credibility: the criterion is a dispersion of models` responses. 15

Example: Models of hot water consumption 16

Cold water consumption, increasing humidity 17

Models on Housing data Single model Ensemble of 10 models Before After 18

Classification problems Data: Setoza class Virginica class Versicolor class Blue GAME model (Iris Setoza class) output = 1 decision boundary output = 0 Petal width Petal length 19

Credibility of classifiers GAME model 1 GAME model 2 GAME model 3 GAME models (1*2*3) Iris Setoza * * = Iris Virginica * * = Iris Versicolor * * = 20

Overlapping models X ensemble 21

Random behavior filtered out Before After 22

Problem how to find information in n-dim n space? Multidimensional space of input variables What we are looking for? Interesting relationship of IO variables Regions of high sensitivity Credible models (compromise response) Can we automate the search? 23

When a plot is interesting for us? x istart x isize x i 24

Definition of interesting plot Minimal volume of the envelope p min Maximal sensitivity of the output to the change of x i input variable y size max Maximal size of the area x isize max 25

Multiobjective optimization Interestingness: Unknown variables: x 1,x 2,..., x i-1,x i+1, x n x istart, x isize We will use Niching genetic algorithm Chromosome: x 1 x 2... x i-1 x i+1 x n x istart x isize 26

Niching GA on simple data Training data Chromosome: x start x size fitness = x size * 1/p * y size Very simple problem Search space is 2D, can be visualized x start GAME ensemble x size fitness 27

Niching GA locates also local optima Three subpopulations (niches) of individuals survived 28

Automated retrieval of plots showing interesting behavior Genetic Algorithm Genetic algorithm with special fitness function is used to adjust all other inputs (dimensions) Best so far individual found (generation 0 17) 29

Housing data interesting plot retrieved Low fitness High fitness Before After 30

Conclusion Credible regression Credible classification Automated retrieval 31

Future work I Automated knowledge extraction from data KNOWLEDGE EXTRACTION and INFORMATION VISUALIZATION KNOWLEDGE INPUT DATA AUTOMATED DATA PREPROCESSING AUTOMATED DATA MINING FAKE INTERFACE GAME ENGINE 32

Future work II FAKE GAME framework DATA WAREHOUSING DATA INTEGRATION Classes boundaries, relationship of variables Classification, Prediction, Identification and Regression DATA CLEANING INPUT DATA DATA INSPECTION AUTOMATED DATA PREPROCESSING MO DEL MO DEL MO DEL GAME ENGINE MO DEL MO DEL MO DEL Credibility estimation Math equations DATA COLLECTION Interesting behaviour PROBLEM IDENTIFICATION FAKE INTERFACE Feature ranking 33

Future work III Just released as open source project Automated data preprocessing Automated model building, validation Optimization methods Visualization see and join us: http://www.sourceforge.net/projects/fakegame 34