Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002

Similar documents
Enterprise Miner Tutorial Notes 2 1

SAS Enterprise Miner : Tutorials and Examples

Predictive Modeling with SAS Enterprise Miner

Define the data source for the Organic Purchase Analysis project.

Enterprise Miner Version 4.0. Changes and Enhancements

SAS Enterprise Miner TM 6.1

Predictive Modeling with SAS Enterprise Miner

Ensemble methods in machine learning. Example. Neural networks. Neural networks

CS6220: DATA MINING TECHNIQUES

Data Mining. Lab Exercises

Getting Started with. SAS Enterprise Miner 5.3

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

SAS Factory Miner 14.2: User s Guide

Getting Started with SAS Enterprise Miner 12.1

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Getting Started with SAS Enterprise Miner 14.2

Knowledge Discovery and Data Mining

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach

Didacticiel - Études de cas

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

CS229 Final Project: Predicting Expected Response Times

Enterprise Miner Software: Changes and Enhancements, Release 4.1

AN OVERVIEW AND EXPLORATION OF JMP A DATA DISCOVERY SYSTEM IN DAIRY SCIENCE

Data Mining Overview. CHAPTER 1 Introduction to SAS Enterprise Miner Software

SAS Visual Analytics 7.2, 7.3, and 7.4: Getting Started with Analytical Models

Simple Model Selection Cross Validation Regularization Neural Networks

SAS Visual Analytics 8.2: Working with SAS Visual Statistics

CRITERION Vantage 3 Admin Training Manual Contents Introduction 5

STAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression

Excel to R and back 1

Chapter One: Getting Started With IBM SPSS for Windows

For Monday. Read chapter 18, sections Homework:

Excel 2010 with XLSTAT

SELECTION OF A MULTIVARIATE CALIBRATION METHOD

Creating a data file and entering data

Show how the LG-Syntax can be generated from a GUI model. Modify the LG-Equations to specify a different LC regression model

Data Mining Using SAS Enterprise Miner : A Case Study Approach, Fourth Edition

Linear Models. Lecture Outline: Numeric Prediction: Linear Regression. Linear Classification. The Perceptron. Support Vector Machines

A Systematic Overview of Data Mining Algorithms

Right-click on whatever it is you are trying to change Get help about the screen you are on Help Help Get help interpreting a table

Generalized least squares (GLS) estimates of the level-2 coefficients,

Excel Assignment 4: Correlation and Linear Regression (Office 2016 Version)

CSC 578 Neural Networks and Deep Learning

JMP Book Descriptions

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Neural Network Neurons

17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1

Using Large Data Sets Workbook Version A (MEI)

Expectation Maximization (EM) and Gaussian Mixture Models

From Building Better Models with JMP Pro. Full book available for purchase here.

Practical Guidance for Machine Learning Applications

STATA 13 INTRODUCTION

Predict Outcomes and Reveal Relationships in Categorical Data

Open a Request from the Dashboard: Open the Dashboard button to review all Everyday User reservations that are awaiting approval.

Notes on Multilayer, Feedforward Neural Networks

Exercise: Training Simple MLP by Backpropagation. Using Netlab.

Spotlight Session Analysing answers to open-ended questions from surveys

MLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms

A Neural Network Model Of Insurance Customer Ratings

Math 227 EXCEL / MEGASTAT Guide

An introduction to SPSS

Artificial Neural Network based Curve Prediction

Opening a Data File in SPSS. Defining Variables in SPSS

2. Neural network basics

Didacticiel - Études de cas

Random Forest A. Fornaser

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Introduction to Automated Text Analysis. bit.ly/poir599

Pattern Recognition Winter Semester 2009/2010. Exemplary Solution

3 Graphical Displays of Data

SAS Visual Analytics 8.1: Getting Started with Analytical Models

As a reference, please find a version of the Machine Learning Process described in the diagram below.

SAS High-Performance Analytics Products

Multi Layer Perceptron with Back Propagation. User Manual

Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry

Local Minima in Regression with Optimal Scaling Transformations

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

CS249: ADVANCED DATA MINING

Tips and Guidance for Analyzing Data. Executive Summary

Getting Started With. A Step-by-Step Guide to Using WorldAPP Analytics to Analyze Survey Data, Create Charts, & Share Results Online

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA

Classification of Breast Cancer Cells Using JMP Marie Gaudard, North Haven Group

Principles of Machine Learning

Data Mining. Neural Networks

Example Using Missing Data 1

1 Introduction to Using Excel Spreadsheets

7.4 Tutorial #4: Profiling LC Segments Using the CHAID Option

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Tutorial #1: Using Latent GOLD choice to Estimate Discrete Choice Models

Tutorials Case studies

IBM DB2 Intelligent Miner for Data. Tutorial. Version 6 Release 1

CSC 2515 Introduction to Machine Learning Assignment 2

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Data Mining and Data Warehousing

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Transcription:

Recitation Supplement: Creating a Neural Network for Classification SAS EM December 2, 2002 Introduction Neural networks are flexible nonlinear models that can be used for regression and classification tasks, in the same way that linear/logistic regression and tree models can. Accordingly, to explore neural network modeling in SAS EM, we will use the same data set myraw.xls that was used in earlier recitations introducing linear regression and tree models. Recall that this is a data set extracted from a larger data set maintained by a national veteran s organization on people who have made charitable donations to them. We will build a standard neural network model to predict whether someone would contribute to a new fundraising campaign (by modeling the probability of response). Neural networks can also be used for regression tasks (e.g. predicting the amount contributed by a donor), but the extension is straightforward, so we will not discuss this here. Procedure The first few steps will be essentially the same as in the October 7 and October 21 recitations, providing a brief review of target profiling and imputation for missing data. The later steps will focus in more detail on the use of the Neural Network node. 1. Download and save myraw.xls from the course website: (http://www.orie.cornell.edu/~davidr/or474, under Course Data Sets, labeled as Donations data). Import the data into SAS, start Enterprise Miner, and create a new project. Maximize the EM window. This will make it easier to view the charts and diagrams that we will create. 2. Drag an Input Data Source node onto the diagram, open it, and input the data set. (Also, change the metadata sample to be the full data set.) Make the usual changes to the default assignments under the Variables tab: (a) Set the Model Role of TARGET B to target. Set the Model Role of TARGET D to rejected. (b) Set the Model Role of variables PETS and PCOWNERS to input and their Measurement type to binary. 3. Go to the target profile window (right-click on TARGET B, choose Edit target profile... ), and create a new profile (Edit Create New Profile). Make the new profile active (rightclick on its cell in the Use column, select Set to use). 1

After making sure that the new profile is highlighted, go to the Assessment Information tab. Create a new profit matrix (right-click in the left-hand list and select Add) and make it active (highlight it, right-click on it, and choose Set to use). We will use the same profit/cost information as before: Enter 12.32 under column 1, row 1, for the net profit when we correctly predict a response, and -0.68 in column 1, row 0, for the mailing cost incurred when we incorrectly predict a response. Put 0 s in both rows of column 0. Select the Prior tab. Create a new prior vector (right-click in the left-hand list and choose Add) and make it active (highlight it, right-click on it, and choose Set to use). Enter the true population proportions in the right-hand table (0.05 for responders, 0.95 for non-responders). Close the target profiles window and close the node (saving changes in both). 4. Connect a Data Partition node after the Input Data Source node and open it. Under the Partition tab, reset the Percentages for the data partitioning so that the Train data set gets 70%, the Validation set gets 30%, and the Test set gets 0%. (This choice is somewhat arbitrary, but it provides more training data needed by a flexible model like a neural network, while still allowing enough data for validation.) Close this node (choosing to save changes). Just as transforming variables can sometimes improve a standard logistic regression, it can be a useful precursor to neural network modeling. (Recall that logistic regression can, in fact, be viewed as a very simple neural network.) However, it is not quite as important for more complex neural network models, since these are already capable of approximating certain types of nonlinear relationships. Transformations will not be used in this analysis, but if they were desired, a Transform Variables node could be added at this point. (If this analysis were a regression problem rather than a classification problem, transformation of the target variable might be necessary for a good model fit if the error variance depends on the target value, especially when using the usual squared-error loss.) 5. Neural network models, like linear/logistic regression models (and unlike tree models), generally require imputation of missing predictor variables. Connect a Replacement node after the Data Partition node and open it. Look at the Imputation Methods subtab (under the Defaults tab). Change the default imputation method for both the interval and class variables to tree imputation. Now make the usual modifications for the three ambiguous variables: Under the Class Variables tab, right click on the cell in the HOMEOWNR row and Imputation Method column. Choose Select Method... set value... on the pop-up menus. Enter U in the box and click OK. Now do the same for PETS and PCOWNERS. Close this node (choosing to save changes). 6. Connect a Neural Network node (from the EM toolbar) after the Replacement node, and open it. You will see the same Data, Variables, Output, and Notes tabs that you 2

have seen before when opening a Regression or Tree node. The General tab and Basic (or Advanced) tab allow you to specify particular options for building neural networks. Go to the General tab. By default, the criterion of Profit / Loss will be used to select the final model, which is suitable for the current analysis. The Advanced user interface box, if checked, makes the Advanced tab available instead of the Basic tab. The Advanced tab allows more control over the configuration of the neural network and may therefore be preferred by experienced users. The current analysis will be a simple one, so this box should be left unchecked. The remaining two options may be used to analyze the training process (the fitting of the neural network to the data). We will also leave these at the defaults. (See the help files for more information.) Go to the Basic tab. This tab allows simple options for configuring the network. Click on the drop button beside the Network architecture text box. This will open a new window. The text box labeled Hidden neurons allows you to control how many units are in the hidden layer. You can specify this qualitatively (High, Moderate, or Low noise, or Noiseless data), in which case the number of hidden units is determined based on the number of inputs and the sample size and indicated in the greyed-out box on the right side. Alternatively, you can specify the number of hidden units directly with the Set number... option. Choose this option, and set the number of hidden neurons to 3. The Direct connections text box allows you to specify whether or not to include direct connections between input and output units, i.e. model terms that are linear in the input variables (before application of the output activation function). Leave this at the default setting of No to fit the usual type of neural network. The Network architecture text box allows some limited options for defining the structure of the network. By default, this is Multilayer perceptron, the standard type of neural network. With this option, there will be a single hidden layer with logistic (tanh) activation functions, by default. (Because we have one binary target variable, there will naturally be one output unit with a logistic (softmax) activation function, by default.) Click and hold the drop menu button to see the other options, which include a generalized linear model (no hidden layer) and various types of radial basis function networks. Leave this option at its default (Multilayer perceptron). Click OK to return to the main level of the Basic tab. The training of a neural network involves the optimization of some type of objective function (often the likelihood function) starting from a given set of initial weights (the parameters of a neural network model). The Preliminary runs text box allows you to specify the number of preliminary optimizations that are performed (usually based on a subset of data to speed up training) to find good starting values for the network weights, and thereby possibly avoid convergence to a suboptimal solution. The Training technique text box allows you to choose which optimization method to use. Because training can take excessive amounts of time for large data sets with many variables, a maximum training time can be set in the Runtime limit text box. Leave all of these at their default settings, for now. Close the Neural Network node, choosing to save changes, and giving the model an 3

appropriate name. 7. Run the Neural Network node. A Process Monitor window may momentarily appear, illustrating the progress of the training by showing the objective function value at each stage for the training and validation data. (If the program seems to freeze when this window appears, try clicking the Continue button.) When the training finishes, choose to view the results. The Tables tab is open by default. This simply gives tabular representations of several output data sets generated by the Neural Network node. You can access various tables using the drop menu at the top of the tab. The Fit statistics table is displayed by default. Click on the Model tab. Its General subtab gives the usual information. The Network subtab gives more detailed information about the model options chosen. This can be useful if you which to know which options were chosen by default, and the resulting configuration of the network, but interpreting the information requires some understanding of what the options mean. Click on the Weights tab. The table displayed here (Table subtab) gives the fitted parameter values for the neural network (one for each connection, plus appropriate bias terms). These parameters are identified by the connections with which they are associated, as defined by the node labels in the From and To columns. Input nodes are identified by the corresponding variable name (e.g. AGE), hidden nodes are labeled with a layer number and an index (e.g. H12 for the second node in the first hidden layer), and the output node is labeled with the output variable name. Bias terms are simply labeled as BIAS in the From column, and can be thought of as corresponding to a constant input of 1. Note that categorical variables are converted to numerical indicator variables with appropriately modified names, just as in linear/logistic regression. Click on the Graph subtab to see a graphical representation of this information. (Note that you may need to scroll to see all of the variables: choose the scroll tool or right-click on the graph.) Click on the Plot tab. The graphs here show the performance of the model on the training and validation data sets as a function of training (optimization) iteration number. The criterion being graphed appears to be Average Error, by default. You can change this by right-clicking on the graph and selecting a different option from the pop-up menu. The vertical line indicates which iteration was chosen to produce the optimal weights. Interestingly, it appears that the weights chosen are not those corresponding to the final result of the optimization, but rather those corresponding to the iteration for which the model selection criterion is greatest on the validation data. (To verify this in the current case, right-click on the graph and choose Profit to see graphs of the average profit as a function of iteration.) This is an example of model selection by early stopping; see Sec. 11.5.2 in the text. (By default, weight decay is not performed, and it is only possible to choose weight decay manually through an appropriate selection from the node s Advanced tab.) The remaining tabs (Code, Log, Output, and Notes) work as usual, and are less important in casual use. (There is no apparent way to view the connection diagram from 4

the results browser, but such a diagram can be created in the Advanced tab when the node is opened, to allow interactive training.) Try It Yourself 1. Add an Assessment node to the graph, connecting it after the Neural Network node. View different assessment charts for the model. Add a Regression node (with stepwise model selection based on profit for the validation data) in parallel with the Neural Network node. How do the two models compare on the basis of the assessment charts over the validation data? 2. Add a few more Neural Network nodes (in parallel with the original node) with different numbers of hidden units. Can you find a neural network model that outperforms logistic regression on this data set? Created by Trevor Park on December 1, 2002 5