CS 4510/9010: Applied Machine Learning 1 Function Algorithms: Linear Regression, Logistic Regression Paula Matuszek Fall, 2016 Some of these slides originated from Andrew Moore Tutorials, at http://www.cs.cmu.edu/~awm/tutorials.html
Linear models for classification 2 Any regression technique can be used for classification. Binary classification Line separates the two classes Decision boundary - defines where the decision changes from one class value to the other Prediction is made by plugging in observed values of the attributes into the expression Predict one class if output cutoff, and the other class if output < cutoff Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3)
Weka With Binary Class 3 LinearRegression can only be run if class is numeric. NominalToNumeric unsupervised attribute filter. Filters don t run on class attribute irispetal.arff Resulting predictions are close to 0 or 1. There is a filter to add classifier results top attributes AddClassification supervised attribute filter Does exactly the same thing as the classifier Can then get the predicted nominal class, using predicted Classification to predict the original class, using something as simple as OneR.
Classification 4 For multiple classes, One vs many comparison: Training: perform a regression for each class, setting the output to 1 for training instances that belong to class, and 0 for those that don t Prediction: predict class corresponding to model with largest output value (membership value) Sometimes called multi-response linear regression Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4)
Logistic Regression 5 Linear regression methods predict a numeric attribute; we can then use it as a classifier by setting a cutoff score Complicated! These numeric predictions are not probabilities Can be negative, can be >1. If our outcome attribute is nominal rather than numeric, we more typically use a logistic regression model instead. Predict probability that an instance falls into each class Sometimes we would prefer to know the relative probabilities. Will it rain? Does this person have diabetes?
Logistic Regression 6 Still using linear sum of weighted attributes to predict class But it is transformed using a logit transformation Probability(class1 a1, a2,,ak) = 1/(1+e -w 0 -w 1 a 1 -w 2 a 2 - -w k a k) Results in a value between 0 and 1 Weights maximize log-likelihood https://www.ling.upenn.edu/~joseff/rstudy/week7.html
Logistic Regression in Weka 7 Expects a nominal class, not numeric Get the usual evaluation statistics for nominal class: Accuracy, confusion matrix, precision, recall, etc May be especially interesting to look at the instances vote.arff: democrat or republican Binary: Single model. Rrobability of democrat Predictions on test data give P(class) and errors most are near 1. Values lower than 1 are often errors Usual evaluation statistics: accuracy, confusing matrix, precision and recall, etc.
Logistic Regression, Multi-class 8 glass.arff Seven possible class values: Six models Does this instance belong to this class? If more than one model says yes, highest probability If none of them says yes, seventh class (headlamps) We have a model for vehicle wind non-float, but no instances in the training data and no predicted instances Model is not meaningful. Usual evaluation statistics
Logistic Regression 9 Most typically used when class is nominal and other attributes are numeric. Like linear regression, not sensitive to irrelevant attributes. Still normally assumes linearity.