Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme. Machine Learning (SS2012)

Size: px

Start display at page:

Download "Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme. Machine Learning (SS2012)"

Loren Barnett
5 years ago
Views:

1 Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme Machine Learning (SS2012) Prof. Dr. M. Riedmiller, Manuel Blum Exercise Sheet 5 Exercise 5.1: Spam Detection Suppose you are working on a spam detection system. You formulated the problem as a classification task where Spam is the positive class and not Spam is the negative class. Your training set contains m = s, 99% of these are non-spam and 1% are spam. (a) What accuracy has a classifier that always predicts not Spam? (b) The fraction of spam s, that are correctly classified is measured by the recall value. What is the recall of the always non-spam classifier? recall = true positives true positives + false negatives

2 (c) Suppose you trained a classifier using a MLP with one output neuron. If the activation of the output neuron is larger than a threshold, the instance will be classified as positive. The result on the training set using =0.5 issummarizedintable1. Intotal,thereare 24 instances classified as Spam. The percentage of those that really are spam is called precision: precision = true positives true positives + false positives Calculate precision and recall for the MLP classifier. What happens as you vary? Predicted class Spam not Spam Actual class Spam 8 2 not Spam Table 1: Performance of the MLP spam detection system on the training set. (d) Finally, you evaluate the performance of your classifier on independent test data and observe that the results are substantially worse than on the training set. Which of the following measures are likely to improve the performance of your spam detection system? 2 using additional features 2 increasing the weight decay parameter 2 increasing the number of gradient descent iterations 2 reducing the number of gradient descent iterations 2 reducing the number of hidden neurons in the MLP 2 obtaining more training data

3 Exercise 5.2: Boosting with Decision Stumps Apply the AdaBoost algorithm to train a classifier for the dataset specified in Table 2. Consider four decision stumps S N, S C, S R,andS F oneforeachattribute thatclassifydi erent instances as positive and negative. So, for example, the decision stump belonging to the coughing attribute classifies the patterns as S C (d i )=true for i 2 {1, 2, 6} and S C (d i )=false for i 2 {3, 4, 5}. (a) Apply T = 4 iterations of the AdaBoost algorithm to the training patterns provided. Select in each iteration that decision stump that yields the lowest error on the reweighted pattern distribution. (b) Verify whether your final classifier H final correctly classifies all training patterns. Training N C R F Classification Example (running nose) (coughing) (reddened skin) (fever) d positive (ill) d positive (ill) d positive (ill) d 4 + negative (healthy) d 5 negative (healthy) d negative (healthy) Table 2: List of training instances.

4 Exercise 5.3: Probabilities We consider a medical diagnosis task. We have knowledge that over the entire population of people 0.8% have cancer. There exists a (binary) laboratory test that represents an imperfect indicator of this disease. That test returns a correct positive result in 98% of the cases in which the disease is present, and a correct negative results in 97% of the cases where the disease is not present. (a) Suppose we observe a patient for whom the laboratory test returns a positive result. Calculate the a posteriori probability that this patient truly su ers from cancer. (b) Knowing that the lab test is an imperfect one, a second test (which is assumed to be independent of the former one) is conducted. Calculate the a posteriori probabilities for cancer and cancer given that the second test has returned a positive result as well.

5 We turn to politics. For the upcoming mayor election, people are allowed to vote either for candidate A or candidate B. There were 1000 registered voters who have already voted by postal voting. (c) Assume that all postal voters have voted for candidate A. Moreover, we assume that all remaining voters decide by flipping a (non-manipulated) coin. What is the probability that candidate A wins the election?

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict