Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle

Size: px
Start display at page:

Download "Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle"

Transcription

1 Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle Jacob Whitehill Worcester Polytechnic Institute

2 Machine learning competitions Data-mining competitions (Kaggle, KDDCup, DrivenData, etc.) have become a mainstay of machine learning practice. They help ensure comparability and reproducibility by providing a common test set and common rules.

3 Machine learning competitions They can help incentivize machine learning innovations in specific application domains: Cervical cancer diagnosis from images Passenger security screening in airports Restaurant visitor forecasting Dog breed identification Ancillary benefit: provide credibility to new data scientists when searching for a job.

4 Machine learning competitions 1. Competition organizer assembles training & testing data. Competition organizer Training data Testing data Images Images Labels Labels 1 0 y 1,...,y n (ground-truth) 4

5 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 y 1,...,y n (ground-truth) 2. Contestant obtains training examples+labels & testing examples without labels. Contestant Training data Images Testing data Images Labels

6 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 y 1,...,y n (ground-truth) 3. Contestant uses machine learning to guess the test labels. Contestant Training data Testing data Images Images Labels Guesses ŷ 1,...,ŷ n (guesses) Machine learning 6

7 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels Contestant submits guesses to the organizer, who records the accuracy c. y 1,...,y n (ground-truth) Contestant Training data Images Labels Testing data Images Guesses ŷ 1,...,ŷ n (guesses) 7

8 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels When competition is over, the organizer reports which contestant won the contest. c= c= c= Contestant A Contestant B Contestant C 8

9 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 Oracle Contestant sends guesses ŷ 1,...,ŷ n to oracle Contestant Images Training data Testing data Images Labels Guesses

10 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 Oracle Oracle reports accuracy c during the competition. Contestant Images Training data Testing data Images Labels Guesses

11 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 Oracle Oracle reports accuracy c during the competition. Contestant Images Training data Testing data Images Labels Guesses Oracle feedback can help candidates to identify more/less promising ML approaches. 11

12 Machine learning competitions Competition organizer Training data Testing data Images Images Labels Labels 1 0 Oracle Oracle reports accuracy c during the competition. Contestant Images Training data Testing data Images Labels Guesses Question: can the oracle be exploited to deduce the true labels illicitly? 12

13 Log-loss One of the most common error metrics for classification problems is the log-loss. If Yn and Ŷn are the (n x c) ground-truth and guess matrices respectively (each element in [0,1]), then the log-loss is: `n. = f(y n, Y b n )= 1 n nx cx y ij log by ij i=1 j=1

14 Example Suppose there are 2 examples and 3 classes, and the contestant submits the following guesses to the oracle: by 2 = apple e 2 e 1 1 e 2 e 1 e 8 e 4 1 e 8 e 4 Suppose the oracle reports that: f(y 2, b Y 2 )= 1 2 2X i=1 3X j=1 y ij log by ij =3 By iterating over all 3 2 possible ground-truths, we can easily determine that: Y 2 = apple

15 Exploiting the log-loss In general, there are c n possible ground-truths for n examples too many to be tractable. However: because the log-loss decomposes across examples, we can iteratively apply brute-force search over small batches of size m n. l n mm Over batches, we can deduce the ground-truth labels of the entire test set.

16 Exploiting the log-loss `n = = 1 n 1 n nx i=1 2 4 kx i=1 cx j=1 X nx j y ij log by ij y ij log by ij + X k+m X i=k+1 3 y ij log by 5 ij X j y ij log by ij + i=k+m+1 j

17 Exploiting the log-loss `n = = 1 n 1 n nx i=1 2 4 kx i=1 cx j=1 X j y ij log by ij Already inferred nx y ij log by ij + Unprobed X k+m X i=k+1 3 y ij log by 5 ij X j Probed y ij log by ij + i=k+m+1 j

18 Exploiting the log-loss `n = = 1 n 1 n nx i=1 2 4 kx i=1 cx j=1 X j y ij log by ij Already inferred nx 0 y ij log by ij + Unprobed X k+m X i=k+1 3 y ij log by 5 ij X j Probed y ij log by ij + i=k+m+1 j For the already inferred examples, the log-loss kx X y ij log ŷ ij 0 i=1 j *It s actually not exactly 0 because ŷ ij 2 [, 1 (c 1) ]

19 Exploiting the log-loss `n = = 1 n 1 n nx i=1 2 4 kx i=1 cx j=1 X j i=k+m+1 y ij log by ij Already inferred 0 y ij log by ij + k+m X i=k+1 3 nx Unprobed X n y ij log by 5 (n m k) log c ij j X j Probed y ij log by ij + For each unprobed example i, if we set ŷij = 1/c for all j, then the log-loss for example i is just -log c:

20 Exploiting the log-loss Hence, the log-loss is: `m due to just the m probed examples `m. = 1 m k+m X X y ij log by ij i=k+1 j = 1 m ((n m k) log c n `n) Log-loss over all n examples. Returned by the oracle.

21 Exploiting the log-loss Now that we know the log-loss on just the batch of examples, we can apply brute-force optimization over Ym. For any possible ground-truth Ym of the probed examples, define the estimation error:. = `m f(y m, b Y m )) Select the Ym that minimizes ε.

22 Choosing the guesses Ŷm But how to choose the guesses Ŷm? If the oracles s floating-point precision was infinite, then we could just choose a random real-valued matrix Ŷm. With probability 1, the log-loss `m would be different for every possible ground-truth, and we could recover Ym unambiguously.

23 Choosing the guesses Ŷm In practice, the oracle s floating-point precision is finite, e.g., 5th decimal place. Collisions can occur: different Ym that result in very similar log-loss values. `m

24 Choosing the guesses Ŷm Consider the following set of guesses, in which no two values are closer than apart: Yet two possible ground-truth matrices result in log-loss values that are very close to each other (<10-4 ): and `m = `m =

25 Choosing the guesses Ŷm To avoid collisions, we want to choose Ŷm so that no two distinct ground-truths Ym and Ym' result in a similar loss: Q( b Y m ). = min Y m 6=Y 0 m f(y m, b Y m ) f(y 0 m, b Y m ) We want to maximize Q(Ŷm): by m. = arg max Q( Y b m ) This is a constrained minimax (maximin) optimization problem.

26 Choosing the guesses Ŷm Special optimization algorithms do exist for constrained minimax problems. However, in our case they are impractical because the number of constraints grows exponentially with m. In practice, we employed an ad-hoc heuristic to maximize Q w.r.t. Ŷm.

27 Upper bound on quality Note that m cannot be too large because the quality Q of the best Ŷm decreases exponentially with m.

28 Upper bound on quality For m=6, we found a Ŷm that worked well in practice.

29 Intel-MobileODT Kaggle competition We applied this algorithm to climb the Kaggle leaderboard for the Intel-MobileODT Kaggle (2017) competition. Topic: diagnosis of cervical cancer from medical images. Competition structure: 1st phase: test set of 512 examples 2nd phase: test set of 4018 examples (including original 512)

30 Intel-MobileODT Kaggle competition We applied this algorithm to climb the Kaggle leaderboard for the Intel-MobileODT Kaggle (2017) competition. Topic: diagnosis of cervical cancer from medical images. Competition structure: 1st phase: test set of 512 examples Mostly for informational purposes 2nd phase: test set of 4018 examples (including original 512) Used to decide who wins $100K

31 Example submission We submit these guesses to the oracle: Already inferred Probed Unprobed image_name,type_1,type_2,type_3 0.jpg,0.,1.,0. 1.jpg, e-01, e-01, e-02 2.jpg, e-01, e-01, e-04 3.jpg, e-01, e-01, e-02 4.jpg, e-01, e-01, e-01 5.jpg, , , jpg, , , jpg, , , jpg, , , jpg, , ,

32 Example submission 1. We receive the log-loss from the oracle: `n = We calculate the loss on just the probed examples: `m = We conduct brute-force optimization to identify Ym: 1.jpg 2.jpg 3.jpg 4.jpg , 1, 0 1, 0, 0 0, 0, 1 0, 0,

33 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

34 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

35 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

36 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

37 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

38 Kaggle MobileODT competition (1st stage) We repeat this process until we ve inferred the labels of all 512 examples

39 Kaggle MobileODT competition (1st stage) Eventually, we achieved a log-loss of 0 and climbed to rank #4 on the leaderboard without doing any real machine learning.

40 Kaggle MobileODT competition To be clear: the 2nd stage of the competition (which we did not win) was the basis for awarding the $100K prize. Our 2nd-stage rank was 225 out of 884 contestants (top 30%), since 512 out of 4018 examples were the same as in 1st stage.

41 Kaggle MobileODT competition Some other data-mining competition sites, such as DataDriven, host competitions that have no 2nd stage. In light of the log-loss attack, this seems dangerous. There might be some ancillary value in performing well even without winning actual prize money. Bragging rights, useful for getting a job interview?

42 Inferring the subset of examples In the previous example (1st stage), the oracle s accuracy was reported on the entire test set. What if the oracle reports log-loss on subset of the test examples but doesn t say which ones?

43 Inferring the subset of examples This can also be done as long as the evaluated subset E is fixed throughout the competition. High-level algorithm: 1. Find a single example that is a member of E. 2. Using this example, infer the size s = E. 3. When inferring Ym of each batch, we must also consider whether or not i E for i = 1,, m.

44 Inferring the subset of examples In a simulation in which we varied the floating-point precision p of the oracle, we found that even without knowing E the attacker can infer Yn with high accuracy. Baseline guess rate: 33.33%

45 Foiling the log-loss attack

46 Adaptive data analysis The previous log-loss attack is an example of (malicious) adaptive data analysis we use the performance of a previous analysis to inform the next one. Other forms of adaptive data analysis also exist, e.g., overfitting to test data by changing hyperparameters. Recent research on privacy-preserving machine learning and complexity theory has sought to find remedies.

47 Ladder algorithm (Blum & Hardt 2015) Problem (for ML community) with log-loss attack: The classifier does very well on the test set, but is useless on the true data distribution.

48 Ladder algorithm (Blum & Hardt 2015) Ladder algorithm (Blum & Hardt 2015): goal is to ensure that leaderboard accuracies reflect each classifier s true loss w.r.t. entire data distribution, not just the empirical (test-set) loss. In essence: accept a new submission only if its accuracy is significantly better than the previous one.

49 Ladder algorithm (Blum & Hardt 2015) Algorithm: Let R0 := For each classifier submission t {1,, k}: If loss(ft) < Rt-1 - η: Rt := round(loss(ft), η) Else: Rt := Rt-1 R1,, Rk are then reported as the leaderboard accuracies.

50 Ladder algorithm (Blum & Hardt 2015) Indeed, in simulation we can verify that the Ladder mechanism foils our log-loss attack: The sequence of accuracies produced by our log-loss attack is not monotonic. Since the oracle returns Rt-1 when we do not improve enough, we receive no new information from the oracle.

51 Ladder algorithm (Blum & Hardt 2015) In practice, Ladder seems not (yet?) to be used widely (or at all?). Might be unpopular with contestants since a small improvement in the true loss might be rejected. In any case, Ladder is designed to prevent sequential probing but what about one-shot attacks?

52 Exploiting an Oracle That Reports AUC Scores in Machine Learning Contests Whitehill (2016), AAAI.

53 AUC metric One of the most widely used accuracy metrics for binary classification problems is the Area Under the receiver operating characteristics Curve (AUC). 53

54 AUC metric The AUC metric has two equivalent definitions: 1. Area under TPR vs. FPR curve: 54

55 AUC metric The AUC metric has two equivalent definitions: 2. Probability of correct response in a 2- alternative forced choice task (one + example, one - example): 55

56 AUC attacks Since the AUC is a fraction of pairs, it is a rational number. Let AUC c = p/q. If contestant knows p/q exactly, what can she/he infer about the ground-truth labels? 56

57 Attack 1: Infer knowledge of n0, n1 Based on AUC c=p/q and test set size n, we can infer the set S of possible values for (n0, n1): q must divide n0n1. n0+n1 must equal n. 57

58 Example Suppose n = 100 and c = = p/q = 197/200. Then since n0+n1=n and since 200 n0n1, we know S = {(20, 80), (40, 60), (60, 40), (80, 20)}. 58

59 Attack 2 Suppose contestant knows that n 1 2 S. Suppose contestant s guesses ŷ 1,...,ŷ n obtain AUC c. Then if for all n 1 2 S, then the first k examples according to the rank order of ŷ 1,...,ŷ n must be negatively labeled. 59

60 Attack 2 Suppose contestant knows that n 1 2 S. Suppose contestant s guesses ŷ 1,...,ŷ n obtain AUC c. Then if for all n 1 2 S, then the first k examples according to the rank order of ŷ 1,...,ŷ n must be negatively labeled. Intuition: these are the guesses on which the classifier is most confident are negatively labeled. If the classifier were wrong about any of them, the AUC would be much lower. 60

61 Attack 2 Suppose contestant knows that n 1 2 S. Suppose contestant s guesses ŷ 1,...,ŷ n obtain AUC c. Then if for all n 1 2 S, then the first k examples according to the rank order of ŷ 1,...,ŷ n must be negatively labeled. Intuition: these are the guesses on which the classifier is most confident are negatively labeled. If the classifier were wrong about any of them, the AUC would be much lower. Analogous result holds for the last few examples being positive. 61

62 Example Suppose n=100, c=0.99, and the contestant knows that between 25% and 75% of examples are positive. Then the first * 5 examples must be negative, and the last * 5 examples must be positive. * according to rank order of ŷ 1,...,ŷ n 62

63 Example Suppose n=100, c=0.99, and the contestant knows that between 25% and 75% of examples are positive. Then the first * 5 examples must be negative, and the last * 5 examples must be positive. The contestant can deduce the identity of 10% of the test examples. * according to rank order of ŷ 1,...,ŷ n 63

64 Attack 2: Implications Knowing a few test labels is useful because: 1. Since you know them definitively, add them to the training set. 64

65 Attack 2: Implications Knowing a few test labels is useful because: 1. Since you know them definitively, add them to the training set. 2. They might be re-used in subsequent contests. 65

66 Attack 2: Implications Knowing a few test labels is useful because: 1. Since you know them definitively, add them to the training set. 2. They might be re-used in subsequent contests. 3. You could collude with other contestants who have deduced a few labels. 66

67 Attack 3 Search for all possible ground-truths y 1,...,y n for which the AUC of the guesses is some fixed value c. 67

68 Example Consider a tiny test set of just 4 examples. Suppose your guesses are ŷ 1 =0.5, ŷ 2 =0.6, ŷ 3 =0.9, ŷ 4 =0.4 Suppose the oracle says the accuracy (AUC) for these guesses is c=

69 Example For guesses: ŷ 1 =0.5, ŷ 2 =0.6, ŷ 3 =0.9, ŷ 4 =0.4 AUC for different labelings y 1 y 2 y 3 y 4 AUC

70 Example For guesses: ŷ 1 =0.5, ŷ 2 =0.6, ŷ 3 =0.9, ŷ 4 =0.4 The true labels must be: y 1 =1,y 2 =0,y 3 =1,y 4 =0 AUC for different labelings y 1 y 2 y 3 y 4 AUC

71 Example For guesses: ŷ 1 =0.5, ŷ 2 =0.6, ŷ 3 =0.9, ŷ 4 =0.4 The true labels must be: y 1 =1,y 2 =0,y 3 =1,y 4 =0 Contestant can now re-submit and obtain a perfect score in one shot. AUC for different labelings y 1 y 2 y 3 y 4 AUC

72 Attack 3 How many different ground-truth vectors are there such that the AUC of the guesses is some fixed number c? 72

73 Number of satisfying labelings grows exponentially in n for every AUC xxxxx c 2 (0, 1) For every fixed AUC c = p/q 2 (0, 1) on a test set of size n=4q, the number of different labelings y 1,...,y n such that f(y 1:n, ŷ 1:n )=c is at least: (2 2 c 0.5 ) n/4 73

74 Number of satisfying labelings grows exponentially in n for every AUC xxxxx c 2 (0, 1) For every fixed AUC c = p/q 2 (0, 1) on a test set of size n=4q, the number of different labelings y 1,...,y n such that f(y 1:n, ŷ 1:n )=c is at least: (2 2 c 0.5 ) n/4 What about for n 4q? Open question: Might there be some pathological combination of p, q, n0, n1 (for non-trivial n) such that the number of satisfying labelings is small? 74

75 Conclusions Given that Ladder is rarely implemented, ML practitioners (and job recruiters) should be aware of the danger of log-loss attacks in data-mining competitions. The AUC admits fundamentally different attacks from log-loss: Log-loss: decomposes across single examples. AUC: decomposes across pairs (one +, one -) of examples. Greater goal of this work is to raise awareness of the potential for cheating in machine learning contests. 75

76 Thank you 76

Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle

Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle Jacob Whitehill jrwhitehill@wpi.edu Worcester Polytechnic Institute AAAI 2018 Workshop Talk Machine learning competitions Data-mining competitions

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

CSEP 573: Artificial Intelligence

CSEP 573: Artificial Intelligence CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data

More information

Pattern recognition (4)

Pattern recognition (4) Pattern recognition (4) 1 Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier (1D and

More information

Logistic Regression: Probabilistic Interpretation

Logistic Regression: Probabilistic Interpretation Logistic Regression: Probabilistic Interpretation Approximate 0/1 Loss Logistic Regression Adaboost (z) SVM Solution: Approximate 0/1 loss with convex loss ( surrogate loss) 0-1 z = y w x SVM (hinge),

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

1 Linear programming relaxation

1 Linear programming relaxation Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Primal-dual min-cost bipartite matching August 27 30 1 Linear programming relaxation Recall that in the bipartite minimum-cost perfect matching

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA

MIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA Exploratory Machine Learning studies for disruption prediction on DIII-D by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA Presented at the 2 nd IAEA Technical Meeting on

More information

Data Mining and Knowledge Discovery Practice notes 2

Data Mining and Knowledge Discovery Practice notes 2 Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms

More information

6. Advanced Topics in Computability

6. Advanced Topics in Computability 227 6. Advanced Topics in Computability The Church-Turing thesis gives a universally acceptable definition of algorithm Another fundamental concept in computer science is information No equally comprehensive

More information

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Model Evaluation Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Methods for Model Comparison How to

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 8.11.2017 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday. CS 188: Artificial Intelligence Spring 2011 Lecture 21: Perceptrons 4/13/2010 Announcements Project 4: due Friday. Final Contest: up and running! Project 5 out! Pieter Abbeel UC Berkeley Many slides adapted

More information

CS4491/CS 7265 BIG DATA ANALYTICS

CS4491/CS 7265 BIG DATA ANALYTICS CS4491/CS 7265 BIG DATA ANALYTICS EVALUATION * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Dr. Mingon Kang Computer Science, Kennesaw State University Evaluation for

More information

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis Part 1 Aaron Roth The 2015 ImageNet competition An image classification competition during a heated war for deep learning talent

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem Data Mining Classification: Alternative Techniques Imbalanced Class Problem Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Class Imbalance Problem Lots of classification problems

More information

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016 CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

More about The Competition

More about The Competition Data Structure I More about The Competition TLE? 10 8 integer operations, or 10 7 floating point operations, in a for loop, run in around one second for a typical desktop PC. Therefore, if n = 20, then

More information

1. Lecture notes on bipartite matching February 4th,

1. Lecture notes on bipartite matching February 4th, 1. Lecture notes on bipartite matching February 4th, 2015 6 1.1.1 Hall s Theorem Hall s theorem gives a necessary and sufficient condition for a bipartite graph to have a matching which saturates (or matches)

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

Problems 1 and 5 were graded by Amin Sorkhei, Problems 2 and 3 by Johannes Verwijnen and Problem 4 by Jyrki Kivinen. Entropy(D) = Gini(D) = 1

Problems 1 and 5 were graded by Amin Sorkhei, Problems 2 and 3 by Johannes Verwijnen and Problem 4 by Jyrki Kivinen. Entropy(D) = Gini(D) = 1 Problems and were graded by Amin Sorkhei, Problems and 3 by Johannes Verwijnen and Problem by Jyrki Kivinen.. [ points] (a) Gini index and Entropy are impurity measures which can be used in order to measure

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

More information

Solving lexicographic multiobjective MIPs with Branch-Cut-Price

Solving lexicographic multiobjective MIPs with Branch-Cut-Price Solving lexicographic multiobjective MIPs with Branch-Cut-Price Marta Eso (The Hotchkiss School) Laszlo Ladanyi (IBM T.J. Watson Research Center) David Jensen (IBM T.J. Watson Research Center) McMaster

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Recap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach

Recap: Gaussian (or Normal) Distribution. Recap: Minimizing the Expected Loss. Topics of This Lecture. Recap: Maximum Likelihood Approach Truth Course Outline Machine Learning Lecture 3 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Probability Density Estimation II 2.04.205 Discriminative Approaches (5 weeks)

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Shortest Path Problem G. Guérard Department of Nouvelles Energies Ecole Supérieur d Ingénieurs Léonard de Vinci Lecture 3 GG A.I. 1/42 Outline 1 The Shortest Path Problem Introduction

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule. CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit

More information

A Computational Theory of Clustering

A Computational Theory of Clustering A Computational Theory of Clustering Avrim Blum Carnegie Mellon University Based on work joint with Nina Balcan, Anupam Gupta, and Santosh Vempala Point of this talk A new way to theoretically analyze

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc.

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc. CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems Leigh M. Smith Humtap Inc. leigh@humtap.com Basic system overview Segmentation (Frames, Onsets, Beats, Bars, Chord Changes, etc) Feature

More information

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Midterm Examination CS 540-2: Introduction to Artificial Intelligence Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State

More information

SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION

SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION Alessandro Artale UniBZ - http://www.inf.unibz.it/ artale/ SECTION 5.5 Application: Correctness of Algorithms Copyright Cengage Learning. All

More information

Tutorials Case studies

Tutorials Case studies 1. Subject Three curves for the evaluation of supervised learning methods. Evaluation of classifiers is an important step of the supervised learning process. We want to measure the performance of the classifier.

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 20: Naïve Bayes 4/11/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein. W4 due right now Announcements P4 out, due Friday First contest competition

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

Computational complexity

Computational complexity Computational complexity Heuristic Algorithms Giovanni Righini University of Milan Department of Computer Science (Crema) Definitions: problems and instances A problem is a general question expressed in

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Regularization and model selection

Regularization and model selection CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Evaluating Machine-Learning Methods. Goals for the lecture

Evaluating Machine-Learning Methods. Goals for the lecture Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from

More information

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C. 219 Lemma J For all languages A, B, C the following hold i. A m A, (reflexive) ii. if A m B and B m C, then A m C, (transitive) iii. if A m B and B is Turing-recognizable, then so is A, and iv. if A m

More information

CSE 417 Network Flows (pt 4) Min Cost Flows

CSE 417 Network Flows (pt 4) Min Cost Flows CSE 417 Network Flows (pt 4) Min Cost Flows Reminders > HW6 is due Monday Review of last three lectures > Defined the maximum flow problem find the feasible flow of maximum value flow is feasible if it

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine

More information

Chapter 7: Frequent Itemsets and Association Rules

Chapter 7: Frequent Itemsets and Association Rules Chapter 7: Frequent Itemsets and Association Rules Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2011/12 VII.1-1 Chapter VII: Frequent Itemsets and Association

More information

Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help?

Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help? Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help? Olivier Bousquet, Google, Zürich, obousquet@google.com June 4th, 2007 Outline 1 Introduction 2 Features 3 Minimax

More information

Problem 1: Complexity of Update Rules for Logistic Regression

Problem 1: Complexity of Update Rules for Logistic Regression Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

Lecture 9: Support Vector Machines

Lecture 9: Support Vector Machines Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and

More information

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz Gradient Descent Wed Sept 20th, 2017 James McInenrey Adapted from slides by Francisco J. R. Ruiz Housekeeping A few clarifications of and adjustments to the course schedule: No more breaks at the midpoint

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Eric Medvet 16/3/2017 1/77 Outline Machine Learning: what and why? Motivating example Tree-based methods Regression trees Trees aggregation 2/77 Teachers Eric Medvet Dipartimento

More information

The Further Mathematics Support Programme

The Further Mathematics Support Programme Degree Topics in Mathematics Groups A group is a mathematical structure that satisfies certain rules, which are known as axioms. Before we look at the axioms, we will consider some terminology. Elements

More information

A Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean:

A Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean: A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms. Zero Knowledge Protocols 3. Each statement is derived via the derivation rules.

More information

Zero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16)

Zero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16) Zero Knowledge Protocols c Eli Biham - May 3, 2005 442 Zero Knowledge Protocols (16) A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms.

More information

Medical images, segmentation and analysis

Medical images, segmentation and analysis Medical images, segmentation and analysis ImageLab group http://imagelab.ing.unimo.it Università degli Studi di Modena e Reggio Emilia Medical Images Macroscopic Dermoscopic ELM enhance the features of

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

(b) Linking and dynamic graph t=

(b) Linking and dynamic graph t= 1 (a) (b) (c) 2 2 2 1 1 1 6 3 4 5 6 3 4 5 6 3 4 5 7 7 7 Supplementary Figure 1: Controlling a directed tree of seven nodes. To control the whole network we need at least 3 driver nodes, which can be either

More information

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD Abstract There are two common main approaches to ML recommender systems, feedback-based systems and content-based systems.

More information

1 Overview Definitions (read this section carefully) 2

1 Overview Definitions (read this section carefully) 2 MLPerf User Guide Version 0.5 May 2nd, 2018 1 Overview 2 1.1 Definitions (read this section carefully) 2 2 General rules 3 2.1 Strive to be fair 3 2.2 System and framework must be consistent 4 2.3 System

More information

Classification: Feature Vectors

Classification: Feature Vectors Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12

More information

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/20/2010 Announcements W7 due Thursday [that s your last written for the semester!] Project 5 out Thursday Contest running

More information

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen

More information

Semi-supervised learning and active learning

Semi-supervised learning and active learning Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Propagate the Right Thing: How Preferences Can Speed-Up Constraint Solving

Propagate the Right Thing: How Preferences Can Speed-Up Constraint Solving Propagate the Right Thing: How Preferences Can Speed-Up Constraint Solving Christian Bessiere Anais Fabre* LIRMM-CNRS (UMR 5506) 161, rue Ada F-34392 Montpellier Cedex 5 (bessiere,fabre}@lirmm.fr Ulrich

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

1. Lecture notes on bipartite matching

1. Lecture notes on bipartite matching Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in

More information

Calibrating Random Forests

Calibrating Random Forests Calibrating Random Forests Henrik Boström Informatics Research Centre University of Skövde 541 28 Skövde, Sweden henrik.bostrom@his.se Abstract When using the output of classifiers to calculate the expected

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Multi-label classification using rule-based classifier systems

Multi-label classification using rule-based classifier systems Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar

More information

6.856 Randomized Algorithms

6.856 Randomized Algorithms 6.856 Randomized Algorithms David Karger Handout #4, September 21, 2002 Homework 1 Solutions Problem 1 MR 1.8. (a) The min-cut algorithm given in class works because at each step it is very unlikely (probability

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

CMPSCI 646, Information Retrieval (Fall 2003)

CMPSCI 646, Information Retrieval (Fall 2003) CMPSCI 646, Information Retrieval (Fall 2003) Midterm exam solutions Problem CO (compression) 1. The problem of text classification can be described as follows. Given a set of classes, C = {C i }, where

More information

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Dynamic Programming

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Dynamic Programming Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 25 Dynamic Programming Terrible Fibonacci Computation Fibonacci sequence: f = f(n) 2

More information

Variables and Data Representation

Variables and Data Representation You will recall that a computer program is a set of instructions that tell a computer how to transform a given set of input into a specific output. Any program, procedural, event driven or object oriented

More information

Solutions to Assignment# 4

Solutions to Assignment# 4 Solutions to Assignment# 4 Liana Yepremyan 1 Nov.12: Text p. 651 problem 1 Solution: (a) One example is the following. Consider the instance K = 2 and W = {1, 2, 1, 2}. The greedy algorithm would load

More information

Linear combinations of simple classifiers for the PASCAL challenge

Linear combinations of simple classifiers for the PASCAL challenge Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu

More information

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems CSE 258 Web Mining and Recommender Systems Advanced Recommender Systems This week Methodological papers Bayesian Personalized Ranking Factorizing Personalized Markov Chains Personalized Ranking Metric

More information

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016 CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Admin Course add/drop deadline tomorrow. Assignment 1 is due Friday. Setup your CS undergrad account ASAP to use Handin: https://www.cs.ubc.ca/getacct

More information

1 Non greedy algorithms (which we should have covered

1 Non greedy algorithms (which we should have covered 1 Non greedy algorithms (which we should have covered earlier) 1.1 Floyd Warshall algorithm This algorithm solves the all-pairs shortest paths problem, which is a problem where we want to find the shortest

More information

Use of Synthetic Data in Testing Administrative Records Systems

Use of Synthetic Data in Testing Administrative Records Systems Use of Synthetic Data in Testing Administrative Records Systems K. Bradley Paxton and Thomas Hager ADI, LLC 200 Canal View Boulevard, Rochester, NY 14623 brad.paxton@adillc.net, tom.hager@adillc.net Executive

More information

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict

More information

Computational problems. Lecture 2: Combinatorial search and optimisation problems. Computational problems. Examples. Example

Computational problems. Lecture 2: Combinatorial search and optimisation problems. Computational problems. Examples. Example Lecture 2: Combinatorial search and optimisation problems Different types of computational problems Examples of computational problems Relationships between problems Computational properties of different

More information

How Learning Differs from Optimization. Sargur N. Srihari

How Learning Differs from Optimization. Sargur N. Srihari How Learning Differs from Optimization Sargur N. srihari@cedar.buffalo.edu 1 Topics in Optimization Optimization for Training Deep Models: Overview How learning differs from optimization Risk, empirical

More information

Algorithms for Learning and Teaching. Sets of Vertices in Graphs. Patricia A. Evans and Michael R. Fellows. University of Victoria

Algorithms for Learning and Teaching. Sets of Vertices in Graphs. Patricia A. Evans and Michael R. Fellows. University of Victoria Algorithms for Learning and Teaching Sets of Vertices in Graphs Patricia A. Evans and Michael R. Fellows Department of Computer Science University of Victoria Victoria, B.C. V8W 3P6, Canada Lane H. Clark

More information

NP-Complete Problems

NP-Complete Problems 1 / 34 NP-Complete Problems CS 584: Algorithm Design and Analysis Daniel Leblanc 1 1 Senior Adjunct Instructor Portland State University Maseeh College of Engineering and Computer Science Winter 2018 2

More information

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018 15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018 In this lecture, we describe a very general problem called linear programming

More information

Learning Dense Models of Query Similarity from User Click Logs

Learning Dense Models of Query Similarity from User Click Logs Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression Journal of Data Analysis and Information Processing, 2016, 4, 55-63 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jdaip http://dx.doi.org/10.4236/jdaip.2016.42005 A Comparative Study

More information