Chapter 2 Learning Basics and Linear Models

Size: px
Start display at page:

Download "Chapter 2 Learning Basics and Linear Models"

Transcription

1 Chapter 2 Learning Basics and Linear Models M1 Nakayama Sahoko(SP) 2017/7/7 1/32

2 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 2/32

3 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 3/32

4 Overview This chapter provides Supervised machine learning terminology and practices Linear and log-linear models for binary and multiclass classification 4/32

5 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 5/32

6 Supervised Machine Learning The creation of mechanisms that can look at examples and produce generalizations Input Output F(x) spam Not-spam Spam Or Not-spam 6/32

7 Parameterized function Searching over the set of all possible functions is very hard Restrict to specific Hypothesis classes(family of functions) injecting the learner with inductive bias Searching over the space of parameters One common hypothesis class (linear model) : Input parameters f x = x $ W + b x R * +, W R * +, * /01 b R * /01 7/32

8 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 8/32

9 How to know the function is good Our goal is to produce a function f(x) that correctly maps inputs x to outputs y5 How do we know that the produced function f() is indeed a good one? 9/32

10 Leave-one-out cross-validation Train k functions f 6:8 1. leaving out a different input example x 9 2. evaluating the resulting function f 9 () on its ability to predict x 9 Train another function f() on the entire trainings set x 6:8 10/32

11 Leave-one-out Good a good approximation of the accuracy on new inputs Bad very costly in computation time used only in cases where k is very small 11/32

12 Held-out set 1. Randomly split all data into 2 subsets(say in 80%/20%): Training set Held-out set 2. Train a model on training set 3. Test its accuracy on the held-out set 12/32

13 A three-way split To compare several models and select the best one Three-way split of the data into train, validation(development), and a test set Training set Tweaks, error analysis and model selection validation set Test set Held-out set A single run of the final model 13/32

14 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 14/32

15 Binary classification f x = x $ w + b = 1, w: vector y5 = sign f x = sign x $ w + b The positive class: 1 The negative class: -1 15/32

16 Binary classification y5 = sign f x = sign x $ w + b = sign(size w 6 + price w R + b) Blue circles: Dupont Circle Green crosses: Fairfax 16/32

17 Binary classification y5 = sign f x = sign x $ w + b = sign(size w 6 + price w R + b) If y5 0 Fairfax else Dupont Circle 17/32

18 More than two features Counts of letter-bigram ab : x \] = #ab D, #ab : number of times the bigram ab appears in the document D : total number of bigrams in the document (document s length) x R`ab (an alphabet has 28 letters) 18/32

19 More than two features Bigram histograms for several German and English texts 19/32

20 More than two features Given a new item as Which will it be considered as the German group or the English one?? y5 = sign f x = sign x $ w + b = sign( x \\ w \\ + x \] w \] + x \c w \c + b) be considered as English if f x 0 and as German otherwise 20/32

21 Log-linear binary classification The confidence of the decision The probability that the classifier assigns to the class pushing the output through a squashing function such as the sigmoid 1 σ x = 1 + e fg y5 = σ f(x) = 6 6hi j(k wmn) 21/32

22 Multi-class classification Assign an example to one of k different classes e.g.) classify a document into one of six possible languages : English, French, German, Italian, Spanish, Other y5 = f x = argmax x $ w p + b p L {E t, F u, G u, I u, S y, O} Re-written as w p R`ab, b p W R`ab, vector b R y5 = f x = x $ W + b prediction = y5 = argmax y [9] i (2.7) 22/32

23 Multi-class classification y5 = f x = argmax x $ w p + b p L {E t, F u, G u, I u, S y, O} L = E t, F u, G u, I u, S y, O 784 #aa D #ab D #ac D #zy D #zz D b = score p 23/32

24 Multi-class classification y5 = f x = x $ W + b prediction = y5 = argmax y [9] #aa D #ab D #ac D #zy D #zz D i En Fr Gr Ir Sp O = /32

25 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 25/32

26 Representations y5 = f x = x $ W + b is a representation of the documentation. 26/32

27 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 27/32

28 One-hot vector x [+] R`ab : one-hot vector i : particular document position, D [9] : bigram at the document positon All entries are zero except the single entry corresponding to the letter bigram D [9],which is /32

29 Bag of words x = 1 Ž x [+] D 9 6 The resulting vector x is commonly referred to as an averaged bag of bigrams(averaged bag of words or just bag of words) R /32

30 Continuous bag of words y5 = 1 Ž W [+] D 9 6 This representation is called a continuous bag of words(cbow), as it is composed of sum of word representations y = x $ W =( 6 x [+] 9 6 )$ W = 6 (x [+] 9 6 $ W) = 6 W [+] 9 6 W [ ] W [ ] W [ ] sum 30/32 y

31 Continuous bag of words y = x $ W =( 6 x [+] 9 6 )$ W = 6 (x [+] 9 6 $ W) = 6 W [+] 9 6 W [ ] W [ ] sum y W [ ] x [+] W W [+] En Fr Gr Ir Sp O = /32

32 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 32/32

33 Log-linear multi-class classification For binary case sigmoid function, resulting in a log-linear model For multi-class case softmax function : softmax(x) [9] = ik [+] resulting in y5 = softmax y [9] = i (kwmn) [+] i (kwmn) [ ] xw + b i k [ ] Forces the values in y to be positive and sum to 1, making them interpretable as a probability distribution 33/32

Data Preprocessing. Supervised Learning

Data Preprocessing. Supervised Learning Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are

More information

Bayes Theorem simply explained. With applications in Spam classifier and Autocorrect Hung Tu Dinh Jan 2018

Bayes Theorem simply explained. With applications in Spam classifier and Autocorrect Hung Tu Dinh Jan 2018 Bayes Theorem simply explained With applications in Spam classifier and Autocorrect Hung Tu Dinh Jan 2018 Agenda Part 1: Basic concepts of conditional probability and Bayes equation Part 2: How does spam

More information

Logistic Regression: Probabilistic Interpretation

Logistic Regression: Probabilistic Interpretation Logistic Regression: Probabilistic Interpretation Approximate 0/1 Loss Logistic Regression Adaboost (z) SVM Solution: Approximate 0/1 loss with convex loss ( surrogate loss) 0-1 z = y w x SVM (hinge),

More information

Polytechnic University of Tirana

Polytechnic University of Tirana 1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous

More information

5 Learning hypothesis classes (16 points)

5 Learning hypothesis classes (16 points) 5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated

More information

Ensemble Learning. Another approach is to leverage the algorithms we have via ensemble methods

Ensemble Learning. Another approach is to leverage the algorithms we have via ensemble methods Ensemble Learning Ensemble Learning So far we have seen learning algorithms that take a training set and output a classifier What if we want more accuracy than current algorithms afford? Develop new learning

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Supervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2.

Supervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2. Supervised Learning CS 586 Machine Learning Prepared by Jugal Kalita With help from Alpaydin s Introduction to Machine Learning, Chapter 2. Topics What is classification? Hypothesis classes and learning

More information

Supervised Learning: The Setup. Spring 2018

Supervised Learning: The Setup. Spring 2018 Supervised Learning: The Setup Spring 2018 1 Homework 0 will be released today through Canvas Due: Jan. 19 (next Friday) midnight 2 Last lecture We saw What is learning? Learning as generalization The

More information

CSEP 573: Artificial Intelligence

CSEP 573: Artificial Intelligence CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining) Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what

More information

Ensemble Methods, Decision Trees

Ensemble Methods, Decision Trees CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm

More information

Smarter text input system for mobile phone

Smarter text input system for mobile phone CS 229 project report Younggon Kim (younggon@stanford.edu) Smarter text input system for mobile phone 1. Abstract Machine learning algorithm was applied to text input system called T9. Support vector machine

More information

Data Mining and Analytics

Data Mining and Analytics Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/

More information

I211: Information infrastructure II

I211: Information infrastructure II Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1

More information

06: Logistic Regression

06: Logistic Regression 06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into

More information

What s New in the 2009/2010 Product Catalogue?

What s New in the 2009/2010 Product Catalogue? What s New in the 2009/200 Product Catalogue? With the release of this catalogue (upon the launch of CODESOFT 9): The product reference number have changed format (See p.. 7) Several products have reduced

More information

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error INTRODUCTION TO MACHINE LEARNING Measuring model performance or error Is our model any good? Context of task Accuracy Computation time Interpretability 3 types of tasks Classification Regression Clustering

More information

What s New in the 2009/2010 Product Catalogue?

What s New in the 2009/2010 Product Catalogue? What s New in the 2009/200 Product Catalogue? With the release of this catalogue (upon the launch of CODESOFT 9): The product reference number have changed format (See p.. 7) Several products have reduced

More information

Dynamic Feature Selection for Dependency Parsing

Dynamic Feature Selection for Dependency Parsing Dynamic Feature Selection for Dependency Parsing He He, Hal Daumé III and Jason Eisner EMNLP 2013, Seattle Structured Prediction in NLP Part-of-Speech Tagging Parsing N N V Det N Fruit flies like a banana

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

AKA: Logistic Regression Implementation

AKA: Logistic Regression Implementation AKA: Logistic Regression Implementation 1 Supervised classification is the problem of predicting to which category a new observation belongs. A category is chosen from a list of predefined categories.

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

CS489/698 Lecture 2: January 8 th, 2018

CS489/698 Lecture 2: January 8 th, 2018 CS489/698 Lecture 2: January 8 th, 2018 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D] Chapt. 3, [B] Sec. 2.5.2, [M] Sec. 1.4.2 CS489/698 (c) 2018 P. Poupart 1 Inductive Learning (recap) Induction

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences

Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences Section 5.3 1 Recursion Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences Previously sequences were defined using a specific formula,

More information

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1 Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental

More information

Neural Nets & Deep Learning

Neural Nets & Deep Learning Neural Nets & Deep Learning The Inspiration Inputs Outputs Our brains are pretty amazing, what if we could do something similar with computers? Image Source: http://ib.bioninja.com.au/_media/neuron _med.jpeg

More information

CSE 446 Bias-Variance & Naïve Bayes

CSE 446 Bias-Variance & Naïve Bayes CSE 446 Bias-Variance & Naïve Bayes Administrative Homework 1 due next week on Friday Good to finish early Homework 2 is out on Monday Check the course calendar Start early (midterm is right before Homework

More information

We extend SVM s in order to support multi-class classification problems. Consider the training dataset

We extend SVM s in order to support multi-class classification problems. Consider the training dataset p. / One-versus-the-Rest We extend SVM s in order to support multi-class classification problems. Consider the training dataset D = {(x, y ),(x, y ),..., (x l, y l )} R n {,..., M}, where the label y i

More information

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13 CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule. CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit

More information

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1 Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

Decision Tree CE-717 : Machine Learning Sharif University of Technology

Decision Tree CE-717 : Machine Learning Sharif University of Technology Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete

More information

Computer Vision Group Prof. Daniel Cremers. 8. Boosting and Bagging

Computer Vision Group Prof. Daniel Cremers. 8. Boosting and Bagging Prof. Daniel Cremers 8. Boosting and Bagging Repetition: Regression We start with a set of basis functions (x) =( 0 (x), 1(x),..., M 1(x)) x 2 í d The goal is to fit a model into the data y(x, w) =w T

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Text Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999

Text Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999 Text Categorization Foundations of Statistic Natural Language Processing The MIT Press1999 Outline Introduction Decision Trees Maximum Entropy Modeling (optional) Perceptrons K Nearest Neighbor Classification

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Software-Information. Binary 8f 23021/1.3b, Binary 8f 2021/1.3b. Software-Information to: Binary 4f 23021/1.3b, Binary 4f 2021/1.

Software-Information. Binary 8f 23021/1.3b, Binary 8f 2021/1.3b. Software-Information to: Binary 4f 23021/1.3b, Binary 4f 2021/1. Product: Binary Input Type: BE/S x.x.1 urrent application program: Binary 4f 23021/1.3b, Binary 4f 2021/1.3b, Binary 8f 23021/1.3b, Binary 8f 2421/1.3b =======================================================================================

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

Calibrating Random Forests

Calibrating Random Forests Calibrating Random Forests Henrik Boström Informatics Research Centre University of Skövde 541 28 Skövde, Sweden henrik.bostrom@his.se Abstract When using the output of classifiers to calculate the expected

More information

Factoring. Factor: Change an addition expression into a multiplication expression.

Factoring. Factor: Change an addition expression into a multiplication expression. Factoring Factor: Change an addition expression into a multiplication expression. 1. Always look for a common factor a. immediately take it out to the front of the expression, take out all common factors

More information

Supervised Learning for Image Segmentation

Supervised Learning for Image Segmentation Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.

More information

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class

More information

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict

More information

Printing Report Cards

Printing Report Cards Elementary Printing Report Cards The CCPS Reporting Services application will be used to generate and print report cards. Once all interim information has been entered, you can begin the printing process.

More information

Lecture 6,

Lecture 6, Lecture 6, 4.16.2009 Today: Review: Basic Set Operation: Recall the basic set operator,!. From this operator come other set quantifiers and operations:!,!,!,! \ Set difference (sometimes denoted, a minus

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

Notes and Announcements

Notes and Announcements Notes and Announcements Midterm exam: Oct 20, Wednesday, In Class Late Homeworks Turn in hardcopies to Michelle. DO NOT ask Michelle for extensions. Note down the date and time of submission. If submitting

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information

Tree-based methods for classification and regression

Tree-based methods for classification and regression Tree-based methods for classification and regression Ryan Tibshirani Data Mining: 36-462/36-662 April 11 2013 Optional reading: ISL 8.1, ESL 9.2 1 Tree-based methods Tree-based based methods for predicting

More information

4&5 Binary Operations and Relations. The Integers. (part I)

4&5 Binary Operations and Relations. The Integers. (part I) c Oksana Shatalov, Spring 2016 1 4&5 Binary Operations and Relations. The Integers. (part I) 4.1: Binary Operations DEFINITION 1. A binary operation on a nonempty set A is a function from A A to A. Addition,

More information

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday. CS 188: Artificial Intelligence Spring 2011 Lecture 21: Perceptrons 4/13/2010 Announcements Project 4: due Friday. Final Contest: up and running! Project 5 out! Pieter Abbeel UC Berkeley Many slides adapted

More information

Bayes Classifiers and Generative Methods

Bayes Classifiers and Generative Methods Bayes Classifiers and Generative Methods CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Stages of Supervised Learning To

More information

Predicting Popular Xbox games based on Search Queries of Users

Predicting Popular Xbox games based on Search Queries of Users 1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which

More information

Learning to Localize Objects with Structured Output Regression

Learning to Localize Objects with Structured Output Regression Learning to Localize Objects with Structured Output Regression Matthew Blaschko and Christopher Lampert ECCV 2008 Best Student Paper Award Presentation by Jaeyong Sung and Yiting Xie 1 Object Localization

More information

On the automatic classification of app reviews

On the automatic classification of app reviews Requirements Eng (2016) 21:311 331 DOI 10.1007/s00766-016-0251-9 RE 2015 On the automatic classification of app reviews Walid Maalej 1 Zijad Kurtanović 1 Hadeer Nabil 2 Christoph Stanik 1 Walid: please

More information

Instance-Based Learning. Goals for the lecture

Instance-Based Learning. Goals for the lecture Instance-Based Learning Mar Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Review of Data Representation & Binary Operations Dhananjai M. Rao CSA Department Miami University

Review of Data Representation & Binary Operations Dhananjai M. Rao CSA Department Miami University Review of Data Representation & Binary Operations Dhananjai M. Rao () CSA Department Miami University 1. Introduction In digital computers all data including numbers, characters, and strings are ultimately

More information

1 Machine Learning System Design

1 Machine Learning System Design Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training

More information

Semi-supervised Learning

Semi-supervised Learning Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University

More information

Linear Regression and K-Nearest Neighbors 3/28/18

Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,

More information

Multi-Class Logistic Regression and Perceptron

Multi-Class Logistic Regression and Perceptron Multi-Class Logistic Regression and Perceptron Instructor: Wei Xu Some slides adapted from Dan Jurfasky, Brendan O Connor and Marine Carpuat MultiClass Classification Q: what if we have more than 2 categories?

More information

Logical Rhythm - Class 3. August 27, 2018

Logical Rhythm - Class 3. August 27, 2018 Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological

More information

k-nearest Neighbor (knn) Sept Youn-Hee Han

k-nearest Neighbor (knn) Sept Youn-Hee Han k-nearest Neighbor (knn) Sept. 2015 Youn-Hee Han http://link.koreatech.ac.kr ²Eager Learners Eager vs. Lazy Learning when given a set of training data, it will construct a generalization model before receiving

More information

Applying Improved Random Forest Explainability (RFEX 2.0) steps on synthetic data for variable features having a unimodal distribution

Applying Improved Random Forest Explainability (RFEX 2.0) steps on synthetic data for variable features having a unimodal distribution Applying Improved Random Forest Explainability (RFEX 2.0) steps on synthetic data for variable features having a unimodal distribution 1. Introduction Sabiha Barlaskar, Dragutin Petkovic SFSU CS Department

More information

Introduction to Classification & Regression Trees

Introduction to Classification & Regression Trees Introduction to Classification & Regression Trees ISLR Chapter 8 vember 8, 2017 Classification and Regression Trees Carseat data from ISLR package Classification and Regression Trees Carseat data from

More information

.commercers extension user guide Auto Content

.commercers extension user guide Auto Content .commercers extension user guide Auto Content (June 2016) Background of this extension It is becoming more and more important to fill your shop with relevant content. In order to sell on portals like Amazon,

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Computer Vision Group Prof. Daniel Cremers. 6. Boosting

Computer Vision Group Prof. Daniel Cremers. 6. Boosting Prof. Daniel Cremers 6. Boosting Repetition: Regression We start with a set of basis functions (x) =( 0 (x), 1(x),..., M 1(x)) x 2 í d The goal is to fit a model into the data y(x, w) =w T (x) To do this,

More information

CS294-1 Final Project. Algorithms Comparison

CS294-1 Final Project. Algorithms Comparison CS294-1 Final Project Algorithms Comparison Deep Learning Neural Network AdaBoost Random Forest Prepared By: Shuang Bi (24094630) Wenchang Zhang (24094623) 2013-05-15 1 INTRODUCTION In this project, we

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

Syntactic N-grams as Machine Learning. Features for Natural Language Processing. Marvin Gülzow. Basics. Approach. Results.

Syntactic N-grams as Machine Learning. Features for Natural Language Processing. Marvin Gülzow. Basics. Approach. Results. s Table of Contents s 1 s 2 3 4 5 6 TL;DR s Introduce n-grams Use them for authorship attribution Compare machine learning approaches s J48 (decision tree) SVM + S work well Section 1 s s Definition n-gram:

More information

Recognition Part I: Machine Learning. CSE 455 Linda Shapiro

Recognition Part I: Machine Learning. CSE 455 Linda Shapiro Recognition Part I: Machine Learning CSE 455 Linda Shapiro Visual Recognition What does it mean to see? What is where, Marr 1982 Get computers to see Visual Recognition Verification Is this a car? Visual

More information

Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues

Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues D. Zikic, B. Glocker, E. Konukoglu, J. Shotton, A. Criminisi, D. H. Ye, C. Demiralp 3, O. M. Thomas 4,5, T. Das 4, R. Jena

More information

Lecture Linear Support Vector Machines

Lecture Linear Support Vector Machines Lecture 8 In this lecture we return to the task of classification. As seen earlier, examples include spam filters, letter recognition, or text classification. In this lecture we introduce a popular method

More information

MSRP Price list & order form

MSRP Price list & order form 1 LVI America, Inc. 150 north Michigan Avenue, Ste 1950 Chicago, IL 60601 Phone: (888) 781-7811 E-mail: order@lviamerica.com WWW.LVIAMERICA.COM MSRP Price list & order form Valid from 2018-01-31 Product

More information

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS Deep Neural Decision Forests Microsoft Research Cambridge UK, ICCV 2015 Decision Forests, Convolutional Networks and the Models in-between

More information

Linear Classification and Perceptron

Linear Classification and Perceptron Linear Classification and Perceptron INFO-4604, Applied Machine Learning University of Colorado Boulder September 7, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the

More information

Classifying Depositional Environments in Satellite Images

Classifying Depositional Environments in Satellite Images Classifying Depositional Environments in Satellite Images Alex Miltenberger and Rayan Kanfar Department of Geophysics School of Earth, Energy, and Environmental Sciences Stanford University 1 Introduction

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines

Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines SemDeep-4, Oct. 2018 Gengchen Mai Krzysztof Janowicz Bo Yan STKO Lab, University of California, Santa Barbara

More information

1) Give decision trees to represent the following Boolean functions:

1) Give decision trees to represent the following Boolean functions: 1) Give decision trees to represent the following Boolean functions: 1) A B 2) A [B C] 3) A XOR B 4) [A B] [C Dl Answer: 1) A B 2) A [B C] 1 3) A XOR B = (A B) ( A B) 4) [A B] [C D] 2 2) Consider the following

More information

Goals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1

Goals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1 Natural Semantics Goals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1 1 Natural deduction is an instance of first-order logic; that is, it is the formal

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

The Basics of Decision Trees

The Basics of Decision Trees Tree-based Methods Here we describe tree-based methods for regression and classification. These involve stratifying or segmenting the predictor space into a number of simple regions. Since the set of splitting

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining

More information

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B

Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen

More information

Chapter Seven: Regular Expressions

Chapter Seven: Regular Expressions Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Clustering & Classification (chapter 15)

Clustering & Classification (chapter 15) Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical

More information

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/20/2010 Announcements W7 due Thursday [that s your last written for the semester!] Project 5 out Thursday Contest running

More information

Random Forests and Boosting

Random Forests and Boosting Random Forests and Boosting Tree-based methods are simple and useful for interpretation. However they typically are not competitive with the best supervised learning approaches in terms of prediction accuracy.

More information