Chapter 2 Learning Basics and Linear Models
|
|
- Marlene Powell
- 6 years ago
- Views:
Transcription
1 Chapter 2 Learning Basics and Linear Models M1 Nakayama Sahoko(SP) 2017/7/7 1/32
2 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 2/32
3 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 3/32
4 Overview This chapter provides Supervised machine learning terminology and practices Linear and log-linear models for binary and multiclass classification 4/32
5 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 5/32
6 Supervised Machine Learning The creation of mechanisms that can look at examples and produce generalizations Input Output F(x) spam Not-spam Spam Or Not-spam 6/32
7 Parameterized function Searching over the set of all possible functions is very hard Restrict to specific Hypothesis classes(family of functions) injecting the learner with inductive bias Searching over the space of parameters One common hypothesis class (linear model) : Input parameters f x = x $ W + b x R * +, W R * +, * /01 b R * /01 7/32
8 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 8/32
9 How to know the function is good Our goal is to produce a function f(x) that correctly maps inputs x to outputs y5 How do we know that the produced function f() is indeed a good one? 9/32
10 Leave-one-out cross-validation Train k functions f 6:8 1. leaving out a different input example x 9 2. evaluating the resulting function f 9 () on its ability to predict x 9 Train another function f() on the entire trainings set x 6:8 10/32
11 Leave-one-out Good a good approximation of the accuracy on new inputs Bad very costly in computation time used only in cases where k is very small 11/32
12 Held-out set 1. Randomly split all data into 2 subsets(say in 80%/20%): Training set Held-out set 2. Train a model on training set 3. Test its accuracy on the held-out set 12/32
13 A three-way split To compare several models and select the best one Three-way split of the data into train, validation(development), and a test set Training set Tweaks, error analysis and model selection validation set Test set Held-out set A single run of the final model 13/32
14 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 14/32
15 Binary classification f x = x $ w + b = 1, w: vector y5 = sign f x = sign x $ w + b The positive class: 1 The negative class: -1 15/32
16 Binary classification y5 = sign f x = sign x $ w + b = sign(size w 6 + price w R + b) Blue circles: Dupont Circle Green crosses: Fairfax 16/32
17 Binary classification y5 = sign f x = sign x $ w + b = sign(size w 6 + price w R + b) If y5 0 Fairfax else Dupont Circle 17/32
18 More than two features Counts of letter-bigram ab : x \] = #ab D, #ab : number of times the bigram ab appears in the document D : total number of bigrams in the document (document s length) x R`ab (an alphabet has 28 letters) 18/32
19 More than two features Bigram histograms for several German and English texts 19/32
20 More than two features Given a new item as Which will it be considered as the German group or the English one?? y5 = sign f x = sign x $ w + b = sign( x \\ w \\ + x \] w \] + x \c w \c + b) be considered as English if f x 0 and as German otherwise 20/32
21 Log-linear binary classification The confidence of the decision The probability that the classifier assigns to the class pushing the output through a squashing function such as the sigmoid 1 σ x = 1 + e fg y5 = σ f(x) = 6 6hi j(k wmn) 21/32
22 Multi-class classification Assign an example to one of k different classes e.g.) classify a document into one of six possible languages : English, French, German, Italian, Spanish, Other y5 = f x = argmax x $ w p + b p L {E t, F u, G u, I u, S y, O} Re-written as w p R`ab, b p W R`ab, vector b R y5 = f x = x $ W + b prediction = y5 = argmax y [9] i (2.7) 22/32
23 Multi-class classification y5 = f x = argmax x $ w p + b p L {E t, F u, G u, I u, S y, O} L = E t, F u, G u, I u, S y, O 784 #aa D #ab D #ac D #zy D #zz D b = score p 23/32
24 Multi-class classification y5 = f x = x $ W + b prediction = y5 = argmax y [9] #aa D #ab D #ac D #zy D #zz D i En Fr Gr Ir Sp O = /32
25 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 25/32
26 Representations y5 = f x = x $ W + b is a representation of the documentation. 26/32
27 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 27/32
28 One-hot vector x [+] R`ab : one-hot vector i : particular document position, D [9] : bigram at the document positon All entries are zero except the single entry corresponding to the letter bigram D [9],which is /32
29 Bag of words x = 1 Ž x [+] D 9 6 The resulting vector x is commonly referred to as an averaged bag of bigrams(averaged bag of words or just bag of words) R /32
30 Continuous bag of words y5 = 1 Ž W [+] D 9 6 This representation is called a continuous bag of words(cbow), as it is composed of sum of word representations y = x $ W =( 6 x [+] 9 6 )$ W = 6 (x [+] 9 6 $ W) = 6 W [+] 9 6 W [ ] W [ ] W [ ] sum 30/32 y
31 Continuous bag of words y = x $ W =( 6 x [+] 9 6 )$ W = 6 (x [+] 9 6 $ W) = 6 W [+] 9 6 W [ ] W [ ] sum y W [ ] x [+] W W [+] En Fr Gr Ir Sp O = /32
32 Contents 2 Learning Basics and Linear Models 2.1 Supervised Learning and Parameterized Functions 2.2 Train, Test, and Validation Sets 2.3 Linear Models Binary Classification Log-Linear Binary Classification Multi-class Classification 2.4 Representations 2.5 One-Hot and Dense Vector Representations 2.6 Log-linear Multi-class Classification 32/32
33 Log-linear multi-class classification For binary case sigmoid function, resulting in a log-linear model For multi-class case softmax function : softmax(x) [9] = ik [+] resulting in y5 = softmax y [9] = i (kwmn) [+] i (kwmn) [ ] xw + b i k [ ] Forces the values in y to be positive and sum to 1, making them interpretable as a probability distribution 33/32
Data Preprocessing. Supervised Learning
Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are
More informationBayes Theorem simply explained. With applications in Spam classifier and Autocorrect Hung Tu Dinh Jan 2018
Bayes Theorem simply explained With applications in Spam classifier and Autocorrect Hung Tu Dinh Jan 2018 Agenda Part 1: Basic concepts of conditional probability and Bayes equation Part 2: How does spam
More informationLogistic Regression: Probabilistic Interpretation
Logistic Regression: Probabilistic Interpretation Approximate 0/1 Loss Logistic Regression Adaboost (z) SVM Solution: Approximate 0/1 loss with convex loss ( surrogate loss) 0-1 z = y w x SVM (hinge),
More informationPolytechnic University of Tirana
1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous
More information5 Learning hypothesis classes (16 points)
5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated
More informationEnsemble Learning. Another approach is to leverage the algorithms we have via ensemble methods
Ensemble Learning Ensemble Learning So far we have seen learning algorithms that take a training set and output a classifier What if we want more accuracy than current algorithms afford? Develop new learning
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationSupervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2.
Supervised Learning CS 586 Machine Learning Prepared by Jugal Kalita With help from Alpaydin s Introduction to Machine Learning, Chapter 2. Topics What is classification? Hypothesis classes and learning
More informationSupervised Learning: The Setup. Spring 2018
Supervised Learning: The Setup Spring 2018 1 Homework 0 will be released today through Canvas Due: Jan. 19 (next Friday) midnight 2 Last lecture We saw What is learning? Learning as generalization The
More informationCSEP 573: Artificial Intelligence
CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationData Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)
Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what
More informationEnsemble Methods, Decision Trees
CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm
More informationSmarter text input system for mobile phone
CS 229 project report Younggon Kim (younggon@stanford.edu) Smarter text input system for mobile phone 1. Abstract Machine learning algorithm was applied to text input system called T9. Support vector machine
More informationData Mining and Analytics
Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/
More informationI211: Information infrastructure II
Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1
More information06: Logistic Regression
06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into
More informationWhat s New in the 2009/2010 Product Catalogue?
What s New in the 2009/200 Product Catalogue? With the release of this catalogue (upon the launch of CODESOFT 9): The product reference number have changed format (See p.. 7) Several products have reduced
More informationINTRODUCTION TO MACHINE LEARNING. Measuring model performance or error
INTRODUCTION TO MACHINE LEARNING Measuring model performance or error Is our model any good? Context of task Accuracy Computation time Interpretability 3 types of tasks Classification Regression Clustering
More informationWhat s New in the 2009/2010 Product Catalogue?
What s New in the 2009/200 Product Catalogue? With the release of this catalogue (upon the launch of CODESOFT 9): The product reference number have changed format (See p.. 7) Several products have reduced
More informationDynamic Feature Selection for Dependency Parsing
Dynamic Feature Selection for Dependency Parsing He He, Hal Daumé III and Jason Eisner EMNLP 2013, Seattle Structured Prediction in NLP Part-of-Speech Tagging Parsing N N V Det N Fruit flies like a banana
More informationNaïve Bayes for text classification
Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationAKA: Logistic Regression Implementation
AKA: Logistic Regression Implementation 1 Supervised classification is the problem of predicting to which category a new observation belongs. A category is chosen from a list of predefined categories.
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationCS489/698 Lecture 2: January 8 th, 2018
CS489/698 Lecture 2: January 8 th, 2018 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D] Chapt. 3, [B] Sec. 2.5.2, [M] Sec. 1.4.2 CS489/698 (c) 2018 P. Poupart 1 Inductive Learning (recap) Induction
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationRecursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences
Section 5.3 1 Recursion Recursion defining an object (or function, algorithm, etc.) in terms of itself. Recursion can be used to define sequences Previously sequences were defined using a specific formula,
More informationIntroduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1
Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental
More informationNeural Nets & Deep Learning
Neural Nets & Deep Learning The Inspiration Inputs Outputs Our brains are pretty amazing, what if we could do something similar with computers? Image Source: http://ib.bioninja.com.au/_media/neuron _med.jpeg
More informationCSE 446 Bias-Variance & Naïve Bayes
CSE 446 Bias-Variance & Naïve Bayes Administrative Homework 1 due next week on Friday Good to finish early Homework 2 is out on Monday Check the course calendar Start early (midterm is right before Homework
More informationWe extend SVM s in order to support multi-class classification problems. Consider the training dataset
p. / One-versus-the-Rest We extend SVM s in order to support multi-class classification problems. Consider the training dataset D = {(x, y ),(x, y ),..., (x l, y l )} R n {,..., M}, where the label y i
More informationCSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13
CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.
CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit
More informationBig Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1
Big Data Methods Chapter 5: Machine learning Big Data Methods, Chapter 5, Slide 1 5.1 Introduction to machine learning What is machine learning? Concerned with the study and development of algorithms that
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationEvaluation Metrics. (Classifiers) CS229 Section Anand Avati
Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,
More informationDecision Tree CE-717 : Machine Learning Sharif University of Technology
Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete
More informationComputer Vision Group Prof. Daniel Cremers. 8. Boosting and Bagging
Prof. Daniel Cremers 8. Boosting and Bagging Repetition: Regression We start with a set of basis functions (x) =( 0 (x), 1(x),..., M 1(x)) x 2 í d The goal is to fit a model into the data y(x, w) =w T
More informationEvaluating Classifiers
Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts
More informationText Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999
Text Categorization Foundations of Statistic Natural Language Processing The MIT Press1999 Outline Introduction Decision Trees Maximum Entropy Modeling (optional) Perceptrons K Nearest Neighbor Classification
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationSoftware-Information. Binary 8f 23021/1.3b, Binary 8f 2021/1.3b. Software-Information to: Binary 4f 23021/1.3b, Binary 4f 2021/1.
Product: Binary Input Type: BE/S x.x.1 urrent application program: Binary 4f 23021/1.3b, Binary 4f 2021/1.3b, Binary 8f 23021/1.3b, Binary 8f 2421/1.3b =======================================================================================
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationCalibrating Random Forests
Calibrating Random Forests Henrik Boström Informatics Research Centre University of Skövde 541 28 Skövde, Sweden henrik.bostrom@his.se Abstract When using the output of classifiers to calculate the expected
More informationFactoring. Factor: Change an addition expression into a multiplication expression.
Factoring Factor: Change an addition expression into a multiplication expression. 1. Always look for a common factor a. immediately take it out to the front of the expression, take out all common factors
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationLECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS
LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationPrinting Report Cards
Elementary Printing Report Cards The CCPS Reporting Services application will be used to generate and print report cards. Once all interim information has been entered, you can begin the printing process.
More informationLecture 6,
Lecture 6, 4.16.2009 Today: Review: Basic Set Operation: Recall the basic set operator,!. From this operator come other set quantifiers and operations:!,!,!,! \ Set difference (sometimes denoted, a minus
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationNotes and Announcements
Notes and Announcements Midterm exam: Oct 20, Wednesday, In Class Late Homeworks Turn in hardcopies to Michelle. DO NOT ask Michelle for extensions. Note down the date and time of submission. If submitting
More informationNeural Networks (pp )
Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.
More informationTree-based methods for classification and regression
Tree-based methods for classification and regression Ryan Tibshirani Data Mining: 36-462/36-662 April 11 2013 Optional reading: ISL 8.1, ESL 9.2 1 Tree-based methods Tree-based based methods for predicting
More information4&5 Binary Operations and Relations. The Integers. (part I)
c Oksana Shatalov, Spring 2016 1 4&5 Binary Operations and Relations. The Integers. (part I) 4.1: Binary Operations DEFINITION 1. A binary operation on a nonempty set A is a function from A A to A. Addition,
More informationAnnouncements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.
CS 188: Artificial Intelligence Spring 2011 Lecture 21: Perceptrons 4/13/2010 Announcements Project 4: due Friday. Final Contest: up and running! Project 5 out! Pieter Abbeel UC Berkeley Many slides adapted
More informationBayes Classifiers and Generative Methods
Bayes Classifiers and Generative Methods CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 The Stages of Supervised Learning To
More informationPredicting Popular Xbox games based on Search Queries of Users
1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which
More informationLearning to Localize Objects with Structured Output Regression
Learning to Localize Objects with Structured Output Regression Matthew Blaschko and Christopher Lampert ECCV 2008 Best Student Paper Award Presentation by Jaeyong Sung and Yiting Xie 1 Object Localization
More informationOn the automatic classification of app reviews
Requirements Eng (2016) 21:311 331 DOI 10.1007/s00766-016-0251-9 RE 2015 On the automatic classification of app reviews Walid Maalej 1 Zijad Kurtanović 1 Hadeer Nabil 2 Christoph Stanik 1 Walid: please
More informationInstance-Based Learning. Goals for the lecture
Instance-Based Learning Mar Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed
More informationReview of Data Representation & Binary Operations Dhananjai M. Rao CSA Department Miami University
Review of Data Representation & Binary Operations Dhananjai M. Rao () CSA Department Miami University 1. Introduction In digital computers all data including numbers, characters, and strings are ultimately
More information1 Machine Learning System Design
Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training
More informationSemi-supervised Learning
Semi-supervised Learning Piyush Rai CS5350/6350: Machine Learning November 8, 2011 Semi-supervised Learning Supervised Learning models require labeled data Learning a reliable model usually requires plenty
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University
More informationLinear Regression and K-Nearest Neighbors 3/28/18
Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,
More informationMulti-Class Logistic Regression and Perceptron
Multi-Class Logistic Regression and Perceptron Instructor: Wei Xu Some slides adapted from Dan Jurfasky, Brendan O Connor and Marine Carpuat MultiClass Classification Q: what if we have more than 2 categories?
More informationLogical Rhythm - Class 3. August 27, 2018
Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological
More informationk-nearest Neighbor (knn) Sept Youn-Hee Han
k-nearest Neighbor (knn) Sept. 2015 Youn-Hee Han http://link.koreatech.ac.kr ²Eager Learners Eager vs. Lazy Learning when given a set of training data, it will construct a generalization model before receiving
More informationApplying Improved Random Forest Explainability (RFEX 2.0) steps on synthetic data for variable features having a unimodal distribution
Applying Improved Random Forest Explainability (RFEX 2.0) steps on synthetic data for variable features having a unimodal distribution 1. Introduction Sabiha Barlaskar, Dragutin Petkovic SFSU CS Department
More informationIntroduction to Classification & Regression Trees
Introduction to Classification & Regression Trees ISLR Chapter 8 vember 8, 2017 Classification and Regression Trees Carseat data from ISLR package Classification and Regression Trees Carseat data from
More information.commercers extension user guide Auto Content
.commercers extension user guide Auto Content (June 2016) Background of this extension It is becoming more and more important to fill your shop with relevant content. In order to sell on portals like Amazon,
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationComputer Vision Group Prof. Daniel Cremers. 6. Boosting
Prof. Daniel Cremers 6. Boosting Repetition: Regression We start with a set of basis functions (x) =( 0 (x), 1(x),..., M 1(x)) x 2 í d The goal is to fit a model into the data y(x, w) =w T (x) To do this,
More informationCS294-1 Final Project. Algorithms Comparison
CS294-1 Final Project Algorithms Comparison Deep Learning Neural Network AdaBoost Random Forest Prepared By: Shuang Bi (24094630) Wenchang Zhang (24094623) 2013-05-15 1 INTRODUCTION In this project, we
More informationSupport Vector Machines
Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
More informationSyntactic N-grams as Machine Learning. Features for Natural Language Processing. Marvin Gülzow. Basics. Approach. Results.
s Table of Contents s 1 s 2 3 4 5 6 TL;DR s Introduce n-grams Use them for authorship attribution Compare machine learning approaches s J48 (decision tree) SVM + S work well Section 1 s s Definition n-gram:
More informationRecognition Part I: Machine Learning. CSE 455 Linda Shapiro
Recognition Part I: Machine Learning CSE 455 Linda Shapiro Visual Recognition What does it mean to see? What is where, Marr 1982 Get computers to see Visual Recognition Verification Is this a car? Visual
More informationContext-sensitive Classification Forests for Segmentation of Brain Tumor Tissues
Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues D. Zikic, B. Glocker, E. Konukoglu, J. Shotton, A. Criminisi, D. H. Ye, C. Demiralp 3, O. M. Thomas 4,5, T. Das 4, R. Jena
More informationLecture Linear Support Vector Machines
Lecture 8 In this lecture we return to the task of classification. As seen earlier, examples include spam filters, letter recognition, or text classification. In this lecture we introduce a popular method
More informationMSRP Price list & order form
1 LVI America, Inc. 150 north Michigan Avenue, Ste 1950 Chicago, IL 60601 Phone: (888) 781-7811 E-mail: order@lviamerica.com WWW.LVIAMERICA.COM MSRP Price list & order form Valid from 2018-01-31 Product
More informationDECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS
DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS Deep Neural Decision Forests Microsoft Research Cambridge UK, ICCV 2015 Decision Forests, Convolutional Networks and the Models in-between
More informationLinear Classification and Perceptron
Linear Classification and Perceptron INFO-4604, Applied Machine Learning University of Colorado Boulder September 7, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the
More informationClassifying Depositional Environments in Satellite Images
Classifying Depositional Environments in Satellite Images Alex Miltenberger and Rayan Kanfar Department of Geophysics School of Earth, Energy, and Environmental Sciences Stanford University 1 Introduction
More informationBayesian model ensembling using meta-trained recurrent neural networks
Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya
More informationCombining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines
Combining Text Embedding and Knowledge Graph Embedding Techniques for Academic Search Engines SemDeep-4, Oct. 2018 Gengchen Mai Krzysztof Janowicz Bo Yan STKO Lab, University of California, Santa Barbara
More information1) Give decision trees to represent the following Boolean functions:
1) Give decision trees to represent the following Boolean functions: 1) A B 2) A [B C] 3) A XOR B 4) [A B] [C Dl Answer: 1) A B 2) A [B C] 1 3) A XOR B = (A B) ( A B) 4) [A B] [C D] 2 2) Consider the following
More informationGoals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1
Natural Semantics Goals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1 1 Natural deduction is an instance of first-order logic; that is, it is the formal
More informationWeek 3: Perceptron and Multi-layer Perceptron
Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,
More informationThe Basics of Decision Trees
Tree-based Methods Here we describe tree-based methods for regression and classification. These involve stratifying or segmenting the predictor space into a number of simple regions. Since the set of splitting
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining
More informationAdvanced Video Content Analysis and Video Compression (5LSH0), Module 8B
Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen
More informationChapter Seven: Regular Expressions
Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:
More informationEvaluating Classifiers
Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts
More informationClustering & Classification (chapter 15)
Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/20/2010 Announcements W7 due Thursday [that s your last written for the semester!] Project 5 out Thursday Contest running
More informationRandom Forests and Boosting
Random Forests and Boosting Tree-based methods are simple and useful for interpretation. However they typically are not competitive with the best supervised learning approaches in terms of prediction accuracy.
More information