# Learning and Generalization in Single Layer Perceptrons

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Learning and Generalization in Single Layer Perceptrons Neural Computation : Lecture 4 John A. Bullinaria, What Can Perceptrons do? 2. Decision Boundaries The Two Dimensional Case 3. Decision Boundaries for AND, OR and XOR 4. Decision Hyperplanes and Linear Separability 5. Learning and Generalization 6. Training Networks by Iterative Weight Updates 7. Convergence of the Perceptron Learning Rule

2 What Can Perceptrons Do? How do we know if a simple Perceptron is powerful enough to solve a given problem? If it can t even do XOR, why should we expect it to be able to deal with the aeroplane classification example, or real world tasks that will be even more complex than that? 0.8 Bomber Fighter 0.6 Speed Mass 3 4 In this lecture, we shall look at the limitations of Perceptrons, and how we can find their connection weights without having to compute and solve large numbers of inequalities. L4-2

3 Decision Boundaries in Two Dimensions For simple logic gate problems, it is easy to visualise what the neural network is doing. It is forming decision boundaries between classes. Remember, the network output is: θ w 1 w 2 out = step( w 1 in 1 + w 2 in 2 θ) in 1 in 2 The decision boundary (between out = 0 and out = 1) is at w 1 in 1 + w 2 in 2 θ = 0 i.e. along the straight line: in 2 = w 1 in w 1 + θ 2 w 2 So, in two dimensions the decision boundaries are always straight lines. L4-3

4 Decision Boundaries for AND and OR We can easily plot the decision boundaries we found by inspection last lecture: AND OR w 1 = 1, w 2 = 1, θ = 1.5 w 1 = 1, w 2 = 1, θ = 0.5 in 2 in 2 in 1 in 1 The extent to which we can change the weights and thresholds without changing the output decisions is now clear. L4-4

5 Decision Boundaries for XOR The difficulty in dealing with XOR is beginning to look obvious. We need two straight lines to separate the different outputs/decisions: in 2 out e.g. in in 1 There are two obvious remedies: either change the transfer function so that it has more than one decision boundary, or use a more complex network that is able to generate more complex decision boundaries. L4-5

6 Decision Hyperplanes and Linear Separability If we have two inputs, then the weights define a decision boundary that is a one dimensional straight line in the two dimensional input space of possible input values. If we have n inputs, the weights define a decision boundary that is an n 1 dimensional hyperplane in the n dimensional input space: w 1 in 1 + w 2 in w n in n θ = 0 This hyperplane is clearly still linear (i.e., straight or flat or non-curved) and can still only divide the space into two regions. We still need more complex transfer functions, or more complex networks, to deal with XOR type problems. Problems with input patterns that can be classified using a single hyperplane are said to be linearly separable. Problems (such as XOR) which cannot be classified in this way are said to be non-linearly separable. L4-6

7 General Decision Boundaries Generally, we will want to deal with input patterns that are not binary, and expect our neural networks to form complex decision boundaries, e.g. in 2?? in 1 We often also wish to classify inputs into many classes (such as the three shown here). L4-7

8 Learning and Generalization A network will also produce outputs for input patterns that it was not originally set up to classify (shown with question marks), though those classifications may be incorrect. There are two important aspects of the network s operation to consider: Learning : The network must learn decision boundaries from a set of training patterns so that these training patterns are classified correctly. Generalization : After training, the network must also be able to generalize, i.e. correctly classify test patterns it has never seen before. Usually we want the neural network to learn in a way that produces good generalization. Sometimes, the training data may contain errors (e.g., noise in the experimental determination of the input values, or incorrect classifications). In this case, learning the training data perfectly may make the generalization worse. There is an important tradeoff between learning and generalization that arises quite generally. L4-8

9 Generalization in Classification Suppose the task of the neural network is to learn a classification decision boundary: in 2 in 2 in 1 in 1 The aim is to get the network to generalize to classify new inputs appropriately. If we know that the training data contains noise, we don t necessarily want the training data to be classified totally accurately, as that is likely to reduce the generalization ability. L4-9

10 Generalization in Function Approximation Suppose we wish to recover a function for which we only have noisy data samples: out in We can expect the neural network output to give a better representation of the underlying function if its output curve does not pass through all the data points. Again, allowing a larger error on the training data is likely to lead to better generalization. L4-10

11 Training a Neural Network Whether the neural network is a simple Perceptron, or a much more complicated multilayer network with special activation functions, we need to develop a systematic procedure for determining appropriate connection weights. The general procedure is to have the network learn the appropriate weights from a representative set of training data. For all but the simplest cases, however, direct computation of the weights is intractable. Instead, a good all-purpose process is to start off with random initial weights and adjust them in small steps until the required outputs are produced. We shall first look at a brute force derivation of such an iterative learning algorithm for simple Perceptrons. Then, in later lectures, we shall see how more powerful and general techniques can easily lead to learning algorithms which will work for neural networks of any specification we could possibly dream up. L4-11

12 Perceptron Learning For simple Perceptrons performing classification, we have seen that the decision boundaries are hyperplanes, and we can think of learning as the process of shifting around the hyperplanes until each training pattern is classified correctly. Somehow, we need to formalise that process of shifting around into a systematic algorithm that can easily be implemented on a computer. The shifting around can conveniently be split up into a number of small steps. If the network weights at time t are w ij (t), then the shifting process corresponds to moving them by a small amount Δw ij (t) so that at time t+1 we have weights w ij (t +1) = w ij (t) + Δw ij (t) It is convenient to treat the thresholds as weights, as discussed previously, so we don t need separate equations for them. L4-12

13 Formulating the Weight Changes Suppose the target output of unit j is targ j and the actual output is out j = step( in i w ij ), where in i are the activations of the previous layer of neurons (i.e. the network inputs for a Perceptron). Then we can just go through all the possibilities to work out an appropriate set of small weight changes, and put them into a common form: If out j = targ j do nothing Note targ j out j = 0 so w ij w ij If out j = 1 and targ j = 0 Note targ j out j = 1 then in i w ij is too large first when in i = 1 decrease w ij so w ij w ij η = w ij η in i and when in i = 0 w ij doesn t matter so so w ij w ij 0 = w ij η in i w ij w ij η in i L4-13

14 If out j = 0 and targ j = 1 Note targ j out j = 1 then in i w ij is too small first when in i = 1 increase w ij so w ij w ij + η = w ij + η in i and when in i = 0 w ij doesn t matter so w ij w ij 0 = w ij + η in i so w ij w ij + η in i It has become clear that each case can be written in the form: w ij w ij + η (targ j out j ) in i Δw ij = η (targ j out j ) in i This weight update equation is called the Perceptron Learning Rule. The positive parameter η is called the learning rate or step size it determines how smoothly we shift the decision boundaries. L4-14

15 Convergence of Perceptron Learning The weight changes Δw ij need to be applied repeatedly for each weight w ij in the network, and for each training pattern in the training set. One pass through all the weights for the whole training set is called one epoch of training. Eventually, usually after many epochs, when all the network outputs match the targets for all the training patterns, all the Δw ij will be zero and the process of training will cease. We then say that the training process has converged to a solution. It is possible to prove that if there does exist a possible set of weights for a Perceptron which solves the given problem correctly, then the Perceptron Learning Rule will find them in a finite number of iterations. Moreover, it can be shown that if a problem is linearly separable, then the Perceptron Learning Rule will find a set of weights in a finite number of iterations that solves the problem correctly. L4-15

16 Overview and Reading 1. Neural network classifiers learn decision boundaries from training data. 2. Simple Perceptrons can only cope with linearly separable problems. 3. Trained networks are expected to generalize, i.e. deal appropriately with input data they were not trained on. 4. One can train networks by iteratively updating their weights. 5. The Perceptron Learning Rule will find weights for linearly separable problems in a finite number of iterations. Reading 1. Haykin-2009: Sections 1.1, 1.2, Gurney: Sections 3.3, 3.4, 3.5, 4.1, 4.2, 4.3, Beale & Jackson: Sections 3.3, 3.4, 3.5, 3.6, Callan: Sections 1.2, 1.3, 2.2, 2.3 L4-16

### COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence

### LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class

### Digital Image Processing. Prof. P.K. Biswas. Department of Electronics & Electrical Communication Engineering

Digital Image Processing Prof. P.K. Biswas Department of Electronics & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Image Segmentation - III Lecture - 31 Hello, welcome

### In this assignment, we investigated the use of neural networks for supervised classification

Paul Couchman Fabien Imbault Ronan Tigreat Gorka Urchegui Tellechea Classification assignment (group 6) Image processing MSc Embedded Systems March 2003 Classification includes a broad range of decision-theoric

### Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

### Lecture #11: The Perceptron

Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

### Lecture 15: Segmentation (Edge Based, Hough Transform)

Lecture 15: Segmentation (Edge Based, Hough Transform) c Bryan S. Morse, Brigham Young University, 1998 000 Last modified on February 3, 000 at :00 PM Contents 15.1 Introduction..............................................

### A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University

### The Simplex Algorithm

The Simplex Algorithm Uri Feige November 2011 1 The simplex algorithm The simplex algorithm was designed by Danzig in 1947. This write-up presents the main ideas involved. It is a slight update (mostly

### Dimension Reduction of Image Manifolds

Dimension Reduction of Image Manifolds Arian Maleki Department of Electrical Engineering Stanford University Stanford, CA, 9435, USA E-mail: arianm@stanford.edu I. INTRODUCTION Dimension reduction of datasets

### Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

### Lecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa

Instructors: Parth Shah, Riju Pahwa Lecture 1 Notes Outline 1. Machine Learning What is it? Classification vs. Regression Error Training Error vs. Test Error 2. Linear Classifiers Goals and Motivations

### PV211: Introduction to Information Retrieval

PV211: Introduction to Information Retrieval http://www.fi.muni.cz/~sojka/pv211 IIR 15-1: Support Vector Machines Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,

### Chapter 3: The Efficiency of Algorithms Invitation to Computer Science,

Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Objectives In this chapter, you will learn about Attributes of algorithms Measuring efficiency Analysis of algorithms When things

### An Introductory SIGMA/W Example

1 Introduction An Introductory SIGMA/W Example This is a fairly simple introductory example. The primary purpose is to demonstrate to new SIGMA/W users how to get started, to introduce the usual type of

### 1 Machine Learning System Design

Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training

### LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach Basic approaches I. Primal Approach - Feasible Direction

### Computer Vision. Image Segmentation. 10. Segmentation. Computer Engineering, Sejong University. Dongil Han

Computer Vision 10. Segmentation Computer Engineering, Sejong University Dongil Han Image Segmentation Image segmentation Subdivides an image into its constituent regions or objects - After an image has

### 5.5 Complex Fractions

5.5 Complex Fractions At this point, right after we cover all the basic operations, we would usually turn our attention to solving equations. However, there is one other type of rational expression that

### Convexity. 1 X i is convex. = b is a hyperplane in R n, and is denoted H(p, b) i.e.,

Convexity We ll assume throughout, without always saying so, that we re in the finite-dimensional Euclidean vector space R n, although sometimes, for statements that hold in any vector space, we ll say

### Research in Computational Differential Geomet

Research in Computational Differential Geometry November 5, 2014 Approximations Often we have a series of approximations which we think are getting close to looking like some shape. Approximations Often

### Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your

### Introduction to Algorithms

Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

### More on Classification: Support Vector Machine

More on Classification: Support Vector Machine The Support Vector Machine (SVM) is a classification method approach developed in the computer science field in the 1990s. It has shown good performance in

### CSCI 4620/8626. Computer Graphics Clipping Algorithms (Chapter 8-5 )

CSCI 4620/8626 Computer Graphics Clipping Algorithms (Chapter 8-5 ) Last update: 2016-03-15 Clipping Algorithms A clipping algorithm is any procedure that eliminates those portions of a picture outside

### Lecture 12 Level Sets & Parametric Transforms. sec & ch. 11 of Machine Vision by Wesley E. Snyder & Hairong Qi

Lecture 12 Level Sets & Parametric Transforms sec. 8.5.2 & ch. 11 of Machine Vision by Wesley E. Snyder & Hairong Qi Spring 2017 16-725 (CMU RI) : BioE 2630 (Pitt) Dr. John Galeotti The content of these

### Support Vector Machines

Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose

### THE preceding chapters were all devoted to the analysis of images and signals which

Chapter 5 Segmentation of Color, Texture, and Orientation Images THE preceding chapters were all devoted to the analysis of images and signals which take values in IR. It is often necessary, however, to

### PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS

PATTERN CLASSIFICATION BY DISTANCE FUNCTIONS Dr. K.Vijayarekha Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401 Joint Initiative of IITs and IISc Funded

Yuki Osada Andrew Cannon 1 Humans are an intelligent species One feature is the ability to learn The ability to learn comes down to the brain The brain learns from experience Research shows that the brain

### Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

### Algorithms in Systems Engineering IE172. Midterm Review. Dr. Ted Ralphs

Algorithms in Systems Engineering IE172 Midterm Review Dr. Ted Ralphs IE172 Midterm Review 1 Textbook Sections Covered on Midterm Chapters 1-5 IE172 Review: Algorithms and Programming 2 Introduction to

### Interpolation, extrapolation, etc.

Interpolation, extrapolation, etc. Prof. Ramin Zabih http://cs100r.cs.cornell.edu Administrivia Assignment 5 is out, due Wednesday A6 is likely to involve dogs again Monday sections are now optional Will

### 15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015 While we have good algorithms for many optimization problems, the previous lecture showed that many

### Vertex Cover Approximations

CS124 Lecture 20 Heuristics can be useful in practice, but sometimes we would like to have guarantees. Approximation algorithms give guarantees. It is worth keeping in mind that sometimes approximation

### Move-to-front algorithm

Up to now, we have looked at codes for a set of symbols in an alphabet. We have also looked at the specific case that the alphabet is a set of integers. We will now study a few compression techniques in

### DESIGN AND ANALYSIS OF ALGORITHMS. Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES

DESIGN AND ANALYSIS OF ALGORITHMS Unit 1 Chapter 4 ITERATIVE ALGORITHM DESIGN ISSUES http://milanvachhani.blogspot.in USE OF LOOPS As we break down algorithm into sub-algorithms, sooner or later we shall

### DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES. Fumitake Takahashi, Shigeo Abe

DECISION-TREE-BASED MULTICLASS SUPPORT VECTOR MACHINES Fumitake Takahashi, Shigeo Abe Graduate School of Science and Technology, Kobe University, Kobe, Japan (E-mail: abe@eedept.kobe-u.ac.jp) ABSTRACT

### Centrality Book. cohesion.

Cohesion The graph-theoretic terms discussed in the previous chapter have very specific and concrete meanings which are highly shared across the field of graph theory and other fields like social network

### Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

### Neural Networks (Overview) Prof. Richard Zanibbi

Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)

### Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

### An Interesting Way to Combine Numbers

An Interesting Way to Combine Numbers Joshua Zucker and Tom Davis October 12, 2016 Abstract This exercise can be used for middle school students and older. The original problem seems almost impossibly

### Robotics. Lecture 5: Monte Carlo Localisation. See course website for up to date information.

Robotics Lecture 5: Monte Carlo Localisation See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College London Review:

### A Formal Approach to Score Normalization for Meta-search

A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003

### Bisecting Floating Point Numbers

Squishy Thinking by Jason Merrill jason@squishythinking.com Bisecting Floating Point Numbers 22 Feb 2014 Bisection is about the simplest algorithm there is for isolating a root of a continuous function:

### Neural Networks Laboratory EE 329 A

Neural Networks Laboratory EE 329 A Introduction: Artificial Neural Networks (ANN) are widely used to approximate complex systems that are difficult to model using conventional modeling techniques such

### Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity

### Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

### Lecture 5 Sorting Arrays

Lecture 5 Sorting Arrays 15-122: Principles of Imperative Computation (Spring 2018) Frank Pfenning, Rob Simmons We begin this lecture by discussing how to compare running times of functions in an abstract,

### CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with

### Functions of Several Variables

Chapter 3 Functions of Several Variables 3.1 Definitions and Examples of Functions of two or More Variables In this section, we extend the definition of a function of one variable to functions of two or

### Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Lecture Outline Global Optimization Global flow analysis Global constant propagation Liveness analysis Compiler Design I (2011) 2 Local Optimization Recall the simple basic-block optimizations Constant

### An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s.

Using Monte Carlo to Estimate π using Buffon s Needle Problem An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Here s the problem (in a simplified form). Suppose

### 2 T. x + 2 T. , T( x, y = 0) = T 1

LAB 2: Conduction with Finite Difference Method Objective: The objective of this laboratory is to introduce the basic steps needed to numerically solve a steady state two-dimensional conduction problem

### / Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang 9.1 Linear Programming Suppose we are trying to approximate a minimization

### Introduction to Functions of Several Variables

Introduction to Functions of Several Variables Philippe B. Laval KSU Today Philippe B. Laval (KSU) Functions of Several Variables Today 1 / 20 Introduction In this section, we extend the definition of

### Multi-Class Logistic Regression and Perceptron

Multi-Class Logistic Regression and Perceptron Instructor: Wei Xu Some slides adapted from Dan Jurfasky, Brendan O Connor and Marine Carpuat MultiClass Classification Q: what if we have more than 2 categories?

### Lecture Transcript While and Do While Statements in C++

Lecture Transcript While and Do While Statements in C++ Hello and welcome back. In this lecture we are going to look at the while and do...while iteration statements in C++. Here is a quick recap of some

### 2. BOOLEAN ALGEBRA 2.1 INTRODUCTION

2. BOOLEAN ALGEBRA 2.1 INTRODUCTION In the previous chapter, we introduced binary numbers and binary arithmetic. As you saw in binary arithmetic and in the handling of floating-point numbers, there is

### Cardinality of Sets MAT231. Fall Transition to Higher Mathematics. MAT231 (Transition to Higher Math) Cardinality of Sets Fall / 15

Cardinality of Sets MAT Transition to Higher Mathematics Fall 0 MAT (Transition to Higher Math) Cardinality of Sets Fall 0 / Outline Sets with Equal Cardinality Countable and Uncountable Sets MAT (Transition

Radial Basis Function Networks As we have seen, one of the most common types of neural network is the multi-layer perceptron It does, however, have various disadvantages, including the slow speed in learning

### COMP combinational logic 1 Jan. 18, 2016

In lectures 1 and 2, we looked at representations of numbers. For the case of integers, we saw that we could perform addition of two numbers using a binary representation and using the same algorithm that

### IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER 1999 1271 Inverting Feedforward Neural Networks Using Linear and Nonlinear Programming Bao-Liang Lu, Member, IEEE, Hajime Kita, and Yoshikazu

### SAAM II Version 2.1 Basic Tutorials. Working with Parameters Basic

SAAM II Version 2.1 Basic Tutorials Basic Introduction Parameters 2 Part 1a. Work with the Parameters dialog box Parameters 3 Part 1b. Hand-fit a model to data Parameters 12 Changing Parameters Manually

### Statistical Methods for NLP

Statistical Methods for NLP Information Extraction, Hidden Markov Models Sameer Maskey * Most of the slides provided by Bhuvana Ramabhadran, Stanley Chen, Michael Picheny Speech Recognition Lecture 4:

### Performance Analysis of Data Mining Classification Techniques

Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal

### Conflict Graphs for Parallel Stochastic Gradient Descent

Conflict Graphs for Parallel Stochastic Gradient Descent Darshan Thaker*, Guneet Singh Dhillon* Abstract We present various methods for inducing a conflict graph in order to effectively parallelize Pegasos.

### PARALLEL MULTI-DELAY SIMULATION

PARALLEL MULTI-DELAY SIMULATION Yun Sik Lee Peter M. Maurer Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 CATEGORY: 7 - Discrete Simulation PARALLEL MULTI-DELAY

### LDPC Codes a brief Tutorial

LDPC Codes a brief Tutorial Bernhard M.J. Leiner, Stud.ID.: 53418L bleiner@gmail.com April 8, 2005 1 Introduction Low-density parity-check (LDPC) codes are a class of linear block LDPC codes. The name

### Module 1: Introduction to Finite Difference Method and Fundamentals of CFD Lecture 6:

file:///d:/chitra/nptel_phase2/mechanical/cfd/lecture6/6_1.htm 1 of 1 6/20/2012 12:24 PM The Lecture deals with: ADI Method file:///d:/chitra/nptel_phase2/mechanical/cfd/lecture6/6_2.htm 1 of 2 6/20/2012

### A Short SVM (Support Vector Machine) Tutorial

A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange

### Wood Grain Image Textures

Do you ever look at the LightWorks help files when trying to figure out how to wrap a wood image to an object in TurboCAD and wonder why it has to be so complex? Uses Arbitrary Plane Wrapping Uses Local

### Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

### Let be a function. We say, is a plane curve given by the. Let a curve be given by function where is differentiable with continuous.

Module 8 : Applications of Integration - II Lecture 22 : Arc Length of a Plane Curve [Section 221] Objectives In this section you will learn the following : How to find the length of a plane curve 221

### DEGENERACY AND THE FUNDAMENTAL THEOREM

DEGENERACY AND THE FUNDAMENTAL THEOREM The Standard Simplex Method in Matrix Notation: we start with the standard form of the linear program in matrix notation: (SLP) m n we assume (SLP) is feasible, and

### Sentiment Classification of Food Reviews

Sentiment Classification of Food Reviews Hua Feng Department of Electrical Engineering Stanford University Stanford, CA 94305 fengh15@stanford.edu Ruixi Lin Department of Electrical Engineering Stanford

### CS 125 Section #4 RAMs and TMs 9/27/16

CS 125 Section #4 RAMs and TMs 9/27/16 1 RAM A word-ram consists of: A fixed set of instructions P 1,..., P q. Allowed instructions are: Modular arithmetic and integer division on registers; the standard

### Chapter 3. The Efficiency of Algorithms INVITATION TO. Computer Science

Chapter 3 The Efficiency of Algorithms INVITATION TO 1 Computer Science Objectives After studying this chapter, students will be able to: Describe algorithm attributes and why they are important Explain

### Unit VIII. Chapter 9. Link Analysis

Unit VIII Link Analysis: Page Ranking in web search engines, Efficient Computation of Page Rank using Map-Reduce and other approaches, Topic-Sensitive Page Rank, Link Spam, Hubs and Authorities (Text Book:2

### Density estimation. In density estimation problems, we are given a random from an unknown density. Our objective is to estimate

Density estimation In density estimation problems, we are given a random sample from an unknown density Our objective is to estimate? Applications Classification If we estimate the density for each class,

### Design Example: 4-bit Multiplier

Design Example: 4-bit Multiplier Consider how we normally multiply numbers: 123 x 264 492 7380 24600 32472 Binary multiplication is similar. (Note that the product of two 4-bit numbers is potentially an

### FMA901F: Machine Learning Lecture 6: Graphical Models. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 6: Graphical Models Cristian Sminchisescu Graphical Models Provide a simple way to visualize the structure of a probabilistic model and can be used to design and motivate

### MATH 890 HOMEWORK 2 DAVID MEREDITH

MATH 890 HOMEWORK 2 DAVID MEREDITH (1) Suppose P and Q are polyhedra. Then P Q is a polyhedron. Moreover if P and Q are polytopes then P Q is a polytope. The facets of P Q are either F Q where F is a facet

### CPSC 467b: Cryptography and Computer Security

Outline ZKIP Other IP CPSC 467b: Cryptography and Computer Security Lecture 19 Michael J. Fischer Department of Computer Science Yale University March 31, 2010 Michael J. Fischer CPSC 467b, Lecture 19

### Introduction to Homogeneous coordinates

Last class we considered smooth translations and rotations of the camera coordinate system and the resulting motions of points in the image projection plane. These two transformations were expressed mathematically

### A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine

### Divisibility Rules and Their Explanations

Divisibility Rules and Their Explanations Increase Your Number Sense These divisibility rules apply to determining the divisibility of a positive integer (1, 2, 3, ) by another positive integer or 0 (although

### A Rule Chaining Architecture Using a Correlation Matrix Memory. James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe

A Rule Chaining Architecture Using a Correlation Matrix Memory James Austin, Stephen Hobson, Nathan Burles, and Simon O Keefe Advanced Computer Architectures Group, Department of Computer Science, University

### Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

### Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Output: Knowledge representation Tables Linear models Trees Rules

### Lecture 5: The Halting Problem. Michael Beeson

Lecture 5: The Halting Problem Michael Beeson Historical situation in 1930 The diagonal method appears to offer a way to extend just about any definition of computable. It appeared in the 1920s that it

### EULER S FORMULA AND THE FIVE COLOR THEOREM

EULER S FORMULA AND THE FIVE COLOR THEOREM MIN JAE SONG Abstract. In this paper, we will define the necessary concepts to formulate map coloring problems. Then, we will prove Euler s formula and apply

### Learning to Learn: additional notes

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2008 Recitation October 23 Learning to Learn: additional notes Bob Berwick

### Final Exam DATA MINING I - 1DL360

Uppsala University Department of Information Technology Kjell Orsborn Final Exam 2012-10-17 DATA MINING I - 1DL360 Date... Wednesday, October 17, 2012 Time... 08:00-13:00 Teacher on duty... Kjell Orsborn,

### Introduction to Support Vector Machines

Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,

### A Neural Network Based Bank Cheque Recognition system for Malaysian Cheques

A Neural Network Based Bank Cheque Recognition system for Malaysian Cheques Ahmad Ridhwan Wahap 1 Marzuki Khalid 1 Abd. Rahim Ahmad 3 Rubiyah Yusof 1 1 Centre for Artificial Intelligence and Robotics,

### 1 Achieving IND-CPA security

ISA 562: Information Security, Theory and Practice Lecture 2 1 Achieving IND-CPA security 1.1 Pseudorandom numbers, and stateful encryption As we saw last time, the OTP is perfectly secure, but it forces