Lecture 4: Backpropagation and Automatic Differentiation. CSE599W: Spring 2018

Size: px
Start display at page:

Download "Lecture 4: Backpropagation and Automatic Differentiation. CSE599W: Spring 2018"

Transcription

1 Lecture 4: Backpropagation and Automatic Differentiation CSE599W: Spring 208

2 Announcement Assignment is out today, due in 2 weeks (Apr 9 th, 5pm)

3 Model Training Overview layer extractor layer2 extractor predictor Objective Training

4 Symbolic Differentiation Input formulae is a symbolic expression tree (computation graph). Implement differentiation rules, e.g., sum rule, product rule, chain rule!(# %)!' For complicated functions, the resultant expression can be exponentially large. Wasteful to keep around intermediate symbolic expressions if we only need a numeric value of the gradient in the end Prone to error =!#!'!%!'!(#%)!' =!#!' % #!%!'! h '!' =!# % '!' *!%(') '

5 Numerical Differentiation We can approximate the gradient using!" # " # h/ 0 "(#) lim!$ % *, h " 4, $ = 4 $ <

6 Numerical Differentiation We can approximate the gradient using!" # " # h/ 0 "(#) lim!$ % *, h " 4, $ = 4 $ = " 4, $ = 4 $ 0.8 ; 0.3 =

7 Numerical Differentiation We can approximate the gradient using!" # " # h/ 0 "(#) lim!$ % *, h Reduce the truncation error by using center difference!" # " # h/ 0 "(# h/ 0 ) lim!$ % *, 2h Bad: rounding error, and slow to compute üa powerful tool to check the correctness of implementation, usually use h = e-6.

8 Backpropagation! Operator % # = %(!, ") "

9 Backpropagation " )* )" = )* )$ )$ )" Compute gradient becomes local computation Operator! $ =!(", #) )* )$ # )* )# = )* )$ )$ )#

10 Backpropagation simple example! = % &(( )*(, *( -, - )

11 Backpropagation simple example " #, = *.( ) % # " & % & *% /% " '

12 Backpropagation simple example " # % # " & % & , = *.( ) *% /% " ' 2.0

13 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /%.0

14 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /% , % = /% à 89 = 84 /%& :; :% = :; :, :, :% = /%&

15 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /% , % = % à =

16 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /% , % = * 4 à 89 :; :% = :; :, 84 = *4 :, :% = :; :, *4

17 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /% , %, " = %" à 9: 94 = ", 9: 90 = %

18 Backpropagation simple example " # % # " & % & " ' , = *.( ) *% /%

19 Any problem? Can we do better?

20 Problems of backpropagation You always need to keep intermediate data in the memory during the forward pass in case it will be used in the backpropagation. Lack of flexibility, e.g., compute the gradient of gradient.

21 Automatic Differentiation (autodiff) Create computation graph for gradient computation

22 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % #, = *.( ) " & % & *% /% " '

23 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % # " & - = * /( ) % & *% /% " ' % & - % = /% à = /%&

24 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % # " & - = * /( ) % & *% /% " ' % & - % = % à =

25 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % # " & - = * /( ) % & *% /% " ' % & - % = * 5 à = *5

26 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % # " & - = * /( ) % & *% /% " ' 89 - %, " = %" à 8; = % % &

27 Automatic Differentiation (autodiff) Create computation graph for gradient computation " # % # " & - = * /( ) % & *% /% " ' % &

28 AutoDiff Algorithm W x y matmult softmax log mul mean cross_en tropy y_ W_grad matmulttranspose log-grad softmax-grad mul / batch_size

29 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad! " exp! %

30 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! " exp! %

31 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad:! %! " :! " exp! %

32 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad:! %! " :! " exp! % "! %

33 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! % :! % : "! %! "! " exp! % "! %

34 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! % :! % : "! %! "! " exp! % "! %

35 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad! " exp! % $ "! % id node_to_grad: :! % :! % : "! %! "

36 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad! " exp! % $ "! % id node_to_grad: :! % :! % : ", $! %! "

37 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad! " exp! % $ "! % id node_to_grad: :! % :! % : ", $! %! "

38 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! % :! % : ", $! %! "! " exp! %! " "! % $ id

39 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! % :! % : ", $! " :! "! %! "! " exp! %! " $ id "! %

40 AutoDiff Algorithm def gradient(out): node_to_grad[out] = nodes = get_node_list(out) for node in reverse_topo_order(nodes): grad ß sum partial adjoints from output edges input_grads ß node.op.gradient(input, grad) for input in node.inputs add input_grads to node_to_grad return node_to_grad node_to_grad: :! % :! % : ", $! " :! "! %! "! " exp! %! " $ id "! %

41 Backpropagation vs AutoDiff Backpropagation AutoDiff! "! " exp! #! )! (! " exp! #! (! #! #! # id! (! )! )

42 Recap Numerical differentiation Tool to check the correctness of implementation Backpropagation Easy to understand and implement Bad for memory use and schedule optimization Automatic differentiation Generate gradient computation to entire computation graph Better for system optimization

43 References Automatic differentiation in machine learning: a survey CS23n backpropagation:

Lecture 3: Overview of Deep Learning System. CSE599W: Spring 2018

Lecture 3: Overview of Deep Learning System. CSE599W: Spring 2018 Lecture 3: Overview of Deep Learning System CSE599W: Spring 2018 The Deep Learning Systems Juggle We won t focus on a specific one, but will discuss the common and useful elements of these systems Typical

More information

Administrative. Assignment 1 due Wednesday April 18, 11:59pm

Administrative. Assignment 1 due Wednesday April 18, 11:59pm Lecture 4-1 Administrative Assignment 1 due Wednesday April 18, 11:59pm Lecture 4-2 Administrative All office hours this week will use queuestatus Lecture 4-3 Where we are... scores function SVM loss data

More information

CSC321 Lecture 10: Automatic Differentiation

CSC321 Lecture 10: Automatic Differentiation CSC321 Lecture 10: Automatic Differentiation Roger Grosse Roger Grosse CSC321 Lecture 10: Automatic Differentiation 1 / 23 Overview Implementing backprop by hand is like programming in assembly language.

More information

autograd tutorial Paul Vicol, Slides Based on Ryan Adams January 30, 2017 CSC 321, University of Toronto

autograd tutorial Paul Vicol, Slides Based on Ryan Adams January 30, 2017 CSC 321, University of Toronto autograd tutorial Paul Vicol, Slides Based on Ryan Adams January 30, 2017 CSC 321, University of Toronto 1 tutorial outline 1. Automatic Differentiation 2. Introduction to Autograd 3. IPython Notebook

More information

Backpropagation and Neural Networks. Lecture 4-1

Backpropagation and Neural Networks. Lecture 4-1 Lecture 4: Backpropagation and Neural Networks Lecture 4-1 Administrative Assignment 1 due Thursday April 20, 11:59pm on Canvas Lecture 4-2 Administrative Project: TA specialities and some project ideas

More information

Lecture 11: Distributed Training and Communication Protocols. CSE599W: Spring 2018

Lecture 11: Distributed Training and Communication Protocols. CSE599W: Spring 2018 Lecture 11: Distributed Training and Communication Protocols CSE599W: Spring 2018 Where are we High level Packages User API Programming API Gradient Calculation (Differentiation API) System Components

More information

UNIT 2B An Introduction to Programming. Announcements

UNIT 2B An Introduction to Programming. Announcements UNIT 2B An Introduction to Programming 1 Announcements Tutoring help on Mondays 8:30 11:00 pm in the Mudge Reading Room Extra help session Fridays 12:00 2:00 pm in GHC 4122 Academic integrity forms Always

More information

Autodiff generates your exponential family inference code + Autograd s implementation

Autodiff generates your exponential family inference code + Autograd s implementation Autodiff generates your exponential family inference code + Autograd s implementation p(x ) =exp{ t(x) log Z( )} p(x ) =exp{ t(x) log Z( )} Bernoulli: t(x) = x = log p p(x ) =exp{ t(x) log Z( )} Gaussian:

More information

Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 10, 2017

Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 10, 2017 INF 5860 Machine learning for image classification Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 0, 207 Mandatory exercise Available tonight,

More information

Lecture : Training a neural net part I Initialization, activations, normalizations and other practical details Anne Solberg February 28, 2018

Lecture : Training a neural net part I Initialization, activations, normalizations and other practical details Anne Solberg February 28, 2018 INF 5860 Machine learning for image classification Lecture : Training a neural net part I Initialization, activations, normalizations and other practical details Anne Solberg February 28, 2018 Reading

More information

Backpropagation + Deep Learning

Backpropagation + Deep Learning 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders

More information

CSE 20. Lecture 4: Number System and Boolean Function. CSE 20: Lecture2

CSE 20. Lecture 4: Number System and Boolean Function. CSE 20: Lecture2 CSE 20 Lecture 4: Number System and Boolean Function Next Weeks Next week we will do Unit:NT, Section 1. There will be an assignment set posted today. It is just for practice. Boolean Functions and Number

More information

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 CSC 261/461 Database Systems Lecture 21 and 22 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Announcements Project 3 (MongoDB): Due on: 04/12 Work on Term Project and Project 1 The last (mini)

More information

Parallel Deep Network Training

Parallel Deep Network Training Lecture 26: Parallel Deep Network Training Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Speech Debelle Finish This Album (Speech Therapy) Eat your veggies and study

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course

More information

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs Natural Language Processing with Deep Learning CS4N/Ling84 Christopher Manning Lecture 4: Backpropagation and computation graphs Lecture Plan Lecture 4: Backpropagation and computation graphs 1. Matrix

More information

VSCode: Open Project -> View Terminal -> npm run pull -> npm start. Lecture 16

VSCode: Open Project -> View Terminal -> npm run pull -> npm start. Lecture 16 VSCode: Open Project -> View Terminal -> npm run pull -> npm start for Loops Lecture 16 Don t Stop Believin I would feel excited and hype because this song is a classic 101 student Announcements Quiz 2

More information

Inf2C - Computer Systems Lecture 2 Data Representation

Inf2C - Computer Systems Lecture 2 Data Representation Inf2C - Computer Systems Lecture 2 Data Representation Boris Grot School of Informatics University of Edinburgh Last lecture Moore s law Types of computer systems Computer components Computer system stack

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

RECURSION (CONTINUED)

RECURSION (CONTINUED) RECURSION (CONTINUED) Lecture 9 CS2110 Spring 2018 Prelim two weeks from today: 13 March. 1. Visit Exams page of course website, check what time your prelim is, complete assignment P1Conflict ONLY if necessary.

More information

npm run pull npm start

npm run pull npm start 1. Open Visual Studio Code 2. At the top click on View->Integrated Terminal (if not already open) 3. In the terminal, first run: npm run pull 4. After this finishes run: npm start For Loops Lecture 14

More information

Replacing Neural Networks with Black-Box ODE Solvers

Replacing Neural Networks with Black-Box ODE Solvers Replacing Neural Networks with Black-Box ODE Solvers Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud University of Toronto, Vector Institute Resnets are Euler integrators Middle layers

More information

Lecture #3: Environments

Lecture #3: Environments Lecture #3: Environments Substitution is not as simple as it might seem. For example: def f(x): def g(x): return x + 10 return g(5) f(3) When we call f(3), we should not substitute 3 for the xs in g! And

More information

Lecture 8: Conditionals & Control Flow (Sections ) CS 1110 Introduction to Computing Using Python

Lecture 8: Conditionals & Control Flow (Sections ) CS 1110 Introduction to Computing Using Python http://www.cs.cornell.edu/courses/cs1110/2018sp Lecture 8: Conditionals & Control Flow (Sections 5.1-5.7) CS 1110 Introduction to Computing Using Python [E. Andersen, A. Bracy, D. Gries, L. Lee, S. Marschner,

More information

Engineering Mathematics (4)

Engineering Mathematics (4) Engineering Mathematics (4) Zhang, Xinyu Department of Computer Science and Engineering, Ewha Womans University, Seoul, Korea zhangxy@ewha.ac.kr Example With respect to parameter: s (arc length) r( t)

More information

Gated Recurrent Models. Stephan Gouws & Richard Klein

Gated Recurrent Models. Stephan Gouws & Richard Klein Gated Recurrent Models Stephan Gouws & Richard Klein Outline Part 1: Intuition, Inference and Training Building intuitions: From Feedforward to Recurrent Models Inference in RNNs: Fprop Training in RNNs:

More information

ECE521 Lecture 18 Graphical Models Hidden Markov Models

ECE521 Lecture 18 Graphical Models Hidden Markov Models ECE521 Lecture 18 Graphical Models Hidden Markov Models Outline Graphical models Conditional independence Conditional independence after marginalization Sequence models hidden Markov models 2 Graphical

More information

Back Propagation and Other Differentiation Algorithms. Sargur N. Srihari

Back Propagation and Other Differentiation Algorithms. Sargur N. Srihari Back Propagation and Other Differentiation Algorithms Sargur N. srihari@cedar.buffalo.edu 1 Topics (Deep Feedforward Networks) Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units

More information

CS281 Section 3: Practical Optimization

CS281 Section 3: Practical Optimization CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical

More information

Euler s Methods (a family of Runge- Ku9a methods)

Euler s Methods (a family of Runge- Ku9a methods) Euler s Methods (a family of Runge- Ku9a methods) ODE IVP An Ordinary Differential Equation (ODE) is an equation that contains a function having one independent variable: The equation is coupled with an

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Better Living with Automatic Differentiation

Better Living with Automatic Differentiation Better Living with Automatic Differentiation clab lecture: September 2, 2014 Chris Dyer Empirical Risk Minimization Define a model that predicts something given some inputs and modulated by parameters

More information

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey. Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

CS1 Recitation. Week 2

CS1 Recitation. Week 2 CS1 Recitation Week 2 Sum of Squares Write a function that takes an integer n n must be at least 0 Function returns the sum of the square of each value between 0 and n, inclusive Code: (define (square

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

Memory Bandwidth and Low Precision Computation. CS6787 Lecture 9 Fall 2017

Memory Bandwidth and Low Precision Computation. CS6787 Lecture 9 Fall 2017 Memory Bandwidth and Low Precision Computation CS6787 Lecture 9 Fall 2017 Memory as a Bottleneck So far, we ve just been talking about compute e.g. techniques to decrease the amount of compute by decreasing

More information

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017 COS433/Math 473: Cryptography Mark Zhandry Princeton University Spring 2017 Previously on COS 433 Pseudorandom Permutations unctions that look like random permutations Syntax: Key space K (usually {0,1}

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

Lab 7 1 Due Thu., 6 Apr. 2017

Lab 7 1 Due Thu., 6 Apr. 2017 Lab 7 1 Due Thu., 6 Apr. 2017 CMPSC 112 Introduction to Computer Science II (Spring 2017) Prof. John Wenskovitch http://cs.allegheny.edu/~jwenskovitch/teaching/cmpsc112 Lab 7 - Using Stacks to Create a

More information

The Anatomy of Deep Learning Frameworks* *Everything you wanted to know about DL Frameworks but were afraid to ask

The Anatomy of Deep Learning Frameworks* *Everything you wanted to know about DL Frameworks but were afraid to ask The Anatomy of Deep Learning Frameworks* *Everything you wanted to know about DL Frameworks but were afraid to ask $whoami Master s Student in CS @ ETH Zurich, bachelors from BITS Pilani Contributor to

More information

Math 104, Spring 2010 Course Log

Math 104, Spring 2010 Course Log Math 104, Spring 2010 Course Log Date: 1/11 Sections: 1.3, 1.4 Log: Lines in the plane. The point-slope and slope-intercept formulas. Functions. Domain and range. Compositions of functions. Inverse functions.

More information

ODE IVP. An Ordinary Differential Equation (ODE) is an equation that contains a function having one independent variable:

ODE IVP. An Ordinary Differential Equation (ODE) is an equation that contains a function having one independent variable: Euler s Methods ODE IVP An Ordinary Differential Equation (ODE) is an equation that contains a function having one independent variable: The equation is coupled with an initial value/condition (i.e., value

More information

Lecture 9: Scheduling. CSE599G1: Spring 2017

Lecture 9: Scheduling. CSE599G1: Spring 2017 Lecture 9: Scheduling CSE599G1: Spring 2017 Next Week Two Joint Sessions with Computer Architecture Class Different date, time and location, detail to be announced Wed: ASICs for deep learning Friday:

More information

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10 Announcements Assignment 2 due Tuesday, May 4. Edge Detection, Lines Midterm: Thursday, May 6. Introduction to Computer Vision CSE 152 Lecture 10 Edges Last Lecture 1. Object boundaries 2. Surface normal

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 4: Harris corner detection Szeliski: 4.1 Reading Announcements Project 1 (Hybrid Images) code due next Wednesday, Feb 14, by 11:59pm Artifacts due Friday, Feb

More information

CSC 261/461 Database Systems Lecture 19

CSC 261/461 Database Systems Lecture 19 CSC 261/461 Database Systems Lecture 19 Fall 2017 Announcements CIRC: CIRC is down!!! MongoDB and Spark (mini) projects are at stake. L Project 1 Milestone 4 is out Due date: Last date of class We will

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 22, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 22, 2011 1 / 22 If the graph is non-chordal, then

More information

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015 Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using

More information

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008 A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some

More information

Models of Computation for Massive Data

Models of Computation for Massive Data Models of Computation for Massive Data Jeff M. Phillips August 28, 2013 Outline Sequential: External Memory / (I/O)-Efficient Streaming Parallel: P and BSP MapReduce GP-GPU Distributed Computing runtime

More information

CS1 Lecture 5 Jan. 25, 2019

CS1 Lecture 5 Jan. 25, 2019 CS1 Lecture 5 Jan. 25, 2019 HW1 due Monday, 9:00am. Notes: Do not write all the code at once before starting to test. Take tiny steps. Write a few lines test... add a line or two test... add another line

More information

Math 20A lecture 10 The Gradient Vector

Math 20A lecture 10 The Gradient Vector Math 20A lecture 10 p. 1/12 Math 20A lecture 10 The Gradient Vector T.J. Barnet-Lamb tbl@brandeis.edu Brandeis University Math 20A lecture 10 p. 2/12 Announcements Homework five posted, due this Friday

More information

CS1 Lecture 5 Jan. 26, 2018

CS1 Lecture 5 Jan. 26, 2018 CS1 Lecture 5 Jan. 26, 2018 HW1 due Monday, 9:00am. Notes: Do not write all the code at once (for Q1 and 2) before starting to test. Take tiny steps. Write a few lines test... add a line or two test...

More information

Homework 3. Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) CSCI 402: Computer Architectures

Homework 3. Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) CSCI 402: Computer Architectures Homework 3 Assigned on 02/15 Due time: midnight on 02/21 (1 WEEK only!) B.2 B.11 B.14 (hint: use multiplexors) 1 CSCI 402: Computer Architectures Arithmetic for Computers (2) Fengguang Song Department

More information

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method.

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. Reals 1 13 Reals Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. 13.1 Floating-point numbers Real numbers, those declared to be

More information

CNN Visualizations. Seoul AI Meetup Martin Kersner, 2018/01/06

CNN Visualizations. Seoul AI Meetup Martin Kersner, 2018/01/06 CNN Visualizations Seoul AI Meetup Martin Kersner, 2018/01/06 Content 1. Visualization of convolutional weights from the first layer 2. Visualization of patterns learned by higher layers 3. Weakly Supervised

More information

Introduction to Data Management. Lecture 15 (More About Indexing)

Introduction to Data Management. Lecture 15 (More About Indexing) Introduction to Data Management Lecture 15 (More About Indexing) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW s and quizzes:

More information

Metropolis Light Transport

Metropolis Light Transport Metropolis Light Transport CS295, Spring 2017 Shuang Zhao Computer Science Department University of California, Irvine CS295, Spring 2017 Shuang Zhao 1 Announcements Final presentation June 13 (Tuesday)

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+

More information

Lecture 9: (Semi-)bandits and experts with linear costs (part I)

Lecture 9: (Semi-)bandits and experts with linear costs (part I) CMSC 858G: Bandits, Experts and Games 11/03/16 Lecture 9: (Semi-)bandits and experts with linear costs (part I) Instructor: Alex Slivkins Scribed by: Amr Sharaf In this lecture, we will study bandit problems

More information

CS1 Lecture 15 Feb. 18, 2019

CS1 Lecture 15 Feb. 18, 2019 CS1 Lecture 15 Feb. 18, 2019 HW4 due Wed. 2/20, 5pm Q2 and Q3: it is fine to use a loop as long as the function is also recursive. Exam 1, Thursday evening, 2/21, 6:30-8:00pm, W290 CB You must bring ID

More information

Networking Sensors, II

Networking Sensors, II Networking Sensors, II Sensing Networking Leonidas Guibas Stanford University Computation CS321 ZG Book, Ch. 3 1 Class Administration Paper presentation preferences due today, by class time Project info

More information

x 6 + λ 2 x 6 = for the curve y = 1 2 x3 gives f(1, 1 2 ) = λ actually has another solution besides λ = 1 2 = However, the equation λ

x 6 + λ 2 x 6 = for the curve y = 1 2 x3 gives f(1, 1 2 ) = λ actually has another solution besides λ = 1 2 = However, the equation λ Math 0 Prelim I Solutions Spring 010 1. Let f(x, y) = x3 y for (x, y) (0, 0). x 6 + y (4 pts) (a) Show that the cubic curves y = x 3 are level curves of the function f. Solution. Substituting y = x 3 in

More information

Lecture 13. Lecture 13: B+ Tree

Lecture 13. Lecture 13: B+ Tree Lecture 13 Lecture 13: B+ Tree Lecture 13 Announcements 1. Project Part 2 extension till Friday 2. Project Part 3: B+ Tree coming out Friday 3. Poll for Nov 22nd 4. Exam Pickup: If you have questions,

More information

CSC 261/461 Database Systems Lecture 24

CSC 261/461 Database Systems Lecture 24 CSC 261/461 Database Systems Lecture 24 Fall 2017 TRANSACTIONS Announcement Poster: You should have sent us the poster by yesterday. If you have not done so, please send us asap. Make sure to send it for

More information

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008

A Brief Overview of Optimization Problems. Steven G. Johnson MIT course , Fall 2008 A Brief Overview of Optimization Problems Steven G. Johnson MIT course 18.335, Fall 2008 Why optimization? In some sense, all engineering design is optimization: choosing design parameters to improve some

More information

Automatic Differentiation for R

Automatic Differentiation for R Automatic Differentiation for R Radford M. Neal, University of Toronto Dept. of Statistical Sciences and Dept. of Computer Science Vector Institute Affiliate http://www.cs.utoronto.ca/ radford http://radfordneal.wordpress.com

More information

Ambiguity. Lecture 8. CS 536 Spring

Ambiguity. Lecture 8. CS 536 Spring Ambiguity Lecture 8 CS 536 Spring 2001 1 Announcement Reading Assignment Context-Free Grammars (Sections 4.1, 4.2) Programming Assignment 2 due Friday! Homework 1 due in a week (Wed Feb 21) not Feb 25!

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

What you will learn today

What you will learn today What you will learn today Tangent Planes and Linear Approximation and the Gradient Vector Vector Functions 1/21 Recall in one-variable calculus, as we zoom in toward a point on a curve, the graph becomes

More information

CSE 332: Data Structures. Spring 2016 Richard Anderson Lecture 1

CSE 332: Data Structures. Spring 2016 Richard Anderson Lecture 1 CSE 332: Data Structures Spring 2016 Richard Anderson Lecture 1 CSE 332 Team Instructors: Richard Anderson anderson at cs TAs: Hunter Zahn, Andrew Li hzahn93 at cs lia4 at cs 2 Today s Outline Introductions

More information

CSC148 Week 7. Larry Zhang

CSC148 Week 7. Larry Zhang CSC148 Week 7 Larry Zhang 1 Announcements Test 1 can be picked up in DH-3008 A1 due this Saturday Next week is reading week no lecture, no labs no office hours 2 Recap Last week, learned about binary trees

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

61A Lecture 6. Monday, February 2

61A Lecture 6. Monday, February 2 61A Lecture 6 Monday, February 2 Announcements Homework 2 due Monday 2/2 @ 11:59pm Project 1 due Thursday 2/5 @ 11:59pm Project party on Tuesday 2/3 5pm-6:30pm in 2050 VLSB Partner party on Wednesday 2/4

More information

Deep neural networks II

Deep neural networks II Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of

More information

Introduction to Computer Science

Introduction to Computer Science Introduction to Computer Science CSCI 109 Readings St. Amant, Ch. 4, Ch. 8 China Tianhe-2 Andrew Goodney Spring 2018 An algorithm (pronounced AL-go-rithum) is a procedure or formula for solving a problem.

More information

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016 CPSC 340: Machine Learning and Data Mining Logistic Regression Fall 2016 Admin Assignment 1: Marks visible on UBC Connect. Assignment 2: Solution posted after class. Assignment 3: Due Wednesday (at any

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

A Usability Evaluation of Google Calendar

A Usability Evaluation of Google Calendar A Usability Evaluation of Google Calendar Executive summary Purpose This study is to conduct a usability evaluation of web-based Google Calendar. Four highlevel questions about the usability of this product

More information

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies"

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing Larry Matthies ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies" lhm@jpl.nasa.gov, 818-354-3722" Announcements" First homework grading is done! Second homework is due

More information

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13. Announcements Edge and Corner Detection HW3 assigned CSE252A Lecture 13 Efficient Implementation Both, the Box filter and the Gaussian filter are separable: First convolve each row of input image I with

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

EECS 496 Statistical Language Models. Winter 2018

EECS 496 Statistical Language Models. Winter 2018 EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading

More information

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11 Announcement Edge and Corner Detection Slides are posted HW due Friday CSE5A Lecture 11 Edges Corners Edge is Where Change Occurs: 1-D Change is measured by derivative in 1D Numerical Derivatives f(x)

More information

Lecture 12. Lecture 12: The IO Model & External Sorting

Lecture 12. Lecture 12: The IO Model & External Sorting Lecture 12 Lecture 12: The IO Model & External Sorting Announcements Announcements 1. Thank you for the great feedback (post coming soon)! 2. Educational goals: 1. Tech changes, principles change more

More information

CSCI 136 Data Structures & Advanced Programming. Lecture 7 Spring 2018 Bill and Jon

CSCI 136 Data Structures & Advanced Programming. Lecture 7 Spring 2018 Bill and Jon CSCI 136 Data Structures & Advanced Programming Lecture 7 Spring 2018 Bill and Jon Administrative Details Lab 3 Wednesday! You may work with a partner Fill out Lab 3 Partners Google form either way! Come

More information

Lecture 21. Lecture 21: Concurrency & Locking

Lecture 21. Lecture 21: Concurrency & Locking Lecture 21 Lecture 21: Concurrency & Locking Lecture 21 Today s Lecture 1. Concurrency, scheduling & anomalies 2. Locking: 2PL, conflict serializability, deadlock detection 2 Lecture 21 > Section 1 1.

More information

CSE 417 Dynamic Programming (pt 4) Sub-problems on Trees

CSE 417 Dynamic Programming (pt 4) Sub-problems on Trees CSE 417 Dynamic Programming (pt 4) Sub-problems on Trees Reminders > HW4 is due today > HW5 will be posted shortly Dynamic Programming Review > Apply the steps... 1. Describe solution in terms of solution

More information

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling)

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) 18-447 Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 2/13/2015 Agenda for Today & Next Few Lectures

More information

System Programming CISC 360. Floating Point September 16, 2008

System Programming CISC 360. Floating Point September 16, 2008 System Programming CISC 360 Floating Point September 16, 2008 Topics IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Powerpoint Lecture Notes for Computer Systems:

More information

Some material taken from: Yuri Boykov, Western Ontario

Some material taken from: Yuri Boykov, Western Ontario CS664 Lecture #22: Distance transforms, Hausdorff matching, flexible models Some material taken from: Yuri Boykov, Western Ontario Announcements The SIFT demo toolkit is available from http://www.evolution.com/product/oem/d

More information

Manipulating Integers

Manipulating Integers Manipulating Integers Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

DEEP LEARNING IN PYTHON. The need for optimization

DEEP LEARNING IN PYTHON. The need for optimization DEEP LEARNING IN PYTHON The need for optimization A baseline neural network Input 2 Hidden Layer 5 2 Output - 9-3 Actual Value of Target: 3 Error: Actual - Predicted = 4 A baseline neural network Input

More information

Lecture 8 Index (B+-Tree and Hash)

Lecture 8 Index (B+-Tree and Hash) CompSci 516 Data Intensive Computing Systems Lecture 8 Index (B+-Tree and Hash) Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 due tomorrow: Announcements Due on 09/21 (Thurs),

More information

Training Deep Neural Networks (in parallel)

Training Deep Neural Networks (in parallel) Lecture 9: Training Deep Neural Networks (in parallel) Visual Computing Systems How would you describe this professor? Easy? Mean? Boring? Nerdy? Professor classification task Classifies professors as

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU) Mohamed Zahran

More information

Floating Point Numbers

Floating Point Numbers Floating Point Numbers Computer Systems Organization (Spring 2016) CSCI-UA 201, Section 2 Fractions in Binary Instructor: Joanna Klukowska Slides adapted from Randal E. Bryant and David R. O Hallaron (CMU)

More information