Neural Networks. Aarti Singh. Machine Learning Nov 3, Slides Courtesy: Tom Mitchell

Size: px
Start display at page:

Download "Neural Networks. Aarti Singh. Machine Learning Nov 3, Slides Courtesy: Tom Mitchell"

Transcription

1 Neura Networks Aarti Singh Machine Learning Nov 3, 2011 Sides Courtesy: Tom Mitche 1

2 Logis0c Regression Assumes the foowing func1ona form for P(Y X): Logis1c func1on appied to a inear func1on of the data Logistic function (or Sigmoid): ogit (z) Features can be discrete or continuous! z 2

3 Logis0c Regression is a Linear Cassifier! Assumes the foowing func1ona form for P(Y X): Decision boundary: 1 1 (Linear Decision Boundary) 3

4 Training Logis0c Regression How to earn the parameters w 0, w 1, w d? Training Data Maximum (Condi1ona) Likeihood Es1mates Discrimina1ve phiosophy Don t waste effort earning P(X), focus on P(Y X) that s a that malers for cassifica1on! 4

5 Op0mizing convex func0on Max Condi1ona og- ikeihood = Min Nega1ve Condi1ona og- ikeihood Nega1ve Condi1ona og- ikeihood is a convex func1on Gradient Descent (convex) Gradient: Update rue: Learning rate, η>0 5

6 Logis0c func0on as a Graph Sigmoid Unit d d d

7 Neura Networks to earn f: X à Y f can be a non- inear func1on X (vector of) con1nuous and/or discrete variabes Y (vector of) con1nuous and/or discrete variabes Neura networks - Represent f by network of ogis1c/sigmoid units: Sigmoid Unit Output ayer, Y Hidden ayer, H Input ayer, X

8 Neura Network trained to distinguish vowe sounds using 2 formants (features) Output ayer Hidden ayer Input ayer Two ayers of ogistic units Highy non-inear decision surface

9 Neura Network trained to drive a car! Weights to output units from the hidden unit Weights of each pixe for one hidden unit

10

11 Forward Propaga0on for predic0on Prediction Given neura network (hidden units and weights), use it to predict the abe of a test point Forward Propagation Start from input ayer For each subsequent ayer, compute output of sigmoid unit Sigmoid unit: 1-Hidden ayer, 1 output NN: o h

12 Training Neura Networks d d d Differentiabe

13 M(C)LE Training for Neura Networks Consider regression probem f:xà Y, for scaar Y y = f(x) + ε assume noise N(0,σ ε ), iid deterministic Let s maximize the conditiona data ikeihood Learned neura network Train weights of a units to minimize sum of squared errors of predicted network outputs

14 MAP Training for Neura Networks Consider regression probem f:xà Y, for scaar Y y = f(x) + ε noise N(0,σ ε ) deterministic Gaussian P(W) = N(0,σΙ) n P(W) c i w i 2 Train weights of a units to minimize sum of squared errors of predicted network outputs pus weight magnitudes

15 E Mean Square Error d For Neura Networks, E[w] no onger convex in w

16 Error Gradient for a Sigmoid Unit y y y y y y y Sigmoid Unit d d d

17 (MLE) Using Forward propagation y k y k = target output (abe) o k/h = unit output (obtained by forward propagation) w ij = wt from i to j Note: if i is input variabe, o i = x i o

18 Using a training data D y y

19 Objective/Error no onger convex in weights

20 Deaing with OverfiVng Our earning agorithm invoves a parameter n=number of gradient descent iterations How do we choose n to optimize future error? (note: simiar issue for ogistic regression, decision trees, ) e.g. the n that minimizes error rate of neura net over future data

21 Deaing with OverfiVng Our earning agorithm invoves a parameter n=number of gradient descent iterations How do we choose n to optimize future error? Separate avaiabe data into training and vaidation set Use training to perform gradient descent n ß number of iterations that optimizes vaidation set error

22 K- fod Cross- vaida0on Idea: train mutipe times, eaving out a disjoint subset of data each time for test. Average the test set accuracies. Partition data into K disjoint subsets For k=1 to K testdata = kth subset h ß cassifier trained* on a data except for testdata accuracy(k) = accuracy of h on testdata end FinaAccuracy = mean of the K recorded testset accuracies * might withhod some of this to choose number of gradient decent steps

23 Leave- one- out Cross- vaida0on This is just k-fod cross vaidation eaving out one exampe each iteration Partition data into K disjoint subsets, each containing one exampe For k=1 to K testdata = kth subset h ß cassifier trained* on a data except for testdata accuracy(k) = accuracy of h on testdata end FinaAccuracy = mean of the K recorded testset accuracies * might withhod some of this to choose number of gradient decent steps

24 Deaing with OverfiVng Cross-vaidation Reguarization sma weights impy NN is inear (ow VC dimension) Logistic output Contro number of hidden units ow compexity Σw i x i

25

26

27

28

29 eft strt right up w 0

30 Semantic Memory Mode Based on ANN s [McCeand & Rogers, Nature 2003] No hierarchy given. Train with assertions, e.g., Can(Canary,Fy)

31 Humans act as though they have a hierarchica memory organization 1. Victims of Semantic Dementia progressivey ose knowedge of objects But they ose specific detais first, genera properties ater, suggesting hierarchica memory organization 2. Chidren appear to earn genera categories and properties first, foowing the same hierarchy, top down *. NonLiving Thing Pant Living Fish Anima Bird Canary Question: What earning mechanism coud produce this emergent hierarchy? * some debate remains on this.

32 Memory deterioration foows semantic hierarchy [McCeand & Rogers, Nature 2003]

33

34 Training Networks on Time Series Suppose we want to predict next state of word and it depends on history of unknown ength e.g., robot with forward-facing sensors trying to predict next sensor reading as it moves and turns

35 Training Networks on Time Series Suppose we want to predict next state of word and it depends on history of unknown ength e.g., robot with forward-facing sensors trying to predict next sensor reading as it moves and turns Idea: use hidden ayer in network to capture state history

36 Training Networks on Time Series How can we train recurrent net??

37 Ar0ficia Neura Networks: Summary Ac1vey used to mode distributed computa1on in brain Highy non- inear regression/cassifica1on Vector- vaued inputs and outputs Poten1ay miions of parameters to es1mate - overfiwng Hidden ayers earn intermediate representa1ons how many to use? Predic1on Forward propaga1on Gradient descent (Back- propaga1on), oca minima probems Mosty obsoete kerne tricks are more popuar, but coming back in new form as deep beief networks (probabiis1c interpreta1on)

Neural Networks. Aarti Singh & Barnabas Poczos. Machine Learning / Apr 24, Slides Courtesy: Tom Mitchell

Neural Networks. Aarti Singh & Barnabas Poczos. Machine Learning / Apr 24, Slides Courtesy: Tom Mitchell Neura Networks Aarti Singh & Barnabas Poczos Machine Learning 10-701/15-781 Apr 24, 2014 Sides Courtesy: Tom Mitche 1 Logis0c Regression Assumes the foowing func1ona form for P(Y X): Logis1c func1on appied

More information

Logis&c Regression. Aar$ Singh & Barnabas Poczos. Machine Learning / Jan 28, 2014

Logis&c Regression. Aar$ Singh & Barnabas Poczos. Machine Learning / Jan 28, 2014 Logis&c Regression Aar$ Singh & Barnabas Poczos Machine Learning 10-701/15-781 Jan 28, 2014 Linear Regression & Linear Classifica&on Weight Height Linear fit Linear decision boundary 2 Naïve Bayes Recap

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

Space-Time Trade-offs.

Space-Time Trade-offs. Space-Time Trade-offs. Chethan Kamath 03.07.2017 1 Motivation An important question in the study of computation is how to best use the registers in a CPU. In most cases, the amount of registers avaiabe

More information

Research of Classification based on Deep Neural Network

Research of  Classification based on Deep Neural Network 2018 Internationa Conference on Sensor Network and Computer Engineering (ICSNCE 2018) Research of Emai Cassification based on Deep Neura Network Wang Yawen Schoo of Computer Science and Engineering Xi

More information

Layer-Specific Adaptive Learning Rates for Deep Networks

Layer-Specific Adaptive Learning Rates for Deep Networks Layer-Specific Adaptive Learning Rates for Deep Networks arxiv:1510.04609v1 [cs.cv] 15 Oct 2015 Bharat Singh, Soham De, Yangmuzi Zhang, Thomas Godstein, and Gavin Tayor Department of Computer Science Department

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Patrice Y. Simard, 1 Yann A. Le Cun, 2 John S. Denker, 2 Bernard Victorri 3 1 Microsoft Research, 1 Microsoft Way, Redmond,

More information

CMPT 882 Week 3 Summary

CMPT 882 Week 3 Summary CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being

More information

More Relation Model: Functional Dependencies

More Relation Model: Functional Dependencies More Reation Mode: Functiona Dependencies Lecture #7 Autumn, 2001 Fa, 2001, LRX #07 More Reation Mode: Functiona Dependencies HUST,Wuhan,China 152 Functiona Dependencies X -> A = assertion about a reation

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

Simple Model Selection Cross Validation Regularization Neural Networks

Simple Model Selection Cross Validation Regularization Neural Networks Neural Nets: Many possible refs e.g., Mitchell Chapter 4 Simple Model Selection Cross Validation Regularization Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February

More information

Fastest-Path Computation

Fastest-Path Computation Fastest-Path Computation DONGHUI ZHANG Coege of Computer & Information Science Northeastern University Synonyms fastest route; driving direction Definition In the United states, ony 9.% of the househods

More information

Linear Regression & Gradient Descent

Linear Regression & Gradient Descent Linear Regression & Gradient Descent These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online.

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

Gradient Descent. Michail Michailidis & Patrick Maiden

Gradient Descent. Michail Michailidis & Patrick Maiden Gradient Descent Michail Michailidis & Patrick Maiden Outline Mo4va4on Gradient Descent Algorithm Issues & Alterna4ves Stochas4c Gradient Descent Parallel Gradient Descent HOGWILD! Mo4va4on It is good

More information

MACHINE learning techniques can, automatically,

MACHINE learning techniques can, automatically, Proceedings of Internationa Joint Conference on Neura Networks, Daas, Texas, USA, August 4-9, 203 High Leve Data Cassification Based on Network Entropy Fiipe Aves Neto and Liang Zhao Abstract Traditiona

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Agent architectures. Francesco Amigoni

Agent architectures. Francesco Amigoni Francesco Amigoni Designing inteigent agents An agent is defined by its agent function f() that maps a sequence of perceptions to an action p(0), p(1),..., p(t) f() a(t) AGENT perceptions actions ENVIRONMENT

More information

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The

More information

Chapter Multidimensional Direct Search Method

Chapter Multidimensional Direct Search Method Chapter 09.03 Mutidimensiona Direct Search Method After reading this chapter, you shoud be abe to:. Understand the fundamentas of the mutidimensiona direct search methods. Understand how the coordinate

More information

Forgot to compute the new centroids (-1); error in centroid computations (-1); incorrect clustering results (-2 points); more than 2 errors: 0 points.

Forgot to compute the new centroids (-1); error in centroid computations (-1); incorrect clustering results (-2 points); more than 2 errors: 0 points. Probem 1 a. K means is ony capabe of discovering shapes that are convex poygons [1] Cannot discover X shape because X is not convex. [1] DBSCAN can discover X shape. [1] b. K-means is prototype based and

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Automatic Grouping for Social Networks CS229 Project Report

Automatic Grouping for Social Networks CS229 Project Report Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct

More information

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802.

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802. A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irreguar LDPC Codes in the IEEE 82.16e Standards Yeong-Luh Ueng and Chung-Chao Cheng Dept. of Eectrica Engineering,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

COMPRESSIVE sensing (CS), which aims at recovering

COMPRESSIVE sensing (CS), which aims at recovering D-Net: Deep Learning pproach for Compressive Sensing RI Yan Yang, Jian Sun, Huibin Li, and Zongben u ariv:705.06869v [cs.cv] 9 ay 07 bstract Compressive sensing (CS) is an effective approach for fast agnetic

More information

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space Sensitivity Anaysis of Hopfied Neura Network in Cassifying Natura RGB Coor Space Department of Computer Science University of Sharjah UAE rsammouda@sharjah.ac.ae Abstract: - This paper presents a study

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

An Introduction to Design Patterns

An Introduction to Design Patterns An Introduction to Design Patterns 1 Definitions A pattern is a recurring soution to a standard probem, in a context. Christopher Aexander, a professor of architecture Why woud what a prof of architecture

More information

Multiple Plane Phase Retrieval Based On Inverse Regularized Imaging and Discrete Diffraction Transform

Multiple Plane Phase Retrieval Based On Inverse Regularized Imaging and Discrete Diffraction Transform Mutipe Pane Phase Retrieva Based On Inverse Reguaried Imaging and Discrete Diffraction Transform Artem Migukin, Vadimir Katkovnik, and Jaakko Astoa Department of Signa Processing, Tampere University of

More information

Quaternion Support Vector Classifier

Quaternion Support Vector Classifier Quaternion Support Vector Cassifier G. López-Gonzáez, Nancy Arana-Danie, and Eduardo Bayro-Corrochano CINVESTAV - Unidad Guadaajara, Av. de Bosque 1145, Coonia e Bajo, Zapopan, Jaisco, México {geopez,edb}@gd.cinvestav.mx

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

Solutions to the Final Exam

Solutions to the Final Exam CS/Math 24: Intro to Discrete Math 5//2 Instructor: Dieter van Mekebeek Soutions to the Fina Exam Probem Let D be the set of a peope. From the definition of R we see that (x, y) R if and ony if x is a

More information

Stages of (Batch) Machine Learning

Stages of (Batch) Machine Learning Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 9, 2012 Today: Graphical models Bayes Nets: Inference Learning Readings: Required: Bishop chapter

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Sequential Learning of Layered Models from Video

Sequential Learning of Layered Models from Video Sequentia Learning of Layered Modes from Video Michais K. Titsias and Christopher K. I. Wiiams Schoo of Informatics, University of Edinburgh, Edinburgh EH1 2QL, UK M.Titsias@sms.ed.ac.uk, c.k.i.wiiams@ed.ac.uk

More information

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course

More information

Clustering. Image segmentation, document clustering, protein class discovery, compression

Clustering. Image segmentation, document clustering, protein class discovery, compression Clustering CS 444 Some material on these is slides borrowed from Andrew Moore's machine learning tutorials located at: Clustering The problem of grouping unlabeled data on the basis of similarity. A key

More information

5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 2014

5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 2014 5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 014 Topoogy-Transparent Scheduing in Mobie Ad Hoc Networks With Mutipe Packet Reception Capabiity Yiming Liu, Member, IEEE,

More information

Naïve Bayes, Gaussian Distributions, Practical Applications

Naïve Bayes, Gaussian Distributions, Practical Applications Naïve Bayes, Gaussian Distributions, Practical Applications Required reading: Mitchell draft chapter, sections 1 and 2. (available on class website) Machine Learning 10-601 Tom M. Mitchell Machine Learning

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

TerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley

TerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley TerraSwarm A Machine Learning and Op0miza0on Toolkit for the Swarm Ilge Akkaya, Shuhei Emoto, Edward A. Lee University of California, Berkeley TerraSwarm Tools Telecon 17 November 2014 Sponsored by the

More information

Unit V. Neural Fuzzy System

Unit V. Neural Fuzzy System Unit V Neural Fuzzy System 1 Fuzzy Set In the classical set, its characteristic function assigns a value of either 1 or 0 to each individual in the universal set, There by discriminating between members

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability

More information

Machine Learning in Telecommunications

Machine Learning in Telecommunications Machine Learning in Telecommunications Paulos Charonyktakis & Maria Plakia Department of Computer Science, University of Crete Institute of Computer Science, FORTH Roadmap Motivation Supervised Learning

More information

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism Backing-up Fuzzy Contro of a Truck-traier Equipped with a Kingpin Siding Mechanism G. Siamantas and S. Manesis Eectrica & Computer Engineering Dept., University of Patras, Patras, Greece gsiama@upatras.gr;stam.manesis@ece.upatras.gr

More information

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System GPU Impementation of Parae SVM as Appied to Intrusion Detection System Sudarshan Hiray Research Schoar, Department of Computer Engineering, Vishwakarma Institute of Technoogy, Pune, India sdhiray7@gmai.com

More information

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning

More information

Complex Human Activity Searching in a Video Employing Negative Space Analysis

Complex Human Activity Searching in a Video Employing Negative Space Analysis Compex Human Activity Searching in a Video Empoying Negative Space Anaysis Shah Atiqur Rahman, Siu-Yeung Cho, M.K.H. Leung 3, Schoo of Computer Engineering, Nanyang Technoogica University, Singapore 639798

More information

Binarized support vector machines

Binarized support vector machines Universidad Caros III de Madrid Repositorio instituciona e-archivo Departamento de Estadística http://e-archivo.uc3m.es DES - Working Papers. Statistics and Econometrics. WS 2007-11 Binarized support vector

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

Sequential Learning of Layered Models from Video

Sequential Learning of Layered Models from Video Sequentia Learning of Layered Modes from Video Michais K. Titsias and Christopher K. I. Wiiams Schoo of Informatics, University of Edinburgh, Edinburgh EH1 2QL, UK M.Titsias@sms.ed.ac.uk, c.k.i.wiiams@ed.ac.uk

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

WHILE estimating the depth of a scene from a single image

WHILE estimating the depth of a scene from a single image JOURNAL OF L A T E X CLASS FILES, VOL. 4, NO. 8, AUGUST 05 Monocuar Depth Estimation using Muti-Scae Continuous CRFs as Sequentia Deep Networks Dan Xu, Student Member, IEEE, Eisa Ricci, Member, IEEE, Wani

More information

A probabilistic fuzzy method for emitter identification based on genetic algorithm

A probabilistic fuzzy method for emitter identification based on genetic algorithm A probabitic fuzzy method for emitter identification based on genetic agorithm Xia Chen, Weidong Hu, Hongwen Yang, Min Tang ATR Key Lab, Coege of Eectronic Science and Engineering Nationa University of

More information

Logistic Regression. Abstract

Logistic Regression. Abstract Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression

More information

Neural Network Enhancement of the Los Alamos Force Deployment Estimator

Neural Network Enhancement of the Los Alamos Force Deployment Estimator Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the

More information

Pattern Classification Algorithms for Face Recognition

Pattern Classification Algorithms for Face Recognition Chapter 7 Pattern Classification Algorithms for Face Recognition 7.1 Introduction The best pattern recognizers in most instances are human beings. Yet we do not completely understand how the brain recognize

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Opening the Black Box Data Driven Visualizaion of Neural N

Opening the Black Box Data Driven Visualizaion of Neural N Opening the Black Box Data Driven Visualizaion of Neural Networks September 20, 2006 Aritificial Neural Networks Limitations of ANNs Use of Visualization (ANNs) mimic the processes found in biological

More information

Image Segmentation Using Semi-Supervised k-means

Image Segmentation Using Semi-Supervised k-means I J C T A, 9(34) 2016, pp. 595-601 Internationa Science Press Image Segmentation Using Semi-Supervised k-means Reza Monsefi * and Saeed Zahedi * ABSTRACT Extracting the region of interest is a very chaenging

More information

Learning to Learn Second-Order Back-Propagation for CNNs Using LSTMs

Learning to Learn Second-Order Back-Propagation for CNNs Using LSTMs Learning to Learn Second-Order Bac-Propagation for CNNs Using LSTMs Anirban Roy SRI Internationa Meno Par, USA anirban.roy@sri.com Sinisa Todorovic Oregon State University Corvais, USA sinisa@eecs.oregonstate.edu

More information

Deep Learning & Neural Networks

Deep Learning & Neural Networks Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural

More information

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

D-Separation. b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C. D-Separation Say: A, B, and C are non-intersecting subsets of nodes in a directed graph. A path from A to B is blocked by C if it contains a node such that either a) the arrows on the path meet either

More information

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Nattawut Thepayasuwan, Member, IEEE and Aex Doboi, Member, IEEE Abstract

More information

Logistic Regression. May 28, Decision boundary is a property of the hypothesis and not the data set e z. g(z) = g(z) 0.

Logistic Regression. May 28, Decision boundary is a property of the hypothesis and not the data set e z. g(z) = g(z) 0. Logistic Regression May 28, 202 Logistic Regression. Decision Boundary Decision boundary is a property of the hypothesis and not the data set. sigmoid function: h (x) = g( x) = P (y = x; ) suppose predict

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Fa 2017 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program instructions

More information

Semi- Supervised Learning

Semi- Supervised Learning Semi- Supervised Learning Aarti Singh Machine Learning 10-601 Dec 1, 2011 Slides Courtesy: Jerry Zhu 1 Supervised Learning Feature Space Label Space Goal: Optimal predictor (Bayes Rule) depends on unknown

More information

Automatic Hidden Web Database Classification

Automatic Hidden Web Database Classification Automatic idden Web atabase Cassification Zhiguo Gong, Jingbai Zhang, and Qian Liu Facuty of Science and Technoogy niversity of Macau Macao, PRC {fstzgg,ma46597,ma46620}@umac.mo Abstract. In this paper,

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Diagnosing Breast Cancer with a Neural Network

Diagnosing Breast Cancer with a Neural Network Undergraduate Journa of Mathematica Modeing: One + Two Voume 7 017 Spring 017 Issue Artice 4 Diagnosing Breast Cancer with a Neura Network John Cuen University of South Forida Advisors: Arcadii Grinshpan,

More information

Multi-Manifold Deep Metric Learning for Image Set Classification

Multi-Manifold Deep Metric Learning for Image Set Classification Muti-Manifod Deep Metric Learning for Image Set Cassification Jiwen Lu1, Gang Wang1,2, Weihong Deng3, Pierre Mouin1,4, and Jie Zhou5 1 Advanced Digita Sciences Center, Singapore 2 Schoo of Eectrica and

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program?

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program? Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Spring 2018 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program a set of

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Model-driven Collaboration and Information Integration for Enhancing Video Semantic Concept Detection

Model-driven Collaboration and Information Integration for Enhancing Video Semantic Concept Detection Mode-driven Coaboration and Information Integration for Enhancing Video Semantic Concept Detection Tao Meng, Mei-Ling Shyu Department of Eectrica and Computer Engineering University of Miami Cora Gabes,

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 1, 2019 Today: Inference in graphical models Learning graphical models Readings: Bishop chapter 8 Bayesian

More information

Learning to Generate with Memory

Learning to Generate with Memory Chongxuan Li LICX14@MAILS.TSINGHUA.EDU.CN Jun Zhu DCSZJ@TSINGHUA.EDU.CN Bo Zhang DCSZB@TSINGHUA.EDU.CN Dept. of Comp. Sci. & Tech., State Key Lab of Inte. Tech. & Sys., TNList Lab, Center for Bio-Inspired

More information

CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes

CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes 1 CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview

More information

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

Extended Node-Arc Formulation for the K-Edge-Disjoint Hop-Constrained Network Design Problem

Extended Node-Arc Formulation for the K-Edge-Disjoint Hop-Constrained Network Design Problem Extended Node-Arc Formuation for the K-Edge-Disjoint Hop-Constrained Network Design Probem Quentin Botton Université cathoique de Louvain, Louvain Schoo of Management, (Begique) botton@poms.uc.ac.be Bernard

More information

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15 Perceptrons and Backpropagation Fabio Zachert Cognitive Modelling WiSe 2014/15 Content History Mathematical View of Perceptrons Network Structures Gradient Descent Backpropagation (Single-Layer-, Multilayer-Networks)

More information

Mixture Model Analysis of DNA Microarray Images

Mixture Model Analysis of DNA Microarray Images IEEE TANSACTIONS ON MEDICAL IMAIN, VOL.??, NO.??,?? 2005 1 Mixture Mode Anaysis of DNA Microarray Images K. Bekas, N. P. aatsanos, A. Likas and I. E. Lagaris Abstract In this paper we propose a new methodoogy

More information

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class

More information

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation Muti-Scae Continuous CRFs as Sequentia Deep Networks for Monocuar Depth Estimation Dan Xu 1, Eisa Ricci 4,5, Wani Ouyang 2,3, Xiaogang Wang 2, Nicu Sebe 1 1 University of Trento, 2 The Chinese University

More information

Neural Networks: Learning. Cost func5on. Machine Learning

Neural Networks: Learning. Cost func5on. Machine Learning Neural Networks: Learning Cost func5on Machine Learning Neural Network (Classifica2on) total no. of layers in network no. of units (not coun5ng bias unit) in layer Layer 1 Layer 2 Layer 3 Layer 4 Binary

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

Sparse Representation based Face Recognition with Limited Labeled Samples

Sparse Representation based Face Recognition with Limited Labeled Samples Sparse Representation based Face Recognition with Limited Labeed Sampes Vijay Kumar, Anoop Namboodiri, C.V. Jawahar Center for Visua Information Technoogy, IIIT Hyderabad, India Abstract Sparse representations

More information

Notes and Announcements

Notes and Announcements Notes and Announcements Midterm exam: Oct 20, Wednesday, In Class Late Homeworks Turn in hardcopies to Michelle. DO NOT ask Michelle for extensions. Note down the date and time of submission. If submitting

More information

TerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley

TerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley TerraSwarm A Machine Learning and Op0miza0on Toolkit for the Swarm Ilge Akkaya, Shuhei Emoto, Edward A. Lee University of California, Berkeley TerraSwarm Tools Telecon 17 November 2014 Sponsored by the

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University March 4, 2015 Today: Graphical models Bayes Nets: EM Mixture of Gaussian clustering Learning Bayes Net structure

More information

OF SCIENTIFIC DATABASES

OF SCIENTIFIC DATABASES CHAR4mCS OF SCIENTIFIC DATABASES Arie Shoshani, Frank Oken, and Harry K.T. Wong Computer Science Research Department University of Caifornia, Lawrence Berkeey Laboratory Berkeey, Caifornia 94720 The purpose

More information