Some fast and compact neural network solutions for artificial intelligence applications

Size: px
Start display at page:

Download "Some fast and compact neural network solutions for artificial intelligence applications"

Transcription

1 Some fast and compact neural network solutions for artificial intelligence applications Radu Dogaru, University Politehnica of Bucharest ETTI, Dept. of Applied Electronics and Info. Eng., Natural Computing Laboratory Bucharest, Romania 1

2 Artificial Intelligence Today Architectures: Deep Learning (multiple layer perceptrons) (have very good accuracies, useful in many big data problems image recognition, speech etc. Shallow networks faster (learner) but less accurate can be used as sub-modules in Deep classifiers Challenges: In hardware-oriented applications (e.g. Intelligent sensors) one needs compact, low complexity, still accurate solutions 2

3 LRF based Deep Classifier (faster approach to deep=learning) From: Convolutional layer here we may use Simplicial cells instead (morphological processors, nonlinear, low complexity for hw dedicated) Shallow classifier (e.g. ELM or SFSVC (fast training / data-driven) Nonlinear preprocessing unit adapted to the specific problem (includes convolutional and pooling layers) 3

4 A unique architecture (kernel network) used for our compact solutions All learning is focused (done) in a linear adaptive layer (ADALINE)! - fast and convergent LEARNING (eg. LMS, Linear SVM or Moore-Penrose (like in Extreme Learning Machine) Functional capability achieved using proper nonlinear expanding (kernel modules) can be optimized for specialized HW/Sw implements ; No tuning of the hidden layer (only 1-2 generic parameters) 4

5 Kernel networks are Adalines operating in a nonlinearly-expanded input space x n Nonlinear Expander (must conserve input information) but expands it in a higher dimensional space o x o m. w i w 1 Adaline y d m>n w 35 A theorem of Cover (1965): if m>>n a problem that is linearly not separable in the input space x, may become lineary separable (thus learnable by Adaline) in the expanded space o. It only requires a proper choice of the kernel functions: x,... j x,..., x 1 m 5

6 27 years old 6

7 7

8 Cover theorem as a basis to build fast learners Need for fast? Because the learner (deep one) has many additional parameters (e.g. Hidden nodes, pooling size, etc.) one shall try various possibilities each time training again Compact, tehnology adapted: using specific kernels (one has to design them such that no backprop tuning is necessary (the main hint to make it fast). The idea exists / implemented already in widely known architectures (but not emphasized as so): SVM (support vector machine) gamma parameter ELM (extreme learning machine) # of hidden units.. Simplicial expanding factor of input space (S)FSVC - radius of RBF units centered on selection from training set 8

9 Simplicial neural nets what are they? What are they good for? A bit of history first 9

10 A bit of history.. CNNS the need for universal cells Convolutional layer linear 1998 Chua: can you design a universal CNN cell? (capable to represent arbitrary nonlinear local functions)? 10

11 With (linear) convolution 11

12 One of the first CNN chips with Simplicial Cells With nolinear operators replacing convolution 12

13 Steps towards universal CNN cells 1999: Universal CNN cell Multi-nested PWL Cell Universal Very compact Adaptive Only for Boolean local functions (white/black images) We need to generalize (gray-scale / color processing) -Pedro Julian previous works on simplicial decomposition - in 2000 we started to disccuss on the possibility to implement this theory as a hw (analogic) cell -Results: [1][2] 13

14 The trainable function f and its parameters c j (weights of Adaline) Simplicial cell theory A particular problem is learned (via the simple LMS) in the set of 2 n coefficients c j Kernels With a nested nonlinear expansion at the input the above formula is an universal approximator Simplex selected by the input vector. The input vector can be decomposed as a linear (weighted) combination of the vertices (only n+1 vectors) of the simplex. 14

15 It is a kernel network where kernels can be computed very conveniently in a mixed-signal circuit The circuit implementation of the existent algorithm was my idea together with P. Julian we then investigated the performances of the S-cell for use as either a new kind of neural network or as a universal gray level CNN cell 15

16 Program (image proc) 16

17 17

18 Can it be implemented in a digital system? Actually this is the case with all chips reported (simplicity comes to the price of sequencial processing) Program 6 bits/pixel => 2^6 = 64 processing cycles Often 6 bits is enough (1ns/cycle => 64 ns per frame) From [6] Enough for most apps 18

19 Learning image processing (feature extractors) with Simplicial cells Program 19

20 Applications (Median and Order Statistic Filtering) A regression problem For comparison, the same function requires 85 combinational logic blocks in a digital implementation using FPGAs. Radu DOGARU - Some Program fast and compact neural network 20

21 <- edge detection training samples The binary gene can be realized with a simple m-nest cell instead of the RAM -----> Program 21

22 Can it learn any other problems? Yes Learning a SIN (sinusoidal) function (1 input 1 output regression problem) m=3 Nonlinear preprocessing of the input space has been done using the multi-nested recursion: k 2,..., m u , k u k m=5 22

23 Simplicial cells are COMPACT well suited in multiple core implementations (visual microprocessors) Distributed synaptic memories All synapses in a unique memory module 23

24 2018 Latest CNN chips with Simplicial Cells Only 0.81 TOPS/W in 2014 Binary information processing to reduce power consumption 24

25 Simplicial cell essentials An apparently sophisticated theory (simplicial decomposition) fits quite well in a compact circuit A kernel network can be implemented to approximate any arbitrarily local functional (mostly useful in image processing) This solution was already adopted in implementing optical sensors with integrated processing capabilities (first step towards an intelligent, fully integrated, optical sensor) Best suited to implement, in a fully parallel mode, image feature detectors (local, i.e. similar to convolutional layers) 25

26 Simplical cell local image processing We need a classifier in the output stage (large size for input vectors) e.g. MNIST 28x28 pixels (n=784) Usually: SVM (Support vector machine) ELM (Extreme learning machine) They are not fast enough.. and for other reasons we propose : Fast Support Vector Classifier (FSVC) Introduced in 1996 as RBF-M (a HW oriented model) [9] SFSVC (Super FSVC) no Adaline training at all (2016) [10] 26

27 x n Nonlinear Expander (must conserve input information) but expands it in a higher dimensional space o x o m. w i w 1 Adaline y d m>n w 35 ELM (Extreme Learning Machine) widely popularized as the fastest Training in the Adaline Layer with Moore-Penrose algorithm (quadratic in m), training times become excessive for large numbers of neurons. SVM (similar behavior of the training algorithm), + no hardware oriented kernels are possible (usually Gaussian) 27

28 Fast support vector classifiers 28

29 Challenges: Big Data i.e. collections of annotated samples (e.g. hyper-spectral imagery, handwritten text, images of different objects) are becoming widely available. There is a need for classifiers capable to learn rapidly (and generalize well) from such large amounts of data. Somehow conflicting issues: Speed (training time) Classification performance Low complexity - for convenient integration into high performance computing platforms (GPU, FPGA, specialized hardware embedded into sensors) 29

30 Actual solutions: DEEP LEARNING (multiple-layer neural architectures) Very good accuracies but relatively slow training and requiring costly computational platforms; yet it is a solution widely adopted now by academics but industry as well (several chips implementing deep-learning solutions are available on the market SHALLOW architectures Only a single (nonlinear) hidden layer offer fast training (when using techniques other than back-propagation) but on image classification problems accuracies are generally lower than what is obtained in deep-learning. Still, using adequate additional layers (e.g. based on local receptive fields), the global accuracy can be improved, while maintaining the speed advantage. Our approach SFSVC is a shallow architecture, providing very fast training speed and comparable accuracy to any of the typical shallow architectures. Why is it fast? No output layer tuning (can it be? Yes) Fast selection of support vectors using a supervised novelty based selection algorithm (no parameter adjusting is involved). 30

31 Shallow (kernel-based) Neural Networks y General defining formula for a kernel neuron m o k 0 k w k Adaline (output layer) It can be trained using LMS or pseudo-inverse method. In our approach weights w k are directly assigned +1 or 0 values! (no tuning), where o k k x Properly chosen nonlinear (basis) function (also called a kernel ) It may be also called hidden neuron, there are many choiches 31

32 Kernel networks are Adalines operating in a nonlinearly-expanded input space x n Nonlinear Expander (must conserve input information) but expands it in a higher dimensional space o x o m. w i w 1 Adaline y d m>n w 35 A theorem of Cover (1965): if m>>n a problem that is linearly not separable in the input space x, may become lineary separable (thus learnable by Adaline) in the expanded space o. It only requires a proper choice of the kernel functions: x,... j x,..., x 1 m 32

33 Shallow architectures SVM selects support vectors for kernels from the training samples using an optimization algorithm minimizing the risk relatively slow training (although the actual LIBSVM implementation used widely is relatively well optimized) ELM (extreme learning machine) considered by its authors as the fastest neural paradigm uses the trick of generating randomly weights (parameters) in the hidden layer which lets most of the training time for the pseudo-inverse training of the output linear layer. Note that for big-data the pseudo-inverse requires large training times, proportional to M 2 where the number M of hidden nodes (also large when databases are large). Another problem with ELM, because of randomness, several trials (often ignored by authors in reporting training speed) are necessary to reach a maximum performance using the same architecture. NoProp proposed recently by Widrow is essentially an ELM where output layer training is done faster via LMS instead of pseudo-inverse. In our approach: hidden layer has no random elements (only support vectors selected (not computed) from the training dataset); when using tuning it uses LMS 33

34 Datasets and classifier In addition, a test set TS is considered to evaluate the generalization performance Input feature vector x TIX R, type of basis fnct. SFSVC classifier A list of integeres (1,..N): indexes k of the selected vectors in TR; It is the result of the training phase Predicted class 1,..M Each class is assigned an output Adaline (the one with highest activation indicates predicted class) 34

35 (S)FSVC Architecture and Equations (RBF-M [9]) Simple LMS training (only for SFSVC-T) or d k values (1 or 0) in SFSVC Centroids c j are the vector supports selected from TR (according to TIX). RBF allows a wider variety of kernels, not necessary to satisfy Mercer s condition Unsupervised selection of centers in FSVC; Supervised selection in SFSVC (accelerates the algorithm and improves performance) For each class a search for centers (support vectors) is done 35

36 Various Distances and RBF functions Manhattan d m x c j 1 n i 1 x i c ji Best for hardware-oriented applications d e x c j 2 Euclidean n xi c ji i if d k r ( d, r) d 1 else k r k 2 m Triangular ( d, r) d exp 2r 2 Such choices do not influence significantly classification performance g Gaussian 2 36

37 Without tuning in the Adaline layer the only limiting factor for training speed is the novelty-based support vector finding algorithm (almost linear in the number of units) 37

38 Time (support vectors search) for MNIST problem (60000 samples) and its dependency on the number of RBF units For getting best performance only the radius should be varied (in SVM one has the gamma and C ; in ELM the number of hidden neurons) 38

39 For large datasets (like MNIST) a huge number of neurons will give the best accuracy. With proper additional layers (LRF-deep structure for example) it can be dramatically improved (over 99%) 39

40 But in SVM and ELM training time dependency on the number of hidden units (on the same PC) goes like this: 900 sec. for neurons 80 ELM neurons give an error! 300 SVM For any gamma and C lasts hundred of seconds (difficult to tune the model!) 40

41 The Support Vector selection algorithm 2 m 1 m+1 New input sample becomes a centroid (less overlap with existing coverage) 2 m 1 Not added New input sample is not becoming a centroid (much overlap with existing coverage) 1 It is a measure of the degree of overlap between two RBF units: a small degree (e.g. 0.1) implies that some input vector that is placed just in the middle of the distance between the two RBF centers will generate an overall output of the RBF output close to 0 making thus very difficult to discriminate such vectors. If the overlap becomes large (e.g. 10) the two RBF units are redundant. 41

42 Training algorithm = Determining TIX The overlapping threshold matters at training (in SFSVC it can be as small as Selecting a 1/128, for the tunable new support version SFSVC-T vector usually is taken 1). Computes hidden layer activity FSVC uses unsupervised selection (i.e. class assignment of input pattern does not matter); Supervised approach gives faster speed and offers better performance 42

43 Results t 1 TIX generating t 2 compute hidden and output layer t 3 Adaline training Tuning adds important times needed to compute the hidden/output layer and adjust the Adaline weights Pseudo-inverse learning of ELM adds additional training time! It is a reduced size MNIST From [10] 43

44 Comparison / speed-ups Generally accuracies are quite simillar 44

45 Newer results to be published soon MNIST In [13] an investigation was done on solving the MNIST problem with SVM. As for the SFSVC where r and ov should be optimized for the best performance, in the case of SVM one needs to optimize the regularization parameter C and the parameter (related to radius r). Using Python and likely a similar CPU as ours, they report seconds for training 48 different SVM models resulting in the best accuracy 98.5% (i.e. around 4608 seconds per model). On the other hand 22 SFSVC models were tried (22 different radius values) for a total time of only 1504 seconds (i.e. around 68 seconds per model) to achieve 97.78% accuracy. This indicates an important speed-up of 67 times on a similar computational platform (Python with SCIKIT-LEARN) thus indicating the validity and efficiency of the SFSVC model 45

46 SFSVC work in progress we would still want to improve accuracy while not sacrificing training speed too much, other issues are also on the agenda.. Encouraging preliminary results: Adaline tuning but on a reduced sub-set of the training data. SFSVC: It is faster than usual shallow classifiers (using proper implementation platform we found Python / NUMPY & SCIPY very convenient When using hardware-oriented RBFs can be also conveniently implemented in other platforms : FPGA, GPU.. (Gaussian kernels replced with PWL ones) 46

47 Thank You! More questions? 47

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE COMPUTATIONAL INTELLIGENCE Radial Basis Function Networks Adrian Horzyk Preface Radial Basis Function Networks (RBFN) are a kind of artificial neural networks that use radial basis functions (RBF) as activation

More information

Image Classification using Fast Learning Convolutional Neural Networks

Image Classification using Fast Learning Convolutional Neural Networks , pp.50-55 http://dx.doi.org/10.14257/astl.2015.113.11 Image Classification using Fast Learning Convolutional Neural Networks Keonhee Lee 1 and Dong-Chul Park 2 1 Software Device Research Center Korea

More information

For Monday. Read chapter 18, sections Homework:

For Monday. Read chapter 18, sections Homework: For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

A LOW COMPLEXITY METHOD FOR ANALYZING ACOUSTIC ECHO SIGNALS IN THE FRAMEWORK OF A CELLULAR AUTOMATA VIRTUAL ENVIRONMENT

A LOW COMPLEXITY METHOD FOR ANALYZING ACOUSTIC ECHO SIGNALS IN THE FRAMEWORK OF A CELLULAR AUTOMATA VIRTUAL ENVIRONMENT U.P.B. Sci. Bull., Series C, Vol. 80, Iss. 2, 2018 ISSN 2286-3540 A LOW COMPLEXITY METHOD FOR ANALYZING ACOUSTIC ECHO SIGNALS IN THE FRAMEWORK OF A CELLULAR AUTOMATA VIRTUAL ENVIRONMENT Mihai BUCURICA

More information

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( ) Structure: 1. Introduction 2. Problem 3. Neural network approach a. Architecture b. Phases of CNN c. Results 4. HTM approach a. Architecture b. Setup c. Results 5. Conclusion 1.) Introduction Artificial

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

Using Machine Learning for Classification of Cancer Cells

Using Machine Learning for Classification of Cancer Cells Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Neural Networks (Overview) Prof. Richard Zanibbi

Neural Networks (Overview) Prof. Richard Zanibbi Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

Vulnerability of machine learning models to adversarial examples

Vulnerability of machine learning models to adversarial examples Vulnerability of machine learning models to adversarial examples Petra Vidnerová Institute of Computer Science The Czech Academy of Sciences Hora Informaticae 1 Outline Introduction Works on adversarial

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Parallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem

Parallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem Parallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem Raphaël Couturier, Michel Salomon FEMTO-ST - DISC Department - AND Team November 2 & 3, 2015 / Besançon

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism) Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

Tutorial on Machine Learning Tools

Tutorial on Machine Learning Tools Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence

More information

CS229 Final Project: Predicting Expected Response Times

CS229 Final Project: Predicting Expected  Response Times CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time

More information

Class 6 Large-Scale Image Classification

Class 6 Large-Scale Image Classification Class 6 Large-Scale Image Classification Liangliang Cao, March 7, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

CSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks

CSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks CSE 5526: Introduction to Neural Networks Radial Basis Function (RBF) Networks Part IV 1 Function approximation MLP is both a pattern classifier and a function approximator As a function approximator,

More information

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Classifying Depositional Environments in Satellite Images

Classifying Depositional Environments in Satellite Images Classifying Depositional Environments in Satellite Images Alex Miltenberger and Rayan Kanfar Department of Geophysics School of Earth, Energy, and Environmental Sciences Stanford University 1 Introduction

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Support Vector Machines

Support Vector Machines Support Vector Machines RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

Feature scaling in support vector data description

Feature scaling in support vector data description Feature scaling in support vector data description P. Juszczak, D.M.J. Tax, R.P.W. Duin Pattern Recognition Group, Department of Applied Physics, Faculty of Applied Sciences, Delft University of Technology,

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation

More information

Combine the PA Algorithm with a Proximal Classifier

Combine the PA Algorithm with a Proximal Classifier Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU

More information

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction

More information

An Efficient Learning Scheme for Extreme Learning Machine and Its Application

An Efficient Learning Scheme for Extreme Learning Machine and Its Application An Efficient Learning Scheme for Extreme Learning Machine and Its Application Kheon-Hee Lee, Miso Jang, Keun Park, Dong-Chul Park, Yong-Mu Jeong and Soo-Young Min Abstract An efficient learning scheme

More information

Chap.12 Kernel methods [Book, Chap.7]

Chap.12 Kernel methods [Book, Chap.7] Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Syllabus 1. Whole traditional (old) visual recognition pipeline 2. Introduction to Neural Nets 3. Deep Nets for image classification To do : Voir la leçon inaugurale

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

Extreme Learning Machines. Tony Oakden ANU AI Masters Project (early Presentation) 4/8/2014

Extreme Learning Machines. Tony Oakden ANU AI Masters Project (early Presentation) 4/8/2014 Extreme Learning Machines Tony Oakden ANU AI Masters Project (early Presentation) 4/8/2014 This presentation covers: Revision of Neural Network theory Introduction to Extreme Learning Machines ELM Early

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Vulnerability of machine learning models to adversarial examples

Vulnerability of machine learning models to adversarial examples ITAT 216 Proceedings, CEUR Workshop Proceedings Vol. 1649, pp. 187 194 http://ceur-ws.org/vol-1649, Series ISSN 1613-73, c 216 P. Vidnerová, R. Neruda Vulnerability of machine learning models to adversarial

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

Training Convolutional Neural Networks for Translational Invariance on SAR ATR

Training Convolutional Neural Networks for Translational Invariance on SAR ATR Downloaded from orbit.dtu.dk on: Mar 28, 2019 Training Convolutional Neural Networks for Translational Invariance on SAR ATR Malmgren-Hansen, David; Engholm, Rasmus ; Østergaard Pedersen, Morten Published

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

Backpropagation + Deep Learning

Backpropagation + Deep Learning 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders

More information

Clustering algorithms and autoencoders for anomaly detection

Clustering algorithms and autoencoders for anomaly detection Clustering algorithms and autoencoders for anomaly detection Alessia Saggio Lunch Seminars and Journal Clubs Université catholique de Louvain, Belgium 3rd March 2017 a Outline Introduction Clustering algorithms

More information

Classification: Linear Discriminant Functions

Classification: Linear Discriminant Functions Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions

More information

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION 6 NEURAL NETWORK BASED PATH PLANNING ALGORITHM 61 INTRODUCTION In previous chapters path planning algorithms such as trigonometry based path planning algorithm and direction based path planning algorithm

More information

Optimization Methods for Machine Learning (OMML)

Optimization Methods for Machine Learning (OMML) Optimization Methods for Machine Learning (OMML) 2nd lecture Prof. L. Palagi References: 1. Bishop Pattern Recognition and Machine Learning, Springer, 2006 (Chap 1) 2. V. Cherlassky, F. Mulier - Learning

More information

Neural Networks: What can a network represent. Deep Learning, Fall 2018

Neural Networks: What can a network represent. Deep Learning, Fall 2018 Neural Networks: What can a network represent Deep Learning, Fall 2018 1 Recap : Neural networks have taken over AI Tasks that are made possible by NNs, aka deep learning 2 Recap : NNets and the brain

More information

Neural Networks: What can a network represent. Deep Learning, Spring 2018

Neural Networks: What can a network represent. Deep Learning, Spring 2018 Neural Networks: What can a network represent Deep Learning, Spring 2018 1 Recap : Neural networks have taken over AI Tasks that are made possible by NNs, aka deep learning 2 Recap : NNets and the brain

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

Brainchip OCTOBER

Brainchip OCTOBER Brainchip OCTOBER 2017 1 Agenda Neuromorphic computing background Akida Neuromorphic System-on-Chip (NSoC) Brainchip OCTOBER 2017 2 Neuromorphic Computing Background Brainchip OCTOBER 2017 3 A Brief History

More information

Ensembles. An ensemble is a set of classifiers whose combined results give the final decision. test feature vector

Ensembles. An ensemble is a set of classifiers whose combined results give the final decision. test feature vector Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier 3 super classifier result 1 * *A model is the learned

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna, Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep

More information

INTRODUCTION TO DEEP LEARNING

INTRODUCTION TO DEEP LEARNING INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional

More information

Report: Privacy-Preserving Classification on Deep Neural Network

Report: Privacy-Preserving Classification on Deep Neural Network Report: Privacy-Preserving Classification on Deep Neural Network Janno Veeorg Supervised by Helger Lipmaa and Raul Vicente Zafra May 25, 2017 1 Introduction In this report we consider following task: how

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, Michael Felsberg 2 Discriminative Correlation Filters (DCF)

More information

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers Anil Kumar Goswami DTRL, DRDO Delhi, India Heena Joshi Banasthali Vidhyapith

More information

Data Mining and Analytics

Data Mining and Analytics Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/

More information

Combined Weak Classifiers

Combined Weak Classifiers Combined Weak Classifiers Chuanyi Ji and Sheng Ma Department of Electrical, Computer and System Engineering Rensselaer Polytechnic Institute, Troy, NY 12180 chuanyi@ecse.rpi.edu, shengm@ecse.rpi.edu Abstract

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Microsoft Research 1 Microsoft Way Redmond, WA 9805 jplatt@microsoft.com Abstract Training a Support Vector Machine

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Introduction to object recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others Overview Basic recognition tasks A statistical learning approach Traditional or shallow recognition

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

More information

Handwritten Digit Recognition Using Convolutional Neural Networks

Handwritten Digit Recognition Using Convolutional Neural Networks Handwritten Digit Recognition Using Convolutional Neural Networks T SIVA AJAY 1 School of Computer Science and Engineering VIT University Vellore, TamilNadu,India ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Nelder-Mead Enhanced Extreme Learning Machine

Nelder-Mead Enhanced Extreme Learning Machine Philip Reiner, Bogdan M. Wilamowski, "Nelder-Mead Enhanced Extreme Learning Machine", 7-th IEEE Intelligent Engineering Systems Conference, INES 23, Costa Rica, June 9-2., 29, pp. 225-23 Nelder-Mead Enhanced

More information

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Evangelos MALTEZOS, Charalabos IOANNIDIS, Anastasios DOULAMIS and Nikolaos DOULAMIS Laboratory of Photogrammetry, School of Rural

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special

More information

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision

More information

Convolution Neural Networks for Chinese Handwriting Recognition

Convolution Neural Networks for Chinese Handwriting Recognition Convolution Neural Networks for Chinese Handwriting Recognition Xu Chen Stanford University 450 Serra Mall, Stanford, CA 94305 xchen91@stanford.edu Abstract Convolutional neural networks have been proven

More information

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University

More information

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting

More information

Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?)

Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?) SKIP - May 2004 Hardware Neuronale Netzwerke - Lernen durch künstliche Evolution (?) S. G. Hohmann, Electronic Vision(s), Kirchhoff Institut für Physik, Universität Heidelberg Hardware Neuronale Netzwerke

More information