Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Similar documents
Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Machine Learning 13. week

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

IMPLEMENTING DEEP LEARNING USING CUDNN 이예하 VUNO INC.

Character Recognition Using Convolutional Neural Networks

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Deep Learning for Computer Vision II

Computer Vision Lecture 16

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Lecture 20: Neural Networks for NLP. Zubin Pahuja

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Data Mining. Neural Networks

Neural Networks CMSC475/675

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Introduction to Neural Networks

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174!

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

ROB 537: Learning-Based Control

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

For Monday. Read chapter 18, sections Homework:

MoonRiver: Deep Neural Network in C++

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

INTRODUCTION TO DEEP LEARNING

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Deep Learning. Volker Tresp Summer 2014

Lecture #11: The Perceptron

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

11/14/2010 Intelligent Systems and Soft Computing 1

Deep Learning with Tensorflow AlexNet

On the Effectiveness of Neural Networks Classifying the MNIST Dataset

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Introduction to Neural Networks

27: Hybrid Graphical Models and Neural Networks

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Advanced Introduction to Machine Learning, CMU-10715

A Deep Learning primer

Structured Prediction using Convolutional Neural Networks

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Deep Generative Models Variational Autoencoders

A Quick Guide on Training a neural network using Keras.

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016

Neural Networks and Deep Learning

Introduction AL Neuronale Netzwerke. VL Algorithmisches Lernen, Teil 2b. Norman Hendrich

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15

Study of Residual Networks for Image Recognition

Machine Learning. MGS Lecture 3: Deep Learning

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

CSC 578 Neural Networks and Deep Learning

Deep neural networks II

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Introduction to Neural Networks

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

ImageNet Classification with Deep Convolutional Neural Networks

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research

COMPUTATIONAL INTELLIGENCE

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

A Dendrogram. Bioinformatics (Lec 17)

Back propagation Algorithm:

Convolutional Neural Networks

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Introduction to Deep Learning

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

COMP9444 Neural Networks and Deep Learning 5. Geometry of Hidden Units

Computer Vision Lecture 16

CMPT 882 Week 3 Summary

Recurrent Neural Networks

Multi-Glance Attention Models For Image Classification

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Report: Privacy-Preserving Classification on Deep Neural Network

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

ConvolutionalNN's... ConvNet's... deep learnig

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Know your data - many types of networks

All You Want To Know About CNNs. Yukun Zhu

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

Deep Learning. Volker Tresp Summer 2015

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Neural Network Neurons

Deconvolutions in Convolutional Neural Networks

Computer Vision Lecture 16

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CS6220: DATA MINING TECHNIQUES

Neural Nets. General Model Building

In-Place Activated BatchNorm for Memory- Optimized Training of DNNs

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution

Some fast and compact neural network solutions for artificial intelligence applications

Transcription:

Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206

Artificial neural network NB! Inspired by biology, not based on biology!

Applications Automatic speech recognition Automatic image classification and tagging Natural language modeling

Learning objectives How do artificial neural networks work? What types of artificial neural networks are used for what tasks? What are the state-of-the-art results achieved with artificial neural networks?

Part How DO neural networks work?

Frank Rosenblatt (957) Added learning rule to McCulloch-Pitts neuron.

Perceptron Prediction:, if xw x2 w2 b 0 y 0, otherwise y Σ w x w2 x2 b

Perceptron Prediction:, if xw x2 w2 b 0 y 0, otherwise Learning: wi wi (t y ) xi b b (t y ) y If prediction == target, do nothing Σ w x w2 x2 If prediction < target, increase weights of positive inputs, decrease wights of negative inputs If prediction > target, vice versa b

Let s try it out! X Y X OR Y 0 0 0 0 0 A B C Initialize A,B,C=0, so output is 0 go over the examples in table:. t = y, so no changes 2. y = 0, t = A = 0,B =, C = 3. y = t = 4. y = t = 5. y =, t = 0 A = 0, B =, C = 0 6. y = t = 7. y = 0, t = A =, B =, C = 0 Learning: wi wi (t y ) xi b b (t y )

Perceptron limitations Perceptron learning algorithm converges only for linearly separable problems (because it only has one layer) Minsky, Papert, Perceptrons (969)

Multi-layer perceptrons Add non-linear activation functions Add hidden layer(s) Universal approximation theorem (important!): Any continuous function can be approximated by finite feed-forward neural network with one hidden layer.

Forward propagation + b b2 x w + b2 y =σ(b+xw+x2w2) Σ w2 w2 w2 x2 w22 y2 = σ(b2+xw2+x2w22) Σ w22 z = b2+yw2+y2w22 (no nonlinearity) Σ Z

Loss function Function approximation: (0 z ) 2 L (t z ) 2 2 Binary classification: log(z ) log( z ), if t L log( z ), if t 0 Multi-class classification: L t j log z j j log( z )

Backpropagation + + b2= ez b= ey b2= ey z = b2+yw2+y2w22 y = σ(b+xw+x2w2) ey= ezw2 σ (b+xw+x2w2) x w= eyx Σ dl/dz = ez = z-t Σ w2= ezy w2= ey2x w2= eyx2 x2 w22= ey2x2 y2 = σ(b2+xw2+x2w22) ey2= ezw22 σ (b2+xw2+x2w22) Σ w22= ezy2 Derivative of sigmoid: ' ( x) ( x)( ( x))

Gradient Descent Gradient descent finds weight values that result in small loss. Gradient descent is guaranteed to find only local minimum. But there is plenty of them and they are often good enough!

Walking around in energy(loss) landscape based on only local gradient information

Things to remember... Perceptron was the first artificial neuron model invented in late 950s. Perceptron can learn only linearly separable classification problems. Feed-forward networks with non-linear activation functions and hidden layers can overcome limitations of perceptrons. Multi-layer artificial neural networks are trained using backpropagation and gradient descent.

Part 2 Neural networks taxonomy

Simple feed-forward networks Architecture: Each node connected to all nodes of previous layer. Information moves in one direction only. Used for: Function approximation Simple classification problems Not too many inputs (~00) OUTPUT LAYER HIDDEN LAYER INPUT LAYER

Convolutional neural networks

Hubel & Wiesel (959) Performed experiments with anesthetized cat. Discovered topographical mapping, sensitivity to orientation and hierarchical processing. Simple cells convolution Complex cells pooling

Convolution in neural nets Recommending music on Spotify

Convolutional neural networks Architecture: Convolutional layer: local connections + weight sharing. Pooling layer: translation invariance. POOLING LAYER 3 max CONVOLUTIONAL LAYER Used for: images, any other data with locality property, i.e. adjacent characters make up word. 2 0-2 2 2 2 - INPUT LAYER weights: 0 - -3

Convolution 0 0 0 0 Convolution searches for the same pattern over the entire image and calculates a score for each match.

Convolution Now try this: - - - - And this.. - - - Convolution searches for the same pattern over the entire image and calculates a score for each match.

What do these filers do? 0 0-4 0 0

0 0-4 0 0

Pooling Pooling achieves translation invariance by taking maximum of adjacent convolution scores.

Example: handwritten digit recognition Y. LeCun et al., Handwritten digit recognition: Applications of neural net chips and automatic learning, 989. LeCun et al. (989)

Recurrent neural networks Architecture: Hidden layer nodes connected to each other. Allows retaining internal state and memory. Used for: speech recognition, handwriting recognition, any time series brain activity, DNA reads OUTPUT LAYER RECURRENT HIDDEN LAYER INPUT LAYER

Backpropagation through time T T2 T3 T4? HIDDEN LAYER OUTPUT LAYER O O2 O3 O4 H0 H H2 H3 H4 I3 I4 same W INPUT LAYER I I2 time

Auto-encoders Architecture: Input and output are the same!! Hidden layer functions as a bottleneck. Network is trained to reconstruct input from hidden layer activations. OUTPUT LAYER = INPUT LAYER HIDDEN LAYER Used for: image search dimensionality reduction INPUT LAYER

We didn t talk about... Restricted Boltzmann Machines (RBMs) Long Short Term Memory networks (LSTMs) Echo State Networks / Liquid State Machines Hopfield Network Self-organizing maps (SOMs) Radial basis function networks (RBFs) But we covered the most important ones!

Things to remember... Simple feed-forward networks are usually used for function approximation, i.e. predict energy consumption. Convolutional neural networks are mostly used for images. Recurrent neural networks are used for speech recognition and language modeling Autoencoders are used for dimensionality reduction.

Part 3 State-of-the-art results

Deep Learning Artificial neural networks and backpropagation have been around since 980s. What s all this fuss about deep learning? What has changed: we have much bigger datasets, we have much faster computers (think GPUs), we have learned a few tricks how to train networks with very very many (50) layers.

GoogLeNet ImageNet 204 winner 27 layers, 5M weights. Szegedy et al., Going Deeper with Convolutions (204).

ImageNet classification current best 4,9% (human error 5,%) Try it yourself: http://www.clarifai.com/#demo Wu et al., Deep Image: Scaling up Image Recognition (205). Ioffe, Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (205).

Automatic image descriptions Karpathy, Fei-Fei, Deep Visual-Semantic Alignments for Generating Image Descriptions (204)

Reinforcement learning score screen Pong Seaquest actions Breakout Beam Rider Space Invaders Enduro https://github.com/tambetm/simple_dqn Mnih et al., Human-level control through deep reinforcement learning (205)

Multiagent reinforcement learning Videos on YouTube about competitive mode and collaborative mode Tampuu, Matiisen et al., Multiagent Cooperation and Competition with Deep Reinforcement Learning (205)

Program execution Curriculum learning learning simple expressions first and then more complex proved to be essential. Zaremba, Sutskever, Learning to Execute (205).

The future of AI? Neural Turing Machines Memory Networks writing and reading from external memory (infinite memory) For example: Hybrid computing using a neural network with dynamic external memory (Graves, Hassabis et al. 206)

Thank you!