Structured Memory for Neural Turing Machines
|
|
- Jasmin Ball
- 5 years ago
- Views:
Transcription
1 Structured Memory for Neural Turing Machines Wei Zhang, Bowen Zhou, Yang Yu {zhangwei, zhou,
2 MoCvaCon NTM memory tries to simulate human working memory using linearly structured memory. It has been proved to be successful for learning simple algorithms. Q: Is Linear memory adequate for more complex algorithms, such as those for quescon answering? Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural Turing Machines." arxiv preprint arxiv: (2014).
3 NTM s neurological intuicon NTM is trained end-to-end: Both controller and memory are jointly learned by interaccng with environment. How and where to store in memory the response to the input is learnt automaccally, instead of specified explicitly.
4 Empirical observacons Memory plays a key role in NTM. For larger memories, the interaccon between parameter and memory content could be out of control at early stage of training, although careful inicalizacon of the memory content is applied 1. t-1 t C w r C w r controller Memory controller Memory Unroll NTM through Cme 1. h`ps://github.com/kaishengtai/torch-ntm
5 Empirical observacons SomeCmes NTM does not converge, or not as quickly as expected. Copy and recall. NTM struggles to learn the tasks that are intuicvely learnable. E.g. on babi data 1, compared to a`encon mechanism used in a`encve reader 2 1. Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer and Tomas Mikolov. Towards AI Complete Ques9on Answering: A Set of Prerequisite Toy Tasks. arxiv: Hermann, Karl Moritz, et al. "Teaching machines to read and comprehend." NIPS 2015.
6 Are those issues possibly related to the limitacon of NTM s linear memory structure? Can we learn from human working memory? h`p://thebrain.mcgill.ca/flash/d/d_07/d_07_cr/d_07_cr_tra/d_07_cr_tra.html
7 Imagine a tree. As a tree grows, so do the roots and branches. Given the right condicons (nutrients, water and sunlight), at the end of these branches leaves will grow. Much like a tree neurons have a complex network of branches, and given the right condi9ons, new synapses will grow and strength helping in the forma9on of memories. Dr. Daniel Jones BSc PhD Director of Research and Development Revive AcCve A Neuron tree as memory structure h`p://
8 Is it possible to improve NTM by using structured memory that mimics human memory in certain ways? and Can we mimic tree-structured human working memory in NTM?
9 Yes and Yes
10 Upper layer memory Accumulate response Lower layer memory Neurons closer to the root send signals to write to the other connected branch neurons. This is more like accumula9ng responses in memory.
11 Inspired by the tree structure for human memory, we propose Structured Memory for NTM
12 Related work David Krueger & Roland. REGULARIZING RNNS BY STABILIZING ACTIVATIONS. (2016) Memisevic h`p://arxiv.org/pdf/ v2.pdf Meng et al A DEEP MEMORY-BASED ARCHITECTURE FOR SEQUENCE- TO-SEQUENCE LEARNING h`p://arxiv.org/pdf/ v3.pdf
13 How do we create structured memory in NTM? Different ways. Just list 3.
14 Three proposed models Memory visability (w.r.t. write heads): memory is Controlled if it is modified by controller outputs through write heads directly, or Hidden if not. 1. Hidden-memory 2. Double-controlled 3. Tightly-coupled input output input output input output
15 Vanilla NTM input: output: input: output: Output c(t-1) Output c(t) LSTM LSTM c(t) c(t-1) ADDRESSING READ ADDRESSING M (t-1) M(t)
16 Our proposal #1: Hidden-Memory NTM input: output: input: output: Output Output c(t-1) c(t) LSTM LSTM c(t) c(t-1) ADDRESSING READ ADDRESSING Mc (t-1) acc Mh (t-1) Mc (t) acc Mh (t)
17 Mc(t) the controlled memory, is also the short-term memory that remembers immediate input from the environment. Mh(t) the hidden memory, works as a longer short-term memory that accumulates the contents from the controlled memory.
18 Our proposal #2: Double-Controlled NTM input: output: input: output: Output c(t-1) Output c(t) LSTM LSTM c(t) c(t-1) ADDRE SSING ADDRE SSING READ ADDRE SSING ADDRE SSING M1(t-1) M1(t) acc acc M2(t-1) M2(t)
19
20 Our proposal #3 Tightly-Coupled NTM NTM output: NTM input: Output MulC-layer LSTM L1 c(t) L2 L3 ADDR1 ADDR2 ADDR3 M1 M2 M3
21 Experiments
22 ImplementaCon Details Torch 6 RMSProp All modules are recurrent Train on CPU Memory size: Reserve big enough to hold all inputs.
23 Experiments copy, recall Copy: randomly generate input vector sequence to copy. Input sequence is only copied once. Recall: randomly generate input vector sequence, randomly choose a generated vector in sequence as query, and output the immediate next vector. sigmoid+binary cross entropy babi data for simulacng QuesCon Answering Word prediccon LogSoxMax + NegaCve LogLikelihood
24 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write [PosiCon, weights] Read [posicon, weights] Write [PosiCon, weights] Read [posicon, weights]
25 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write Read Write Read [PosiCon, weights] [posicon, weights] [PosiCon, weights] [posicon, weights]
26 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write Read Write Read [PosiCon, weights] [posicon, weights] [PosiCon, weights] [posicon, weights]
27 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write Read Write Read [PosiCon, weights] [posicon, weights] [PosiCon, weights] [posicon, weights]
28 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write Read Write Read [PosiCon, weights] [posicon, weights] [PosiCon, weights] [posicon, weights]
29 Comparison on copy task where did they read from/write to memory? #Iter Vanilla NTM Hidden-Memory NTM Write weights Read weights Write eights Read weights [PosiCon, weights] [posicon, weights] [PosiCon, weights] [posicon, weights]
30 20 Vanilla NTM -copy Copy Loss per item (x-axis is # of iteracons, y-axis is loss/vector, sampled every 25 iters) 8 Hidden-Memory NTM- copy r 2w 1r 1w Double-controlled NTM-copy Tightly-coupled NTM-copy
31 Copy task is a task that could lead to a nice converged model. Recall task is harder in that it requires random jump back into specific locacon in input.
32 AssociaCve Recall (x-axis is # of iteracons, y-axis is log of NLL loss/vector, sampled 10 Vanilla NTM every 25 iters) 10 Hidden-Memory Double-controlled Tightly-Coupled
33 AssociaCve Recall (x-axis is # of iteracons, y-axis is log of NLL loss/vector, sampled 10 Vanilla NTM every 25 iters) 10 Hidden-Memory Double-controlled Tightly-Coupled
34 QuesCon answering simulacon Test on babi data
35 Learning example: babi task 1 (1K) supporcng fact only F: Mary went to the Kitchen. Q: Where is marry? A: Kitchen.
36 babi task 1 (1K) supporcng fact only x #epochs, y acc NTM-1w1r NTM-2w2r Hidden-Memory Double-controlled Tightly-coupled
37 An example babi task 2 (1K) supporcng fact only F: John got the football there. F: John went to the hallway. Q: Where is the football? A: Hallway
38 babi task 2 (1k train/test) supporcng fact only x #epochs, y acc R1W 2R2W Hidden-Memory Double-controlled Tightly-coupled
39 An example babi task 3 (1K) supporcng fact only F: Mary journeyed to the office. F: Mary journeyed to the bathroom. F: Mary dropped the football. Q: Where was the football before the bathroom? A: office
40 Task 3(1K) Accuracy x #epochs, y acc r1w 2w2r Hidden-memory Double-controlled Tightly-coupled
41 Task 3(1K) Training loss dynamics (x: one sample /25 iteracon. Y: training loss) 1w1r 2w2r Double-controlled Tightly-coupled Hidden-memory
42 Discussion
43 Why structured memory works: The learning perspeccve Stabilizing memory, proved to be successful for training recurrent neural networks. David Krueger & Roland. REGULARIZING RNNS BY STABILIZING ACTIVATIONS. (2016) Memisevic h`p://arxiv.org/pdf/ v2.pdf
44 Why structured memory works: The learning perspeccve Smoothing. ExpectaCon stabilizes accvacon (memory), be`er parameter fi}ng. The items wri`en earlier in the history is forgo`en (might be because of erase and add operacons). AccumulaCon of memory brings the old memory back
45 Future and ongoing work Memory with more complicated structure. Tensor as weights for more complex ways of mixing memory contents? Real quescon answering dataset Hermann, Karl Moritz, Tomáš Kočiský, Edward Grefenste`e, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. "Teaching machines to read and comprehend." NIPS (2015).
46 Thanks for your reasoning, a`encon and memory during my talk! Q&A
Structured Attention Networks
Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP 1 Deep Neural Networks for Text Processing and Generation 2 Attention Networks 3 Structured Attention Networks
More informationEmpirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio Presenter: Yu-Wei Lin Background: Recurrent Neural
More informationNeural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision Anonymized for review Abstract Extending the success of deep neural networks to high level tasks like natural language
More informationDifferentiable Data Structures (and POMDPs)
Differentiable Data Structures (and POMDPs) Yarin Gal & Rowan McAllister February 11, 2016 Many thanks to Edward Grefenstette for graphics material; other sources include Wikimedia licensed under CC BY-SA
More informationarxiv: v1 [cs.ne] 10 Jun 2016
Memory-Efficient Backpropagation Through Time Audrūnas Gruslys Google DeepMind audrunas@google.com Remi Munos Google DeepMind munos@google.com Ivo Danihelka Google DeepMind danihelka@google.com arxiv:1606.03401v1
More informationFastText. Jon Koss, Abhishek Jindal
FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words
More informationShow, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks
Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of
More informationRecurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra
Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification
More informationContext Encoding LSTM CS224N Course Project
Context Encoding LSTM CS224N Course Project Abhinav Rastogi arastogi@stanford.edu Supervised by - Samuel R. Bowman December 7, 2015 Abstract This project uses ideas from greedy transition based parsing
More informationLSTM: An Image Classification Model Based on Fashion-MNIST Dataset
LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application
More informationarxiv: v3 [cs.lg] 30 Dec 2016
Video Ladder Networks Francesco Cricri Nokia Technologies francesco.cricri@nokia.com Xingyang Ni Tampere University of Technology xingyang.ni@tut.fi arxiv:1612.01756v3 [cs.lg] 30 Dec 2016 Mikko Honkala
More informationArtificial Neuron Modelling Based on Wave Shape
Artificial Neuron Modelling Based on Wave Shape Kieran Greer, Distributed Computing Systems, Belfast, UK. http://distributedcomputingsystems.co.uk Version 1.2 Abstract This paper describes a new model
More informationApplication of Deep Learning Techniques in Satellite Telemetry Analysis.
Application of Deep Learning Techniques in Satellite Telemetry Analysis. Greg Adamski, Member of Technical Staff L3 Technologies Telemetry and RF Products Julian Spencer Jones, Spacecraft Engineer Telenor
More informationEfficient Deep Learning Optimization Methods
11-785/ Spring 2019/ Recitation 3 Efficient Deep Learning Optimization Methods Josh Moavenzadeh, Kai Hu, and Cody Smith Outline 1 Review of optimization 2 Optimization practice 3 Training tips in PyTorch
More informationRecurrent Neural Network and its Various Architecture Types
Recurrent Neural Network and its Various Architecture Types Trupti Katte Assistant Professor, Computer Engineering Department, Army Institute of Technology, Pune, Maharashtra, India Abstract----Recurrent
More informationFloodless in SEATTLE A Scalable Ethernet Architecture for Large Enterprises By Changhoon Kim, Ma/hew Caesar, and Jennifer Rexford
Floodless in SEATTLE A Scalable Ethernet Architecture for Large Enterprises By Changhoon Kim, Ma/hew Caesar, and Jennifer Rexford Presented by: Charndeep Grewal Department of Electrical Engineering MoCvaCon
More informationRecurrent Neural Network (RNN) Industrial AI Lab.
Recurrent Neural Network (RNN) Industrial AI Lab. For example (Deterministic) Time Series Data Closed- form Linear difference equation (LDE) and initial condition High order LDEs 2 (Stochastic) Time Series
More informationInstructor: Jessica Wu Harvey Mudd College
The Perceptron Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Eric Eaton (UPenn), David Kauchak (Pomona), and the many others who made their course
More informationRecurrent Neural Nets II
Recurrent Neural Nets II Steven Spielberg Pon Kumar, Tingke (Kevin) Shen Machine Learning Reading Group, Fall 2016 9 November, 2016 Outline 1 Introduction 2 Problem Formulations with RNNs 3 LSTM for Optimization
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationarxiv: v2 [cs.ne] 10 Nov 2018
Number Sequence Prediction Problems for Evaluating Computational Powers of Neural Networks Hyoungwook Nam College of Liberal Studies Seoul National University Seoul, Korea hwnam8@snu.ac.kr Segwang Kim
More informationCS 5150 So(ware Engineering 12. System Architecture
Cornell University Compu1ng and Informa1on Science CS 5150 So(ware Engineering 12. System Architecture William Y. Arms Design The requirements describe the funccon of a system as seen by the client. For
More informationResidual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina
Residual Networks And Attention Models cs273b Recitation 11/11/2016 Anna Shcherbina Introduction to ResNets Introduced in 2015 by Microsoft Research Deep Residual Learning for Image Recognition (He, Zhang,
More informationarxiv: v2 [cs.lg] 23 Feb 2016
Marcin Andrychowicz MARCINA@GOOGLE.COM Google DeepMind Karol Kurach KKURACH@GOOGLE.COM Google / University of Warsaw 1 equal contribution arxiv:1602.03218v2 [cs.lg] 23 Feb 2016 Abstract In this paper,
More informationRecurrent neural networks for polyphonic sound event detection in real life recordings
Recurrent neural networks for polyphonic sound event detection in real life recordings Giambattista Parascandolo, Heikki Huttunen, Tuomas Virtanen Tampere University of Technology giambattista.parascandolo@tut.fi
More informationAdding Depth to Dashboards
Copyright 2015 Splunk Inc. Adding Depth to Dashboards Pierre Brunel Splunk Disclaimer During the course of this presentacon, we may make forward looking statements regarding future events or the expected
More informationExploring Boosted Neural Nets for Rubik s Cube Solving
Exploring Boosted Neural Nets for Rubik s Cube Solving Alexander Irpan Department of Computer Science University of California, Berkeley alexirpan@berkeley.edu Abstract We explore whether neural nets can
More informationNeural Programming by Example
Neural Programming by Example Chengxun Shu Beihang University Beijing 100191, China shuchengxun@163.com Hongyu Zhang The University of Newcastle Callaghan, NSW 2308, Australia hongyu.zhang@newcastle.edu.au
More informationNeural Program Search: Solving Programming Tasks from Description and Examples
Neural Program Search: Solving Programming Tasks from Description and Examples Anonymous authors Paper under double-blind review Abstract We present a Neural Program Search, an algorithm to generate programs
More informationGate-Variants of Gated Recurrent Unit (GRU) Neural Networks
Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks Rahul Dey and Fathi M. Salem Circuits, Systems, and Neural Networks (CSANN) LAB Department of Electrical and Computer Engineering Michigan State
More informationShow, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks
Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Boya Peng Department of Computer Science Stanford University boya@stanford.edu Zelun Luo Department of Computer
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationEND-TO-END CHINESE TEXT RECOGNITION
END-TO-END CHINESE TEXT RECOGNITION Jie Hu 1, Tszhang Guo 1, Ji Cao 2, Changshui Zhang 1 1 Department of Automation, Tsinghua University 2 Beijing SinoVoice Technology November 15, 2017 Presentation at
More informationarxiv: v1 [cs.cv] 17 Nov 2016
Multimodal Memory Modelling for Video Captioning arxiv:1611.05592v1 [cs.cv] 17 Nov 2016 Junbo Wang 1 Wei Wang 1 Yan Huang 1 Liang Wang 1,2 Tieniu Tan 1,2 1 Center for Research on Intelligent Perception
More informationLSTM with Working Memory
LSTM with Working Memory Andrew Pulver Department of Computer Science University at Albany Email: apulver@albany.edu Siwei Lyu Department of Computer Science University at Albany Email: slyu@albany.edu
More informationarxiv: v1 [cs.lg] 5 May 2015
Reinforced Decision Trees Reinforced Decision Trees arxiv:1505.00908v1 [cs.lg] 5 May 2015 Aurélia Léon aurelia.leon@lip6.fr Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6, F-75005, Paris, France
More informationSequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015
Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using
More informationTime Series prediction with Feed-Forward Neural Networks -A Beginners Guide and Tutorial for Neuroph. Laura E. Carter-Greaves
http://neuroph.sourceforge.net 1 Introduction Time Series prediction with Feed-Forward Neural Networks -A Beginners Guide and Tutorial for Neuroph Laura E. Carter-Greaves Neural networks have been applied
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationImage Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction
Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction by Noh, Hyeonwoo, Paul Hongsuck Seo, and Bohyung Han.[1] Presented : Badri Patro 1 1 Computer Vision Reading
More informationRecurrent Neural Networks
Recurrent Neural Networks Javier Béjar Deep Learning 2018/2019 Fall Master in Artificial Intelligence (FIB-UPC) Introduction Sequential data Many problems are described by sequences Time series Video/audio
More informationarxiv: v6 [stat.ml] 15 Jun 2015
VARIATIONAL RECURRENT AUTO-ENCODERS Otto Fabius & Joost R. van Amersfoort Machine Learning Group University of Amsterdam {ottofabius,joost.van.amersfoort}@gmail.com ABSTRACT arxiv:1412.6581v6 [stat.ml]
More informationAll You Want To Know About CNNs. Yukun Zhu
All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image
More informationREVISITING DISTRIBUTED SYNCHRONOUS SGD
REVISITING DISTRIBUTED SYNCHRONOUS SGD Jianmin Chen, Rajat Monga, Samy Bengio & Rafal Jozefowicz Google Brain Mountain View, CA, USA {jmchen,rajatmonga,bengio,rafalj}@google.com 1 THE NEED FOR A LARGE
More informationNeural Machine Translation In Linear Time
Neural Machine Translation In Linear Time Authors: Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu Presenter: SunMao sm4206 YuZheng yz2978 OVERVIEW
More informationGraph Neural Network. learning algorithm and applications. Shujia Zhang
Graph Neural Network learning algorithm and applications Shujia Zhang What is Deep Learning? Enable computers to mimic human behaviour Artificial Intelligence Machine Learning Subset of ML algorithms using
More informationSentiment Classification of Food Reviews
Sentiment Classification of Food Reviews Hua Feng Department of Electrical Engineering Stanford University Stanford, CA 94305 fengh15@stanford.edu Ruixi Lin Department of Electrical Engineering Stanford
More informationTutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY
Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company
More informationSlides credited from Dr. David Silver & Hung-Yi Lee
Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to
More informationPartial Knowledge in Embeddings
Partial Knowledge in Embeddings R.V.Guha... Abstract. Representing domain knowledge is crucial for any task. There has been a wide range of techniques developed to represent this knowledge, from older
More informationNotes on Multilayer, Feedforward Neural Networks
Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book
More informationNeural Network Models for Text Classification. Hongwei Wang 18/11/2016
Neural Network Models for Text Classification Hongwei Wang 18/11/2016 Deep Learning in NLP Feedforward Neural Network The most basic form of NN Convolutional Neural Network (CNN) Quite successful in computer
More informationarxiv: v1 [cs.cl] 7 Mar 2017
Linguistic Knowledge as Memory for Recurrent Neural Networks Bhuwan Dhingra 1 Zhilin Yang 1 William W. Cohen 1 Ruslan Salakhutdinov 1 arxiv:1703.02620v1 [cs.cl] 7 Mar 2017 Abstract Training recurrent neural
More informationarxiv: v2 [cs.ai] 28 May 2018
Recurrent Relational Networks Rasmus Berg Palm Technical University of Denmark Tradeshift rapal@dtu.dk Ulrich Paquet DeepMind upaq@google.com Ole Winther Technical University of Denmark olwi@dtu.dk arxiv:.00v
More informationDeep Learning With Noise
Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu
More informationMachine Learning for Natural Language Processing. Alice Oh January 17, 2018
Machine Learning for Natural Language Processing Alice Oh January 17, 2018 Overview Distributed representation Temporal neural networks RNN LSTM GRU Sequence-to-sequence models Machine translation Response
More informationDistributional Neural Networks for. Automatic resolution of Crossword Puzzles
Distributional Neural Networks for Automatic Resolution of Crossword Puzzles Aliaksei Severyn, Massimo Nicosia, Gianni Barlacchi, Alessandro Moschitti DISI - University of Trento, Italy Qatar Computing
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu
Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationNEURAL FUNCTIONAL PROGRAMMING
NEURAL FUNCTIONAL PROGRAMMING John K. Feser Massachusetts Institute of Technology feser@csail.mit.edu Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow Microsoft Research {mabrocks,t-algaun,dtarlow}@microsoft.com
More informationDecentralized and Distributed Machine Learning Model Training with Actors
Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationCUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models Xie Chen, Xunying Liu, Yanmin Qian, Mark Gales and Phil Woodland April 1, 2016 Overview
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More informationLessons in Building a Distributed Query Planner. Ozgun Erdogan PGCon 2016
Lessons in Building a Distributed Query Planner Ozgun Erdogan PGCon 2016 Talk Outline 1. IntroducCon 2. Key insight in distributed planning 3. Distributed logical plans 4. Distributed physical plans 5.
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationOutline. Morning program Preliminaries Semantic matching Learning to rank Entities
112 Outline Morning program Preliminaries Semantic matching Learning to rank Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q&A 113 are polysemic Finding
More informationConvolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research
Convolutional Sequence to Sequence Learning Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Sequence generation Need to model a conditional distribution
More informationGradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent
Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Slide credit: http://sebastianruder.com/optimizing-gradient-descent/index.html#batchgradientdescent
More informationDeep Learning for Vision
Deep Learning for Vision Xiaoming Liu Slides adapted from Adam Coates What do we want ML to do? Given image, predict complex high- level pa>erns: Cat Object recognicon DetecCon SegmentaCon [MarCn et al.,
More informationLogical Rhythm - Class 3. August 27, 2018
Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationPixel-level Generative Model
Pixel-level Generative Model Generative Image Modeling Using Spatial LSTMs (2015NIPS) L. Theis and M. Bethge University of Tübingen, Germany Pixel Recurrent Neural Networks (2016ICML) A. van den Oord,
More informationarxiv: v3 [cs.lg] 29 Oct 2018
COMPOSITIONAL CODING CAPSULE NETWORK WITH K-MEANS ROUTING FOR TEXT CLASSIFICATION Hao Ren, Hong Lu School of Computer Science, Fudan University Shanghai, China arxiv:1810.09177v3 [cs.lg] 29 Oct 2018 ABSTRACT
More informationDeep Temporal Models (Benchmarks and Applica6ons Analysis)
Deep Temporal Models (Benchmarks and Applica6ons Analysis) Sek Chai SRI Interna6onal Presented at: NICE 2016, March 7, 2016 2016 SRI International Project Summary Goals Analyze Deep Temporal Models (DTMs).
More informationAutoencoders, denoising autoencoders, and learning deep networks
4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,
More informationEnd-to-end Training of Differentiable Pipelines Across Machine Learning Frameworks
End-to-end Training of Differentiable Pipelines Across Machine Learning Frameworks Mitar Milutinovic Computer Science Division University of California, Berkeley mitar@cs.berkeley.edu Robert Zinkov zinkov@robots.ox.ac.uk
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course
More informationAn Online Sequence-to-Sequence Model Using Partial Conditioning
An Online Sequence-to-Sequence Model Using Partial Conditioning Navdeep Jaitly Google Brain ndjaitly@google.com David Sussillo Google Brain sussillo@google.com Quoc V. Le Google Brain qvl@google.com Oriol
More informationarxiv: v1 [cs.ne] 28 Nov 2017
Block Neural Network Avoids Catastrophic Forgetting When Learning Multiple Task arxiv:1711.10204v1 [cs.ne] 28 Nov 2017 Guglielmo Montone montone.guglielmo@gmail.com Alexander V. Terekhov avterekhov@gmail.com
More informationOn the Efficiency of Recurrent Neural Network Optimization Algorithms
On the Efficiency of Recurrent Neural Network Optimization Algorithms Ben Krause, Liang Lu, Iain Murray, Steve Renals University of Edinburgh Department of Informatics s17005@sms.ed.ac.uk, llu@staffmail.ed.ac.uk,
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Feb 04, 2016 Today Administrivia Attention Modeling in Image Captioning, by Karan Neural networks & Backpropagation
More informationEnd-To-End Spam Classification With Neural Networks
End-To-End Spam Classification With Neural Networks Christopher Lennan, Bastian Naber, Jan Reher, Leon Weber 1 Introduction A few years ago, the majority of the internet s network traffic was due to spam
More informationXES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework
XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework Demo Paper Joerg Evermann 1, Jana-Rebecca Rehse 2,3, and Peter Fettke 2,3 1 Memorial University of Newfoundland 2 German Research
More informationEECS 496 Statistical Language Models. Winter 2018
EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading
More informationUsable while performant: the challenges building. Soumith Chintala
Usable while performant: the challenges building Soumith Chintala Problem Statement Deep Learning Workloads Problem Statement Deep Learning Workloads for epoch in range(max_epochs): for data, target in
More informationNatural Language Processing with Deep Learning CS224N/Ling284
Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview
More informationPython & Web Mining. Lecture Old Dominion University. Department of Computer Science CS 495 Fall 2012
Python & Web Mining Lecture 6 10-10-12 Old Dominion University Department of Computer Science CS 495 Fall 2012 Hany SalahEldeen Khalil hany@cs.odu.edu Scenario So what did Professor X do when he wanted
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationMemory Bandwidth and Low Precision Computation. CS6787 Lecture 10 Fall 2018
Memory Bandwidth and Low Precision Computation CS6787 Lecture 10 Fall 2018 Memory as a Bottleneck So far, we ve just been talking about compute e.g. techniques to decrease the amount of compute by decreasing
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More information11. Neural Network Regularization
11. Neural Network Regularization CS 519 Deep Learning, Winter 2016 Fuxin Li With materials from Andrej Karpathy, Zsolt Kira Preventing overfitting Approach 1: Get more data! Always best if possible! If
More informationGlobal Optimality in Neural Network Training
Global Optimality in Neural Network Training Benjamin D. Haeffele and René Vidal Johns Hopkins University, Center for Imaging Science. Baltimore, USA Questions in Deep Learning Architecture Design Optimization
More informationCHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION
75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen
More informationarxiv: v1 [cs.cl] 28 Nov 2018
A Deep Cascade Model for Multi-Document Reading Comprehension Ming Yan, Jiangnan Xia, Chen Wu, Bin Bi Zhongzhou Zhao, Ji Zhang, Luo Si, Rui Wang, Wei Wang, Haiqing Chen Alibaba Group {ym119608, jiangnan.xjn,
More informationCS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016
CS 4510/9010 Applied Machine Learning 1 Neural Nets Paula Matuszek Fall 2016 Neural Nets, the very short version 2 A neural net consists of layers of nodes, or neurons, each of which has an activation
More informationCSC 2515 Introduction to Machine Learning Assignment 2
CSC 2515 Introduction to Machine Learning Assignment 2 Zhongtian Qiu(1002274530) Problem 1 See attached scan files for question 1. 2. Neural Network 2.1 Examine the statistics and plots of training error
More informationCode Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:
Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment
More informationNeural Networks (Overview) Prof. Richard Zanibbi
Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)
More information