Variational Autoencoders. Sargur N. Srihari

Similar documents
Deep Generative Models Variational Autoencoders

Auto-Encoding Variational Bayes

Variational Autoencoders

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.

Unsupervised Learning

Alternatives to Direct Supervision

GAN Frontiers/Related Methods

Deep Boltzmann Machines

19: Inference and learning in Deep Learning

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin

Generative Models in Deep Learning. Sargur N. Srihari

Deep generative models of natural images

Deep Generative Models and a Probabilistic Programming Library

Semi-Amortized Variational Autoencoders

RNNs as Directed Graphical Models

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x

Classification of 1D-Signal Types Using Semi-Supervised Deep Learning

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

Autoencoding Beyond Pixels Using a Learned Similarity Metric

Implicit generative models: dual vs. primal approaches

Day 3 Lecture 1. Unsupervised Learning

Lecture 19: Generative Adversarial Networks

CSC412: Stochastic Variational Inference. David Duvenaud

Energy Based Models, Restricted Boltzmann Machines and Deep Networks. Jesse Eickholt

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

DEEP LEARNING PART THREE - DEEP GENERATIVE MODELS CS/CNS/EE MACHINE LEARNING & DATA MINING - LECTURE 17

Learning to generate with adversarial networks

Towards Principled Methods for Training Generative Adversarial Networks. Martin Arjovsky & Léon Bottou

arxiv: v1 [stat.ml] 10 Dec 2018

Adversarially Learned Inference

arxiv: v2 [cs.lg] 17 Dec 2018

Tutorial Deep Learning : Unsupervised Feature Learning

ECE 5424: Introduction to Machine Learning

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

Replacing Neural Networks with Black-Box ODE Solvers

Stochastic Simulation with Generative Adversarial Networks

Bidirectional GAN. Adversarially Learned Inference (ICLR 2017) Adversarial Feature Learning (ICLR 2017)

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Learning a Representation Map for Robot Navigation using Deep Variational Autoencoder

K-Means Clustering. Sargur Srihari

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Denoising Adversarial Autoencoders

Multi-Modal Generative Adversarial Networks

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Bayesian model ensembling using meta-trained recurrent neural networks

10-701/15-781, Fall 2006, Final

arxiv: v1 [cs.lg] 24 Jan 2019

Deep Learning Srihari. Autoencoders. Sargur Srihari

Hidden Units. Sargur N. Srihari

22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos

A Fast Learning Algorithm for Deep Belief Nets

Score function estimator and variance reduction techniques

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

Neural Networks and Deep Learning

Deep-Q: Traffic-driven QoS Inference using Deep Generative Network

Neural Networks: promises of current research

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

27: Hybrid Graphical Models and Neural Networks

Auxiliary Variational Information Maximization for Dimensionality Reduction

Clustering Lecture 5: Mixture Model

The Multi-Entity Variational Autoencoder

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Challenges motivating deep learning. Sargur N. Srihari

Model Generalization and the Bias-Variance Trade-Off

Introduction to Machine Learning CMU-10701

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Extracting and Composing Robust Features with Denoising Autoencoders

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Grundlagen der Künstlichen Intelligenz

Autoencoders, denoising autoencoders, and learning deep networks

arxiv: v2 [cs.lg] 6 Jun 2015

arxiv: v2 [cs.lg] 25 May 2016

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

K-Means and Gaussian Mixture Models

Table of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

arxiv: v1 [cs.cv] 17 Nov 2016

arxiv: v1 [cs.gr] 27 Dec 2018

When Variational Auto-encoders meet Generative Adversarial Networks

Generative Adversarial Network

Iterative Inference Models

arxiv: v6 [stat.ml] 15 Jun 2015

Mode Regularized Generative Adversarial Networks

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

Amortised MAP Inference for Image Super-resolution. Casper Kaae Sønderby, Jose Caballero, Lucas Theis, Wenzhe Shi & Ferenc Huszár ICLR 2017

Introduction to Generative Adversarial Networks

What is machine learning?

Implicit Mixtures of Restricted Boltzmann Machines

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Gradient of the lower bound

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

arxiv: v1 [cs.cv] 7 Mar 2018

Auxiliary Deep Generative Models

If you are confused about something, it s probably because I haven t explained it well and other people are probably confused too, so please feel

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

arxiv: v1 [stat.ml] 11 Feb 2018

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Neural Network Neurons

Machine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari

Transcription:

Variational Autoencoders Sargur N. srihari@cedar.buffalo.edu

Topics 1. Generative Model 2. Standard Autoencoder 3. Variational autoencoders (VAE) 2

Generative Model A variational autoencoder (VAE) is a generative model i.e., able to generate fake samples that look like samples from training data With MNIST data, these fake samples would be synthetic images of handwritten digits VAE provides us with a space, the latent space, from which we can sample points Any of these points can be decoded into a reasonable image of a handwritten digit 3

Standard Autoencoder A standard autoencoder trained on MNIST digits may not provide a reasonable output when a V image is input http://ijdykeman.github.io/ml/2016/12/21/cvae.html 4

Normal Distribution of MNIST A standard normal distribution This is how we would like points corresponding to MNIST digit images to be distributed in the latent space 5

Decoder of a VAE 3 s are in first quadrant, 6 s are in third quadrant 6

Encoder of a VAE 7

MNIST Variational Autoencoder 8

Structure of Latent Space Decoder expects the latent space to be normally distributed Whether the sum of distributions produced by the encoder approximates a standard Normal distribution in measured by the KL divergence 9

VAE Training Due to random variable between input and output it cannot be trained using backprop Instead, backprop proceeds through parameters of the latent distribution Called reparameterization trick N(µ,Σ) = µ + Σ N(0, I) where the covariance matrix Σ is diagonal Due to randomness involved, training is called Stochastic Gradient Variational Bayes (SGVB) 10

Conditional VAE The number information is fed as a one-hot vector 11

Generating Images from a VAE Feed a random point in latent space and desired number. Even if the same latent point is used for two different numbers, the process will work correctly since the latent space only encodes features such as stroke width or angle 12

Samples generated from VAE Images produced by fixing no. input to decoder and sampling from latent space Nos. vary in style, but images in a single row are clearly of the same no. 13

14 VAE for Radiology Combines two types of models: discriminative and generative models into a single framework Right: generative PGM with inputs: 1. class label y (diseases) 2. nuisance variables s (hospital identifiers) 3. latent variables z (size, shape, other brain properties) Provides causality of observation Left: Discriminative deep nn model Input: observed variables Generates posterior distributions over latent variables and possibly (if unobserved) class labels. Performs Inference of latent variables necessary to perform variational updates The models are trained jointly using the variational EM framework

Variational Autoencoder (VAE) VAE is a directed model that uses Learned approximate inference Trained purely with gradient-based methods VAE generates a sample from the model, First draw sample z from code distribution p model (z). Sample is then run through a differentiable generator network g(z) x is sampled from distribution p model (x;g(z))=p model (x g(z)) However during training the approximate inference network (or encoder) q(z x) is used to obtain z and p model (x z) is viewed as a decoder network 15

16 The VAE model Method for modeling a data distribution using a collection of independent latent variables Generative model: p(x,z)=p(x z)p(z) x is a r.v. representing observed data z is a collection of latent variables p(x z) is parameterized by a deep neural network (decoder) Components of z are independent Bernoulli or Gaussian Learned approx inference trained using gradient descent q(z x)=n(μ,σ 2 I) whose parameters are given by another deep network (encoder) Thus we have z ~ Enc(x)=q(z x) and y~dec(z)=p(x z)

Key insight of VAE They can be trained by maximizing variational lower bound L(q) associated with data point x where E z~q(z x) log p model (z, x) is the joint log-likelihood of the visible and hidden variables under the approximate posterior over the latent variables H(q(z x) is the entropy of the approximate posterior When q is chosen to be Gaussian with noise added to a predicted mean, maximizing this entropy term encourages increasing σ 17

VAE : 2-D coordinate systems learned for high-dimensional manifolds 18

19 Disentangling FoVs During training, only supervision is class labels Specified FoVs Images captured from different viewpoints Strong supervision: pairs of images with two different objects at same viewing angle Unspecified FoVs Labels unavailable A disentaglement method Combine variational autoencoder with adversarial training