Perception as an Inference Problem

Size: px

Start display at page:

Download "Perception as an Inference Problem"

Esmond Pope
5 years ago
Views:

1 Perception as an Inference Problem Bruno A. Olshausen Helen Wills Neuroscience Institute, School of Optometry and Redwood Center for Theoretical Neuroscience UC Berkeley

2 What are the principles governing information processing in this system?

3 Wallisch & Movshon (2008) Gabor filters..?.. objects. faces

4 Two views of visual system function Deduction - feature extraction, classification - (Hubel & Wiesel; Fukushima; deep learning ) Inference - generative models, recurrent computation - (Helmholtz; Nakayama; Kersten & Yuille; Geman; Lee & Mumford)

5 Hubel & Wiesel (1962, 1965) Hypercomplex Complex 20. The hypothetical illustrated has a complex Simple

6 output (y) y = f(x; w) input data (x)

7 Is this the goal of vision?

Visual Navigation in Box Jellyfish 799 jumping spider sand wasp

such that the oriented. Thus, th straight upward body orientation.

8 Visual Navigation in Box Jellyfish 799 jumping spider sand wasp Figure 1. Rhopali of the Upper Lens box jellyfish (A and B) In freely s lia maintain a cons the medusa chan heavy crystal (sta rhopalium causes such that the oriented. Thus, th straight upward body orientation. ated on the far sid eyes directed to t (C) Modeling the peripheral photore angular sensitivity ceptors are supe cording to the co

9 .... state (s) sensory data (x) actuator movement (a) ṡ + s = g(s, x, a; w) a = f(s)

10 Vision as inference lens World Image Model

12 Separation of shape and reflectance reflectance shading (Adelson, 2000)

13 Possible neural circuits for inferential computation in V1 1. Sparse coding 2. Separating form and motion from time-varying images

14 ... Sparse coding External world Internal model image model (Olshausen & Field, 1996; Chen, Donoho & Saunders 1995) I(x,y) φ i (x,y) ai MX I( x) = a i i ( x)+ ( x) i=1 image neural features other activities stuff (sparse)

15 Energy function preserve information be sparse

16 Energy function -log P(I a) P(a) preserve information be sparse

17 Coefficients a i may be computed via thresholding and lateral inhibition ( LCA - Rozell, Johnson, Baraniuk & Olshausen, 2008) g g g g g b i = X x i(x) I(x) G ij = X x i(x) j (x)

18 1.25x 2.5x 5x 10x

19 Two examples 1. Sparse coding 2. Separating form and motion from time-varying images

20 Visual perception requires separation of form and motion from time-varying retinal images (eye movement data from Austin Roorda, UC Berkeley)

21 Simple averaging is not sufficient

22 The problem I(~x, t) =S(~x ~x(t)) + (~x, t) ˆ~x(t) = arg min ~x(t) I(~x, t) S(~x ~x(t)) 2 Z Ŝ(~x) = I(~x + ~x(t)) dt

23 Traditional models compute motion and form independently motion energy and pooling optic flow time-varying image feature extraction and pooling invariant pattern recognition

24 Traditional models compute motion and form independently motion energy and pooling optic flow time-varying image feature extraction and pooling invariant pattern recognition

25 Motion and form must be estimated simultaneously ) time-varying image estimate motion optic flow regularization (smoothness) ) time-varying image estimate motion estimate pattern form motion pattern form natural scene statistics prior natural scene statistics prior

26 Graphical model for separating form and motion (Alex Anderson, Ph.D. thesis) X 0 X 1 X 2 Eye position R 0 R 1 R 2 Spikes (from LGN afferents) S Pattern Ŝ = arg max S log P (R S)

27 Given current estimate of position (X), update S Retina Internal Position Estimate (X) Internal Form Estimate (S)

28 Given current estimate of form (S), update X P(X t R 0:t ) R t+1 S = S t P(X t+1 R 0:t ) P(R t+1 X t+1,s = S t ) P(X t+1 R 0:t+1 )

29 Joint estimation of form (S) and position (X)

30 Including a prior over form (S) X 0 X 1 X 2 Eye position R 0 R 1 R 2 Spikes (from LGN afferents) S A D Pattern Dictionary Sparse representation Â = arg max A log P (R A) + log P (A) sparse

31 Learned dictionary D

32 Prior over form (S) improves inference

33 Form prior improves inference

34 Main points Perception seems better described as an inference problem that attempts to disentangle underlying causes from image data. Inference involves bidirectional information flow both within and between levels of representation. This moves us away from thinking of receptive fields and instead toward how populations of neurons interact to perform collective computations.

35 Papers Olshausen BA (2014) Perception as an Inference Problem. In: The Cognitive Neurosciences V. M. Gazzaniga, R. Mangun, Eds. MIT Press. Rozell CJ, Johnson DH, Baraniuk RG, Olshausen BA (2008). Sparse Coding via Thresholding and Local Competition in Neural Circuits. Neural Computation, 20, Olshausen BA (2013) Highly overcomplete sparse coding. In: SPIE Proceedings vol. 8651: Human Vision and Electronic Imaging XVIII, (B.E. Rogowitz, T.N. Pappas, H. de Ridder, Eds.), Feb. 4-7, 2013, San Francisco, California.

The Sparse Manifold Transform

The Sparse Manifold Transform Bruno Olshausen Redwood Center for Theoretical Neuroscience, Helen Wills Neuroscience Institute, and School of Optometry UC Berkeley with Yubei Chen (EECS) and Dylan Paiton