Announcements I Introduction to Computer Vision CSE 152 Lecture 18 Assignment 4: Due Toda Assignment 5: Posted toda Read: Trucco & Verri, Chapter 10 on recognition Final Eam: Wed, 6/9/04, 11:30-2:30, WLH 2207 (here) Optical Flow: Where do piels move to? Virtual Cinematograph: Making 'The Matri' Sequels George Borshukov VFX Technolog Supervisor, ESC Entertainment Frida, June 4, 2004 1:00 p.m. to 2:30 p.m. [Pizza lunch will precede the event from noon to 1 p.m.] Main Auditorium, San Diego Supercomputer Center The presentation will cover the ke technologies that had to be developed and deploed to create the snthetic human sequences in the Matri sequels including Universal Capture - image-based facial animation, realistic human face rendering, and use of measured BRDF in film production. It will also feature a breakdown of The Superpunch shot (pictured above) from "The Matri Revolutions" (the bullet time punch that Neo delivers to Agent Smith during the film's last face-off). This difficult, important, epensive, and challenging shot was entirel computer generated and showcased the technological developments of 3.5+ ears at their best b showing a full-frame close-up of a known human actor. Mathematical formulation [Note change of notation: image coordinates now (,), not (u,v)] I (,,t) brightness at image point (,) at time t Consider scene (or camera) to be moving, so (t), (t) Brightness constanc assumption: d d I ( + δt, + δt, t + δt) I(,, t) Optical flow constraint equation : di d + d di + t Measurements I, I, I t t Flow vector d u, d v
Two was to get flow de( u, v) du de( u, v) dv 2I 2I ( I u + I v + I ) ( I u + I v + I ) t t Ω 1. Think globall, and regularize over image 2. Look over window and assume constant motion in the window Given a database of objects and an image determine what, if an of the objects are present in the image. Challenges Within-class variabilit Different objects within the class have different shapes or different material characteristics Deformable Articulated Compositional Pose variabilit: 2-D Image transformation (translation, rotation, scale) 3-D Pose Variabilit (perspective, orthographic projection) Lighting Direction (multiple sources & tpe) Color Shadows Occlusion partial Clutter in background -> false positives Object Issues: How general is the problem? 2D vs. 3D range of viewing conditions available contet segmentation cues What sort of data is best suited to the problem? Whole images Local 2D features (color, teture, 3D (range) data What information do we have in the database? Collection of images? 3-D models? Learned representation? Learned classifiers? How man objects are involved? small: brute force search large:??
A Rough Spectrum Appearance-Based (Eigenface, Fisherface) Shape Contets Geometric Invariants Image Abstractions/ Volumetric Primitives Appearance-Based Vision: A Pattern Classification Viewpoint Local Features + Spatial Relations Aspect Graphs Increasing Generalit 3-D Model-Based Function 1. Feature Space + Nearest Neighbor 2. Dimensionalit Reduction 3. Baesian Classification 4. Appearance Manifolds Image (window) Sketch of a Pattern Architecture Feature Etraction Feature Vector Classification Object Identit Eample: Face Detection Scan window over image. Classif window as either: Face Non-face Window Classifier Face Non-face The Problem of Given an image I and a database of k objects and a representation R j for object j in the database, recognition can be epressed as: Image as a Feature Vector 2 i arg min c( R j [1, L, k ] j, I) 1 3 where c(r j,,i) is a function which gives the compatibilit or consistenc of representation R j with the image. Consider an n-piel image (window) to be a point in an n-dimensional space, R n. Each piel value is a coordinate of.
Simplest Scheme R j is an image. c(r j, I) is Euclidean distance. R 1 I 1 2 3 Nearest Neighbor Classifier R 2 Comments Sometimes called Template Matching Variations on distance function (e.g. L 1, robust distances) Multiple templates per class- perhaps man training images per class. Epensive to compute k distances, especiall when each image is big (N dimensional). Ma not generalize well to unseen eamples of class. Some solutions: Dimensionalit reduction Baesian classification Eigenfaces: Linear Projection Eigenfaces: Principal Component Analsis (PCA) An n-piel image R n can be projected to a low-dimensional feature space R m b W where W is an m b n matri. is performed using nearest neighbor in R m. How do we choose a good W? Some details: Use Singular value decomposition, trick described in appendi of tet to compute basis when n<<d Mean First Principal Component Direction of Maimum Variance Eigenfaces Modeling 1. Given a collection of n labeled training images, 2. Compute mean image and covariance matri. 3. Compute k Eigenvectors (note that these are images) of covariance matri corresponding to k largest Eigenvalues. (Or perform using SVD!!) 4. Project the training images to the k-dimensional Eigenspace. 1. Given a test image, project to Eigenspace. 2. Perform classification to the projected training images.
Eigenfaces: Training Images Eigenfaces [ Turk, Pentland 91 Mean Image Basis Images Basis Images for Variable Lighting Projection, and reconstruction An n-piel image R n can be projected to a low-dimensional feature space R m b W From R m, the reconstruction of the point is W T The error of the reconstruction is: -W T W Reconstruction using Eigenfaces Given image on left, project to Eigenspace, then reconstruct an image (right). Face detection using distance to face space Scan a window ω across the image, and classif the window as face/not face as follows: Project window to subspace, and reconstruct as described earlier. Compute distance between ω and reconstruction. Local minima of distance over all image locations less than some treshold are taken as locations of faces. Repeat at different scales. Possibl normalize windows intensit so that ω 1.