Causal and compositional generative models in online perception
|
|
- Norma Coleen Baker
- 6 years ago
- Views:
Transcription
1 Causal and compositional generative models in online perception Michael Janner with Ilker Yildirim, Mario Belledonne, Chris8an Wallraven, Winrich Freiwald, Joshua Tenenbaum MIT July 28, 2017 London, UK
2
3 ResNet-18 predic/ons zebra dalma8on park bench... ResNet: He et al. CVPR %.2%.1%
4 Outline 1. Occluded face perception with causal and compositional models
5 Outline 1. Occluded face perception with causal and compositional models 2. Automatic training-free vision-to-touch crossmodal transfer
6 Outline 1. Occluded face perception with causal and compositional models 2. Automatic training-free vision-to-touch crossmodal transfer 3. Learning visual causal models for generic object categories Composite Texture Shape Lighting
7
8 The generative model Intrinsic Shape Texture Shape S; Texture T Face Morphable Model: Blanz, Vetter. SIGGRAPH 1999.
9 The generative model Intrinsic Extrinsic Shape Texture Lights Pose Shape S; Texture T Lights L; Pose P Face Morphable Model: Blanz, Vetter. SIGGRAPH 1999.
10 The generative model Intrinsic Extrinsic Shape Texture Lights Pose Shape S; Texture T Lights L; Pose P Rendering Engine Face Face Face Morphable Model: Blanz, Vetter. SIGGRAPH 1999.
11 The generative model Intrinsic Extrinsic Shape Texture Lights Pose Shape S; Texture T Lights L; Pose P Rendering Engine Face Occluder Face Occluder ShapeNet: Angel X. Chang, et al. arxiv:
12 The generative model Intrinsic Extrinsic Shape Texture Lights Pose Shape S; Texture T Lights L; Pose P Rendering Engine Face Occluder Face Occluder Image Image
13 Samples from the generative model Intrinsic Shape S; Texture T Extrinsic Lights L; Pose P Rendering Engine Face Occluder Image
14 Test occluders Bars Fence Curtain Door Blinds Low Medium High - Occlude face in stripes, mesh patterns, and large patches - Three levels of occlusion, ranging from 15-55% of face covered
15 Occluded face perception task
16 Occluded face perception task
17 Occluded face perception task Occluded à Unoccluded Same
18 Occluded face perception task
19 Occluded face perception task
20 Occluded face perception task Unoccluded à Occluded Different
21 Behavioral results Accuracy Occlusion Level Low Medium High Occluded à Unoccluded Unoccluded à Occluded Chance:.5
22 Naïve model: Learn to ignore the occluder CNN Shape S; Texture T Predict intrinsic and extrinsic face parameters directly. Can model become invariant to all types of occluders?
23 Naïve model: Learn to ignore the occluder Observation Reconstruction Observation Reconstruction Predict intrinsic and extrinsic face parameters directly. Can model become invariant to all types of occluders? No. Test occluders heavily influence face predictions.
24 Inverse graphics: Inverting the generative model Shape S; Texture T Lights L; Pose P Shape S; Texture T; Pose P Rendering Engine Efficient Inverse Graphics Network (EIG) Face Occluder Layer Reconstruction Network (LR) Image [Also see: Egger, et al. Occlusion-aware 3D Morphable Face Models. BMVC 2016.]
25 Inverse graphics: Inverting the generative model Shape S; Texture T Lights L; Pose P Shape S; Texture T; Pose P Rendering Engine Efficient Inverse Graphics Network (EIG) Face Occluder Layer Reconstruction Network (LR) Image [Also see: Egger, et al. Occlusion-aware 3D Morphable Face Models. BMVC 2016.]
26 Observed Layer Reconstructions Rendered EIG Face Occluder Ground truth Modeling causes allows for: (1) face predictions that are invariant to occluders (2) better generalization to unseen occluders
27 Prediction human judgments Occluded à Unoccluded Low Medium High Unoccluded à Occluded Low Medium High Spearman rank correlation Image-space comparison Latent-space baseline VGG
28 1. Occluded face perception with causal and compositional models 2. Automatic training-free vision-to-touch crossmodal transfer 3. Learning visual causal models for generic object categories Composite Texture Shape Lighting
29 Vision-to-touch transfer Study Test Shape is explicitly represented Immediate transfer to haptic domain Can make judgments based on grasping joint angles
30 Shape, S; Texture, T Vision-to-touch transfer Approximate human performance [3] Grasping Engine Joint Angles [3] Dopjans et al, IEEE Transac*ons on Hap*cs, 2000
31 1. Occluded face perception with causal and compositional models 2. Automatic training-free vision-to-touch crossmodal transfer 3. Learning visual causal models for generic object categories Composite Texture Shape Lighting
32 Causal models of generic objects Causal intermediate stage visual representations of reflectance, shape, and lighting
33 c Inputs Shader Outputs Decomposing generic objects Causal intermediate stage visual representations of reflectance, shape, and lighting Learn both the recognition model and the rendering function
34 Decomposing generic objects Train shapes Test shapes Causal intermediate stage visual representations of reflectance, shape, and lighting Learn both the recognition model and the rendering function Representation learning driven by reconstruction error
35 Decomposing generic objects Observation Initial prediction Unsupervised improvement Causal intermediate stage visual representations of reflectance, shape, and lighting Learn both the recognition model and the rendering function Representation learning driven by reconstruction error
36 Summary - Causal and compositional models better predict human judgments - Modeling causes allows for training-free crossmodal transfer - Causal models can improve internal representations without supervision
37 Thank you Ilker Yildirim Mario Belledonne Christian Wallraven Winrich Freiwald Joshua Tenenbaum
38 Comparison Pipeline Predict latents of both faces Latents Shape, S; Texture, T Pose; P Latent-space comparison Approximate Renderer Shape, S; Texture, T Pose; P Latents Render study face with pose of test face Occlude rendering and test image with mask Compare in image (or latent) space Study Image-space comparison Test
Causal and compositional generative models in online perception
Causal and compositional generative models in online perception Ilker Yildirim*1 (ilkery@mit.edu), Michael Janner*1 (janner@mit.edu), Mario Belledonne1 (belledon@mit.edu), Christian Wallraven2 (christian.wallraven@gmail.com),
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More information3D Face Modelling Under Unconstrained Pose & Illumination
David Bryan Ottawa-Carleton Institute for Biomedical Engineering Department of Systems and Computer Engineering Carleton University January 12, 2009 Agenda Problem Overview 3D Morphable Model Fitting Model
More informationSelf-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz Supplemental Material
Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz Supplemental Material Ayush Tewari 1,2 Michael Zollhöfer 1,2,3 Pablo Garrido 1,2 Florian Bernard 1,2 Hyeongwoo
More informationJoint design of data analysis algorithms and user interface for video applications
Joint design of data analysis algorithms and user interface for video applications Nebojsa Jojic Microsoft Research Sumit Basu Microsoft Research Nemanja Petrovic University of Illinois Brendan Frey University
More informationDeep Convolutional Inverse Graphics Network
Deep Convolutional Inverse Graphics Network Tejas D. Kulkarni* 1, William F. Whitney* 2, Pushmeet Kohli 3, Joshua B. Tenenbaum 4 1,2,4 Massachusetts Institute of Technology, Cambridge, USA 3 Microsoft
More informationLEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS
LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Introduction In this supplementary material, Section 2 details the 3D annotation for CAD models and real
More informationFundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision
Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching
More informationFlow-Based Video Recognition
Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction
More informationFace Alignment Across Large Poses: A 3D Solution
Face Alignment Across Large Poses: A 3D Solution Outline Face Alignment Related Works 3D Morphable Model Projected Normalized Coordinate Code Network Structure 3D Image Rotation Performance on Datasets
More informationComponent-based Face Recognition with 3D Morphable Models
Component-based Face Recognition with 3D Morphable Models Jennifer Huang 1, Bernd Heisele 1,2, and Volker Blanz 3 1 Center for Biological and Computational Learning, M.I.T., Cambridge, MA, USA 2 Honda
More informationOvercoming Occlusion with Inverse Graphics.
Overcoming Occlusion with Inverse Graphics. Pol Moreno 1, Christopher K.I. Williams 1, Charlie Nash 1, Pushmeet Kohli 2 1 School of Informatics, University of Edinburgh p.moreno-comellas@sms.ed.ac.uk,ckiw@inf.ed.ac.uk,
More informationLight Field Appearance Manifolds
Light Field Appearance Manifolds Chris Mario Christoudias, Louis-Philippe Morency, and Trevor Darrell Computer Science and Artificial Intelligence Laboratory Massachussetts Institute of Technology Cambridge,
More informationThe Isometric Self-Organizing Map for 3D Hand Pose Estimation
The Isometric Self-Organizing Map for 3D Hand Pose Estimation Haiying Guan, Rogerio S. Feris, and Matthew Turk Computer Science Department University of California, Santa Barbara {haiying,rferis,mturk}@cs.ucsb.edu
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Chi Li, M. Zeeshan Zia 2, Quoc-Huy Tran 2, Xiang Yu 2, Gregory D. Hager, and Manmohan Chandraker 2 Johns
More informationLecture 24: More on Reflectance CAP 5415
Lecture 24: More on Reflectance CAP 5415 Recovering Shape We ve talked about photometric stereo, where we assumed that a surface was diffuse Could calculate surface normals and albedo What if the surface
More informationSupplementary Material for Synthesizing Normalized Faces from Facial Identity Features
Supplementary Material for Synthesizing Normalized Faces from Facial Identity Features Forrester Cole 1 David Belanger 1,2 Dilip Krishnan 1 Aaron Sarna 1 Inbar Mosseri 1 William T. Freeman 1,3 1 Google,
More information3D Active Appearance Model for Aligning Faces in 2D Images
3D Active Appearance Model for Aligning Faces in 2D Images Chun-Wei Chen and Chieh-Chih Wang Abstract Perceiving human faces is one of the most important functions for human robot interaction. The active
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More informationA Morphable Model for the Synthesis of 3D Faces
A Morphable Model for the Synthesis of 3D Faces Marco Nef Volker Blanz, Thomas Vetter SIGGRAPH 99, Los Angeles Presentation overview Motivation Introduction Database Morphable 3D Face Model Matching a
More informationPassive 3D Photography
SIGGRAPH 2000 Course on 3D Photography Passive 3D Photography Steve Seitz Carnegie Mellon University University of Washington http://www.cs cs.cmu.edu/~ /~seitz Visual Cues Shading Merle Norman Cosmetics,
More information3D FACE RECONSTRUCTION BASED ON EPIPOLAR GEOMETRY
IJDW Volume 4 Number January-June 202 pp. 45-50 3D FACE RECONSRUCION BASED ON EPIPOLAR GEOMERY aher Khadhraoui, Faouzi Benzarti 2 and Hamid Amiri 3,2,3 Signal, Image Processing and Patterns Recognition
More informationComponent-based Face Recognition with 3D Morphable Models
Component-based Face Recognition with 3D Morphable Models B. Weyrauch J. Huang benjamin.weyrauch@vitronic.com jenniferhuang@alum.mit.edu Center for Biological and Center for Biological and Computational
More informationThe Hilbert Problems of Computer Vision. Jitendra Malik UC Berkeley & Google, Inc.
The Hilbert Problems of Computer Vision Jitendra Malik UC Berkeley & Google, Inc. This talk The computational power of the human brain Research is the art of the soluble Hilbert problems, circa 2004 Hilbert
More informationDataset Augmentation for Pose and Lighting Invariant Face Recognition
Dataset Augmentation for Pose and Lighting Invariant Face Recognition Daniel Crispell, Octavian Biris, Nate Crosswhite, Jeffrey Byrne, Joseph L. Mundy Vision Systems, Inc. Systems and Technology Research
More informationDEEP BLIND IMAGE QUALITY ASSESSMENT
DEEP BLIND IMAGE QUALITY ASSESSMENT BY LEARNING SENSITIVITY MAP Jongyoo Kim, Woojae Kim and Sanghoon Lee ICASSP 2018 Deep Learning and Convolutional Neural Networks (CNNs) SOTA in computer vision & image
More informationCPSC 532E Week 6: Lecture. Surface Perception; Completion
Week 6: Lecture Surface Perception; Completion Reflectance functions Shape from shading; shape from texture Visual Completion Figure/Ground ACM Transactions on Applied Perception - Call for papers Numerous
More informationSeeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su
Seeing the unseen Data-driven 3D Understanding from Single Images Hao Su Image world Shape world 3D perception from a single image Monocular vision a typical prey a typical predator Cited from https://en.wikipedia.org/wiki/binocular_vision
More informationVideo Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford
Video Google faces Josef Sivic, Mark Everingham, Andrew Zisserman Visual Geometry Group University of Oxford The objective Retrieve all shots in a video, e.g. a feature length film, containing a particular
More informationLight Field Occlusion Removal
Light Field Occlusion Removal Shannon Kao Stanford University kaos@stanford.edu Figure 1: Occlusion removal pipeline. The input image (left) is part of a focal stack representing a light field. Each image
More informationMulti-View AAM Fitting and Camera Calibration
To appear in the IEEE International Conference on Computer Vision Multi-View AAM Fitting and Camera Calibration Seth Koterba, Simon Baker, Iain Matthews, Changbo Hu, Jing Xiao, Jeffrey Cohn, and Takeo
More informationThe correspondence problem. A classic problem. A classic problem. Deformation-Drive Shape Correspondence. Fundamental to geometry processing
The correspondence problem Deformation-Drive Shape Correspondence Hao (Richard) Zhang 1, Alla Sheffer 2, Daniel Cohen-Or 3, Qingnan Zhou 2, Oliver van Kaick 1, and Andrea Tagliasacchi 1 July 3, 2008 1
More informationPerceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington
Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand
More informationLearning to Estimate 3D Human Pose and Shape from a Single Color Image Supplementary material
Learning to Estimate 3D Human Pose and Shape from a Single Color Image Supplementary material Georgios Pavlakos 1, Luyang Zhu 2, Xiaowei Zhou 3, Kostas Daniilidis 1 1 University of Pennsylvania 2 Peking
More informationHuman Face Shape Analysis under Spherical Harmonics Illumination Considering Self Occlusion
Human Face Shape Analysis under Spherical Harmonics Illumination Considering Self Occlusion Jasenko Zivanov Andreas Forster Sandro Schönborn Thomas Vetter jasenko.zivanov@unibas.ch forster.andreas@gmail.com
More informationAn Overview of Matchmoving using Structure from Motion Methods
An Overview of Matchmoving using Structure from Motion Methods Kamyar Haji Allahverdi Pour Department of Computer Engineering Sharif University of Technology Tehran, Iran Email: allahverdi@ce.sharif.edu
More informationDeformable 3D Reconstruction with an Object Database
Pablo F. Alcantarilla and Adrien Bartoli BMVC 2012 1 of 22 Pablo F. Alcantarilla and Adrien Bartoli ISIT-UMR 6284 CNRS, Université d Auvergne, Clermont Ferrand, France BMVC 2012, Guildford, UK Pablo F.
More informationStereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman
Stereo 11/02/2012 CS129, Brown James Hays Slides by Kristen Grauman Multiple views Multi-view geometry, matching, invariant features, stereo vision Lowe Hartley and Zisserman Why multiple views? Structure
More informationBeyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationECE 6554:Advanced Computer Vision Pose Estimation
ECE 6554:Advanced Computer Vision Pose Estimation Sujay Yadawadkar, Virginia Tech, Agenda: Pose Estimation: Part Based Models for Pose Estimation Pose Estimation with Convolutional Neural Networks (Deep
More informationECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016
ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D
More informationPix2Face: Direct 3D Face Model Estimation
Pix2Face: Direct 3D Face Model Estimation Daniel Crispell Maxim Bazik Vision Systems, Inc Providence, RI USA danielcrispell, maximbazik@visionsystemsinccom Abstract An efficient, fully automatic method
More informationSupplemental Material for Face Alignment Across Large Poses: A 3D Solution
Supplemental Material for Face Alignment Across Large Poses: A 3D Solution Xiangyu Zhu 1 Zhen Lei 1 Xiaoming Liu 2 Hailin Shi 1 Stan Z. Li 1 1 Institute of Automation, Chinese Academy of Sciences 2 Department
More informationRealtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Authors: Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh Presented by: Suraj Kesavan, Priscilla Jennifer ECS 289G: Visual Recognition
More informationDiscrete Optimization of Ray Potentials for Semantic 3D Reconstruction
Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction Marc Pollefeys Joined work with Nikolay Savinov, Christian Haene, Lubor Ladicky 2 Comparison to Volumetric Fusion Higher-order ray
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationSurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material
SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material Ayan Sinha MIT Asim Unmesh IIT Kanpur Qixing Huang UT Austin Karthik Ramani Purdue sinhayan@mit.edu a.unmesh@gmail.com
More informationSupplementary Materials for A Lighting Robust Fitting Approach of 3D Morphable Model For Face Reconstruction
Noname manuscript No. (will be inserted by the editor) Supplementary Materials for A Lighting Robust Fitting Approach of 3D Morphable Model For Face Reconstruction Mingyang Ma Silong Peng Xiyuan Hu Received:
More informationOptimizing Monocular Cues for Depth Estimation from Indoor Images
Optimizing Monocular Cues for Depth Estimation from Indoor Images Aditya Venkatraman 1, Sheetal Mahadik 2 1, 2 Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai,
More information3D Shape Segmentation with Projective Convolutional Networks
3D Shape Segmentation with Projective Convolutional Networks Evangelos Kalogerakis 1 Melinos Averkiou 2 Subhransu Maji 1 Siddhartha Chaudhuri 3 1 University of Massachusetts Amherst 2 University of Cyprus
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationMulti-view stereo. Many slides adapted from S. Seitz
Multi-view stereo Many slides adapted from S. Seitz Beyond two-view stereo The third eye can be used for verification Multiple-baseline stereo Pick a reference image, and slide the corresponding window
More informationLecture 14: Computer Vision
CS/b: Artificial Intelligence II Prof. Olga Veksler Lecture : Computer Vision D shape from Images Stereo Reconstruction Many Slides are from Steve Seitz (UW), S. Narasimhan Outline Cues for D shape perception
More informationReal-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document
Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document Franziska Mueller 1,2 Dushyant Mehta 1,2 Oleksandr Sotnychenko 1 Srinath Sridhar 1 Dan Casas 3 Christian Theobalt
More informationAbstract We present a system which automatically generates a 3D face model from a single frontal image of a face. Our system consists of two component
A Fully Automatic System To Model Faces From a Single Image Zicheng Liu Microsoft Research August 2003 Technical Report MSR-TR-2003-55 Microsoft Research Microsoft Corporation One Microsoft Way Redmond,
More informationStructured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov
Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter
More informationFACIAL ANIMATION FROM SEVERAL IMAGES
International Archives of Photogrammetry and Remote Sensing. Vol. XXXII, Part 5. Hakodate 1998 FACIAL ANIMATION FROM SEVERAL IMAGES Yasuhiro MUKAIGAWAt Yuichi NAKAMURA+ Yuichi OHTA+ t Department of Information
More informationApplying Synthetic Images to Learning Grasping Orientation from Single Monocular Images
Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images 1 Introduction - Steve Chuang and Eric Shan - Determining object orientation in images is a well-established topic
More informationSupplementary Material Estimating Correspondences of Deformable Objects In-the-wild
Supplementary Material Estimating Correspondences of Deformable Objects In-the-wild Yuxiang Zhou Epameinondas Antonakos Joan Alabort-i-Medina Anastasios Roussos Stefanos Zafeiriou, Department of Computing,
More informationLandmark Detection and 3D Face Reconstruction using Modern C++
Landmark Detection and 3D Face Reconstruction using Modern C++ Patrik Huber Centre for Vision, Speech and Signal Processing University of Surrey, UK p.huber@surrey.ac.uk BMVA technical meeting: The Computational
More informationarxiv: v2 [cs.cv] 24 Apr 2018
Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction Shubham Tulsiani, Alexei A. Efros, Jitendra Malik University of California, Berkeley {shubhtuls, efros, malik}@eecs.berkeley.edu
More informationEstimation of Face Depth Maps from Color Textures using Canonical Correlation Analysis
Computer Vision Winter Workshop 006, Ondřej Chum, Vojtěch Franc (eds.) Telč, Czech Republic, February 6 8 Czech Pattern Recognition Society Estimation of Face Depth Maps from Color Textures using Canonical
More information3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington
3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World
More informationProf. Trevor Darrell Lecture 18: Multiview and Photometric Stereo
C280, Computer Vision Prof. Trevor Darrell trevor@eecs.berkeley.edu Lecture 18: Multiview and Photometric Stereo Today Multiview stereo revisited Shape from large image collections Voxel Coloring Digital
More informationCS 231A Computer Vision (Winter 2014) Problem Set 3
CS 231A Computer Vision (Winter 2014) Problem Set 3 Due: Feb. 18 th, 2015 (11:59pm) 1 Single Object Recognition Via SIFT (45 points) In his 2004 SIFT paper, David Lowe demonstrates impressive object recognition
More informationFace View Synthesis Across Large Angles
Face View Synthesis Across Large Angles Jiang Ni and Henry Schneiderman Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 1513, USA Abstract. Pose variations, especially large out-of-plane
More informationBOP: Benchmark for 6D Object Pose Estimation
BOP: Benchmark for 6D Object Pose Estimation Hodan, Michel, Brachmann, Kehl, Buch, Kraft, Drost, Vidal, Ihrke, Zabulis, Sahin, Manhardt, Tombari, Kim, Matas, Rother 4th International Workshop on Recovering
More informationNonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H.
Nonrigid Surface Modelling and Fast Recovery Zhu Jianke Supervisor: Prof. Michael R. Lyu Committee: Prof. Leo J. Jia and Prof. K. H. Wong Department of Computer Science and Engineering May 11, 2007 1 2
More informationSingle Image Super-resolution. Slides from Libin Geoffrey Sun and James Hays
Single Image Super-resolution Slides from Libin Geoffrey Sun and James Hays Cs129 Computational Photography James Hays, Brown, fall 2012 Types of Super-resolution Multi-image (sub-pixel registration) Single-image
More informationELL 788 Computational Perception & Cognition July November 2015
ELL 788 Computational Perception & Cognition July November 2015 Module 6 Role of context in object detection Objects and cognition Ambiguous objects Unfavorable viewing condition Context helps in object
More informationRecovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy
Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform Xintao Wang Ke Yu Chao Dong Chen Change Loy Problem enlarge 4 times Low-resolution image High-resolution image Previous
More informationMoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction Ayush Tewari Michael Zollhofer Hyeongwoo Kim Pablo Garrido Florian Bernard Patrick Perez Christian Theobalt
More informationImage Processing. Filtering. Slide 1
Image Processing Filtering Slide 1 Preliminary Image generation Original Noise Image restoration Result Slide 2 Preliminary Classic application: denoising However: Denoising is much more than a simple
More informationREAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR
REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR Nuri Murat Arar1, Fatma Gu ney1, Nasuh Kaan Bekmezci1, Hua Gao2 and Hazım Kemal Ekenel1,2,3 1 Department of Computer Engineering, Bogazici University,
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationSupplementary: Cross-modal Deep Variational Hand Pose Estimation
Supplementary: Cross-modal Deep Variational Hand Pose Estimation Adrian Spurr, Jie Song, Seonwook Park, Otmar Hilliges ETH Zurich {spurra,jsong,spark,otmarh}@inf.ethz.ch Encoder/Decoder Linear(512) Table
More informationarxiv: v1 [cs.lg] 5 Jan 2016
Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis Jimei Yang 1,3 Scott Reed 2 Ming-Hsuan Yang 1 Honglak Lee 2 arxiv:1601.00706v1 [cs.lg] 5 Jan 2016 1 University of California,
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More information3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.
3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction
More informationMultiple View Geometry
Multiple View Geometry Martin Quinn with a lot of slides stolen from Steve Seitz and Jianbo Shi 15-463: Computational Photography Alexei Efros, CMU, Fall 2007 Our Goal The Plenoptic Function P(θ,φ,λ,t,V
More informationInverse Rendering with a Morphable Model: A Multilinear Approach
ALDRIAN, SMITH: INVERSE RENDERING OF FACES WITH A MORPHABLE MODEL 1 Inverse Rendering with a Morphable Model: A Multilinear Approach Oswald Aldrian oswald@cs.york.ac.uk William A. P. Smith wsmith@cs.york.ac.uk
More informationSynscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.
Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet 7D Labs VINNOVA https://7dlabs.com Photo-realistic image synthesis
More informationarxiv: v1 [cs.cv] 30 Sep 2018
3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image Priyanka Mandikal, Navaneet K L, and R. Venkatesh Babu arxiv:1810.00461v1 [cs.cv] 30 Sep 2018 Indian Institute of Science, Bangalore,
More informationOccluded Facial Expression Tracking
Occluded Facial Expression Tracking Hugo Mercier 1, Julien Peyras 2, and Patrice Dalle 1 1 Institut de Recherche en Informatique de Toulouse 118, route de Narbonne, F-31062 Toulouse Cedex 9 2 Dipartimento
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationFinding Tiny Faces Supplementary Materials
Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution
More informationObject Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing
Object Recognition Lecture 11, April 21 st, 2008 Lexing Xie EE4830 Digital Image Processing http://www.ee.columbia.edu/~xlx/ee4830/ 1 Announcements 2 HW#5 due today HW#6 last HW of the semester Due May
More informationLEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES
LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin Ellis - MIT, Daniel Ritchie - Brown University, Armando Solar-Lezama - MIT, Joshua b. Tenenbaum - MIT Presented by : Maliha Arif Advanced
More informationComputer Vision: Making machines see
Computer Vision: Making machines see Roberto Cipolla Department of Engineering http://www.eng.cam.ac.uk/~cipolla/people.html http://www.toshiba.eu/eu/cambridge-research- Laboratory/ Vision: what is where
More informationVideo Google: A Text Retrieval Approach to Object Matching in Videos
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature
More informationVision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image
Vision-as-Inverse-Graphics: Obtaining a Rich 3D Explanation of a Scene from a Single Image Lukasz Romaszko 1 Christopher K.I. Williams 1,2 Pol Moreno 1 Pushmeet Kohli 3,* 1 School of Informatics, University
More informationR-FCN: Object Detection with Really - Friggin Convolutional Networks
R-FCN: Object Detection with Really - Friggin Convolutional Networks Jifeng Dai Microsoft Research Li Yi Tsinghua Univ. Kaiming He FAIR Jian Sun Microsoft Research NIPS, 2016 Or Region-based Fully Convolutional
More informationGAN Related Works. CVPR 2018 & Selective Works in ICML and NIPS. Zhifei Zhang
GAN Related Works CVPR 2018 & Selective Works in ICML and NIPS Zhifei Zhang Generative Adversarial Networks (GANs) 9/12/2018 2 Generative Adversarial Networks (GANs) Feedforward Backpropagation Real? z
More informationDeep Learning in Image Processing
Deep Learning in Image Processing Roland Memisevic University of Montreal & TwentyBN ICISP 2016 Roland Memisevic Deep Learning in Image Processing ICISP 2016 f 2? cathedral high-rise f 1 It s the features,
More informationarxiv: v1 [cs.cv] 16 Nov 2015
Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial
More informationPose Normalization via Learned 2D Warping for Fully Automatic Face Recognition
A ASTHANA ET AL: POSE NORMALIZATION VIA LEARNED D WARPING 1 Pose Normalization via Learned D Warping for Fully Automatic Face Recognition Akshay Asthana 1, aasthana@rsiseanueduau Michael J Jones 1 and
More informationMulti-view 3D Models from Single Images with a Convolutional Network
Multi-view 3D Models from Single Images with a Convolutional Network Maxim Tatarchenko University of Freiburg Skoltech - 2nd Christmas Colloquium on Computer Vision Humans have prior knowledge about 3D
More informationObject Purpose Based Grasping
Object Purpose Based Grasping Song Cao, Jijie Zhao Abstract Objects often have multiple purposes, and the way humans grasp a certain object may vary based on the different intended purposes. To enable
More information