EECS 442 Computer vision. 3D Object Recognition and. Scene Understanding

Size: px
Start display at page:

Download "EECS 442 Computer vision. 3D Object Recognition and. Scene Understanding"

Transcription

1 EECS 442 Computer vision 3D Object Recognition and Scene Understanding

2 Interpreting the visual world Object: Building 8-10 meters away Object: Traffic light Object: Car, ¾ view 2-3 meters away

3 How can we achieve all of this? 3D modeling no semantic Semantic reasoning no 3D geometry Joint 3D modeling and semantic reasoning Chen & Medioni, 92 Debevec et al 96 Pollefeys et al 02 Nister 04 Hartley & Zisserman, 00 Levoy et al., 00 Brown & Lowe, 04 Schindler et al 08 Snavely et al 08 Agarwal et al 09 Etc

4 How can we achieve all of this? 3D modeling no semantic Semantic no 3D geometry Weber et al. 00 Felzenszwalb & Huttenlocher, 00 Ullman et al. 02 Fergus et al. 03 Torralba et al. 03 Leibe & Schiele, 04 Kumar & Hebert 04 Fei-Fei & Perona, 05 Sivic et al. 05 Shotton et al 05 Grauman et al. 05 Lazebnik et al, 06 Maji & Malik, 07 Vedaldi & Soatto 08 Zhu et al 08 Etc

5 Courtesy of Adan et al, 2011 How can we achieve all of this? 3D modeling no semantic Semantic no 3D geometry Semantic from range data disjoint 3D modeling and recognition Huber 01 Rusu et al. 08 Brostow et al. 08 Son & Kim 10 Tang et al. 10 Adan et al. 11 etc

6 How can we achieve all of this? 3D modeling no semantic Semantic no 3D geometry Semantic from range data disjoint 3D modeling and recognition Joint 3D modeling and semantic reasoning Hoiem et al Gould et al. 09 Hedau et al. 09 Gupta et al, 10 Ladick y et al, 10 Bao, Sun, Savarese 10 Sun, Bao, Savarese 10 Bao & Savarese, 11

7 Joint 3D modeling and recognition Objects and their geometrical attributes provide constraints for estimating the scene layout Given the scene the layout, objects can be detected more robustly

8 In this lecture. 3D Object detectors Robust to view point transformation Allow to estimate pose, scale and 3D shape Methods for coherent object detection and scene layout estimation single view multi-view videos

9 3D Object Detectors Azimuth, Zenith Viewing sphere Detect objects under generic view points Estimate object pose General and work for any object category

10 3D Object Detectors Detect objects under generic view points Estimate object pose General and work for any object category

11 3D Object Categorization

12 Single view object categorization Leung et al 99 Weber et al. 00 Ullman et al. 02 Fergus et al. 03 Torralba et al. 03 Felzenszwalb & Huttenlocher 03 Fei-Fei et al. 04 Leibe et al. 04 Kumar & Hebert 04 Sivic et al. 05 Shotton et al 05 Grauman et al. 05 Sudderth et al 05 Torralba et al. 05 Lazebnik et al. 06 Todorovic et al. 06 Bosh et al 07 Vedaldi & Soatto 08

13 Single 3D object recognition Ballard, 81 Grimson & L.-Perez, 87 Lowe, 87 Edelman et al. 91 Ullman & Barsi, 91 Rothwell 92 Linderberg, 94 Murase & Nayar 94 Zhang et al 95 Schmid & Mohr, 96 Schiele & Crowley, 96 Lowe, 99 Jacob & Barsi, 99 Rothganger et al., 04 Ferrari et al, 05 Brown & Lowe 05 Snavely et al 06 Yin & Collins, 07

14 3D Object Categorization

15 3D Object Categorization Mixture of 2D single view models Weber et al. 00 Schneiderman et al. 01 Bart et al. 04 Gu & Ren, 10 3D models - Explicit 3d models - Implicit 3d models Chiu et al. 07 Hoiem, et al., 07 Yan, et al. 07 Thomas et al. 06 Kushal, et al., 07 Savarese et al, 07, 08

16 Mixture of 2D models Weber et al. 00 Schneiderman et al. 01 Ullman et al. 02 Fergus et al. 03 Torralba et al. 03 Felzenszwalb & Huttenlocher 03 Leibe et al. 04 Shotton et al. 05 Grauman et al. 05 Savarese et al, 06 Todorovic et al. 06 Vedaldi & Soatto 08 Zhu et al 08 Gu & Ren, 10 Single view model Single view model 3D Category model CONS: Single view models are independent Non scalable to large number of categories/view-points Just b. boxes Cannot estimate 3D pose or 3D layout

17 3D Object Categorization Mixture of 2D single view models Weber et al. 00 Schneiderman et al. 01 Bart et al. 04 Gu & Ren, 10 3D models - Implicit 3d models - Explicit 3d models Chiu et al. 07 Hoiem, et al., 07 Yan, et al. 07. Xiang & Savarese 12 Thomas et al. 06 Kushal, et al., 07 Savarese et al, 07, 08 Sun et al. 09

18 Implicit 3D models 3D Category model Sparse set of interest points or parts of the objects are linked across views by implicit 3D transformations (H, F)

19 Linking features or parts across views: Perspective or affine transformation constraints x x x = H x

20 Linking features or parts across views: Epipolar Transformation Constraints x l = F T x x l = F T x x l

21 Implicit 3D models by ISM representations Thomas et al. 06 Leibe et al. 04 Multi-view model Sparse set of interest points or parts of the objects are linked across views.

22 Implicit 3D models by ISM representations Region tracks [Ferrari et al. 04, 06] Courtesy of Thomas et al. 06 Set of region-tracks connecting model views Each track is composed of image regions of a single physical surface patch along the model views in which it is visible.

23 Results Implicit 3D models by ISM representations

24 Implicit 3D models by graph-based representations Savarese, Fei-Fei, ICCV 07 Savarese, Fei-Fei, ECCV 08 Sun, et al, CVPR 2009, ICCV 09 Canonical parts captures view invariant diagnostic appearance information 2d ½ structure linking parts via weak geometry Parts and relationship are modeled in a probabilistic fashion Parameters are learnt so as to maximize detection accuracy

25 Parameterization on view-sphere S T Model the object as collection of parts for any T and S on the viewing sphere

26 Multi-view generative part-based model T, S Yn=Codeword Xn=Location Yn=Codeword Xn=Location Image

27 Multi-view generative part-based model = Part Prop. Prior T, S K V = Part Appearance = Part Location/shape A Yn=Codeword Xn=Location Image

28 Multi-view generative part-based model Learning: estimate the latent variables and relevant parameters, given the observations Variational EM can be used Blei, ICML = Part Prop. Prior T, S K V = Part Appearance = Part Location/shape A Yn=Codeword Xn=Location Image

29 Incorporating geometrical constraints T i m M i j Within triangle constraints: M i j m i m j m j Encoded as a penalty term in variational EM

30 Incorporating geometrical constraints = Center T = Shape View morphing constraints: Seitz &Dyer SIGGRAPH 96 Xiao & Shah CVIU 04 Encoded as a penalty term in variational EM

31 Incorporating geometrical constraints S. M. Seitz and C. R. Dyer, Proc. SIGGRAPH 96, 1996, 21-30

32 h h h h h O P P P I,,, : k k k k k O P P P I,,, : Sequential ransac J-linkage [toldo et al 07] Defining initial parts and part correspondences Initializing the model

33 Semi-supervised Class label Object bounding box No part labels No pose labels [unlike Sun CVPR 09] No need to observe same object instance from multiple views [unlike Savarese & Fei-Fei, 07, 08]

34 Incremental learning T Enable unorganized and on-line collection training images Increase efficiency in learning (no need large storage space)

35 Incremental learning T Assign new training image to a triangle of the view sphere Evidence of training image is used to update model parameters Re-estimate sufficient statistics in a iterative fashion

36 Evolution of learnt parts

37 Examples of learnt part-based models Car 38

38 Examples of learnt part-based models Travel iron 39

39 Experimental results Object detection from any viewing angles Accurate estimation of the object pose PASCAL 2006 dataset 3D Object Dataset

40 Car 41

41 Travel Iron 42

42 Detection - 3D Object Dataset Car Bicycle ROC ROC Our model Sun et al, CVPR 09 Savarese & Fei-Fei ICCV 07

43 Viewpoint Classification 3D object dataset 0º 45º 90º V1 V2 V3 135º V4 180º V5 225º V6 270º V7 315º V8 Classification Accuracy Our model Savarese ICCV 07

44 Predicting object appearance from novel views Viewing sphere?

45 Predicting object appearance [For natural scenes, see Hoiem et al 07; Saxena et al 07] from novel views Thomas et al 08 Cremer et al 09

46

47 Predicting object appearance from novel views Affine transformation Our model

48 3D Object Categorization Mixture of 2D single view models Weber et al. 00 Schneiderman et al. 01 Bart et al. 04 Gu & Ren, 10 3D models - explicit 3d models - Implict 3d models Chiu et al. 07 Hoiem, et al., 07 Yan, et al. 07 Thomas et al. 06 Kushal, et al., 07 Savarese et al, 07, 08

49 Explicit 3D Models Chiu et al. 07 Hoiem, et al., 07 Yan, et al. 07 Xiang & Svarese, 12 H ij 3D Category model Part configuration is modeled as a conditional random fields with maximal margin parameter estimation Enable 6DOF object pose estimation 3D layout estimation of object parts

50 Explicit 3D Models Xiang & Savarese, CVPR 12 Mixture of 2D single view models Weber et al. 00 Schneiderman et al. 01 Bart et al. 04 Gu & Ren, 10 3D models - Explicit 3D models - Implicit 3D models Chiu et al. 07 Hoiem, et al., 07 Yan, et al. 07 Thomas et al. 06 Kushal, et al., 07 Savarese et al, 07, 08 [3D object dataset, 07]

51 In this lecture. 3D Object detectors Robust to view point transformation Allow to estimate pose, scale and 3D shape Methods for coherent object detection and scene layout estimation single view multi-view videos

52 3D scene understanding from a single image Bao, Sun, Savarese, CVPR 2010; BMVC 2010; IJCV 2012 Coherent probabilistic model captures relationship between objects and supporting planes No assumptions on cameras Work both in indoors and outdoors Hoiem et al Gould et al. 09 Hedau et al. 09 Lee et al. 09, 10 Gupta et al, 10, 11 Tsai et al. 11

53 3D scene understanding from a single image Bao, Sun, Savarese, CVPR 2010; BMVC 2010; IJCV 2012 Coherent probabilistic model captures relationship between objects and supporting planes No assumptions on cameras Work both in indoors and outdoors Hoiem et al Gould et al. 09 Hedau et al. 09 Gupta et al, 10, 11

54 In this lecture. 3D Object detectors Robust to view point transformation Allow to estimate pose, scale and 3D shape Methods for coherent object detection and scene layout estimation single view multi-view videos

55 3D scene understanding from multiple images Bao & S. Savarese, CVPR 2011 Bao, Bagra, Savarese. CORP ICCV 2011 Bao, Bagra, Chao, Savarese, CVPR 2012 Bao, Xiang, Savarese, ECCV 2012 Semantic Structure from Motion (SSFM) Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T Huber 01 Rusu et al. 08 Brostow et al. 08 Son & Kim 10 Tang et al. 10 Adan et al. 11 etc

56 Semantic Structure from Motion (SSFM) Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

57 Factor graph Semantic Structure from Motion (SSFM) Y CO Y CQ Y CB Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

58 SSFM: point-level compatibility Y CQ Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

59 SSFM: point-level compatibility observation projection Point re-projection error Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T Tomasi & Kanade 92 Triggs et al 99 Soatto & Perona 99 Hartley & Zisserman 00 Dellaert et al. 00 Pollefeys & V. Gool 02 Nister 04 Brown & Lowe 07 Snavely et al. 08

60 SSFM: Object-level compatibility Y CO Object re-projection error Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

61 SSFM: Object-level compatibility Camera 1 Camera 2 Agreement with measurements is computed using position, pose and scale

62 SSFM: Object-level compatibility Class = car scale=1 pose= back A 3D object detector returns the confidence value (probability) that an object class c with scale s and pose p is found at x,y Savarese, Fei-Fei, ICCV 07 Savarese, Fei-Fei, ECCV 08 Su et al, ICCV 2009 Sun, et al, CVPR 2009 Sun et al, ECCV 2010 Yu & Savarese, CVPR 2012

63 SSFM: Object-level compatibility Class = car scale=3 pose= 3/4 A 3D object detector returns the confidence value (probability) that an object class c with scale s and pose p is found at x,y Savarese, Fei-Fei, ICCV 07 Savarese, Fei-Fei, ECCV 08 Su et al, ICCV 2009 Sun, et al, CVPR 2009 Sun et al, ECCV 2010 Yu & Savarese, CVPR 2012

64 SSFM: Object-level compatibility Class = car scale=1 pose= back Class = car scale=1 pose= 3/4 Camera 1 Camera 2 Efficiently implemented using a parallel computing architecture

65 SSFM: Region-level compatibility Y CB Region re-projection error Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

66 SSFM with interactions Bao, Bagra, Chao, Savarese CVPR 2012 Y QO Y CO Y OB Y CQ Y CB Y QB Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

67 SSFM with interactions Bao, Bagra, Chao, Savarese CVPR 2012 Object-Point Interactions: x x Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) x Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

68 SSFM with interactions Point-Region Interactions: x x x Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

69 SSFM with interactions Object-Region Interactions: Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

70 SSFM with interactions Object-Region Interactions: Measurements I Points (x,y,scale) Objects (x,y, scale, pose) Regions (x,y, pose) Model Parameters: Q = 3D points O = 3D objects B = 3D regions = cam. prm. K, R, T

71 Solving the SSFM problem Modified Markov Chain Monte Carlo (MCMC) sampling algorithm F. Dellaert, S. Seitz, S. Thrun, and C. Thorpe. Feature correspondence: A markov chain monte carlo approach. In NIPS, 2000 Initialization of the cameras, objects, and points are critical for the sampling Initialize configuration of cameras using: SFM consistency of object/region properties across views

72 Results Public Ford Campus Vision and LiDAR Dataset [Pandey et al, International Journal of Robotics Research, 2011] Object categories: Cars Ground truth depth provided by LiDAR In-house Office dataset Object categories: mugs, mice, keyboards Ground truth depth provided by Kinect In-house Street dataset Object categories: humans No ground truth depth available

73 Segmentation View N Detections View 1 Results Observations Joint reconstruction & recognition

74

75 Segmentation View N Detections View 1 Results Observations Joint reconstruction & recognition

76

77 View N Detections View 1 Results Observations Joint reconstruction & recognition SSFM Source code available!

78 FORD CAMPUS (cars) Object detection results Average precision in detecting objects (cars) in the 2D image DPM [1] SSFM (2011) with 2 views SSFM (2012) with 2 views SSFM (2012) with 4 views 54.5% 61.3% 62.8% 66.5% Accuracy in localizing objects in the 3D space (AP) Hoiem [2] SSFM [2011] SSFM [2012] FORD CAMPUS cars 21.4% 32.7% 43.1% OFFICE keyboards, mice, monitors 15.5% 20.2% 21.6% [1] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. TPAMI, 2009 [2] D. Hoiem, A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008.

79 Camera estimation results Camera parameter reconstruction errors Camera translation error SFM [1] SSFM (2011) SSFM (2012) FORD CAMPUS OFFICE STREET Camera rotation error SFM [1] SSFM (2011) <1 <1 <1 SSFM (2012) [1] N. Snavely, S. M. Seitz, and R. S. Szeliski. Modeling the world from internet photo collections. IJCV, (2), Nov. 2008

80 Results

81 e T Results

82 Source code available!

83 In this lecture. 3D Object detectors Robust to view point transformation Allow to estimate pose, scale and 3D shape Methods for coherent object detection and scene layout estimation single view multi-view videos

84 Joint 3D modeling and recognition from videos Monocular cameras Un-calibrated cameras Arbitrary motion Highly cluttered scenes Occlusion Background clutter Moving targets Wu et al 07, Breitenstein et al 09, Zhao et al 04 Ess et al 09 Choi & Shahid & Savarese, WMC 2010 Choi & Savarese, ECCV 2010

85 Joint tracking and camera estimation Interest Points in 3D Tracked Interest Points Camera Parameters Pedestrian Detections Target Location in 3D Easily add additional evidence 3d depth IMU, etc 5 frames/second! Code available on line soon! Ω : set of state variables Χ : set of observations

86

87 Safe Driving Applications

88 Autonomous navigation

89 Conclusions Intelligent vision requires joint reconstruction - recognition Geometry provides critical contextual cues for robust recognition High level semantics help establish robust geometrical constraints for reconstruction Within a single view Across views High level semantics help scalability in reconstruction problems Fewer images are needed with wider baseline

90 EECS 442 Computer vision Hope you have enjoyed this class! Good luck with your projects & presentations!

91

Lecture 15 Visual recognition

Lecture 15 Visual recognition Lecture 15 Visual recognition 3D object detection Introduction Single instance 3D object detectors Generic 3D object detectors Silvio Savarese Lecture 15-26-Feb-14 3D object detection Object: Building

More information

Methods for Representing and Recognizing 3D objects

Methods for Representing and Recognizing 3D objects Methods for Representing and Recognizing 3D objects part 1 Silvio Savarese University of Michigan at Ann Arbor Object: Building, 45º pose, 8-10 meters away Object: Person, back; 1-2 meters away Object:

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Session 20 Object Recognition 3 Mani Golparvar-Fard Department of Civil and Environmental Engineering Department of Computer Science 3129D,

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Session 20 Object Recognition 3 Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering Lab

More information

3dRR: Representation and Recognition

3dRR: Representation and Recognition CS331B (3 units) 3dRR: Representation and Recognition Instructor: Silvio Savarese Email: ssilvio@stanford.edu Office: Gates, room: 228 Office hour: by appointment TA: Scott Chung scottc52@gmail.com Class

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Session 18 Object Recognition I Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering Lab

More information

Object Detection by 3D Aspectlets and Occlusion Reasoning

Object Detection by 3D Aspectlets and Occlusion Reasoning Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition

More information

EE290T : 3D Reconstruction and Recognition

EE290T : 3D Reconstruction and Recognition EE290T : 3D Reconstruction and Recognition Acknowledgement Courtesy of Prof. Silvio Savarese. Introduction There was a table set out under a tree in front of the house, and the March Hare and the Hatter

More information

Revisiting 3D Geometric Models for Accurate Object Shape and Pose

Revisiting 3D Geometric Models for Accurate Object Shape and Pose Revisiting 3D Geometric Models for Accurate Object Shape and Pose M. 1 Michael Stark 2,3 Bernt Schiele 3 Konrad Schindler 1 1 Photogrammetry and Remote Sensing Laboratory Swiss Federal Institute of Technology

More information

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS

More information

Methods for Representing and Recognizing 3D objects

Methods for Representing and Recognizing 3D objects Metods for Representing and Recognizing 3D objects part 2 Silvio Savarese University of Micigan at Ann Arbor Part- based Multi-view models Savarese, Fei-Fei, ICCV 07 Savarese, Fei-Fei, ECCV 08 Sun, Su,

More information

Object Detection by 3D Aspectlets and Occlusion Reasoning

Object Detection by 3D Aspectlets and Occlusion Reasoning Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan yuxiang@umich.edu Silvio Savarese Stanford University ssilvio@stanford.edu Abstract We propose a novel framework

More information

Supervised learning. y = f(x) function

Supervised learning. y = f(x) function Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the

More information

Deformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation

Deformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation Deformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation Roberto J. Lo pez-sastre Tinne Tuytelaars2 Silvio Savarese3 GRAM, Dept. Signal Theory and Communications,

More information

Course Administration

Course Administration Course Administration Project 2 results are online Project 3 is out today The first quiz is a week from today (don t panic!) Covers all material up to the quiz Emphasizes lecture material NOT project topics

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

Seeing 3D chairs: Exemplar part-based 2D-3D alignment using a large dataset of CAD models

Seeing 3D chairs: Exemplar part-based 2D-3D alignment using a large dataset of CAD models Seeing 3D chairs: Exemplar part-based 2D-3D alignment using a large dataset of CAD models Mathieu Aubry (INRIA) Daniel Maturana (CMU) Alexei Efros (UC Berkeley) Bryan Russell (Intel) Josef Sivic (INRIA)

More information

Instance-level recognition part 2

Instance-level recognition part 2 Visual Recognition and Machine Learning Summer School Paris 2011 Instance-level recognition part 2 Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique,

More information

Instance-level recognition II.

Instance-level recognition II. Reconnaissance d objets et vision artificielle 2010 Instance-level recognition II. Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique, Ecole Normale

More information

Part-based models. Lecture 10

Part-based models. Lecture 10 Part-based models Lecture 10 Overview Representation Location Appearance Generative interpretation Learning Distance transforms Other approaches using parts Felzenszwalb, Girshick, McAllester, Ramanan

More information

Semantic Structure from Motion

Semantic Structure from Motion Semantic Structure from Motion Sid Yingze Bao and Silvio Savarese Department of Electrical and Computer Engineering, University of Michigan at Ann Arbor {yingze,silvio}@eecs.umich.edu Abstract Conventional

More information

Part-Based Models for Object Class Recognition Part 3

Part-Based Models for Object Class Recognition Part 3 High Level Computer Vision! Part-Based Models for Object Class Recognition Part 3 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de! http://www.d2.mpi-inf.mpg.de/cv ! State-of-the-Art

More information

Modeling 3D viewpoint for part-based object recognition of rigid objects

Modeling 3D viewpoint for part-based object recognition of rigid objects Modeling 3D viewpoint for part-based object recognition of rigid objects Joshua Schwartz Department of Computer Science Cornell University jdvs@cs.cornell.edu Abstract Part-based object models based on

More information

Lecture 10: Multi view geometry

Lecture 10: Multi view geometry Lecture 10: Multi view geometry Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

Lecture 10: Multi-view geometry

Lecture 10: Multi-view geometry Lecture 10: Multi-view geometry Professor Stanford Vision Lab 1 What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

Supervised learning. y = f(x) function

Supervised learning. y = f(x) function Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the

More information

CS231M Mobile Computer Vision Structure from motion

CS231M Mobile Computer Vision Structure from motion CS231M Mobile Computer Vision Structure from motion - Cameras - Epipolar geometry - Structure from motion Pinhole camera Pinhole perspective projection f o f = focal length o = center of the camera z y

More information

3D Spatial Layout Propagation in a Video Sequence

3D Spatial Layout Propagation in a Video Sequence 3D Spatial Layout Propagation in a Video Sequence Alejandro Rituerto 1, Roberto Manduchi 2, Ana C. Murillo 1 and J. J. Guerrero 1 arituerto@unizar.es, manduchi@soe.ucsc.edu, acm@unizar.es, and josechu.guerrero@unizar.es

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Lecture 16: Object recognition: Part-based generative models

Lecture 16: Object recognition: Part-based generative models Lecture 16: Object recognition: Part-based generative models Professor Stanford Vision Lab 1 What we will learn today? Introduction Constellation model Weakly supervised training One-shot learning (Problem

More information

Virtual Training for Multi-View Object Class Recognition

Virtual Training for Multi-View Object Class Recognition Virtual Training for Multi-View Object Class Recognition Han-Pang Chiu Leslie Pack Kaelbling Tomás Lozano-Pérez MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139, USA {chiu,lpk,tlp}@csail.mit.edu

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

High Level Computer Vision

High Level Computer Vision High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de http://www.d2.mpi-inf.mpg.de/cv Please Note No

More information

CS 558: Computer Vision 13 th Set of Notes

CS 558: Computer Vision 13 th Set of Notes CS 558: Computer Vision 13 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview Context and Spatial Layout

More information

Probabilistic Parameter Selection for Learning Scene Structure from Video

Probabilistic Parameter Selection for Learning Scene Structure from Video Probabilistic Parameter Selection for Learning Scene Structure from Video M. D. Breitenstein 1 E. Sommerlade 2 B. Leibe 1 L. Van Gool 1 I. Reid 2 1 ETH Zurich, Switzerland 2 University of Oxford, UK Abstract

More information

EECS 442 Computer Vision fall 2011

EECS 442 Computer Vision fall 2011 EECS 442 Computer Vision fall 2011 Instructor Silvio Savarese silvio@eecs.umich.edu Office: ECE Building, room: 4435 Office hour: Tues 4:30-5:30pm or under appoint. (after conversation hour) GSIs: Mohit

More information

Segmenting Objects in Weakly Labeled Videos

Segmenting Objects in Weakly Labeled Videos Segmenting Objects in Weakly Labeled Videos Mrigank Rochan, Shafin Rahman, Neil D.B. Bruce, Yang Wang Department of Computer Science University of Manitoba Winnipeg, Canada {mrochan, shafin12, bruce, ywang}@cs.umanitoba.ca

More information

Constructing Implicit 3D Shape Models for Pose Estimation

Constructing Implicit 3D Shape Models for Pose Estimation Constructing Implicit 3D Shape Models for Pose Estimation Mica Arie-Nachimson Ronen Basri Dept. of Computer Science and Applied Math. Weizmann Institute of Science Rehovot 76100, Israel Abstract We present

More information

BIL Computer Vision Apr 16, 2014

BIL Computer Vision Apr 16, 2014 BIL 719 - Computer Vision Apr 16, 2014 Binocular Stereo (cont d.), Structure from Motion Aykut Erdem Dept. of Computer Engineering Hacettepe University Slide credit: S. Lazebnik Basic stereo matching algorithm

More information

Segmentation. Bottom up Segmentation Semantic Segmentation

Segmentation. Bottom up Segmentation Semantic Segmentation Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,

More information

Lecture 6 Stereo Systems Multi-view geometry

Lecture 6 Stereo Systems Multi-view geometry Lecture 6 Stereo Systems Multi-view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-5-Feb-4 Lecture 6 Stereo Systems Multi-view geometry Stereo systems

More information

3D Object Representations for Recognition. Yu Xiang Computational Vision and Geometry Lab Stanford University

3D Object Representations for Recognition. Yu Xiang Computational Vision and Geometry Lab Stanford University 3D Object Representations for Recognition Yu Xiang Computational Vision and Geometry Lab Stanford University 1 2D Object Recognition Ren et al. NIPS15 Ordonez et al. ICCV13 Image classification/tagging/annotation

More information

Visual Object Recognition

Visual Object Recognition Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department

More information

Local Features based Object Categories and Object Instances Recognition

Local Features based Object Categories and Object Instances Recognition Local Features based Object Categories and Object Instances Recognition Eric Nowak Ph.D. thesis defense 17th of March, 2008 1 Thesis in Computer Vision Computer vision is the science and technology of

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

View Synthesis for Recognizing Unseen Poses of Object Classes

View Synthesis for Recognizing Unseen Poses of Object Classes View Synthesis for Recognizing Unseen Poses of Object Classes Silvio Savarese 1 and Li Fei-Fei 2 1 Department of Electrical Engineering, University of Michigan at Ann Arbor 2 Department of Computer Science,

More information

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Contexts and 3D Scenes

Contexts and 3D Scenes Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Nov 30 th 3:30 PM 4:45 PM Grading Three senior graders (30%)

More information

Toward Coherent Object Detection And Scene Layout Understanding

Toward Coherent Object Detection And Scene Layout Understanding Toward Coherent Object Detection And Scene Layout Understanding Sid Ying-Ze Bao Min Sun Silvio Savarese Dept. of Electrical and Computer Engineering, University of Michigan at Ann Arbor, USA {yingze,sunmin,silvio}@eecs.umich.edu

More information

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington 3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World

More information

Patch Descriptors. EE/CSE 576 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Can Similar Scenes help Surface Layout Estimation?

Can Similar Scenes help Surface Layout Estimation? Can Similar Scenes help Surface Layout Estimation? Santosh K. Divvala, Alexei A. Efros, Martial Hebert Robotics Institute, Carnegie Mellon University. {santosh,efros,hebert}@cs.cmu.edu Abstract We describe

More information

Sparse Point Cloud Densification by Using Redundant Semantic Information

Sparse Point Cloud Densification by Using Redundant Semantic Information Sparse Point Cloud Densification by Using Redundant Semantic Information Michael Hödlmoser CVL, Vienna University of Technology ken@caa.tuwien.ac.at Branislav Micusik AIT Austrian Institute of Technology

More information

Detecting Object Instances Without Discriminative Features

Detecting Object Instances Without Discriminative Features Detecting Object Instances Without Discriminative Features Edward Hsiao June 19, 2013 Thesis Committee: Martial Hebert, Chair Alexei Efros Takeo Kanade Andrew Zisserman, University of Oxford 1 Object Instance

More information

What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today?

What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today? Introduction What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today? What are we trying to achieve? Example from Scott Satkin 3D interpretation

More information

CV as making bank. Intel buys Mobileye! $15 billion. Mobileye:

CV as making bank. Intel buys Mobileye! $15 billion. Mobileye: CV as making bank Intel buys Mobileye! $15 billion Mobileye: Spin-off from Hebrew University, Israel 450 engineers 15 million cars installed 313 car models June 2016 - Tesla left Mobileye Fatal crash car

More information

EECS 442 Computer Vision fall 2012

EECS 442 Computer Vision fall 2012 EECS 442 Computer Vision fall 2012 Instructor Silvio Savarese silvio@eecs.umich.edu Office: ECE Building, room: 4435 Office hour: Tues 4:30-5:30pm or under appoint. GSI: Johnny Chao (ywchao125@gmail.com)

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems EECS 442 Computer vision Stereo systems Stereo vision Rectification Correspondence problem Active stereo vision systems Reading: [HZ] Chapter: 11 [FP] Chapter: 11 Stereo vision P p p O 1 O 2 Goal: estimate

More information

Optical flow and tracking

Optical flow and tracking EECS 442 Computer vision Optical flow and tracking Intro Optical flow and feature tracking Lucas-Kanade algorithm Motion segmentation Segments of this lectures are courtesy of Profs S. Lazebnik S. Seitz,

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Multi-view stereo. Many slides adapted from S. Seitz

Multi-view stereo. Many slides adapted from S. Seitz Multi-view stereo Many slides adapted from S. Seitz Beyond two-view stereo The third eye can be used for verification Multiple-baseline stereo Pick a reference image, and slide the corresponding window

More information

CS231M Mobile Computer Vision Spring 2014

CS231M Mobile Computer Vision Spring 2014 CS231M Mobile Computer Vision Spring 2014 Instructors: - Prof Silvio Savarese (Stanford, CVGLab) - Dr Kari Pulli (Nvidia Research) CAs: - Saumitro Dasgupta saumitro@stanford.edu - Francois Chaubard fchaubar@stanford.edu

More information

Determining Patch Saliency Using Low-Level Context

Determining Patch Saliency Using Low-Level Context Determining Patch Saliency Using Low-Level Context Devi Parikh 1, C. Lawrence Zitnick 2, and Tsuhan Chen 1 1 Carnegie Mellon University, Pittsburgh, PA, USA 2 Microsoft Research, Redmond, WA, USA Abstract.

More information

Local features: detection and description. Local invariant features

Local features: detection and description. Local invariant features Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms

More information

Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections

Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections Nima Razavi 1, Juergen Gall 1, and Luc Van Gool 1,2 1 Computer Vision Laboratory, ETH Zurich 2 ESAT-PSI/IBBT,

More information

Image correspondences and structure from motion

Image correspondences and structure from motion Image correspondences and structure from motion http://graphics.cs.cmu.edu/courses/15-463 15-463, 15-663, 15-862 Computational Photography Fall 2017, Lecture 20 Course announcements Homework 5 posted.

More information

Photo Tourism: Exploring Photo Collections in 3D

Photo Tourism: Exploring Photo Collections in 3D Photo Tourism: Exploring Photo Collections in 3D Noah Snavely Steven M. Seitz University of Washington Richard Szeliski Microsoft Research 15,464 37,383 76,389 2006 Noah Snavely 15,464 37,383 76,389 Reproduced

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics 13.01.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar in the summer semester

More information

Object Category Detection. Slides mostly from Derek Hoiem

Object Category Detection. Slides mostly from Derek Hoiem Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models

More information

Local features and image matching. Prof. Xin Yang HUST

Local features and image matching. Prof. Xin Yang HUST Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source

More information

Structure from Motion CSC 767

Structure from Motion CSC 767 Structure from Motion CSC 767 Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R,t R 2,t 2 R 3,t 3 Camera??

More information

Local Features and Bag of Words Models

Local Features and Bag of Words Models 10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Announcements Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics Seminar in the summer semester Current Topics in Computer Vision and Machine Learning Block seminar, presentations in 1 st week

More information

Context. CS 554 Computer Vision Pinar Duygulu Bilkent University. (Source:Antonio Torralba, James Hays)

Context. CS 554 Computer Vision Pinar Duygulu Bilkent University. (Source:Antonio Torralba, James Hays) Context CS 554 Computer Vision Pinar Duygulu Bilkent University (Source:Antonio Torralba, James Hays) A computer vision goal Recognize many different objects under many viewing conditions in unconstrained

More information

Object and Class Recognition I:

Object and Class Recognition I: Object and Class Recognition I: Object Recognition Lectures 10 Sources ICCV 2005 short courses Li Fei-Fei (UIUC), Rob Fergus (Oxford-MIT), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/iccv2005

More information

Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories

Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories Hao Su,4* 2 Min Sun2* Li Fei-Fei3 Silvio Savarese2 Dept. of Computer Science, Princeton

More information

Machine Learning Crash Course

Machine Learning Crash Course Machine Learning Crash Course Photo: CMU Machine Learning Department protests G20 Computer Vision James Hays Slides: Isabelle Guyon, Erik Sudderth, Mark Johnson, Derek Hoiem The machine learning framework

More information

Structure from motion

Structure from motion Structure from motion Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R 1,t 1 R 2,t 2 R 3,t 3 Camera 1 Camera

More information

Structure from motion

Structure from motion Structure from motion Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R 1,t 1 R 2,t R 2 3,t 3 Camera 1 Camera

More information

Scale and Rotation Invariant Color Features for Weakly-Supervised Object Learning in 3D Space

Scale and Rotation Invariant Color Features for Weakly-Supervised Object Learning in 3D Space Scale and Rotation Invariant Color Features for Weakly-Supervised Object Learning in 3D Space Asako Kanezaki Tatsuya Harada Yasuo Kuniyoshi Graduate School of Information Science and Technology, The University

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Photo Tourism: Exploring Photo Collections in 3D

Photo Tourism: Exploring Photo Collections in 3D Photo Tourism: Exploring Photo Collections in 3D SIGGRAPH 2006 Noah Snavely Steven M. Seitz University of Washington Richard Szeliski Microsoft Research 2006 2006 Noah Snavely Noah Snavely Reproduced with

More information

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test Announcements (Part 3) CSE 152 Lecture 16 Homework 3 is due today, 11:59 PM Homework 4 will be assigned today Due Sat, Jun 4, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying

More information

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of

More information

arxiv: v1 [cs.cv] 14 Jul 2015

arxiv: v1 [cs.cv] 14 Jul 2015 Lifting GIS Maps into Strong Geometric Context Raúl Díaz, Minhaeng Lee, Jochen Schubert, Charless C. Fowlkes Computer Science Department, University of California, Irvine {rdiazgar,minhaenl,j.schubert,fowlkes}@uci.edu

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17 Recognition (Part 4) CSE 152 Lecture 17 Announcements Homework 5 is due June 9, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images

More information

EECS 442 Computer vision. Object Recognition

EECS 442 Computer vision. Object Recognition EECS 442 Computer vision Object Recognition Intro Recognition of 3D objects Recognition of object categories: Bag of world models Part based models 3D object categorization Challenges: intraclass variation

More information

Window based detectors

Window based detectors Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection

More information

Closing the Loop in Scene Interpretation

Closing the Loop in Scene Interpretation Closing the Loop in Scene Interpretation Derek Hoiem Beckman Institute University of Illinois dhoiem@uiuc.edu Alexei A. Efros Robotics Institute Carnegie Mellon University efros@cs.cmu.edu Martial Hebert

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Part-Based Models for Object Class Recognition Part 2

Part-Based Models for Object Class Recognition Part 2 High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object

More information

Part-Based Models for Object Class Recognition Part 2

Part-Based Models for Object Class Recognition Part 2 High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object

More information

Contexts and 3D Scenes

Contexts and 3D Scenes Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Dec 1 st 3:30 PM 4:45 PM Goodwin Hall Atrium Grading Three

More information

Fusing shape and appearance information for object category detection

Fusing shape and appearance information for object category detection 1 Fusing shape and appearance information for object category detection Andreas Opelt, Axel Pinz Graz University of Technology, Austria Andrew Zisserman Dept. of Engineering Science, University of Oxford,

More information

CS5670: Intro to Computer Vision

CS5670: Intro to Computer Vision CS5670: Intro to Computer Vision Noah Snavely Introduction to Recognition mountain tree banner building street lamp people vendor Announcements Final exam, in-class, last day of lecture (5/9/2018, 12:30

More information

Real-time Indoor Scene Understanding using Bayesian Filtering with Motion Cues

Real-time Indoor Scene Understanding using Bayesian Filtering with Motion Cues Real-time Indoor Scene Understanding using Bayesian Filtering with Motion Cues Grace Tsai :, Changhai Xu ;, Jingen Liu :, Benjamin Kuipers : : Dept. of Electrical Engineering and Computer Science, University

More information