Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients
|
|
- Sybil Ward
- 6 years ago
- Views:
Transcription
1 ThreeDimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients Authors: Zhile Ren, Erik B. Sudderth Presented by: Shannon Kao, Max Wang October 19, 2016
2 Introduction Given an image of an realistic indoor scene, how do you classify objects in them?
3 Goals Indoor scene understanding: To develop new representations and algorithms for 3D object detection and spatial layout prediction in cluttered indoor scenes Main challenge: Images of indoor (home or office) environments are typically highly cluttered and have substantial occlusions
4 Previous Work: CAD Detection: CAD models for learning object shapes and alternative viewpoints But lack a surplus of models Models do not cover different classes of the same object type Computationally inefficient M. Aubry, D. Maturana, A. Efros, B. Russell, and J. Sivic. Seeing 3D chairs: Exemplar partbased 2D3D alignment using a large dataset of CAD models. In CVPR, J. J. Lim, H. Pirsiavash, and A. Torralba. Parsing IKEA objects: Fine pose estimation. In ICCV, 2013 J. J. Lim, A. Khosla, and A. Torralba. FPM: Fine pose partsbased model with 3D CAD models. In ECCV, pages Springer, 2014
5 Previous Work Layout proposal: Manhattan structure to infer 2D projections of the 3D structure Integral representation to explore exponentially many layout proposals Previous work focused on restricted environment and does not generalize to cluttered scenes Manual heuristics to reduce scene parsing false positives Scalability issue J. M. Coughlan and A. L. Yuille. Manhattan world: Compass direction from a single image by Bayesian inference. In ICCV, volume 2, pages IEEE, 1999 Z. Wu, S. Song, A. Khosla, X. Tang, and J. Xiao. 3D shapenets for 2.5D object recognition and nextbestview prediction. arxiv preprint arxiv: , 2014 S. Song and J. Xiao. Sliding shapes for 3D object detection in depth images. In ECCV, pages Springer, 2014.
6 Proposed Solution Representation Geometric features Clouds of oriented gradients (COG) Novel Manhattan voxel structure (layout) Training Structured SVM (on cuboid + layout) Cascaded classification framework
7 Proposed Solution Representation Geometric features Clouds of oriented gradients (COG) Novel Manhattan voxel structure Training Structured SVM (on cuboid + layout) Cascaded classification framework
8 RGBD to Voxel Features
9 Geometric Features Point cloud density: Using 3D Density with a 3 = iℓ Niℓ / Aiℓ 3D Normal Orientations: Find normal orientation for each 3D point using plane fit with 15 nearest neighbors
10 Clouds of Oriented Gradients (COG) Computes gradients on the RGB channels of the 2D image Applies filters: Maximum responses across color channels are gradients (dx, dy) in the x and y directions, with magnitude:
11 Clouds of Oriented Gradients (COG) Uses nine 3D orientation bins from 0 to 180 degrees Uses perspective projection to find corresponding 2D bin boundaries
12 COG Normalization and Aliasing Bilinearly interpolates gradient magnitudes between neighboring orientation bins For a small ϵ > 0 Dimension of COG: 63 x 9 = 1944
13 Room Layout Geometry: Manhattan Voxels Room layout prediction: floor, ceiling, wall Discretize vertical space between floor and ceiling into 6 equal bins Threshold of 0.15m to separate points near walls from hypothesized layout Use diagonal lines to split bins at room corners to create 12 x 6 = 72 bins
14 Manhattan voxels cont. Regions: 14: Scene interior, where objects could be placed anywhere (point cloud distribution varies widely) 58: Model points near assumed Manhattan wall structure. Here, 5 and 6 contain orthogonal planes. 912: Points outside of predicated layout
15 Proposed Solution Representation Geometric features Clouds of oriented gradients (COG) Novel Manhattan voxel structure Training Structured SVMs (on cuboid + layout) Cascaded classification framework
16 Cuboid Detection Cuboid i = (Ii, Bi) = { a iℓ a iℓ, b iℓ, c iℓ }216ℓ=1 point cloud density feature b iℓ 25 surface normal histogram features c 9 COG features iℓ Find prediction function hc : I B B = (L,, S) L center of cuboid in 3D cuboid orientation S physical size of cuboid
17 Cuboid Detection Training: nslack formulation of structural SVM n Ii Bi C number of categories input image for cuboid i bounding box for cuboid i constant
18 Cuboid Detection Loss function: B B 3D bounding box ground truth bounding box orientation with respect to ground ground truth orientation
19 Cuboid Hypothesis Cuboid hypotheses calculated using sliding window Width quantiles {0.1, 0.3, 0.5, 0.7, 0.9} Depth quantiles {0.25, 0.5, 0.75} Height quantiles {0.3, 0.5, 0.8} All combinations of voxel size, 3D location, and orientation (from 16 candidate orientations) is evaluated.
20 Layout Detection M = (L,, S) Trained using the same SSVM method, with freespace definition of IOU as loss, where ground truth is hypothesis with largest freespace IOU
21 Layout Hypothesis Layout hypotheses must capture 80% of candidate points. Floors and ceilings predicted at and quantiles of 3D points (along gravity direction). 5,000 20,000 hypotheses for a typical scene
22 Learning Spatial Context Problem: Portion of large object detected as smaller object
23 Learning Spatial Context Problem: Portion of large object detected as smaller object Solution: Cascaded classification
24 Evaluation
25 Experiment Setup Dataset: SUN RGBD Parameters Compared with: sliding shape, baseline layout, HOG 10 object categories Performance Metrics Cuboid performance evaluated using IOU with ground truth cuboids Layout performance evaluated using freespace IOU with human annotations
26 Experiment Results
27 Experiment Results Precision scores for 10 object categories
28 Experiment Results
29 Summary Novel Representations Cloud of oriented gradients (COG) for cuboids Manhattan voxels for layouts Uses RGBD data, does not rely on CAD model information Learning Objects classified using SSVM Cascaded learning framework applied to remove false positives
30 Q&A
31 Backup
32 Cascaded Classification Firststage detection becomes input features to secondstage classifiers that estimate confidence Essentially a directed graphical model with hidden variables. Marginalizing the firststage variables recovers a standard, fullyconnected undirected graph. More efficient: Training decomposes into independent learning problems for each node (object category) Optimal test classification is possible via a rapid sequence of local decisions
33 Cascaded Classification First stage Outputs layout, set of {bounding box, confidence score, object category} Second stage Add contextual features: Objectobject overlap: Objectlayout context: distance and angle to nearest wall
34 Learning Spatial Context Training Standard SVM with radial basis function (RBF) kernel Binary classification: true or false positive Prediction Secondstage classifier outputs new contextual confidence Overall confidence is sum of first and second stages
ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016
ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D
More informationContexts and 3D Scenes
Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Nov 30 th 3:30 PM 4:45 PM Grading Three senior graders (30%)
More informationContexts and 3D Scenes
Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Dec 1 st 3:30 PM 4:45 PM Goodwin Hall Atrium Grading Three
More informationCS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep
CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships
More informationLearning from 3D Data
Learning from 3D Data Thomas Funkhouser Princeton University* * On sabbatical at Stanford and Google Disclaimer: I am talking about the work of these people Shuran Song Andy Zeng Fisher Yu Yinda Zhang
More informationLEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS
LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition
More information3D Object Detection with Latent Support Surfaces
3D Object Detection with Latent Support Surfaces Zhile Ren Brown University ren@cs.brown.edu Erik B. Sudderth University of California, Irvine sudderth@uci.edu Abstract We develop a 3D object detection
More informationWhat are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today?
Introduction What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today? What are we trying to achieve? Example from Scott Satkin 3D interpretation
More informationDetecting Object Instances Without Discriminative Features
Detecting Object Instances Without Discriminative Features Edward Hsiao June 19, 2013 Thesis Committee: Martial Hebert, Chair Alexei Efros Takeo Kanade Andrew Zisserman, University of Oxford 1 Object Instance
More informationarxiv: v3 [cs.cv] 18 Aug 2017
Predicting Complete 3D Models of Indoor Scenes Ruiqi Guo UIUC, Google Chuhang Zou UIUC Derek Hoiem UIUC arxiv:1504.02437v3 [cs.cv] 18 Aug 2017 Abstract One major goal of vision is to infer physical models
More informationSeparating Objects and Clutter in Indoor Scenes
Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous
More informationSupport surfaces prediction for indoor scene understanding
2013 IEEE International Conference on Computer Vision Support surfaces prediction for indoor scene understanding Anonymous ICCV submission Paper ID 1506 Abstract In this paper, we present an approach to
More informationDevelopment in Object Detection. Junyuan Lin May 4th
Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,
More informationObject Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning
Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification
More informationCS 558: Computer Vision 13 th Set of Notes
CS 558: Computer Vision 13 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview Context and Spatial Layout
More informationObject Category Detection. Slides mostly from Derek Hoiem
Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationImagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding
: Stability-based Cuboid Arrangements for Scene Understanding Tianjia Shao* Aron Monszpart Youyi Zheng Bongjin Koo Weiwei Xu Kun Zhou * Niloy J. Mitra * Background A fundamental problem for single view
More informationarxiv: v1 [cs.cv] 25 Oct 2017
ZOU, LI, HOIEM: COMPLETE 3D SCENE PARSING FROM SINGLE RGBD IMAGE 1 arxiv:1710.09490v1 [cs.cv] 25 Oct 2017 Complete 3D Scene Parsing from Single RGBD Image Chuhang Zou http://web.engr.illinois.edu/~czou4/
More informationDetection III: Analyzing and Debugging Detection Methods
CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationCS381V Experiment Presentation. Chun-Chen Kuo
CS381V Experiment Presentation Chun-Chen Kuo The Paper Indoor Segmentation and Support Inference from RGBD Images. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. ECCV 2012. 50 100 150 200 250 300 350
More information3D Spatial Layout Propagation in a Video Sequence
3D Spatial Layout Propagation in a Video Sequence Alejandro Rituerto 1, Roberto Manduchi 2, Ana C. Murillo 1 and J. J. Guerrero 1 arituerto@unizar.es, manduchi@soe.ucsc.edu, acm@unizar.es, and josechu.guerrero@unizar.es
More information2D-Driven 3D Object Detection in RGB-D Images
2D-Driven 3D Object Detection in RGB-D Images Jean Lahoud, Bernard Ghanem King Abdullah University of Science and Technology (KAUST) Thuwal, Saudi Arabia {jean.lahoud,bernard.ghanem}@kaust.edu.sa Abstract
More informationRobotics Programming Laboratory
Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car
More informationCategory vs. instance recognition
Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building
More informationFPM: Fine Pose Parts-Based Model with 3D CAD Models
FPM: Fine Pose Parts-Based Model with 3D CAD Models The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher
More informationPanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding Yinda Zhang Shuran Song Ping Tan Jianxiong Xiao Princeton University Simon Fraser University Alicia Clark PanoContext October
More informationPerceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington
Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand
More informationBus Detection and recognition for visually impaired people
Bus Detection and recognition for visually impaired people Hangrong Pan, Chucai Yi, and Yingli Tian The City College of New York The Graduate Center The City University of New York MAP4VIP Outline Motivation
More informationarxiv: v1 [cs.cv] 3 Jul 2016
A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method Yuzhuo Ren, Chen Chen, Shangwen Li, and C.-C. Jay Kuo arxiv:1607.00598v1 [cs.cv] 3 Jul 2016 Abstract. The task of estimating the spatial layout
More informationSegmentation. Bottom up Segmentation Semantic Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationRoom Reconstruction from a Single Spherical Image by Higher-order Energy Minimization
Room Reconstruction from a Single Spherical Image by Higher-order Energy Minimization Kosuke Fukano, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Akihiro Sugimoto, and Hiroshi Ishikawa Waseda
More informationVisuelle Perzeption für Mensch- Maschine Schnittstellen
Visuelle Perzeption für Mensch- Maschine Schnittstellen Vorlesung, WS 2009 Prof. Dr. Rainer Stiefelhagen Dr. Edgar Seemann Institut für Anthropomatik Universität Karlsruhe (TH) http://cvhci.ira.uka.de
More informationAnalysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009
Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More information3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington
3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More informationCategory-level localization
Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object
More informationSeeing 3D chairs: Exemplar part-based 2D-3D alignment using a large dataset of CAD models
Seeing 3D chairs: Exemplar part-based 2D-3D alignment using a large dataset of CAD models Mathieu Aubry (INRIA) Daniel Maturana (CMU) Alexei Efros (UC Berkeley) Bryan Russell (Intel) Josef Sivic (INRIA)
More informationDeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding Yinda Zhang Mingru Bai Pushmeet Kohli 2,5 Shahram Izadi 3,5 Jianxiong Xiao,4 Princeton University 2 DeepMind 3 PerceptiveIO
More informationIs 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014
Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Introduction In this supplementary material, Section 2 details the 3D annotation for CAD models and real
More informationHolistic 3D Scene Parsing and Reconstruction from a Single RGB Image. Supplementary Material
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image Supplementary Material Siyuan Huang 1,2, Siyuan Qi 1,2, Yixin Zhu 1,2, Yinxue Xiao 1, Yuanlu Xu 1,2, and Song-Chun Zhu 1,2 1 University
More informationarxiv: v2 [cs.cv] 24 Apr 2017
IM2CAD Hamid Izadinia University of Washington Qi Shan Zillow Group Steven M. Seitz University of Washington arxiv:1608.05137v2 [cs.cv] 24 Apr 2017 Figure 1: IM2CAD takes a single photo of a real scene
More informationDetection and Fine 3D Pose Estimation of Texture-less Objects in RGB-D Images
Detection and Pose Estimation of Texture-less Objects in RGB-D Images Tomáš Hodaň1, Xenophon Zabulis2, Manolis Lourakis2, Šťěpán Obdržálek1, Jiří Matas1 1 Center for Machine Perception, CTU in Prague,
More information3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding Scott Satkin Google Inc satkin@googlecom Martial Hebert Carnegie Mellon University hebert@ricmuedu Abstract We present a new algorithm
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More informationObject Category Detection: Sliding Windows
04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical
More informationSupplementary Material for Ensemble Diffusion for Retrieval
Supplementary Material for Ensemble Diffusion for Retrieval Song Bai 1, Zhichao Zhou 1, Jingdong Wang, Xiang Bai 1, Longin Jan Latecki 3, Qi Tian 4 1 Huazhong University of Science and Technology, Microsoft
More informationPerson Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab
Person Detection in Images using HoG + Gentleboost Rahul Rajan June 1st July 15th CMU Q Robotics Lab 1 Introduction One of the goals of computer vision Object class detection car, animal, humans Human
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationCRF Based Point Cloud Segmentation Jonathan Nation
CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to
More informationAmodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou
Amodal and Panoptic Segmentation Stephanie Liu, Andrew Zhou This lecture: 1. 2. 3. 4. Semantic Amodal Segmentation Cityscapes Dataset ADE20K Dataset Panoptic Segmentation Semantic Amodal Segmentation Yan
More informationDeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding Yinda Zhang Mingru Bai Pushmeet Kohli 2,5 Shahram Izadi 3,5 Jianxiong Xiao,4 Princeton University 2 DeepMind 3 PerceptiveIO
More informationFeature Descriptors. CS 510 Lecture #21 April 29 th, 2013
Feature Descriptors CS 510 Lecture #21 April 29 th, 2013 Programming Assignment #4 Due two weeks from today Any questions? How is it going? Where are we? We have two umbrella schemes for object recognition
More informationHuman Upper Body Pose Estimation in Static Images
1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project
More informationData-driven Depth Inference from a Single Still Image
Data-driven Depth Inference from a Single Still Image Kyunghee Kim Computer Science Department Stanford University kyunghee.kim@stanford.edu Abstract Given an indoor image, how to recover its depth information
More informationSeminar Heidelberg University
Seminar Heidelberg University Mobile Human Detection Systems Pedestrian Detection by Stereo Vision on Mobile Robots Philip Mayer Matrikelnummer: 3300646 Motivation Fig.1: Pedestrians Within Bounding Box
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationMulti-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011
Multi-view Stereo Ivo Boyadzhiev CS7670: September 13, 2011 What is stereo vision? Generic problem formulation: given several images of the same object or scene, compute a representation of its 3D shape
More informationPermanent Structure Detection in Cluttered Point Clouds from Indoor Mobile Laser Scanners (IMLS)
Permanent Structure Detection in Cluttered Point Clouds from NCG Symposium October 2016 Promoter: Prof. Dr. Ir. George Vosselman Supervisor: Michael Peter Problem and Motivation: Permanent structure reconstruction,
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationHuman detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah
Human detection using histogram of oriented gradients Srikumar Ramalingam School of Computing University of Utah Reference Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection,
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationObject Detection Design challenges
Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should
More informationBeyond Bags of features Spatial information & Shape models
Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features
More informationPart-Based Models for Object Class Recognition Part 3
High Level Computer Vision! Part-Based Models for Object Class Recognition Part 3 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de! http://www.d2.mpi-inf.mpg.de/cv ! State-of-the-Art
More informationStructured Models in. Dan Huttenlocher. June 2010
Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More information3D-Based Reasoning with Blocks, Support, and Stability
3D-Based Reasoning with Blocks, Support, and Stability Zhaoyin Jia, Andrew Gallagher, Ashutosh Saxena, Tsuhan Chen School of Electrical and Computer Engineering, Cornell University. Department of Computer
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Chi Li, M. Zeeshan Zia 2, Quoc-Huy Tran 2, Xiang Yu 2, Gregory D. Hager, and Manmohan Chandraker 2 Johns
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More information3D Deep Learning on Geometric Forms. Hao Su
3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation
More informationBeyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets
More informationHISTOGRAMS OF ORIENTATIO N GRADIENTS
HISTOGRAMS OF ORIENTATIO N GRADIENTS Histograms of Orientation Gradients Objective: object recognition Basic idea Local shape information often well described by the distribution of intensity gradients
More informationContext. CS 554 Computer Vision Pinar Duygulu Bilkent University. (Source:Antonio Torralba, James Hays)
Context CS 554 Computer Vision Pinar Duygulu Bilkent University (Source:Antonio Torralba, James Hays) A computer vision goal Recognize many different objects under many viewing conditions in unconstrained
More informationLearning to generate 3D shapes
Learning to generate 3D shapes Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst http://people.cs.umass.edu/smaji August 10, 2018 @ Caltech Creating 3D shapes
More informationSingle Image Super-resolution. Slides from Libin Geoffrey Sun and James Hays
Single Image Super-resolution Slides from Libin Geoffrey Sun and James Hays Cs129 Computational Photography James Hays, Brown, fall 2012 Types of Super-resolution Multi-image (sub-pixel registration) Single-image
More informationEfficient Detector Adaptation for Object Detection in a Video
2013 IEEE Conference on Computer Vision and Pattern Recognition Efficient Detector Adaptation for Object Detection in a Video Pramod Sharma and Ram Nevatia Institute for Robotics and Intelligent Systems,
More informationLocal features and image matching. Prof. Xin Yang HUST
Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source
More informationModern Object Detection. Most slides from Ali Farhadi
Modern Object Detection Most slides from Ali Farhadi Comparison of Classifiers assuming x in {0 1} Learning Objective Training Inference Naïve Bayes maximize j i logp + logp ( x y ; θ ) ( y ; θ ) i ij
More informationObject Detection Using Segmented Images
Object Detection Using Segmented Images Naran Bayanbat Stanford University Palo Alto, CA naranb@stanford.edu Jason Chen Stanford University Palo Alto, CA jasonch@stanford.edu Abstract Object detection
More informationMulti-view stereo. Many slides adapted from S. Seitz
Multi-view stereo Many slides adapted from S. Seitz Beyond two-view stereo The third eye can be used for verification Multiple-baseline stereo Pick a reference image, and slide the corresponding window
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More informationSingle-view 3D Reconstruction
Single-view 3D Reconstruction 10/12/17 Computational Photography Derek Hoiem, University of Illinois Some slides from Alyosha Efros, Steve Seitz Notes about Project 4 (Image-based Lighting) You can work
More informationPart-based and local feature models for generic object recognition
Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza
More informationVolumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material
Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details
More informationLearning Realistic Human Actions from Movies
Learning Realistic Human Actions from Movies Ivan Laptev*, Marcin Marszałek**, Cordelia Schmid**, Benjamin Rozenfeld*** INRIA Rennes, France ** INRIA Grenoble, France *** Bar-Ilan University, Israel Presented
More informationRobust PDF Table Locator
Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records
More informationRevisiting 3D Geometric Models for Accurate Object Shape and Pose
Revisiting 3D Geometric Models for Accurate Object Shape and Pose M. 1 Michael Stark 2,3 Bernt Schiele 3 Konrad Schindler 1 1 Photogrammetry and Remote Sensing Laboratory Swiss Federal Institute of Technology
More informationA novel template matching method for human detection
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen
More informationLocal Features and Bag of Words Models
10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering
More informationReal-Time Human Detection using Relational Depth Similarity Features
Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,
More informationDepth Estimation from a Single Image Using a Deep Neural Network Milestone Report
Figure 1: The architecture of the convolutional network. Input: a single view image; Output: a depth map. 3 Related Work In [4] they used depth maps of indoor scenes produced by a Microsoft Kinect to successfully
More informationhttps://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola
More information