ECE 6554:Advanced Computer Vision Pose Estimation

Size: px
Start display at page:

Download "ECE 6554:Advanced Computer Vision Pose Estimation"

Transcription

1 ECE 6554:Advanced Computer Vision Pose Estimation Sujay Yadawadkar, Virginia Tech,

2 Agenda: Pose Estimation: Part Based Models for Pose Estimation Pose Estimation with Convolutional Neural Networks (Deep pose) Pose Estimation with Sequential Prediction (Pose Machines)

3 Estimating Articulated Poses Localizing Body Joints from Monocular Images 3

4

5 Estimating Articulated Poses from Monocular Images Why it is Hard? large variance occlusion L/R ambiguity 5

6 Estimating Articulated Poses from Monocular Images Direct Mapping 6

7 Part-based Models Recognizing Local Appearance Part detector for wrist Confidence maps

8 Non-parametric Uncertainty on Confidence Maps right wrist 8

9 Extracting Features from Confidence Maps Loses Uncertainty Hand-crafted Context feature Peak Candidates for Graphical Models Regress to Displacement [Ramakrishna14] [Chen14] [Tompson15] [Pishchulin16] [Carriera16] 9

10 Pose Estimation with CNN Consider Pose Estimation as a regression problem. Loss function: L2 distance between ground truth of the pose vector and estimated pose vector.

11 Convolutional Pose Machines 1. Capture local appearance with CNNs 2. CNNs on confidence maps to capture long-range part dependencies (preserve uncertainty) 3. Iteratively refine confidence maps with global cues 11

12 Convolutional Pose Machines (CPMs) Stage 1 Stage 2 Stage T P P P 12

13 Convolutional Pose Machines Capturing Local Appearance by FCNN Stage 1 Stage 2 Stage T P P P Stage 1 P 13

14 Convolutional Pose Machines Learning Image-dependent Spatial Model Stage 1 Stage 2 Stage T P P P Stage 1 Stage 2 P P 14

15 Convolutional Pose Machines Large Receptive Field Stage 1 Stage 2 P P Stage T P Stage 1 Stage 2 P P 15

16 Convolutional Pose Machines Overall Architecture Stage 1 Stage 2 Stage T P P P Stage 1 Stage 2 Stage T 16

17 Iteratively Refined Confidence Maps right elbow right wrist Input Image 1st stage 2nd stage 3rd stage 18

18 Iteratively Refined Confidence Maps Recover from False Negative 1st stage 2nd stage 3rd stage 19

19 Iteratively Refined Confidence Maps Recover from False Negative 1st stage output 2nd stage 3rd stage 20

20 Iteratively Refined Confidence Maps 21

21 Training CPMs Ideal Confidence Maps for Intermediate Supervisions Stage 1 Stage 2 Stage T overall loss 23

22 Training CPMs Joint training with Intermediate Supervisions Stage 1 Stage 2 Stage T 24

23 Training CPMs Intermediate Supervisions Resolves Gradient Vanishing Stage 1 Stage 2 Stage 3 Stage 1 Stage 2 Stage 3 Without Intermediate Supervision With Intermediate Supervision 25

24 Convolutional Pose Machines Overall Architecture with Shared Image Features Stage 1 Stage 2 Stage T Stage 1 Stage 2 Stage T 26

25 Training CPMs Data Prepare and Augmentation 27

26 Analysis and Results 28

27 Benchmark Datasets FLIC LSP MPII size 3987 training 1016 testing training 1000 testing training testing type annotation movie scenes upper body sports full body diverse full body w/ truncation 29

28 Number of Stages PCK 0.2 (error tolerance) 30

29 Training Methods PCK 0.2 (error tolerance) 31

30 Quantitatively Results FLIC Upper Body with Observer Centric (OC) Annotations PCK 0.2 PCK

31 34 Quantitatively Results LSP Dataset with Person Centric (PC) Annotations PCK %

32 35 Quantitatively Results MPII Dataset with PC Annotations PCKh 0.5 LSP 90.1 %

33 Quantitatively Results MPII Dataset: Viewpoints 36

34 37

35 Failure Cases L/R confusion rare viewpoint rare pose severe occlusion right wrist 38

36 Summary Monocular human pose estimation are becoming reliable. CPMs capture complex long-range part dependencies by iteratively refining confidence maps with preserved uncertainty. CPMs naturally avoid the problem of vanishing gradient by intermediate supervisions. 39

37 What s Next? 40

38 From Single to Multi-Person Challenge: Identifying number of people and part-person association 41

39 Multi-person Human Pose Estimations Naive Two-phase Method CPM with P = 1 (person detector) 42 Credit: Zhe Cao

40 Pose Estimations in Finer Scales Faces 43 Source: Tomas Simon

41 Pose Estimations in Finer Scales Hands 44

42 CMU Panoptic Studio 500 Synced Cameras 45

43 Multiple Views for 3D Recon Right Wrist 46

44 Multiple Views for 3D Reconstruction 47 Source: Hanbyul Joo

45 Multiple Views for 3D Recon Projected Full Poses 48

46 Live Demo! 49

47 Real-time CPMs 50

48 Real-time CPMs: Confidence Map of Right Wrist 51

49 Future Directions - Analysis on failure cases and data distribution - One-shot multi-person pose estimation - Direct 3D reasoning - Temporal CPMs 52

50 Thank you Questions? Check our Paper, Github, and Youtube Channel! 53

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Authors: Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh Presented by: Suraj Kesavan, Priscilla Jennifer ECS 289G: Visual Recognition

More information

DeepPose & Convolutional Pose Machines

DeepPose & Convolutional Pose Machines DeepPose & Convolutional Pose Machines Main Concepts 1. 2. 3. 4. CNN with regressor head. Object Localization. Bigger to smaller or Smaller to bigger view. Deep Supervised learning to prevent Vanishing

More information

Human Pose Estimation using Global and Local Normalization. Ke Sun, Cuiling Lan, Junliang Xing, Wenjun Zeng, Dong Liu, Jingdong Wang

Human Pose Estimation using Global and Local Normalization. Ke Sun, Cuiling Lan, Junliang Xing, Wenjun Zeng, Dong Liu, Jingdong Wang Human Pose Estimation using Global and Local Normalization Ke Sun, Cuiling Lan, Junliang Xing, Wenjun Zeng, Dong Liu, Jingdong Wang Overview of the supplementary material In this supplementary material,

More information

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, Yichen Wei UT Austin & MSRA & Fudan Human Pose Estimation Pose representation

More information

An Efficient Convolutional Network for Human Pose Estimation

An Efficient Convolutional Network for Human Pose Estimation RAFI, KOSTRIKOV, GALL, LEIBE: EFFICIENT CNN FOR HUMAN POSE ESTIMATION 1 An Efficient Convolutional Network for Human Pose Estimation Umer Rafi 1 rafi@vision.rwth-aachen.de Ilya Kostrikov 1 ilya.kostrikov@rwth-aachen.de

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Pose estimation using a variety of techniques

Pose estimation using a variety of techniques Pose estimation using a variety of techniques Keegan Go Stanford University keegango@stanford.edu Abstract Vision is an integral part robotic systems a component that is needed for robots to interact robustly

More information

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA) Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of

More information

Multi-Scale Structure-Aware Network for Human Pose Estimation

Multi-Scale Structure-Aware Network for Human Pose Estimation Multi-Scale Structure-Aware Network for Human Pose Estimation Lipeng Ke 1, Ming-Ching Chang 2, Honggang Qi 1, and Siwei Lyu 2 1 University of Chinese Academy of Sciences, Beijing, China 2 University at

More information

Deep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D

Deep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D Deep Learning for Virtual Shopping Dr. Jürgen Sturm Group Leader RGB-D metaio GmbH Augmented Reality with the Metaio SDK: IKEA Catalogue App Metaio: Augmented Reality Metaio SDK for ios, Android and Windows

More information

Multi-Scale Structure-Aware Network for Human Pose Estimation

Multi-Scale Structure-Aware Network for Human Pose Estimation Multi-Scale Structure-Aware Network for Human Pose Estimation Lipeng Ke 1, Ming-Ching Chang 2, Honggang Qi 1, Siwei Lyu 2 1 University of Chinese Academy of Sciences, Beijing, China 2 University at Albany,

More information

LOCALIZATION OF HUMANS IN IMAGES USING CONVOLUTIONAL NETWORKS

LOCALIZATION OF HUMANS IN IMAGES USING CONVOLUTIONAL NETWORKS LOCALIZATION OF HUMANS IN IMAGES USING CONVOLUTIONAL NETWORKS JONATHAN TOMPSON ADVISOR: CHRISTOPH BREGLER OVERVIEW Thesis goal: State-of-the-art human pose recognition from depth (hand tracking) Domain

More information

Stereo Human Keypoint Estimation

Stereo Human Keypoint Estimation Stereo Human Keypoint Estimation Kyle Brown Stanford University Stanford Intelligent Systems Laboratory kjbrown7@stanford.edu Abstract The goal of this project is to accurately estimate human keypoint

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

An Efficient Convolutional Network for Human Pose Estimation

An Efficient Convolutional Network for Human Pose Estimation RAFI, LEIBE, GALL: 1 An Efficient Convolutional Network for Human Pose Estimation Umer Rafi 1 rafi@vision.rwth-aachen.de Juergen Gall 2 gall@informatik.uni-bonn.de Bastian Leibe 1 leibe@vision.rwth-aachen.de

More information

CS231N Section. Video Understanding 6/1/2018

CS231N Section. Video Understanding 6/1/2018 CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

arxiv: v1 [cs.cv] 20 Nov 2015

arxiv: v1 [cs.cv] 20 Nov 2015 eepcut: Joint Subset Partition and Labeling for Multi Person Pose Estimation Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka,, Peter Gehler, and Bernt Schiele arxiv:.0v

More information

arxiv: v3 [cs.cv] 9 Jun 2015

arxiv: v3 [cs.cv] 9 Jun 2015 Efficient Object Localization Using Convolutional Networks Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, Christoph Bregler New York University tompson/goroshin/ajain/lecun/bregler@cims.nyu.edu

More information

arxiv: v2 [cs.cv] 26 Apr 2016

arxiv: v2 [cs.cv] 26 Apr 2016 eepcut: Joint Subset Partition and Labeling for Multi Person Pose Estimation Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka,, Peter Gehler, and Bernt Schiele arxiv:.0v

More information

Human Pose Estimation with Deep Learning. Wei Yang

Human Pose Estimation with Deep Learning. Wei Yang Human Pose Estimation with Deep Learning Wei Yang Applications Understand Activities Family Robots American Heist (2014) - The Bank Robbery Scene 2 What do we need to know to recognize a crime scene? 3

More information

Vision based autonomous driving - A survey of recent methods. -Tejus Gupta

Vision based autonomous driving - A survey of recent methods. -Tejus Gupta Vision based autonomous driving - A survey of recent methods -Tejus Gupta Presently, there are three major paradigms for vision based autonomous driving: Directly map input image to driving action using

More information

Ensemble Convolutional Neural Networks for Pose Estimation

Ensemble Convolutional Neural Networks for Pose Estimation Ensemble Convolutional Neural Networks for Pose Estimation Yuki Kawana b, Norimichi Ukita a,, Jia-Bin Huang c, Ming-Hsuan Yang d a Toyota Technological Institute, Japan b Nara Institute of Science and

More information

Convolutional Pose Machines

Convolutional Pose Machines onvolutional ose Machines Shih-En Wei shihenw@cmu.edu Varun Ramakrishna vramakri@cs.cmu.edu Takeo Kanade Takeo.Kanade@cs.cmu.edu The Robotics Institute arnegie Mellon University Yaser Sheikh yaser@cs.cmu.edu

More information

Depth from Stereo. Dominic Cheng February 7, 2018

Depth from Stereo. Dominic Cheng February 7, 2018 Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2. Efficient Deep Learning for Stereo Matching (W. Luo, A. Schwing, and R. Urtasun. In CVPR 2016.) 3. Cascade Residual

More information

Contexts and 3D Scenes

Contexts and 3D Scenes Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Nov 30 th 3:30 PM 4:45 PM Grading Three senior graders (30%)

More information

Local-Level 3D Deep Learning. Andy Zeng

Local-Level 3D Deep Learning. Andy Zeng Local-Level 3D Deep Learning Andy Zeng 1 Matching 3D Data Image Credits: Song and Xiao, Tevs et al. 2 Matching 3D Data Reconstruction Image Credits: Song and Xiao, Tevs et al. 3 Matching 3D Data Reconstruction

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

Multi-scale Adaptive Structure Network for Human Pose Estimation from Color Images

Multi-scale Adaptive Structure Network for Human Pose Estimation from Color Images Multi-scale Adaptive Structure Network for Human Pose Estimation from Color Images Wenlin Zhuang 1, Cong Peng 2, Siyu Xia 1, and Yangang, Wang 1 1 School of Automation, Southeast University, Nanjing, China

More information

THE task of human pose estimation is to determine the. Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation

THE task of human pose estimation is to determine the. Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation 1 Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation Guanghan Ning, Student Member, IEEE, Zhi Zhang, Student Member, IEEE, and Zhihai He, Fellow, IEEE arxiv:1705.02407v2 [cs.cv] 8

More information

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera By Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, Christian

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

3D Face and Hand Tracking for American Sign Language Recognition

3D Face and Hand Tracking for American Sign Language Recognition 3D Face and Hand Tracking for American Sign Language Recognition NSF-ITR (2004-2008) D. Metaxas, A. Elgammal, V. Pavlovic (Rutgers Univ.) C. Neidle (Boston Univ.) C. Vogler (Gallaudet) The need for automated

More information

Deep Learning for Vision

Deep Learning for Vision Deep Learning for Vision Presented by Kevin Matzen Quick Intro - DNN Feed-forward Sparse connectivity (layer to layer) Different layer types Recently popularized for vision [Krizhevsky, et. al. NIPS 2012]

More information

arxiv: v3 [cs.cv] 28 Mar 2016 Abstract

arxiv: v3 [cs.cv] 28 Mar 2016 Abstract onvolutional ose Machines Shih-En Wei shihenw@cmu.edu Varun Ramakrishna varunnr@cs.cmu.edu Takeo Kanade Takeo.Kanade@cs.cmu.edu The Robotics Institute arnegie Mellon University Yaser Sheikh yaser@cs.cmu.edu

More information

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany Deep Incremental Scene Understanding Federico Tombari & Christian Rupprecht Technical University of Munich, Germany C. Couprie et al. "Toward Real-time Indoor Semantic Segmentation Using Depth Information"

More information

A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation

A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation Submission ID: 2065 Abstract Accurate keypoint localization of human pose needs diversified features:

More information

Contexts and 3D Scenes

Contexts and 3D Scenes Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Dec 1 st 3:30 PM 4:45 PM Goodwin Hall Atrium Grading Three

More information

A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation

A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation Wentao Liu, 1,2 Jie Chen,

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Pedestrian and Part Position Detection using a Regression-based Multiple Task Deep Convolutional Neural Network

Pedestrian and Part Position Detection using a Regression-based Multiple Task Deep Convolutional Neural Network Pedestrian and Part Position Detection using a Regression-based Multiple Tas Deep Convolutional Neural Networ Taayoshi Yamashita Computer Science Department yamashita@cs.chubu.ac.jp Hiroshi Fuui Computer

More information

arxiv: v1 [cs.cv] 6 Oct 2017

arxiv: v1 [cs.cv] 6 Oct 2017 Human Pose Regression by Combining Indirect Part Detection and Contextual Information arxiv:1710.02322v1 [cs.cv] 6 Oct 2017 Diogo C. Luvizon Abstract In this paper, we propose an end-to-end trainable regression

More information

arxiv: v5 [cs.cv] 4 Oct 2017

arxiv: v5 [cs.cv] 4 Oct 2017 Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision Dushyant Mehta 1, Helge Rhodin 2, Dan Casas 3, Pascal Fua 2, Oleksandr Sotnychenko 1, Weipeng Xu 1, and Christian Theobalt

More information

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document Franziska Mueller 1,2 Dushyant Mehta 1,2 Oleksandr Sotnychenko 1 Srinath Sridhar 1 Dan Casas 3 Christian Theobalt

More information

arxiv: v2 [cs.cv] 15 Aug 2017 Abstract

arxiv: v2 [cs.cv] 15 Aug 2017 Abstract Self Adversarial Training for Human Pose Estimation Chia-Jung Chou, Jui-Ting Chien, and Hwann-Tzong Chen Department of Computer Science National Tsing Hua University, Taiwan {jessie33321, ydnaandy123}@gmail.com

More information

Progress in Computer Vision in the Last Decade & Open Problems: People Detection & Human Pose Estimation

Progress in Computer Vision in the Last Decade & Open Problems: People Detection & Human Pose Estimation Progress in Computer Vision in the Last Decade & Open Problems: People Detection & Human Pose Estimation Bernt Schiele Max Planck Institute for Informatics & Saarland University, Saarland Informatics Campus

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

3D Pose Estimation using Synthetic Data over Monocular Depth Images

3D Pose Estimation using Synthetic Data over Monocular Depth Images 3D Pose Estimation using Synthetic Data over Monocular Depth Images Wei Chen cwind@stanford.edu Xiaoshi Wang xiaoshiw@stanford.edu Abstract We proposed an approach for human pose estimation over monocular

More information

arxiv: v2 [cs.cv] 26 Jul 2016

arxiv: v2 [cs.cv] 26 Jul 2016 arxiv:1603.06937v2 [cs.cv] 26 Jul 2016 Stacked Hourglass Networks for Human Pose Estimation Alejandro Newell, Kaiyu Yang, and Jia Deng University of Michigan, Ann Arbor {alnewell,yangky,jiadeng}@umich.edu

More information

Deeply Learned Compositional Models for Human Pose Estimation

Deeply Learned Compositional Models for Human Pose Estimation Deeply Learned Compositional Models for Human Pose Estimation Wei Tang, Pei Yu and Ying Wu Northwestern University 2145 Sheridan Road, Evanston, IL 60208 {wtt450, pyi980, yingwu}@eecs.northwestern.edu

More information

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Combining PGMs and Discriminative Models for Upper Body Pose Detection Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative

More information

Detecting Parts for Action Localization

Detecting Parts for Action Localization CHESNEAU ET AL.: DETECTING PARTS FOR ACTION LOCALIZATION 1 Detecting Parts for Action Localization Nicolas Chesneau nicolas.chesneau@inria.fr Grégory Rogez gregory.rogez@inria.fr Karteek Alahari karteek.alahari@inria.fr

More information

MSCOCO Keypoints Challenge Megvii (Face++)

MSCOCO Keypoints Challenge Megvii (Face++) MSCOCO Keypoints Challenge 2017 Megvii (Face++) Team members(keypoints & Detection): Yilun Chen* Zhicheng Wang* Xiangyu Peng Zhiqiang Zhang Gang Yu Chao Peng Tete Xiao Zeming Li Xiangyu Zhang Yuning Jiang

More information

arxiv: v3 [cs.cv] 28 Aug 2018

arxiv: v3 [cs.cv] 28 Aug 2018 Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB Dushyant Mehta [1,2], Oleksandr Sotnychenko [1,2], Franziska Mueller [1,2], Weipeng Xu [1,2], Srinath Sridhar [3], Gerard Pons-Moll [1,2],

More information

arxiv: v1 [cs.cv] 24 Nov 2016

arxiv: v1 [cs.cv] 24 Nov 2016 Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Zhe ao Tomas Simon Shih-En Wei Yaser Sheikh The Robotics Institute, arnegie Mellon University arxiv:6.85v [cs.v] 24 Nov 26 {zhecao,shihenw}@cmu.edu

More information

Multi-Person Pose Estimation with Local Joint-to-Person Associations

Multi-Person Pose Estimation with Local Joint-to-Person Associations Multi-Person Pose Estimation with Local Joint-to-Person Associations Umar Iqbal and Juergen Gall omputer Vision Group University of Bonn, Germany {uiqbal,gall}@iai.uni-bonn.de Abstract. Despite of the

More information

Learning visual odometry with a convolutional network

Learning visual odometry with a convolutional network Learning visual odometry with a convolutional network Kishore Konda 1, Roland Memisevic 2 1 Goethe University Frankfurt 2 University of Montreal konda.kishorereddy@gmail.com, roland.memisevic@gmail.com

More information

Deep globally constrained MRFs for Human Pose Estimation

Deep globally constrained MRFs for Human Pose Estimation Deep globally constrained MRFs for Human Pose Estimation Ioannis Marras, Petar Palasek and Ioannis Patras School of Electronic Engineering and Computer Science, Queen Mary University of London, United

More information

From 3D descriptors to monocular 6D pose: what have we learned?

From 3D descriptors to monocular 6D pose: what have we learned? ECCV Workshop on Recovering 6D Object Pose From 3D descriptors to monocular 6D pose: what have we learned? Federico Tombari CAMP - TUM Dynamic occlusion Low latency High accuracy, low jitter No expensive

More information

Joint Object Detection and Viewpoint Estimation using CNN features

Joint Object Detection and Viewpoint Estimation using CNN features Joint Object Detection and Viewpoint Estimation using CNN features Carlos Guindel, David Martín and José M. Armingol cguindel@ing.uc3m.es Intelligent Systems Laboratory Universidad Carlos III de Madrid

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Recognizing people. Deva Ramanan

Recognizing people. Deva Ramanan Recognizing people Deva Ramanan The goal Why focus on people? How many person-pixels are in a video? 35% 34% Movies TV 40% YouTube Let s start our discussion with a loaded question: why is visual recognition

More information

Learning-based Localization

Learning-based Localization Learning-based Localization Eric Brachmann ECCV 2018 Tutorial on Visual Localization - Feature-based vs. Learned Approaches Torsten Sattler, Eric Brachmann Roadmap Machine Learning Basics [10min] Convolutional

More information

SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material

SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material Ayan Sinha MIT Asim Unmesh IIT Kanpur Qixing Huang UT Austin Karthik Ramani Purdue sinhayan@mit.edu a.unmesh@gmail.com

More information

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford Video Google faces Josef Sivic, Mark Everingham, Andrew Zisserman Visual Geometry Group University of Oxford The objective Retrieve all shots in a video, e.g. a feature length film, containing a particular

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices

Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices Technical University of Cluj-Napoca Image Processing and Pattern Recognition Research Center www.cv.utcluj.ro Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving

More information

A Dual-Source Approach for 3D Human Pose Estimation from Single Images

A Dual-Source Approach for 3D Human Pose Estimation from Single Images 1 Computer Vision and Image Understanding journal homepage: www.elsevier.com A Dual-Source Approach for 3D Human Pose Estimation from Single Images Umar Iqbal a,, Andreas Doering a, Hashim Yasin b, Björn

More information

Human Motion Reconstruction from Action Video Data Using a 3-Layer-LSTM

Human Motion Reconstruction from Action Video Data Using a 3-Layer-LSTM Human Motion Reconstruction from Action Video Data Using a 3-Layer-LSTM Jihee Hwang Stanford Computer Science jiheeh@stanford.edu Danish Shabbir Stanford Electrical Engineering danishs@stanford.edu Abstract

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

RTSR: Enhancing Real-time H.264 Video Streaming using Deep Learning based Video Super Resolution Spring 2017 CS570 Project Presentation June 8, 2017

RTSR: Enhancing Real-time H.264 Video Streaming using Deep Learning based Video Super Resolution Spring 2017 CS570 Project Presentation June 8, 2017 RTSR: Enhancing Real-time H.264 Video Streaming using Deep Learning based Video Super Resolution Spring 2017 CS570 Project Presentation June 8, 2017 Team 16 Soomin Kim Leslie Tiong Youngki Kwon Insu Jang

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

CMU Facilities. Motion Capture Lab. Panoptic Studio

CMU Facilities. Motion Capture Lab. Panoptic Studio CMU Facilities Motion Capture Lab The 1700 square foot Motion Capture Lab provides a resource for behavior capture of humans as well as measuring and controlling robot behavior in real time. It includes

More information

Depth Estimation from a Single Image Using a Deep Neural Network Milestone Report

Depth Estimation from a Single Image Using a Deep Neural Network Milestone Report Figure 1: The architecture of the convolutional network. Input: a single view image; Output: a depth map. 3 Related Work In [4] they used depth maps of indoor scenes produced by a Microsoft Kinect to successfully

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Action recognition in videos

Action recognition in videos Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit

More information

arxiv: v2 [cs.cv] 11 Apr 2017

arxiv: v2 [cs.cv] 11 Apr 2017 3D Human Pose Estimation = 2D Pose Estimation + Matching Ching-Hang Chen Carnegie Mellon University chinghac@andrew.cmu.edu Deva Ramanan Carnegie Mellon University deva@cs.cmu.edu arxiv:1612.06524v2 [cs.cv]

More information

arxiv: v4 [cs.cv] 2 Sep 2017

arxiv: v4 [cs.cv] 2 Sep 2017 RMPE: Regional Multi-Person Pose Estimation Hao-Shu Fang 1, Shuqin Xie 1, Yu-Wing Tai 2, Cewu Lu 1 1 Shanghai Jiao Tong University, China 2 Tencent YouTu fhaoshu@gmail.com qweasdshu@sjtu.edu.cn yuwingtai@tencent.com

More information

ECE 5470 Classification, Machine Learning, and Neural Network Review

ECE 5470 Classification, Machine Learning, and Neural Network Review ECE 5470 Classification, Machine Learning, and Neural Network Review Due December 1. Solution set Instructions: These questions are to be answered on this document which should be submitted to blackboard

More information

arxiv: v1 [cs.cv] 31 Mar 2016

arxiv: v1 [cs.cv] 31 Mar 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.

More information

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi Motion and Optical Flow Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi We live in a moving world Perceiving, understanding and predicting motion is an important part of our daily lives Motion

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Pose Machines: Articulated Pose Estimation via Inference Machines

Pose Machines: Articulated Pose Estimation via Inference Machines Pose Machines: Articulated Pose Estimation via Inference Machines Varun Ramakrishna, Daniel Munoz, Martial Hebert, James Andrew Bagnell, and Yaser Sheikh The Robotics Institute, Carnegie Mellon University,

More information

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington 3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World

More information

arxiv: v4 [cs.cv] 12 Sep 2018

arxiv: v4 [cs.cv] 12 Sep 2018 arxiv:1711.08585v4 [cs.cv] 12 Sep 2018 Exploiting temporal information for 3D human pose estimation Mir Rayat Imtiaz Hossain, James J. Little Department of Computer Science, University of British Columbia

More information

Exploiting temporal information for 3D human pose estimation

Exploiting temporal information for 3D human pose estimation Exploiting temporal information for 3D human pose estimation Mir Rayat Imtiaz Hossain, James J. Little Department of Computer Science, University of British Columbia {rayat137,little}@cs.ubc.ca Abstract.

More information

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition

More information

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations Xianjie Chen University of California, Los Angeles Los Angeles, CA 90024 cxj@ucla.edu Alan Yuille University of

More information

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Joint Vanishing Point Extraction and Tracking (Supplementary Material)

Joint Vanishing Point Extraction and Tracking (Supplementary Material) Joint Vanishing Point Extraction and Tracking (Supplementary Material) Till Kroeger1 1 Dengxin Dai1 Luc Van Gool1,2 Computer Vision Laboratory, D-ITET, ETH Zurich 2 VISICS, ESAT/PSI, KU Leuven {kroegert,

More information

In-Bed Pose Estimation: Deep Learning with Shallow Dataset

In-Bed Pose Estimation: Deep Learning with Shallow Dataset JOURNAL OF, VOL., NO., MONTH YEAR 1 In-Bed Pose Estimation: Deep Learning with Shallow Dataset Shuangjun Liu, Yu Yin, and Sarah Ostadabbas arxiv:1711.15v3 [cs.cv] 7 Jul 218 Abstract This paper presents

More information

CS 558: Computer Vision 13 th Set of Notes

CS 558: Computer Vision 13 th Set of Notes CS 558: Computer Vision 13 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview Context and Spatial Layout

More information

Photo-realistic Renderings for Machines Seong-heum Kim

Photo-realistic Renderings for Machines Seong-heum Kim Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,

More information

2. Related Work. show the improvement by using an architecture with multi-scale supervision and fusion for joint detection.

2. Related Work. show the improvement by using an architecture with multi-scale supervision and fusion for joint detection. Human Pose Estimation using Global and Local Normalization Ke Sun 1, Cuiling Lan 2, Junliang Xing 3, Wenjun Zeng 2, Dong Liu 1, Jingdong Wang 2 1 3 Universit of Science and Technolog of China, Anhui, China

More information

arxiv: v1 [cs.cv] 3 Apr 2018

arxiv: v1 [cs.cv] 3 Apr 2018 International Journal of Computer Vision https://doi.org/10.1007/s11263-018-1074-6 3D Interpreter Networks for Viewer-Centered Wireframe Modeling Jiajun Wu 1 Tianfan Xue 2 Joseph J. Lim 3 Yuandong Tian

More information