Cascaded Pyramid Network for Multi-Person Pose Estimation
|
|
- Chad Atkinson
- 5 years ago
- Views:
Transcription
1 Cascaded Pyramid Network for Multi-Person Pose Estimation Gang YU Megvii (Face++)
2 Team members: Yilun Chen* Zhicheng Wang* Xiangyu Peng Zhiqiang Zhang Gang Yu Jian Sun ( Code: Megvii (Face++)
3 Results COCO 17 Keypoints (test_challenge)
4 Overview Top-down Pipeline Network Design Motivation: How human locate keypoints? Our Network Architecture Techniques & Experiments Conclusion
5 Overview Top-down Pipeline
6 Det Top-Down pipeline
7 Top-Down pipeline Det crop
8 Top-Down pipeline Det crop Single Person Pose Estimation Network
9 Overview Top-down Pipeline Network Design
10 Overview Top-down Pipeline Network Design Motivation: How human locates keypoints?
11 Motivation: How human locate keypoints?
12 Motivation: How human locate keypoints? Nose Left elbow Visible easy keypoints Right hand What? easy visible parts What?
13 Motivation: How human locate keypoints? Nose Left elbow Right hand Visible easy keypoints context Left knee What? enlarge view Right knee Visible hard keypoints Left hip easy visible parts hard visible parts What? enlarge view hard to distinguish?
14 Motivation: How human locate keypoints? Nose Left elbow Right hand Visible easy keypoints context Left knee What? enlarge view Right knee Visible hard keypoints Left hip context easy visible parts hard visible parts Invisible part What? enlarge view hard to distinguish? Right shoulder
15 Network s Design Goal Easy parts Hard parts Input image receptive view getting larger & more context Output image
16 Overview Top-down Pipeline Network Design Motivation: How human locate keypoints? Our Network Architecture
17 Network Architecture Network Design Principles: Inspired by the process of human locating keypoints and adjusted to CNN network locate easy parts => locate hard parts Two stages GlobalNet: to locate the easy parts (Vanilla L2 loss) RefineNet: to locate hard parts (deep layers) with online hard keypoint mining(hard Mining Loss)
18 Network Architecture The green dots means the groundtruth location of keypoints. Heatmap view: Easy parts like left eye successfully been detected, while hard parts like left hip fail to be detected in GlobalNet. Hard parts like left hip successfully been detected in the RefineNet stage.
19 Overview Top-down Pipeline Network Design Motivation: How human locate keypoint? Our Network Architecture Techniques & Experiments
20 Techniques & Experiments Person Detector Non-Maximum Suppression (NMS) strategies VS Soft NMS Hard NMS
21 Techniques & Experiments Person Detector Non-Maximum Suppression (NMS) strategies
22 Techniques & Experiments Person Detector Detection Performance Keypoint map 68.8 Det map
23 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map
24 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map
25 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map
26 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map
27 Techniques & Experiments Person Detector Detection Performance
28 Techniques & Experiments Cascaded Pyramid Network Online Hard Keypoints Mining CPN M Hard Keypoints.. N-M Keypoints No propagate or loss = 0
29 Techniques & Experiments Cascaded Pyramid Network Online Hard Keypoints Mining
30 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
31 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
32 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
33 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
34 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
35 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
36 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
37 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
38 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
39 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet
40 Techniques & Experiments Data Pre-processing
41 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º)
42 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º) Large Batch (+0.4~0.7AP)
43 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º) Large Batch (+0.4~0.7AP) Ensemble(+1.1~1.5AP in minival) Heatmap merge AP% (COCO minival) AP% (COCO test_challenge) AP% (COCO test_dev, single_model) Our network with all techniques
44 Results on MS COCO
45 Results on MS COCO
46 Results on MS COCO
47 Results on PoseTrack Method AP Our 75.5 AlphaPose 66.7 ML_Lab 70.3 Leaderboard:
48 Illustrative results of our method
49 Illustrative results of our method
50 Conclusion The two-stage network design is crucial. GlobalNet: learns the overall keypoints and mainly locates the easy parts of the keypoints. RefineNet: explicitly learns the hard keypoints with online hard keypoints mining. Intermediate supervision is important to the utility of resnet in human pose estimation. Large batch technique is not only applicable in object detection, but also in keypoint.
51 Thanks & Questions
MSCOCO Keypoints Challenge Megvii (Face++)
MSCOCO Keypoints Challenge 2017 Megvii (Face++) Team members(keypoints & Detection): Yilun Chen* Zhicheng Wang* Xiangyu Peng Zhiqiang Zhang Gang Yu Chao Peng Tete Xiao Zeming Li Xiangyu Zhang Yuning Jiang
More informationarxiv: v2 [cs.cv] 8 Apr 2018
Cascaded Pyramid Network for Multi-Person Pose Estimation Yilun Chen Zhicheng Wang Yuxiang Peng 1 Zhiqiang Zhang 2 Gang Yu Jian Sun 1 Tsinghua University 2 HuaZhong University of Science and Technology
More informationTeam G-RMI: Google Research & Machine Intelligence
Team G-RMI: Google Research & Machine Intelligence Alireza Fathi (alirezafathi@google.com) Nori Kanazawa, Kai Yang, George Papandreou, Tyler Zhu, Jonathan Huang, Vivek Rathod, Chen Sun, Kevin Murphy, et
More informationR-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong
More informationSimple Baselines for Human Pose Estimation and Tracking
Simple Baselines for Human Pose Estimation and Tracking Bin Xiao 1, Haiping Wu 2, and Yichen Wei 1 1 Microsoft Research Asia, 2 University of Electronic Science and Technology of China {Bin.Xiao, v-haipwu,
More informationSEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL
SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL IMAGE DESCRIPTIONS IN THE WILD (IDW-CNN) LARGE KERNEL MATTERS (GCN) DEEP LEARNING SEMINAR, TAU NOVEMBER 2017 TOPICS IDW-CNN: Improving Semantic Segmentation
More informationCascade Region Regression for Robust Object Detection
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali
More informationMask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi
Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image
More informationObject Detection on Self-Driving Cars in China. Lingyun Li
Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not
More informationInception and Residual Networks. Hantao Zhang. Deep Learning with Python.
Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)
More informationDeep Residual Learning
Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st
More informationMulti-scale Adaptive Structure Network for Human Pose Estimation from Color Images
Multi-scale Adaptive Structure Network for Human Pose Estimation from Color Images Wenlin Zhuang 1, Cong Peng 2, Siyu Xia 1, and Yangang, Wang 1 1 School of Automation, Southeast University, Nanjing, China
More informationarxiv: v2 [cs.cv] 23 Jan 2019
CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark Jiefeng Li 1, Can Wang 1, Hao Zhu 1, Yihuan Mao 2, Hao-Shu Fang 1, Cewu Lu 1 1 Shanghai Jiao Tong University, 2 Tsinghua University
More informationarxiv: v5 [cs.cv] 4 Feb 2018
RMPE: Regional Multi-Person Pose Estimation Hao-Shu Fang 1, Shuqin Xie 1, Yu-Wing Tai 2, Cewu Lu 1 1 Shanghai Jiao Tong University, China 2 Tencent YouTu fhaoshu@gmail.com qweasdshu@sjtu.edu.cn yuwingtai@tencent.com
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationMask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018
Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for
More informationarxiv: v1 [cs.cv] 26 Jul 2018
A Better Baseline for AVA Rohit Girdhar João Carreira Carl Doersch Andrew Zisserman DeepMind Carnegie Mellon University University of Oxford arxiv:1807.10066v1 [cs.cv] 26 Jul 2018 Abstract We introduce
More informationMulti-Scale Structure-Aware Network for Human Pose Estimation
Multi-Scale Structure-Aware Network for Human Pose Estimation Lipeng Ke 1, Ming-Ching Chang 2, Honggang Qi 1, and Siwei Lyu 2 1 University of Chinese Academy of Sciences, Beijing, China 2 University at
More informationMulti-Scale Structure-Aware Network for Human Pose Estimation
Multi-Scale Structure-Aware Network for Human Pose Estimation Lipeng Ke 1, Ming-Ching Chang 2, Honggang Qi 1, Siwei Lyu 2 1 University of Chinese Academy of Sciences, Beijing, China 2 University at Albany,
More informationEfficient Segmentation-Aided Text Detection For Intelligent Robots
Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related
More informationStereo Human Keypoint Estimation
Stereo Human Keypoint Estimation Kyle Brown Stanford University Stanford Intelligent Systems Laboratory kjbrown7@stanford.edu Abstract The goal of this project is to accurately estimate human keypoint
More informationAn Analysis of Scale Invariance in Object Detection SNIP
An Analysis of Scale Invariance in Object Detection SNIP Bharat Singh Larry S. Davis University of Maryland, College Park {bharat,lsd}@cs.umd.edu Abstract An analysis of different techniques for recognizing
More informationClassifying a specific image region using convolutional nets with an ROI mask as input
Classifying a specific image region using convolutional nets with an ROI mask as input 1 Sagi Eppel Abstract Convolutional neural nets (CNN) are the leading computer vision method for classifying images.
More informationDetecting Faces Using Inside Cascaded Contextual CNN
Detecting Faces Using Inside Cascaded Contextual CNN Kaipeng Zhang 1, Zhanpeng Zhang 2, Hao Wang 1, Zhifeng Li 1, Yu Qiao 3, Wei Liu 1 1 Tencent AI Lab 2 SenseTime Group Limited 3 Guangdong Provincial
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationAn Analysis of Scale Invariance in Object Detection SNIP
An Analysis of Scale Invariance in Object Detection SNIP Bharat Singh Larry S. Davis University of Maryland, College Park {bharat,lsd}@cs.umd.edu Abstract An analysis of different techniques for recognizing
More informationarxiv: v2 [cs.cv] 22 Nov 2017
Face Attention Network: An Effective Face Detector for the Occluded Faces Jianfeng Wang College of Software, Beihang University Beijing, China wjfwzzc@buaa.edu.cn Ye Yuan Megvii Inc. (Face++) Beijing,
More informationJIAN SUN (
JIAN SUN (www.jiansun.org) CONTACT INFO Email: sunjian@megvii.com Birth date: Oct 30, 1976 Homepage: www.jiansun.org ADDRESS 3F Tower A, Raycom Info Tech Park No.2 Kexueyuan South Road Haidian District,
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationContent-Based Image Recovery
Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose
More informationPaper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations
Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted
More informationRealtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Authors: Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh Presented by: Suraj Kesavan, Priscilla Jennifer ECS 289G: Visual Recognition
More informationFinding Tiny Faces Supplementary Materials
Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution
More informationKaggle Data Science Bowl 2017 Technical Report
Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationImproved Face Detection and Alignment using Cascade Deep Convolutional Network
Improved Face Detection and Alignment using Cascade Deep Convolutional Network Weilin Cong, Sanyuan Zhao, Hui Tian, and Jianbing Shen Beijing Key Laboratory of Intelligent Information Technology, School
More informationarxiv: v1 [cs.cv] 29 Sep 2016
arxiv:1609.09545v1 [cs.cv] 29 Sep 2016 Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge Adrian Bulat and Georgios Tzimiropoulos Computer Vision
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationarxiv: v2 [cs.cv] 30 Sep 2018
A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images arxiv:1809.03917v2 [cs.cv] 30 Sep 2018 Chengyao Qian, Ting Liu, Hao Jiang, Zhe Wang, Pengfei Wang, Mingxin Guan
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationDeeply Cascaded Networks
Deeply Cascaded Networks Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu 1 Introduction After the seminal work of Viola-Jones[15] fast object
More informationarxiv: v2 [cs.cv] 19 Apr 2018
arxiv:1804.06215v2 [cs.cv] 19 Apr 2018 DetNet: A Backbone network for Object Detection Zeming Li 1, Chao Peng 2, Gang Yu 2, Xiangyu Zhang 2, Yangdong Deng 1, Jian Sun 2 1 School of Software, Tsinghua University,
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the
More informationarxiv: v1 [cs.cv] 16 Nov 2015
Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial
More informationarxiv: v1 [cs.cv] 19 Feb 2019
Detector-in-Detector: Multi-Level Analysis for Human-Parts Xiaojie Li 1[0000 0001 6449 2727], Lu Yang 2[0000 0003 3857 3982], Qing Song 2[0000000346162200], and Fuqiang Zhou 1[0000 0001 9341 9342] arxiv:1902.07017v1
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationarxiv: v1 [cs.cv] 29 Nov 2018
Grid R-CNN Xin Lu 1 Buyu Li 1 Yuxin Yue 1 Quanquan Li 1 Junjie Yan 1 1 SenseTime Group Limited {luxin,libuyu,yueyuxin,liquanquan,yanjunjie}@sensetime.com arxiv:1811.12030v1 [cs.cv] 29 Nov 2018 Abstract
More informationarxiv: v4 [cs.cv] 2 Sep 2017
RMPE: Regional Multi-Person Pose Estimation Hao-Shu Fang 1, Shuqin Xie 1, Yu-Wing Tai 2, Cewu Lu 1 1 Shanghai Jiao Tong University, China 2 Tencent YouTu fhaoshu@gmail.com qweasdshu@sjtu.edu.cn yuwingtai@tencent.com
More informationJOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA
JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist
More informationDisguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,
More informationDeep Learning for Object detection & localization
Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified
More informationAmodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou
Amodal and Panoptic Segmentation Stephanie Liu, Andrew Zhou This lecture: 1. 2. 3. 4. Semantic Amodal Segmentation Cityscapes Dataset ADE20K Dataset Panoptic Segmentation Semantic Amodal Segmentation Yan
More informationECE 6554:Advanced Computer Vision Pose Estimation
ECE 6554:Advanced Computer Vision Pose Estimation Sujay Yadawadkar, Virginia Tech, Agenda: Pose Estimation: Part Based Models for Pose Estimation Pose Estimation with Convolutional Neural Networks (Deep
More informationSSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang
SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation
More informationFacial Key Points Detection using Deep Convolutional Neural Network - NaimishNet
1 Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet Naimish Agarwal, IIIT-Allahabad (irm2013013@iiita.ac.in) Artus Krohn-Grimberghe, University of Paderborn (artus@aisbi.de)
More informationIs 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014
Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of
More informationInstance-aware Semantic Segmentation via Multi-task Network Cascades
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationarxiv: v1 [cs.cv] 2 Aug 2018
arxiv:1808.00897v1 [cs.cv] 2 Aug 2018 BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation Changqian Yu 1[0000 0002 4488 4157], Jingbo Wang 2[0000 0001 9700 6262], Chao Peng 3[0000
More informationWhen Big Datasets are Not Enough: The need for visual virtual worlds.
When Big Datasets are Not Enough: The need for visual virtual worlds. Alan Yuille Bloomberg Distinguished Professor Departments of Cognitive Science and Computer Science Johns Hopkins University Computational
More informationSingle Image 3D Interpreter Network
Single Image 3D Interpreter Network Jiajun Wu* Josh Tenenbaum Tianfan Xue* Joseph Lim Antonio Torralba ECCV 2016 Yuandong Tian Bill Freeman (* equal contributions) What do we see from these images? Motivation
More informationEnd-to-End Localization and Ranking for Relative Attributes
End-to-End Localization and Ranking for Relative Attributes Krishna Kumar Singh and Yong Jae Lee Presented by Minhao Cheng [Farhadi et al. 2009, Kumar et al. 2009, Lampert et al. 2009, [Slide: Xiao and
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationTextBoxes++: A Single-Shot Oriented Scene Text Detector
1 TextBoxes++: A Single-Shot Oriented Scene Text Detector Minghui Liao, Baoguang Shi, Xiang Bai, Senior Member, IEEE arxiv:1801.02765v3 [cs.cv] 27 Apr 2018 Abstract Scene text detection is an important
More informationReal-Time Human Pose Recognition in Parts from Single Depth Images
Real-Time Human Pose Recognition in Parts from Single Depth Images Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011 PRESENTER:
More informationTHE task of human pose estimation is to determine the. Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation
1 Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation Guanghan Ning, Student Member, IEEE, Zhi Zhang, Student Member, IEEE, and Zhihai He, Fellow, IEEE arxiv:1705.02407v2 [cs.cv] 8
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationFace Alignment Across Large Poses: A 3D Solution
Face Alignment Across Large Poses: A 3D Solution Outline Face Alignment Related Works 3D Morphable Model Projected Normalized Coordinate Code Network Structure 3D Image Rotation Performance on Datasets
More informationGeneric Face Alignment Using an Improved Active Shape Model
Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationarxiv: v2 [cs.cv] 3 Feb 2019
arxiv:1901.08043v [cs.cv] 3 Feb 019 Bottom-up Object Detection by Grouping Extreme and Center Points Xingyi Zhou UT Austin Jiacheng Zhuo UT Austin Philipp Kra henbu hl UT Austin zhouxy@cs.utexas.edu jzhuo@cs.utexas.edu
More informationUsing k-poselets for detecting people and localizing their keypoints
Using k-poselets for detecting people and localizing their keypoints Georgia Gkioxari, Bharath Hariharan, Ross Girshick and itendra Malik University of California, Berkeley - Berkeley, CA 94720 {gkioxari,bharath2,rbg,malik}@berkeley.edu
More informationarxiv: v1 [cs.cv] 9 Aug 2017
BlitzNet: A Real-Time Deep Network for Scene Understanding Nikita Dvornik Konstantin Shmelkov Julien Mairal Cordelia Schmid Inria arxiv:1708.02813v1 [cs.cv] 9 Aug 2017 Abstract Real-time scene understanding
More informationWE are witnessing a rapid, revolutionary change in our
1904 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 9, SEPTEMBER 2015 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zhang,
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationarxiv: v1 [cs.cv] 30 Jul 2018
Acquisition of Localization Confidence for Accurate Object Detection Borui Jiang 1,3, Ruixuan Luo 1,3, Jiayuan Mao 2,4, Tete Xiao 1,3, and Yuning Jiang 4 arxiv:1807.11590v1 [cs.cv] 30 Jul 2018 1 School
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationarxiv: v1 [cs.cv] 12 Sep 2018
In Proceedings of the 2018 IEEE International Conference on Image Processing (ICIP) The final publication is available at: http://dx.doi.org/10.1109/icip.2018.8451026 A TWO-STEP LEARNING METHOD FOR DETECTING
More informationDeep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?
Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of
More informationMCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection
MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu
More informationEasyChair Preprint. Synthetic image translation for football players pose estimation
EasyChair Preprint 785 Synthetic image translation for football players pose estimation Micha l Sypetkowski, Grzegorz Sarwas and Tomasz Trzciński EasyChair preprints are intended for rapid dissemination
More informationOptimizing Object Detection:
Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN and Single Shot Detection Visual Computing Systems Today s task: object detection Image classification: what
More informationModern Convolutional Object Detectors
Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More information1 MS Student OR 1 Undergrad Student
Fast face/hand detection and tracking for gesture video SVCL is interested in building a purely vision-based system to understand continuous gesture/sign language videos. One of the key component of the
More informationIntroduction to Deep Learning for Facial Understanding Part IV: Facial Understanding
Introduction to Deep Learning for Facial Understanding Part IV: Facial Understanding Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 018 www.nvidia.com/dli ptucha 1 R. Ptucha
More informationFAce detection is an important and long-standing problem in
1 Faceness-Net: Face Detection through Deep Facial Part Responses Shuo Yang, Ping Luo, Chen Change Loy, Senior Member, IEEE and Xiaoou Tang, Fellow, IEEE arxiv:11.08393v3 [cs.cv] 25 Aug 2017 Abstract We
More informationSeminars in Artifiial Intelligenie and Robotiis
Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,
More informationCombining Local Appearance and Holistic View: Dual-Source Deep Neural Networks for Human Pose Estimation
Combining Local Appearance and Holistic View: Dual-Source Deep Neural Networks for Human Pose Estimation Xiaochuan Fan, Kang Zheng, Yuewei Lin, Song Wang Department of Computer Science & Engineering, University
More informationCornerNet: Detecting Objects as Paired Keypoints
CornerNet: Detecting Objects as Paired Keypoints Hei Law Jia Deng more efficient. One-stage detectors place anchor boxes densely over an image and generate final box predictions by scoring anchor boxes
More informationAndrei Polzounov (Universitat Politecnica de Catalunya, Barcelona, Spain), Artsiom Ablavatski (A*STAR Institute for Infocomm Research, Singapore),
WordFences: Text Localization and Recognition ICIP 2017 Andrei Polzounov (Universitat Politecnica de Catalunya, Barcelona, Spain), Artsiom Ablavatski (A*STAR Institute for Infocomm Research, Singapore),
More informationarxiv: v3 [cs.cv] 2 Jun 2017
Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions arxiv:1703.01976v3 [cs.cv] 2 Jun 2017 Iván González-Díaz Department of Signal Theory and
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More information1 Overview Definitions (read this section carefully) 2
MLPerf User Guide Version 0.5 May 2nd, 2018 1 Overview 2 1.1 Definitions (read this section carefully) 2 2 General rules 3 2.1 Strive to be fair 3 2.2 System and framework must be consistent 4 2.3 System
More informationPESIT Bangalore South Campus
INTERNAL ASSESSMENT TEST 2 Date : 04//17 Max Marks : 50 Subject & Code : Object oriented Modeling & Design (CS71) Section : A and B Name of faculty : Mrs Sumana Sinha Time : 11:30-1:00 pm wer any five
More informationarxiv: v1 [cs.cv] 6 Oct 2017
Human Pose Regression by Combining Indirect Part Detection and Contextual Information arxiv:1710.02322v1 [cs.cv] 6 Oct 2017 Diogo C. Luvizon Abstract In this paper, we propose an end-to-end trainable regression
More informationMultiple Instance Detection Network with Online Instance Classifier Refinement
1 Multiple Instance Detection Network with Online Instance Classifier Refinement Peng Tang pengtang@hust.edu.cn Weakly-supervised visual learning (WSVL) 2 Weakly-supervised visual learning is a new trend
More information