ECE 6554:Advanced Computer Vision Pose Estimation

Size: px

Start display at page:

Download "ECE 6554:Advanced Computer Vision Pose Estimation"

Bertha Jackson
5 years ago
Views:

1 ECE 6554:Advanced Computer Vision Pose Estimation Sujay Yadawadkar, Virginia Tech,

2 Agenda: Pose Estimation: Part Based Models for Pose Estimation Pose Estimation with Convolutional Neural Networks (Deep pose) Pose Estimation with Sequential Prediction (Pose Machines)

3 Estimating Articulated Poses Localizing Body Joints from Monocular Images 3

5 Estimating Articulated Poses from Monocular Images Why it is Hard? large variance occlusion L/R ambiguity 5

6 Estimating Articulated Poses from Monocular Images Direct Mapping 6

7 Part-based Models Recognizing Local Appearance Part detector for wrist Confidence maps

8 Non-parametric Uncertainty on Confidence Maps right wrist 8

9 Extracting Features from Confidence Maps Loses Uncertainty Hand-crafted Context feature Peak Candidates for Graphical Models Regress to Displacement [Ramakrishna14] [Chen14] [Tompson15] [Pishchulin16] [Carriera16] 9

10 Pose Estimation with CNN Consider Pose Estimation as a regression problem. Loss function: L2 distance between ground truth of the pose vector and estimated pose vector.

11 Convolutional Pose Machines 1. Capture local appearance with CNNs 2. CNNs on confidence maps to capture long-range part dependencies (preserve uncertainty) 3. Iteratively refine confidence maps with global cues 11

12 Convolutional Pose Machines (CPMs) Stage 1 Stage 2 Stage T P P P 12

13 Convolutional Pose Machines Capturing Local Appearance by FCNN Stage 1 Stage 2 Stage T P P P Stage 1 P 13

14 Convolutional Pose Machines Learning Image-dependent Spatial Model Stage 1 Stage 2 Stage T P P P Stage 1 Stage 2 P P 14

15 Convolutional Pose Machines Large Receptive Field Stage 1 Stage 2 P P Stage T P Stage 1 Stage 2 P P 15

16 Convolutional Pose Machines Overall Architecture Stage 1 Stage 2 Stage T P P P Stage 1 Stage 2 Stage T 16

17 Iteratively Refined Confidence Maps right elbow right wrist Input Image 1st stage 2nd stage 3rd stage 18

18 Iteratively Refined Confidence Maps Recover from False Negative 1st stage 2nd stage 3rd stage 19

19 Iteratively Refined Confidence Maps Recover from False Negative 1st stage output 2nd stage 3rd stage 20

20 Iteratively Refined Confidence Maps 21

21 Training CPMs Ideal Confidence Maps for Intermediate Supervisions Stage 1 Stage 2 Stage T overall loss 23

22 Training CPMs Joint training with Intermediate Supervisions Stage 1 Stage 2 Stage T 24

23 Training CPMs Intermediate Supervisions Resolves Gradient Vanishing Stage 1 Stage 2 Stage 3 Stage 1 Stage 2 Stage 3 Without Intermediate Supervision With Intermediate Supervision 25

24 Convolutional Pose Machines Overall Architecture with Shared Image Features Stage 1 Stage 2 Stage T Stage 1 Stage 2 Stage T 26

25 Training CPMs Data Prepare and Augmentation 27

26 Analysis and Results 28

27 Benchmark Datasets FLIC LSP MPII size 3987 training 1016 testing training 1000 testing training testing type annotation movie scenes upper body sports full body diverse full body w/ truncation 29

28 Number of Stages PCK 0.2 (error tolerance) 30

29 Training Methods PCK 0.2 (error tolerance) 31

30 Quantitatively Results FLIC Upper Body with Observer Centric (OC) Annotations PCK 0.2 PCK

31 34 Quantitatively Results LSP Dataset with Person Centric (PC) Annotations PCK %

32 35 Quantitatively Results MPII Dataset with PC Annotations PCKh 0.5 LSP 90.1 %

33 Quantitatively Results MPII Dataset: Viewpoints 36

34 37

35 Failure Cases L/R confusion rare viewpoint rare pose severe occlusion right wrist 38

36 Summary Monocular human pose estimation are becoming reliable. CPMs capture complex long-range part dependencies by iteratively refining confidence maps with preserved uncertainty. CPMs naturally avoid the problem of vanishing gradient by intermediate supervisions. 39

37 What s Next? 40

38 From Single to Multi-Person Challenge: Identifying number of people and part-person association 41

39 Multi-person Human Pose Estimations Naive Two-phase Method CPM with P = 1 (person detector) 42 Credit: Zhe Cao

40 Pose Estimations in Finer Scales Faces 43 Source: Tomas Simon

41 Pose Estimations in Finer Scales Hands 44

42 CMU Panoptic Studio 500 Synced Cameras 45

43 Multiple Views for 3D Recon Right Wrist 46

44 Multiple Views for 3D Reconstruction 47 Source: Hanbyul Joo

45 Multiple Views for 3D Recon Projected Full Poses 48

46 Live Demo! 49

47 Real-time CPMs 50

48 Real-time CPMs: Confidence Map of Right Wrist 51

49 Future Directions - Analysis on failure cases and data distribution - One-shot multi-person pose estimation - Direct 3D reasoning - Temporal CPMs 52

50 Thank you Questions? Check our Paper, Github, and Youtube Channel! 53

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields Authors: Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh Presented by: Suraj Kesavan, Priscilla Jennifer ECS 289G: Visual Recognition