Cascaded Pyramid Network for Multi-Person Pose Estimation

Size: px

Start display at page:

Download "Cascaded Pyramid Network for Multi-Person Pose Estimation"

Chad Atkinson
5 years ago
Views:

1 Cascaded Pyramid Network for Multi-Person Pose Estimation Gang YU Megvii (Face++)

2 Team members: Yilun Chen* Zhicheng Wang* Xiangyu Peng Zhiqiang Zhang Gang Yu Jian Sun ( Code: Megvii (Face++)

3 Results COCO 17 Keypoints (test_challenge)

4 Overview Top-down Pipeline Network Design Motivation: How human locate keypoints? Our Network Architecture Techniques & Experiments Conclusion

5 Overview Top-down Pipeline

6 Det Top-Down pipeline

7 Top-Down pipeline Det crop

8 Top-Down pipeline Det crop Single Person Pose Estimation Network

9 Overview Top-down Pipeline Network Design

10 Overview Top-down Pipeline Network Design Motivation: How human locates keypoints?

11 Motivation: How human locate keypoints?

12 Motivation: How human locate keypoints? Nose Left elbow Visible easy keypoints Right hand What? easy visible parts What?

13 Motivation: How human locate keypoints? Nose Left elbow Right hand Visible easy keypoints context Left knee What? enlarge view Right knee Visible hard keypoints Left hip easy visible parts hard visible parts What? enlarge view hard to distinguish?

14 Motivation: How human locate keypoints? Nose Left elbow Right hand Visible easy keypoints context Left knee What? enlarge view Right knee Visible hard keypoints Left hip context easy visible parts hard visible parts Invisible part What? enlarge view hard to distinguish? Right shoulder

15 Network s Design Goal Easy parts Hard parts Input image receptive view getting larger & more context Output image

16 Overview Top-down Pipeline Network Design Motivation: How human locate keypoints? Our Network Architecture

Network Architecture Network Design Principles: Inspired by the process of human locating keypoints and adjusted to CNN network locate easy parts => locate hard

17 Network Architecture Network Design Principles: Inspired by the process of human locating keypoints and adjusted to CNN network locate easy parts => locate hard parts Two stages GlobalNet: to locate the easy parts (Vanilla L2 loss) RefineNet: to locate hard parts (deep layers) with online hard keypoint mining(hard Mining Loss)

18 Network Architecture The green dots means the groundtruth location of keypoints. Heatmap view: Easy parts like left eye successfully been detected, while hard parts like left hip fail to be detected in GlobalNet. Hard parts like left hip successfully been detected in the RefineNet stage.

19 Overview Top-down Pipeline Network Design Motivation: How human locate keypoint? Our Network Architecture Techniques & Experiments

20 Techniques & Experiments Person Detector Non-Maximum Suppression (NMS) strategies VS Soft NMS Hard NMS

21 Techniques & Experiments Person Detector Non-Maximum Suppression (NMS) strategies

22 Techniques & Experiments Person Detector Detection Performance Keypoint map 68.8 Det map

23 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map

24 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map

25 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map

26 Techniques & Experiments Person Detector Detection Performance Keypoint map Det map

27 Techniques & Experiments Person Detector Detection Performance

28 Techniques & Experiments Cascaded Pyramid Network Online Hard Keypoints Mining CPN M Hard Keypoints.. N-M Keypoints No propagate or loss = 0

29 Techniques & Experiments Cascaded Pyramid Network Online Hard Keypoints Mining

30 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

31 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

32 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

33 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

34 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

35 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

36 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

37 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

38 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

39 Techniques & Experiments Cascaded Pyramid Network Design Choices of RefineNet

40 Techniques & Experiments Data Pre-processing

41 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º)

42 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º) Large Batch (+0.4~0.7AP)

43 Techniques & Experiments Data Augmentation (+0.4 AP) Crop augmentation Random scales(0.7~ 1.35) Rotation(-45º~ 45º) Large Batch (+0.4~0.7AP) Ensemble(+1.1~1.5AP in minival) Heatmap merge AP% (COCO minival) AP% (COCO test_challenge) AP% (COCO test_dev, single_model) Our network with all techniques

44 Results on MS COCO

45 Results on MS COCO

46 Results on MS COCO

47 Results on PoseTrack Method AP Our 75.5 AlphaPose 66.7 ML_Lab 70.3 Leaderboard:

48 Illustrative results of our method

49 Illustrative results of our method

50 Conclusion The two-stage network design is crucial. GlobalNet: learns the overall keypoints and mainly locates the easy parts of the keypoints. RefineNet: explicitly learns the hard keypoints with online hard keypoints mining. Intermediate supervision is important to the utility of resnet in human pose estimation. Large batch technique is not only applicable in object detection, but also in keypoint.

51 Thanks & Questions

MSCOCO Keypoints Challenge Megvii (Face++)

MSCOCO Keypoints Challenge Megvii (Face++) MSCOCO Keypoints Challenge 2017 Megvii (Face++) Team members(keypoints & Detection): Yilun Chen* Zhicheng Wang* Xiangyu Peng Zhiqiang Zhang Gang Yu Chao Peng Tete Xiao Zeming Li Xiangyu Zhang Yuning Jiang