LEARNING RIGIDITY IN DYNAMIC SCENES FOR SCENE FLOW ESTIMATION

Size: px

Start display at page:

Download "LEARNING RIGIDITY IN DYNAMIC SCENES FOR SCENE FLOW ESTIMATION"

Maud Ward
5 years ago
Views:

1 LEARNING RIGIDITY IN DYNAMIC SCENES FOR SCENE FLOW ESTIMATION Kihwan Kim, Senior Research Scientist Zhaoyang Lv, Kihwan Kim, Alejandro Troccoli, Deqing Sun, James M. Rehg, Jan Kautz

2 CORRESPENDECES IN COMPUTER VISION Image courtesy Roy Shilkrot 2

3 OPTICAL FLOW Fan et al Brox and Malik 2011 Castro M

4 OPTICAL FLOW AND 3D SCENE FLOW Fan et al Brox and Malik 2011 Castro M Letouzey et al

5 APPLICATION OF 3D MOTION 3D reconstruction of dynamic scene AR and telepresence [DynamicFusion, R. Newcombe, CVPR 2016] [Holoportation, Microsoft 2016] 5

6 APPLICATION OF 3D MOTION 3D Scene Understanding for autonomous driving Robotics Interaction [KITTI Dataset, A. Geiger, PAMI 2014] [SE3-Net,A. Byravan, ICRA, 2017] 6

7 2D OPTICAL FLOW VS 3D SCENE FLOW Why 3D motion estimation is challenging? 7

8 STATIC SCENE - MOVING CAMERA Ω 0 x 0 u 0 u 0 u 0 I 0 u 0 I 1 8

9 STATIC SCENE - MOVING CAMERA cm δu 0 1 Optical flow from camera motion Ω 0 x 0 I 0 u 0 u 0 sf0 δu 0 1 cm δu 0 1 u 0 u 0 I 1 9

10 STATIC SCENE - MOVING CAMERA cm δu 0 1 Optical flow from camera motion x 0 Ω 0 Structure (3D) from (camera) Motion I 0 u 0 u 0 sf0 δu 0 1 cm δu 0 1 u 0 u 0 I 1 10

11 DYNAMIC SCENE - FIXED CAMERA Ω 0 x 0 u 0 I 0 11

12 DYNAMIC SCENE - FIXED CAMERA Ω 0 Ω 1 δx 0 1 sf1 δu 0 1 Scene flow Projected scene flow in I 1 x x 1 0 δx 0 1 u 0 sf0 δu0 1 u 1 I 0 12

13 DYNAMIC SCENE - FIXED CAMERA Ω 0 Ω 1 δx 0 1 Scene flow x x 1 0 δx 0 1 u 0 u 1 I 0 13

14 14 COMMON VIDEOS NOWADAYS Giphy.com #gopro, #drone, Sondra.T. 14

15 DYNAMIC SCENE MOVING CAMERA Ω 0 x 0 u 0 I 0 15

16 DYNAMIC SCENE MOVING CAMERA Ω 0 Ω 1 x x 1 0 δx 0 1 u 0 I 0 u 1 u 0 u 1 I 1 16

17 DYNAMIC SCENE MOVING CAMERA sf1 δu 0 1 of δu 0 1 cm δu 0 1 Projected scene flow in I 1 Optical flow Optical flow from camera motion Ω 0 x x 1 0 δx 0 1 Ω 1 I 0 u 0 sf0 δu 0 1 u 1 cm δu 0 1 u 0 u 0 of δu 0 1 sf1 δu 0 1 u 1 I 1 17

flow Optical flow Camera Pose (transform) Camera

18 DYNAMIC SCENE MOVING CAMERA Projected scene flow (3D motion field) Camera Ego motion Input sequence Optical flow Optical flow Camera Pose (transform) Camera ego-motion flow (projected) scene flow or 3D motion field RIGIDITY 18

19 HOW OTHER WORKS SOLVE THIS? Non-rigid or rigid local motions as outliers Yang et al. ICRA 2011 Menze and Geiger. CVPR

20 HOW OTHER FLOW ALGORITHMS SOLVE THIS? Vogel et al. ICCV 2013 Quiroga et al. ECCV 2014 Jaimez et al. 3DV 2015 Jaimez et al. ICRA 2017 Wulff et al. CVPR

21 OUR PROPOSAL Learn which parts of the scene is (likely) rigid/non-rigid 21

22 PIPELINE D 1 I 1 Rigidity Transform Network (RTN) Rigidity Mask [R t] Refinement Refined [R t] Warping D 0 I 0 Flow network PWC-net Optical flow Ego-motion flow Estimated Projected Scene Flow Subtraction In 3D 22

23 RIGIDITY TRANSFORM NETWORK (RTN) D 0 I 0 Deconv 1-5 Rigidity Attention Mask conv1-6 Pose Regressor R t D 1 I 1 23

n v 2 c o n v 3 c o n v 4 c o n v 5 c o n v 6 Global Average

24 RIGIDITY TRANSFORM NETWORK (RTN) D 0 I 0 Binary cross entropy loss Deconv 1-5 Rigidity Attention Mask D 1 I 1 c o n v 1 c o n v 2 c o n v 3 c o n v 4 c o n v 5 c o n v 6 Global Average Pooling conv-t conv-r R t Huber loss Translation Rotation 24

25 2D OPTICAL FLOW PWCNET CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume Sun et al. CVPR

26 POSE REFINEMENT AND FLOW R t = arg min u,v Ω B u, v = 1 O u, v = 0 L RV 0 u + δu, v + δv + t V 1 u, v Rigidity mask Occlusion mask Flow correspondences We solve this objective function using off-the-shelf Gauss Newton solver GTSAM. 26

SUPERVISION NEEDED Scene-net RGB-D SLAM benchmark

Number Total Images Scenes Pose (GT) Optical flow (GT)

230K static Yes No No Yes Yes SINTEL 23 1018 dynamic

27 SUPERVISION NEEDED Scene-net RGB-D SLAM benchmark SINTEL FlyingThings 3D Monkaa RGB-D dataset Lay-out Number Total Images Scenes Pose (GT) Optical flow (GT) Segmentation (GT) Photo realistic Depth realistic Scene-net M static Yes Yes (from pose) Yes No Yes RGB-D SLAM K static Yes No No Yes Yes SINTEL dynamic Yes Yes Yes No Yes FlyingThings - 25K dynamic Yes Yes Yes No No Monkaa - 10K dynamic Yes Yes Yes No Yes 27

28 SEMI-SYNTHETIC DYNAMIC SCENE DATASET 28

29 REFRESH DATASET 29

30 30

31 31

32 SINTEL EVALUATION Trained from our data, testing on SINTEL data 32

33 SINTEL EVALUATION (POSE) 33

34 REAL WORLD DATA EVALUATION 34

35 35

36 CONCLUSION Proposed a learning-based approach to estimate the rigid regions in dynamic scenes observed by a moving camera Robust per-pixel Rigidity of dynamic scenes Camera pose refined jointly together with 2D optical flow and rigid/occlusion masks Novel semi-synthetic dynamic scene dataset, REFRESH Ours outperforms the state-of-the-art in SINTEL Future works End-to-end framework that learns rigidity as well as correspondences More rich contents in dynamic scene data for encouraging more generalization 36

Multiframe Scene Flow with Piecewise Rigid Motion. Vladislav Golyanik,, Kihwan Kim, Robert Maier, Mathias Nießner, Didier Stricker and Jan Kautz

Multiframe Scene Flow with Piecewise Rigid Motion Vladislav Golyanik,, Kihwan Kim, Robert Maier, Mathias Nießner, Didier Stricker and Jan Kautz Scene Flow. 2 Scene Flow. 3 Scene Flow. Scene Flow Estimation: