Imagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding

Size: px

Start display at page:

Download "Imagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding"

Alicia Mason
5 years ago
Views:

1 : Stability-based Cuboid Arrangements for Scene Understanding Tianjia Shao* Aron Monszpart Youyi Zheng Bongjin Koo Weiwei Xu Kun Zhou * Niloy J. Mitra *

scene occlusion Limiting the understanding of

2 Background A fundamental problem for single view data acquisition Missing observations due to scene occlusion Limiting the understanding of indoor scenes From the capture view From another view

3 How do We Imagine Unseen Data?

4 How do We Imagine Unseen Data?

5 Our Motivation Mimic the process of human s imagination based on physical stability

6 Our Motivation Physical stability can help to reason about the scene structure Relationship among parts touching or fixed

7 Related Works: Proxy-based Scene Understanding Structured output of primitives [Li et al. 2011] [Lafarge et al. 2013] Creating abstracted geometry [Arikan et al. 2013] Studying spatial layout [Gupta et al. 2010] [Lee et al. 2010] [Hartley et al. 2012] Image manipulation [Zheng et al. 2012]

8 Related Works: Cuboid Detection Statistical deformable cuboid model [Fidler et al. 2012] Cuboid corner point model [Xiao et al. 2012] Image space contrast-based features [Hedau et al. 2012]

9 Related Works: Physical Validity Constraints Penetration free [Hedau et al. 2010] [Jiang and Xiao 2013] Improve segmentation [Jia et al. 2013] Voxel-based scene parsing [Zheng et al. 2013] [Kim et al. 2013]

10 Our Goal Recover the underlying structure of indoor scenes Abstracting indoor scenes as collections of cuboids Hallucinate geometry in the occluded regions Identifying part parameters (size and orientation) Identifying part relations (touching or fixed)

11 Algorithm Overview

12 Algorithm Overview Input scan Initial cuboids Optimized cuboids + recovered structure

13 Algorithm Overview Input scan Initial cuboids Optimized cuboids + recovered structure

14 A B C D E G Inferring Geometry and Relations Key observation: relations are coupled with geometry G supports A & E; E supports B & C & D G supports E; E supports A & C & D; C supports B Optimized geometry Optimized geometry

graph G V, E V: cuboids; E: contact types (and

15 Inferring Geometry and Relations Encode the discrete interaction among the cuboids as a multiconnection graph G V, E V: cuboids; E: contact types (and corresponding cuboid extensions) a b e ab 2 a e ab 1 e ab 0 b e ab 3

16 Inferring Geometry and Relations Optimization formulation Edge selection problem x ij k = 1: edge type e ij k selected; x ij k = 0: not selected. Goal: Physically stable arrangement of cuboids with minimal number of fixed contacts min xij k #(e ij k = fixed joint) Constraints: k x ij k = 1 e ab 2 a e ab 1 e ab 0 b e ab 3

17 Inferring Geometry and Relations Pruning the solution space Pre-pruning Penetration check Visibility check

18 Inferring Geometry and Relations Pruning the solution space Pre-pruning Penetration check Visibility check Branch-and-bound Most stable

19 Inferring Geometry and Relations Assessing physical stability Local reasoning is not enough

20 Inferring Geometry and Relations Measuring global stability through static equilibrium The net force and torque should be zero Decompose the contact forces to compression forces and friction forces E s A min f Df + w 2 s. t. f i n 0 and f i u, fi v < μfi n f 1 f n 2 f v 2 f 2 w f u 2 f 3

21 Overview Ground truth Metrics Robustness Validity Applications

22 Overview Ground truth Metrics Robustness Validity Applications

23 Evaluation Ground Truth

24 Evaluation Ground Truth 20 scenes High occlusion 3 dimensions measured by hand

25 Evaluation Ground Truth Fixed Touching 20 scenes High occlusion 3 dimensions measured by hand 700 scenes (NYU2) Medium occlusion Cuboids, floors, walls Support graph Initialization

26 Evaluation Ground Truth Fixed Touching 20 scenes High occlusion 3 dimensions measured by hand 700 scenes (NYU2) Medium occlusion Online: 720 scenes, annotator tool Cuboids, floors, walls Support graph Initialization

27 Overview Ground truth Metrics Robustness Validity Applications

28 Evaluation Metrics Approximation error L 1 -norm

29 Evaluation Metrics Fixed Touching Approximation error L 1 -norm Correctness of structure graph F 1 score

30 Evaluation Approximation error Initial

31 Evaluation Approximation error Initial Ground truth {err i init }

32 Evaluation Approximation error Initial Ground truth Optimized {err i init } {err i optimized }

33 Evaluation Approximation error Initial Ground truth Optimized {err i init } {err i optimized } err i init err i optimized >? 5%

34 Evaluation Approximation error Initial Ground truth Optimized {err i init } {err i optimized } err i init err i optimized >? 5% 35% 3%

35 RGB(-D) Evaluation F 1 score Ground Truth Fixed Touching

36 RGB(-D) Evaluation F 1 score Ground Truth Output Fixed Touching

37 RGB(-D) Evaluation F 1 score Ground Truth Output Fixed Touching

38 RGB(-D) Evaluation F 1 score Precision Recall TP TP + FP TP TP + FN F Precision Recall Precision + Recall Ground Truth Output Fixed Touching

Evaluation F 1 score RGB(-D) #Scenes Initial Optimized Synthetic 12 32.2 87.

39 Evaluation F 1 score RGB(-D) #Scenes Initial Optimized Synthetic NYU Own Kinfu Ground Truth Output Fixed Touching

40 Evaluation F 1 score RGB(-D) Initialization Ground Truth Fixed Touching

41 Evaluation F 1 score RGB(-D) Initialization Ground Truth Optimized Fixed Touching

42 Evaluation F 1 score RGB(-D) Initialization Ground Truth Optimized Fixed Touching

43 Overview Ground truth Metrics Robustness Validity Applications

44 Robustness to viewpoint 1 Viewpoint 1 Fixed Touching

45 Robustness to viewpoint 1 Viewpoint 1 Viewpoint 2 Fixed Touching 2

46 Robustness to scene complexity Input scene

47 Robustness to scene complexity Input cloud (back view)

48 Robustness to scene complexity Initialization

49 Robustness to scene complexity Fixed Touching Initialization Optimized Structure graph

50 Robustness to scene complexity Fixed Touching Initialization Optimized Structure graph

51 Overview Ground truth Metrics Robustness Validity Applications

52 Evaluation Validity RGB-D input Initial (side view)

53 Evaluation Validity RGB-D input Initial (side view) Optimized (side view)

54 Evaluation Validity Less occluded scan

55 Evaluation Validity Less occluded scan + Optimized

56 Evaluation Validity Less occluded scan + Optimized Extra proxies removed Front view

57 Overview Ground truth Metrics Robustness Validity Applications

58 Applications #1 Model retrieval

59 Applications #1 Model retrieval

60 Applications #1 Model retrieval

61 Applications #2 Re-arrangement

62 Applications #2 Re-arrangement

63 Applications #3 - Completion RGB-D Input

64 Applications #3 - Completion RGB-D Input Physically stable approximation

65 Applications #3 - Completion RGB-D Input Completed models

66 Applications #3 - Completion RGB-D Input Completed models

67 Limitations heterogeneous [ [GravityGlue.com]

68 Limitations Fixed Touching [ [GravityGlue.com]

69 Code + Data:

70 Robustness to initialization No perturbation 5 10

Imagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding

Imagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding Tianjia Shao Zhejiang University Aron Monszpart University College London Youyi Zheng Yale University Bongjin Koo University