Automatic Dense Semantic Mapping From Visual Street-level Imagery

Size: px

Start display at page:

Download "Automatic Dense Semantic Mapping From Visual Street-level Imagery"

Amanda Stanley
5 years ago
Views:

1 Automatic Dense Semantic Mapping From Visual Street-level Imagery Sunando Sengupta [1], Paul Sturgess [1], Lubor Ladicky [2], Phillip H.S. Torr [1] [1] Oxford Brookes University [2] Visual Geometry Group, Oxford University 1

2 Dense Semantic Map Generate an overhead view of an urban region. Label every pixel in the Map View is associated with an object class label Pavement Car Pedestrian Bollard Shop Sign Sky Post 2 Vegetation Tree Signage Road Building Fence

3 Dense Semantic Map Street images captured inexpensively from vehicle with multiple mounted camera [1]. [1] Yotta. DCL, Yotta dcl case studies, Available: 3

4 Semantic Mapping Framework Street level Images acquisition Semantic mapping framework comprises of two stages 4

5 Semantic Mapping Framework Street level Images acquisition Image Segmentation Semantic mapping framework comprises of two stages Semantic Image Segmentation at street level. 5

Semantic Mapping Framework Street level Images acquisition Image Segmentation Semantic mapping framework comprises of two stages Semantic Image Segmentation at

6 Semantic Mapping Framework Street level Images acquisition Image Segmentation Semantic mapping framework comprises of two stages Semantic Image Segmentation at street level. Ground Plane Labelling at a global level. Ground plane labelling One of the first attempts to do dense overhead mapping from street level images. 6

7 Semantic Image Segmentation Label every pixel in the image with an object class Input Output Automatic Labeller Raw Image Labelled Image Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Object Class Labels Building Fence 7

Segmentation Each pixel is a node in a grid graph G = (V,E).

8 Semantic Image Segmentation We use Conditional Random Field Framework (CRF) CRF construction Input Image X Final Segmentation Each pixel is a node in a grid graph G = (V,E). Each node is a random variable x taking a label from label set. 8

9 Semantic Image Segmentation - CRF Total energy Optimal labelling given as 9 C c c c N j V i j i ij V i i i i x x x E ) ( ), ( ) ( ) (, x x E pix E pair E region

10 Semantic Image Segmentation - CRF Total energy E = E pix + E pair + E region E pix - Model individual pixel s cost of taking a label. Computed via the dense boosting approach Multi feature variant of texton boost [1] Car 0.2 Road 0.3 x [1] L. Ladicky, C. Russell, P. Kohli, and P. H. Torr, Associative hierarchical crfs for object class image segmentation, in ICCV,

11 Semantic Image Segmentation - CRF Total energy E = E pix + E pair + E region E pair - Model each pixel neighbourhood interactions. Encourages label consistency in adjacent pixels Sensitive to edges in images. Contrast sensitive Potts model Car Road x i 0 g(i,j) Car Road x j E pair 11

12 Semantic Image Segmentation - CRF Total energy E = E pix + E pair + E region E region - Model behaviour of a group of pixels. Classify a region Encourages all the pixels in a region to take the same label. Group of pixels given by a multiple meanshift segmentations Car 0.3 Road 0.1 c 12

13 Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Road Expansion Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence [1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99 13

Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Building Expansion Pavement Car Pedestrian Bollard Shop

14 Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Building Expansion Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence [1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99 14

15 Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Sky Expansion Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence [1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99 15

16 Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Pavement Expansion Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence [1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99 16

Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Final solution Pavement Car Pedestrian Bollard Shop Sign

17 Semantic Image Segmentation Solved using alpha-expansion algorithm [1] Input Image Final solution Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence [1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99 17

18 Ground Plane Labelling Combine many labellings from street level imagery. Input Output Automatic Labeller Street Level labellings Labelled Ground Plane 18

19 Ground Plane CRF A CRF defined over the ground plane. Each ground plane pixel (zi) is a random variable taking a label from the label set. Energy for ground plane crf is E g ( Z) E g pix E g pair Z 19

20 Ground Plane Pixel Cost K X Z We assume a flat world. 20

21 Ground Plane Pixel Cost K X Z Homography Road Pavement Post/Pole A ground plane region is estimated. 21

22 Ground Plane Pixel Cost K X Z Homography Road Pavement Post/Pole Each point in the image projects to a unique point on the ground plane. Creating a homography 22

23 Ground Plane Pixel Cost K X Z Homography Road Pavement Post/Pole The image labelling is mapped to the ground plane via the homography. Ground plane Pixel histograms 23

24 Ground Plane Pixel Cost K X Z Homography Road Pavement Post/Pole Ground plane Pixel histograms Labels projected from many views are combined in a histogram. The normalised histogram gives the naïve probability of the ground plane pixel taking a label. 24

25 Ground Plane Pixel Cost K X Z Homography Road Pavement Post/Pole Ground plane Pixel histograms Labels projected from many views are combined in a histogram. The normalised histogram gives the naïve probability of the ground plane pixel taking a label. 25

26 Ground Plane labelling Histogram is built for every ground plane pixel giving E g pix Pairwise cost (E g pair) added to induce smoothness Contrast sensitive potts model Z

27 Ground Plane labelling Final CRF solution obtained using alpha expansion. Void

28 Ground Plane labelling Final CRF solution obtained using alpha expansion. Road expansion

29 Ground Plane labelling Final CRF solution obtained using alpha expansion. Building expansion

30 Ground Plane labelling Final CRF solution obtained using alpha expansion. Pavement expansion

31 Ground Plane labelling Final CRF solution obtained using alpha expansion. Car expansion

32 Ground Plane Labelling Final CRF solution obtained using alpha expansion. Final Solution

33 Dataset Subset of the images captured by the van 14.8 km of track, 8000 images from each camera. Pixel-level labelled ground truth images. Dataset available [1]. 13 object categories Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence Training - 44 images, testing - 42 images. [1] 33

34 SIS Results Input Semantic segmentation Ground Truth Pavement Car Pedestrian Bollard Shop Sign Sky Post Vegetation Tree Signage Road Building Fence Input Images, output of our image level CRF, ground truths. Used Automatic Labelling environment [1] [1] The Automatic Labelling Environment, L Ladicky, PHS Torr. Code available 34

35 Semantic Map Results Semantic map of Pembroke city 35

36 Ground plane Map Evaluation Street Images Back-projected Map results Ground Truth We back-project the ground plane map into image domain and evaluate the results. Global pixel accuracy of 82.9% 36

37 Results 37

38 Conclusions Presented a method to generate overhead view semantic mapping. Experiments on large tracks (~15km) which can be scaled up to country wide mapping Dataset available [1]. [1] 38

Add detailed street level information like information boards, traffic

39 Future Work Perform a 3D street level semantic mapping and reconstruction. Add detailed street level information like information boards, traffic boards etc. Thank you!!! Oxford Brookes Vision group Oxford Brookes University 39

41 41

42 42

43 43

44 Ground Plane Pixel Cost K X Multi-view Z Homography Road Pavement Post/Pole Single view Using single view will create a shadow effect for objects violating flat world assumption and wrong label estimate 44

Combining Appearance and Structure from Motion Features for Road Scene Understanding

STURGESS et al.: COMBINING APPEARANCE AND SFM FEATURES 1 Combining Appearance and Structure from Motion Features for Road Scene Understanding Paul Sturgess paul.sturgess@brookes.ac.uk Karteek Alahari karteek.alahari@brookes.ac.uk