Real-time Stereo and Flow-based Video Segmentation with Superpixels

Size: px
Start display at page:

Download "Real-time Stereo and Flow-based Video Segmentation with Superpixels"

Transcription

1 Real-time Stereo and Flow-based Video Segmentation with Superpixels Michael Van den Bergh 1 1 ET HZurich Zurich, Switzerland vamichae@vision.ee.ethz.ch Luc Van Gool 1,2 2 KULeuven Leuven, Belgium vangool@esat.kuleuven.be Abstract The use of depth is becoming increasingly popular in real-time computer vision applications. However, when using real-time stereo for depth, the quality of the disparity image is usually insufficient for reliable segmentation. The aim of this paper is to obtain a more accurate and at the same time faster segmentation by incorporating color, depth and optical flow. A novel real-time superpixel segmentation algorithm is presented which uses real-time stereo and realtime optical flow. The presented system provides superpixels which represent suggested object boundaries based on color, depth and motion. Each outputted superpixel has a 3D location and a motion vector, and thus allows for straightforward segmentation of objects by 3D position and by motion direction. In particular, it enables reliable segmentation of persons, and of moving hands or arms. We show that our method is competitive with the state of the art while approaching real-time performance. Figure 1. The goal in this paper: efficiently segmenting foreground objects/persons and moving parts. Figure 2. Bad examples of color-based superpixel segmentation. 1. Introduction Segmentation is usually slow and unreliable. Object classes have such a variety of appearance, that image-based segmentation requires complicated and slow features, yet struggles to deliver consistent results. Especially in humanmachine interaction there is a need for a fast and reliable segmentation of persons and body parts for interaction. This goal is illustrated in Figure 1, where the person and the moving hand are segmented. Figure 2 shows examples of standard color-based superpixel segmentation failing. Shadow parts get separated from highlighted parts, and similar colors get confused with the background. One solution is to use infrared (IR)-based depth sensors such as Time-of-Flight (ToF) cameras, or structured light devices such as the Primesense sensor. These sensors provide reliable point clouds which can be used to segment objects in 3D space. However, there are several reasons not to rely on IR. IR-based systems are limited to a strict working volume defined by the sensor manufacturer, and to in- door use. The presence of sunlight saturates the IR sensor, resulting in an over-exposed IR image. Stereo camera setups have a wider range of application, as they can work in sunlight, and the working volume can be chosen by use of different cameras, different lenses, or a different baseline. Furthermore, stereo camera pairs are easier to integrate into mobile devices like phones and laptops. On most robotics platforms, stereo and optical flow are available. The main contribution of this paper is the combination of color (RGB), depth, and motion information (optical flow) for superpixel segmentation. We introduce a color-spatiotemporal distance measure that incorporates color, spatial and temporal information. Furthermore we introduce temporal simple linear iterative clustering (temporal SLIC), which exploits the temporal aspect of video in order to minimize the number of iterations per frame and achieve realtime clustering. We present a real-time stereo system that provides object boundaries based on color, depth and motion, and thus makes it possible to detect foreground objects and moving objects at the same time. 89

2 2. Background In the field of object segmentation, many approaches make use of superpixel-based low-level partitioning of the image [6, 8, 5, 7, 12, 3]. This allows for groups of pixels (superpixels) to be classified and thus to achieve an accurate segmentation of the whole image. This partitioning is usually based on clustering pixels based on colors or local texture features. Achanta et al. [1] present a fast approach for color-based clustering of the image into superpixels, based on simple linear iterative clustering (SLIC). This approach achieves state of the art superpixel clustering in O(N) complexity, where N is the number of pixels in the image. Previous approaches only achieved N 2 or NlogN complexity. The SLIC approach paves the way for real-time superpixel segmentation. However, the approach is iterative and the clustering must be run several times on each input frame. Real-time depth information can be acquired using IRbased sensors such as Time-of-Flight cameras (e.g. Swiss- Ranger) or structured light cameras (e.g. PrimeSense). As these sensors provide relatively accurate depth information, we will show that the use of such sensors can result in dramatic improvements in segmentation accuracy. However, we are more interested in the performance based on realtime stereo. Stereo has a broader application range, but the noisy depth from stereo is also an ideal candidate to be improved by incorporating additional cues (color, optical flow) for the segmentation. Bleyer et al. [2] present a real-time stereo algorithm based on GPU-trees. This approach is fast, but the output is not ideal for segmentation as lots of oscillations appear in the disparity image. For segmentation it is important that each object surface is represented as smooth as possible. Geiger et al [9] present a CPU-based real-time stereo algorithm that provides disparity images that are more useful to our application. The approach first looks for a selection of good feature points in both the left and right camera images, and creates a triangular mesh based on those feature points. The other pixels are expected to lie close to this triangular mesh, requiring only to look for a match in a local region, reducing the computation time. This also results in smooth surfaces which are beneficial for our object segmentation. Besides depth, motion is also an important cue for object detection in videos. Typical objects of interest are moving cars or persons, and for interaction we are interested in moving arms or hands. There are several ways to approach the detection of motion. Basic approaches include foregroundbackground segmentation [13] and pixel-level differences between frames. These approaches can be improved with smoothing techniques such as conditional random fields (CRF) [10], median filtering and HMMs [11]. Even though these approaches are low on computational cost, they have a number of significant disadvantages: they lack a motion direction or magnitude, they cannot distinguish between different moving objects, and they cannot deal with camera movement or moving elements in the background. Therefore it is interesting to look into the computationally more expensive optical flow. Brox and Malik [4] show that very high accuracy can be obtained by segmenting objects based on optical flow. However, the largest bottleneck is the computational cost of optical flow. Werlbergeret al. [15, 14] present a GPU-based dense real-time optical flow algorithm, which lends itself perfectly to our application. The remainder of this paper is structured as follows: in Section 3 we show an overview of the system, and describe the real-time stereo and real-time optical flow algorithms used in this paper. The superpixel clustering is explained in Section 4, where we introduce the color-spatio-temporal distance measure and the temporal simple linear iterative clustering (temporal SLIC). Then, in Section 5, we evaluate the proposed system. First we evaluate it against the Berkeley motion segmentation dataset. Then we evaluate the benefit of using depth and of using optical flow, and compare the performance between using an IR-based sensor and using real-time stereo. Finally, we show an experiment where the system is used outdoors on a mobile autonomous urban robot. 3. Depth and Motion Calculation 3.1. System Overview An overview of the system presented in this paper is shown in Figure 3. A stereo camera set-up is used. A real-time stereo component is implemented based on LibElas, and a real-time optical flow component is implemented based on LibFlow. The stereo algorithm runs on the CPU, while the optical flow algorithm runs on the GPU. From the results a 6-channel image is produced: 3 color channels, a depth channel, and a horizontal and vertical flow channel. The superpixel segmentation is then run on this 6D image. It uses a color-spatio-temporal distance measure, and temporal SLIC in order to produce meaningful object boundaries. left right Stereo Flow Superpixel Segmentation Figure 3. System overview. superpixels The resulting superpixels aim to fit object boundaries, especially towards objects or persons in the foreground of the 90

3 scene, and towards moving objects, persons or body parts. The system produces a labeled image, and for each superpixel an average depth and an average motion vector. To give an example, in combination with a face detector, this labeled image can be used to segment a person in the foreground, and moving body parts such as a waving hand or a gesturing arm. In such a setting the system lends itself well to real-time hand gesture interaction Real-time Stereo Depth is incorporated into the system in order to distinguish foreground objects (or persons) from objects behind it or from the background. IR-based depth sensors such as the PrimeSense sensor and the Swiss Ranger Time-of-Flight camera provide reliable depth data in real-time. However, in order to deal with outdoor scenarios we make use of a stereo set-up and of the real-time stereo algorithm presented by Geiger et. al [9]. The benefit of using stereo is that the system can work in sunlight, and any combination of lenses and baselines can be chosen in order to accommodate different working volumes. However, the resulting disparity images are inaccurate and noisy, as shown in Figure 4, and too noisy to be used as a single input as is the case with for example the PrimeSense sensor. (a) PrimeSense (b) Real-time stereo Figure 4. The difference in quality between the disparity images from an IR-based sensor and the output from a real-time stereo algorithm. Even though the depth image is noisy, it is a good cue for the segmentation, and we will show that combined with the color and motion information, a cleaner and more accurate depth image can be produced based on the resulting superpixels Real-time Optical flow The motivation for using optical flow is that we want to detect and distinguish moving objects, persons or body parts. Especially in traffic scenes and in gesture interaction, we are interested in the moving objects or body parts. In this paper we use the optical flow algorithm presented by Werlbergeret al. [15, 14]. This approach deals well with poorly textured regions and small scale image structures, and a GPU-accelerated version of the algorithm is available. For more details about this optical flow algorithm we refer to [15, 14]. We scale the input image ( pixels) down by a factor of 3 in order to speed up the optical flow computation. The algorithm produces a (ẋ, ẏ) flow vector for each pixel. 4. Superpixel Clustering Superpixels are a useful primitive for image segmentation. Given good superpixels, the object segmentation is limited to connecting the superpixels which belong together based on simple criteria or features. The superpixel clustering algorithm presented in this paper is an extension of the work presented by Achanta et al. [1]. Our method differs from this method by introducing a new distance measure and by the temporal iteration method. In this section we will first introduce a new distance measure which incorporates the color, the position, the depth and the motion of each pixel. Furthermore, a temporal iterative clustering approach is introduced, which allows for a significant speed-bump in the clustering. The superpixel algorithm takes as input the desired number of superpixels K. At the beginning of the algorithm K cluster centers are chosen at regular grid intervals S (step size). In [1] the pixels are clustered in a 5D space: L, a, b, x and y, where (L, a, b) is the color in the CIELAB color space, and (x, y) is the 2D position of the pixel. Rather than using Euclidian distance, a distance measure is presented which weighs the color distance against the spatial distance of each pixel. This approach is good for small distances (small superpixels), however, for larger distances the color similarity within one object is not guaranteed. Therefore we introduce a new distance measure based on color, 3D position, and motion Color-Spatio-Temporal Distance Measure We introduce a distance measure D s defined as follows: D s = d lab + m d xyz + n d flow (1) where m is a parameter that controls the spatial compactness of the superpixels, n is a parameter that controls the motion compactness of the superpixels, and where d lab = 1 C (ak a i ) 2 + (b k b i ) 2 + w l (l k l i ) 2 d xyz = 1 S (xk x i ) 2 + (y k y i ) 2 + w z (z k z i ) 2 d flow = 1 (ẋk ẋ i ) T S 2 + (ẏ k ẏ i ) 2 where C is the depth of each color channel (C = 256 in our case), S is the step size between superpixels, T is the time delta between the previous and the current frame, w l is a weight for the intensity, w z is a weight for the depth, and 91

4 (ẋ, ẏ) is the optical flow vector. The color distance and the spatial distance are normalized with C and S respectively, while the motion distance is normalized with the step size S and the time T between the two frames used for optical flow. We introduce a weight for the color intensity component w l in order to lower the influence of shadows and highlights within one object; an object should not be cut in half because of shadow, for example. We also introduce a weight for the depth component w z. In our experiment we have empirically chosen w l = 0.5 and w z = 10. This means we lower the influence of direct light vs. shadows, while we amplify the influence from the depth registration. m and n control spatial and motion compactness. We have empirically found these values to work well when m = 1 and n = 10. This corresponds to a neutral spatial compactness and an amplified motion compactness, as we are very intersted in detecting moving objects Temporal SLIC The superpixel centers are initialized at regular grid steps S. Then, according to the distance measure D s, the best matching pixels from a 2S 2S neighborhood around the superpixel center are assigned to the superpixel. This is repeated for each superpixel. The new cluster centers are computed, and the process is iterated until convergence or until a fixed number of iterations have passed. This process is based on the SLIC algorithm described in [1]. However, instead of iterating the linear clustering N times on each frame, the superpixel positions from the previous frame are taken, and then the clustering only iterates once on each new frame. This temporal approach yields similar superpixels as the iterative approach, as illustrated in Figure 5. The temporal clustering makes the assumption that during each video sequence we observe a fixed set of objects, for example a user interacting. If the set of objects changes, a reinitialization of the superpixels would be required, or a more intelligent handling of the superpixels (for example inserting new superpixels or removing superpixels online). However, this is beyond the scope of this paper and we can assume a reinitialization when the scene changes significantly Computation time We have measured the computation time of the presented system on a Core i7 system with a GeForce GTX 260 GPU. The running times of the different components (per frame) are shown in table 1. Keep in mind that the optical flow and stereo algorithms run in parallel, one on the GPU and one on the CPU. The resulting system is able to run at approximately 2 frames per second on our modest test system. Algorithm 1 Temporal SLIC 1: Initialize cluster centers at grid interval S. 2: repeat 3: Read next video frame. 4: for each cluster center do 5: Assign best matching pixels from 2S 2S neighborhood. 6: Compute new cluster centers. 7: end for 8: until end of video (a) 10 iterations (b) temporal (10th frame) Figure 5. Iterative vs. temporal clustering: (a) shows the result on a still image after 10 iterations; (b) shows the result after 10 video frames with one iteration per frame. component resolution computation time Flow GPU 300 ms Stereo CPU 200 ms Superpixels CPU 270 ms System 5. Evaluation Table 1. Computation times. 570 ms First, we evaluate the general performance of the presented system by running it on the Berkeley motion segmentation dataset and comparing the results with a state of the art optical-flow based segmentation approach. Then, we illustrate the usefulness of incorporating depth and motion into the segmentation with some examples. Subsequently, we compare the performance of the system by using perfect depth information (from a PrimeSense sensor) to the performance using real-time stereo depth information. We show examples to illustrate that the stereo-based approach performs competitively despite the noisy depth data. Finally, we show some examples of the system running on an outdoor robot in traffic and interaction scenarios, illustrating the benefit of using a stereo setup (IR-based sensors do not work outside in sunlight) Evaluation on the Berkeley Motion Segmentation Dataset We provide an evaluation of our method by running it on the Berkeley motion segmentation dataset provided by Brox and Malik [4]. In this dataset no stereo or depth infor- 92

5 mation is available, so we evaluate only based on RGB and motion data. The dataset provides 26 annotated sequences. We compare our method based on the overall clustering error: the number of bad labels over the total number of labels on a per-pixel basis. As in [4], multiple clusters (superpixels) can be assigned to the same region to avoid high penalties for over-segmentation that actually makes sense. For instance, the arms of a person move independently even though only the whole person is annotated in the dataset. For the segmentation of a frame, we only use the current and the previous frame. We do not use the 10 previous frames. The superpixel segmentation is based on the RGB image itself and the output from the optical flow algorithm. The results in Table 2 show that the presented method slightly outperforms the segmentation in [4]. This could be explained by the fact that the superpixel method also takes the RGB input into account, and not just the optical flow. The outlier for the marple9 sequence is due to the bodies of the subjects not moving during the entire sequence. However, as shown in the 5th row of Table 2, the method does segment the heads correctly, which do move in the sequence. The marple10 sequence also performs less than average, because in the ground truth besides a person, one of the walls is also considered an object, which is hard to detect as a separate object because the other walls move in a similar motion. sequence overall input ground truth segmentation error cars1 4.01% cars2 4.79% cars3 3.93% cars4 0.42% cars5 1.22% cars6 0.47% cars7 0.62% cars8 2.42% cars9 1.77% cars % marple1 3.36% marple2 2.18% marple3 2.96% marple4 2.65% marple5 2.05% marple6 7.63% marple7 5.12% marple8 3.47% marple4 2.65% marple % marple % marple % marple % marple % people1 0.99% people2 1.06% tennis 3.40% average 5.67% Table 2. Left: evaluation results. Right: segmentation examples from the Berkeley motion segmentation dataset Benefit of using Depth and Motion We compare the superpixel segmentation for four cases: (1) based on just color; (2) based on color and depth; (3) based on color and motion; and (4) based on color, depth and motion. The examples in Figure 6b show that the segmentation is improved by adding depth: color is often not enough to distinguish between different objects. Figure 6c shows that by taking motion into account, we can segment moving objects or object parts in separate superpixels. By using both depth and motion, as shown in Figure 6d, superpixels are obtained that segment the person and the moving parts correctly. This experiment shows the usefulness of incorporating depth in order to correctly segment static foreground objects, and of incorporating optical flow in order to segment moving objects PrimeSense vs. Stereo We also compare the performance based on an IR-based sensor compared to real-time stereo. As shown in Figure 4, the amount discontinuities in the disparity image from IRbased sensors is sufficiently low that one could just segment based on a threshold. However, in the case of real-time stereo, the disparity image is very noisy. Some example results from a PrimeSense sensor are shown in Figure 7, and from a real-time stereo setup are shown in Figure 8. For this experiment the PrimeSense sensor and the stereo camera pair were mounted on the same tripod, and the same sequences were recorded simultaneously (note that the Prime- Sense sensor has a wider angle of view). The results show that for the case of real-time stereo, despite the noisy disparity image, by incorporating color and motion a good segmentation can still be obtained Outdoors One of the main reasons for choosing real-time stereo over IR-based sensors is outdoors functionality. Therefore we show some examples recorded outdoors on an autonomous urban robot. These results are shown in Figure 9, where examples are shown of some traffic and some interaction scenarios. Notice that the monotonous appearance of the street makes the stereo disparity image even more noisy. Nevertheless, the system is able to sergment moving cars on crossings, persons in front of the robot and moving (waving) arms. 6. Conclusion In this paper, a novel superpixel segmentation technique was presented that takes noisy depth and motion data, and produces a useful object segmentation.a color-spatiotemporal distance measure was introduced that incorporates color (RGB), spatial (xyz) and temporal information (opti- 93

6 cal flow). Furthermore we introduced temporal simple linear iterative clustering (temporal SLIC), which exploits the temporal aspect of video in order to minimize the number of iterations per frame required. We present a realtime stereo system that provides object boundaries based on color, depth and motion, and thus makes it possible to detect foreground objects and moving objects at the same time. The superpixel segmentation is not the end of the pipeline. The segmentation can be improved by next steps that process the superpixels. This can be based on simple features such as color, depth and motion, but also on additional more complicated features to further identify the content of the superpixels. However, this post-processing is outside the scope of this paper. The presented system will be applied towards detecting waving and pointing hand gestures on a mobile robot platform, which will be used outdoors. The method allows for detection of persons in the foreground, moving body parts and moving objects in the background. In these circumstances we cannot fall back on IR-based sensors, and the segmentation that combines depth and motion is useful. It can be noted that the quality of the stereo/depth input is not always the same, and the real-time stereo can sometimes fail miserably. This influences the superpixel segmentation, as we use a fixed weight parameter (w z ) to influence the importance of the depth input. This could be resolved by using a dynamic weight, which changes depending on the confidence of the depth values on that frame or pixel. There is also the option to extend the optical flow to 3D, based on the depth data it should be possible to calculate a 3D flow vector. One could also look into incorporating additional input values, especially in the outdoor case, where one could incorporate the input from laser scans or other sensors. Furthermore, for the future it will be good to look into tighter integration of the stereo, optical flow and superpixel clustering. These processes are tightly related and a smarter integration should be possible in order to further reduce processing time while improving the quality of the segmentation. Acknowledgments. This work is carried out in the context of the Seventh Framework Programme of the European Commission, EU Project FP7 ICT Interactive Urban Robot (IURO), and the SNF project Vision-supported Speech-based Human Machine Interaction ( ). [2] M. Bleyer and M. Gelautz. Simple but effective tree structures for dynamic programming-based stereo matching. In International Conference on Computer Vision Theory and Applications (VISAPP), pages [3] X. Boix, J. M. Gonfaus, J. V. de Weijer, A. D. Bagdanov, J. Serrat, and J. Gonzalez. Harmony potentials: Fusing global and local scale for semantic image segmentation. [4] T. Brox and J. Malik. Object segmentation by long term analysis of point trajectories. In Proceedings of European Conference of Computer Vision, pages , Greece, [5] D. Comaniciu and P. Meer. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(24): , [6] P. F. Felzenszwalb and D. P. D.P. Huttenlocher. Effcient graph-based image segmentation. International Journal of Computer Vision, 2(59): , [7] C. Fowlkes, D. Martin, and J. Malik. Learning affinity functions for image segmentation: Combining patch-based and gradient-based approaches [8] B. Fulkerson, A. Vedaldi, and S. Soatto. Class segmentation and object localization with superpixel neighborhoods [9] A. Geiger, M. Roser, and R. Urtasun. Efficient large-scale stereo matching. In Proceedings of Asian Conference on Computer Vision, Queenstown, New Zealand, November [10] A. Griesser, S. D. Roeck, A. Neubeck, and L. V. Gool. Gpu-based foreground-background segmentation using an extended colinearity criterion. Proc. of VMV, pages , November [11] B. Jedynak, H. Zheng, and M. Daoudi. Skin detection using pairwise models. Image and Vision Computing, September [12] D. Martina, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local brightness, color and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, [13] R. Mester, T. Aach, and L. Dumbgen. Illumination-invariant change detection using a statistical colinearity criterion. In Pattern Recognition: Proceedings 23rd DAGM, pages , [14] M. Werlberger, T. Pock, and H. Bischof. Motion estimation with non-local total variation regularization. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, June [15] M. Werlberger, W. Trobin, T. Pock, A. Wedel, D. Cremers, and H. Bischof. Anisotropic Huber-L1 optical flow. In Proceedings of the British Machine Vision Conference (BMVC), London, UK, September References [1] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. Slic superpixels. June EPFL Technical Report

7 (a) Superpixels based on RGB values (b) Superpixels based on RGB and depth values (c) Superpixels based on RGB and optical flow values (d) Superpixels based on RGB, depth and optical flow values Figure 6. Superpixels depending on different input types (the image in the background is faded to make the superpixels more visible). By adding depth, the persons are segmented more cleanly. By adding motion, the moving arms are segmented as a separate object. By using both, we get cleanly segmented persons and moving body parts. Figure 7. Experiments with RGB and depth data taken from a PrimeSense sensor. The clean and pixel-accurate depth registration allows for the system to produce a clean segmentation. 95

8 Figure 8. Experiments with RGB images taken from two Point Grey Grasshopper cameras and depth provided by real-time stereo. Figure 9. The superpixel segmentation on some outdoor scenes recorded from a robot. From left to right: RGB input image, stereo disparity, optical flow, resulting superpixels, resulting segmentation. 96

Depth SEEDS: Recovering Incomplete Depth Data using Superpixels

Depth SEEDS: Recovering Incomplete Depth Data using Superpixels Depth SEEDS: Recovering Incomplete Depth Data using Superpixels Michael Van den Bergh 1 1 ETH Zurich Zurich, Switzerland vamichae@vision.ee.ethz.ch Daniel Carton 2 2 TU München Munich, Germany carton@lsr.ei.tum.de

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Superpixel Segmentation using Depth

Superpixel Segmentation using Depth Superpixel Segmentation using Depth Information Superpixel Segmentation using Depth Information David Stutz June 25th, 2014 David Stutz June 25th, 2014 01 Introduction - Table of Contents 1 Introduction

More information

Robot localization method based on visual features and their geometric relationship

Robot localization method based on visual features and their geometric relationship , pp.46-50 http://dx.doi.org/10.14257/astl.2015.85.11 Robot localization method based on visual features and their geometric relationship Sangyun Lee 1, Changkyung Eem 2, and Hyunki Hong 3 1 Department

More information

Content-based Image and Video Retrieval. Image Segmentation

Content-based Image and Video Retrieval. Image Segmentation Content-based Image and Video Retrieval Vorlesung, SS 2011 Image Segmentation 2.5.2011 / 9.5.2011 Image Segmentation One of the key problem in computer vision Identification of homogenous region in the

More information

Notes 9: Optical Flow

Notes 9: Optical Flow Course 049064: Variational Methods in Image Processing Notes 9: Optical Flow Guy Gilboa 1 Basic Model 1.1 Background Optical flow is a fundamental problem in computer vision. The general goal is to find

More information

SCALP: Superpixels with Contour Adherence using Linear Path

SCALP: Superpixels with Contour Adherence using Linear Path SCALP: Superpixels with Contour Adherence using Linear Path Rémi Giraud 1,2 remi.giraud@labri.fr with Vinh-Thong Ta 1 and Nicolas Papadakis 2 1 LaBRI CNRS UMR 5800 - University of Bordeaux, FRANCE PICTURA

More information

Human Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview

Human Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview Human Body Recognition and Tracking: How the Kinect Works Kinect RGB-D Camera Microsoft Kinect (Nov. 2010) Color video camera + laser-projected IR dot pattern + IR camera $120 (April 2012) Kinect 1.5 due

More information

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013 Lecture 19: Depth Cameras Visual Computing Systems Continuing theme: computational photography Cameras capture light, then extensive processing produces the desired image Today: - Capturing scene depth

More information

Graph based Image Segmentation using improved SLIC Superpixel algorithm

Graph based Image Segmentation using improved SLIC Superpixel algorithm Graph based Image Segmentation using improved SLIC Superpixel algorithm Prasanna Regmi 1, B.J.M. Ravi Kumar 2 1 Computer Science and Systems Engineering, Andhra University, AP, India 2 Computer Science

More information

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology

More information

Combining RGB and ToF Cameras for Real-time 3D Hand Gesture Interaction

Combining RGB and ToF Cameras for Real-time 3D Hand Gesture Interaction Combining RGB and ToF Cameras for Real-time 3D Hand Gesture Interaction Michael Van den Bergh 1 1 ETH Zurich Zurich, Switzerland vamichae@vision.ee.ethz.ch Luc Van Gool 1,2 2 KU Leuven Leuven, Belgium

More information

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching

More information

Selective Search for Object Recognition

Selective Search for Object Recognition Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where

More information

Generating Object Candidates from RGB-D Images and Point Clouds

Generating Object Candidates from RGB-D Images and Point Clouds Generating Object Candidates from RGB-D Images and Point Clouds Helge Wrede 11.05.2017 1 / 36 Outline Introduction Methods Overview The Data RGB-D Images Point Clouds Microsoft Kinect Generating Object

More information

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg Human Detection A state-of-the-art survey Mohammad Dorgham University of Hamburg Presentation outline Motivation Applications Overview of approaches (categorized) Approaches details References Motivation

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors

Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors Advanced Imaging Applications on Smart-phones Convergence of General-purpose computing, Graphics acceleration, and Sensors Sriram Sethuraman Technologist & DMTS, Ittiam 1 Overview Imaging on Smart-phones

More information

arxiv: v1 [cs.cv] 14 Sep 2015

arxiv: v1 [cs.cv] 14 Sep 2015 gslicr: SLIC superpixels at over 250Hz Carl Yuheng Ren carl@robots.ox.ac.uk University of Oxford Ian D Reid ian.reid@adelaide.edu.au University of Adelaide September 15, 2015 Victor Adrian Prisacariu victor@robots.ox.ac.uk

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Topic 4 Image Segmentation

Topic 4 Image Segmentation Topic 4 Image Segmentation What is Segmentation? Why? Segmentation important contributing factor to the success of an automated image analysis process What is Image Analysis: Processing images to derive

More information

CRF Based Point Cloud Segmentation Jonathan Nation

CRF Based Point Cloud Segmentation Jonathan Nation CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to

More information

Image Segmentation. Lecture14: Image Segmentation. Sample Segmentation Results. Use of Image Segmentation

Image Segmentation. Lecture14: Image Segmentation. Sample Segmentation Results. Use of Image Segmentation Image Segmentation CSED441:Introduction to Computer Vision (2015S) Lecture14: Image Segmentation What is image segmentation? Process of partitioning an image into multiple homogeneous segments Process

More information

Recognizing Apples by Piecing Together the Segmentation Puzzle

Recognizing Apples by Piecing Together the Segmentation Puzzle Recognizing Apples by Piecing Together the Segmentation Puzzle Kyle Wilshusen 1 and Stephen Nuske 2 Abstract This paper presents a system that can provide yield estimates in apple orchards. This is done

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm Computer Vision Group Prof. Daniel Cremers Dense Tracking and Mapping for Autonomous Quadrocopters Jürgen Sturm Joint work with Frank Steinbrücker, Jakob Engel, Christian Kerl, Erik Bylow, and Daniel Cremers

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

Range Sensors (time of flight) (1)

Range Sensors (time of flight) (1) Range Sensors (time of flight) (1) Large range distance measurement -> called range sensors Range information: key element for localization and environment modeling Ultrasonic sensors, infra-red sensors

More information

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller 3D Computer Vision Depth Cameras Prof. Didier Stricker Oliver Wasenmüller Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de

More information

SUPERPIXELS: THE END OF PIXELS IN OBIA. A COMPARISON OF STATE-OF-THE- ART SUPERPIXEL METHODS FOR REMOTE SENSING DATA

SUPERPIXELS: THE END OF PIXELS IN OBIA. A COMPARISON OF STATE-OF-THE- ART SUPERPIXEL METHODS FOR REMOTE SENSING DATA SUPERPIXELS: THE END OF PIXELS IN OBIA. A COMPARISON OF STATE-OF-THE- ART SUPERPIXEL METHODS FOR REMOTE SENSING DATA O. Csillik * Department of Geoinformatics Z_GIS, University of Salzburg, 5020, Salzburg,

More information

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors Segmentation I Goal Separate image into coherent regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by L. Lazebnik Applications Intelligent

More information

Medial Features for Superpixel Segmentation

Medial Features for Superpixel Segmentation Medial Features for Superpixel Segmentation David Engel Luciano Spinello Rudolph Triebel Roland Siegwart Heinrich H. Bülthoff Cristóbal Curio Max Planck Institute for Biological Cybernetics Spemannstr.

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Photoneo's brand new PhoXi 3D Camera is the highest resolution and highest accuracy area based 3D

Photoneo's brand new PhoXi 3D Camera is the highest resolution and highest accuracy area based 3D Company: Photoneo s.r.o. Germany Contact: Veronika Pulisova E-mail: pulisova@photoneo.com PhoXi 3D Camera Author: Tomas Kovacovsky & Jan Zizka Description of the innovation: General description Photoneo's

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Saliency Detection in Aerial Imagery

Saliency Detection in Aerial Imagery Saliency Detection in Aerial Imagery using Multi-scale SLIC Segmentation Samir Sahli 1, Daniel A. Lavigne 2 and Yunlong Sheng 1 1- COPL, Image Science group, Laval University, Quebec, Canada 2- Defence

More information

Segmentation Based Stereo. Michael Bleyer LVA Stereo Vision

Segmentation Based Stereo. Michael Bleyer LVA Stereo Vision Segmentation Based Stereo Michael Bleyer LVA Stereo Vision What happened last time? Once again, we have looked at our energy function: E ( D) = m( p, dp) + p I < p, q > We have investigated the matching

More information

Effects Of Shadow On Canny Edge Detection through a camera

Effects Of Shadow On Canny Edge Detection through a camera 1523 Effects Of Shadow On Canny Edge Detection through a camera Srajit Mehrotra Shadow causes errors in computer vision as it is difficult to detect objects that are under the influence of shadows. Shadow

More information

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures Now we will talk about Motion Analysis Motion analysis Motion analysis is dealing with three main groups of motionrelated problems: Motion detection Moving object detection and location. Derivation of

More information

DEPTH-ADAPTIVE SUPERVOXELS FOR RGB-D VIDEO SEGMENTATION. Alexander Schick. Fraunhofer IOSB Karlsruhe

DEPTH-ADAPTIVE SUPERVOXELS FOR RGB-D VIDEO SEGMENTATION. Alexander Schick. Fraunhofer IOSB Karlsruhe DEPTH-ADAPTIVE SUPERVOXELS FOR RGB-D VIDEO SEGMENTATION David Weikersdorfer Neuroscientific System Theory Technische Universität München Alexander Schick Fraunhofer IOSB Karlsruhe Daniel Cremers Computer

More information

LEARNING TO SEGMENT MOVING OBJECTS IN VIDEOS FRAGKIADAKI ET AL. 2015

LEARNING TO SEGMENT MOVING OBJECTS IN VIDEOS FRAGKIADAKI ET AL. 2015 LEARNING TO SEGMENT MOVING OBJECTS IN VIDEOS FRAGKIADAKI ET AL. 2015 Darshan Thaker Oct 4, 2017 Problem Statement Moving object segmentation in videos Applications: security tracking, pedestrian detection,

More information

Peripheral drift illusion

Peripheral drift illusion Peripheral drift illusion Does it work on other animals? Computer Vision Motion and Optical Flow Many slides adapted from J. Hays, S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others Video A video

More information

Efficient Large-Scale Stereo Matching

Efficient Large-Scale Stereo Matching Efficient Large-Scale Stereo Matching Andreas Geiger*, Martin Roser* and Raquel Urtasun** *KARLSRUHE INSTITUTE OF TECHNOLOGY **TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO KIT University of the State of Baden-Wuerttemberg

More information

Motion in 2D image sequences

Motion in 2D image sequences Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or activities Segmentation and understanding of video sequences

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Segmentation. Bottom up Segmentation Semantic Segmentation

Segmentation. Bottom up Segmentation Semantic Segmentation Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,

More information

Image Analysis Lecture Segmentation. Idar Dyrdal

Image Analysis Lecture Segmentation. Idar Dyrdal Image Analysis Lecture 9.1 - Segmentation Idar Dyrdal Segmentation Image segmentation is the process of partitioning a digital image into multiple parts The goal is to divide the image into meaningful

More information

The raycloud A Vision Beyond the Point Cloud

The raycloud A Vision Beyond the Point Cloud The raycloud A Vision Beyond the Point Cloud Christoph STRECHA, Switzerland Key words: Photogrammetry, Aerial triangulation, Multi-view stereo, 3D vectorisation, Bundle Block Adjustment SUMMARY Measuring

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with

More information

Compact Watershed and Preemptive SLIC: On improving trade-offs of superpixel segmentation algorithms

Compact Watershed and Preemptive SLIC: On improving trade-offs of superpixel segmentation algorithms To appear in Proc. of International Conference on Pattern Recognition (ICPR), 2014. DOI: not yet available c 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained

More information

Announcements. Stereo Vision Wrapup & Intro Recognition

Announcements. Stereo Vision Wrapup & Intro Recognition Announcements Stereo Vision Wrapup & Intro Introduction to Computer Vision CSE 152 Lecture 17 HW3 due date postpone to Thursday HW4 to posted by Thursday, due next Friday. Order of material we ll first

More information

New Sony DepthSense TM ToF Technology

New Sony DepthSense TM ToF Technology ADVANCED MATERIAL HANDLING WITH New Sony DepthSense TM ToF Technology Jenson Chang Product Marketing November 7, 2018 1 3D SENSING APPLICATIONS Pick and Place Drones Collision Detection People Counting

More information

NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING

NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING Nicole Atzpadin 1, Serap Askar, Peter Kauff, Oliver Schreer Fraunhofer Institut für Nachrichtentechnik, Heinrich-Hertz-Institut,

More information

Computer Vision. 3D acquisition

Computer Vision. 3D acquisition è Computer 3D acquisition Acknowledgement Courtesy of Prof. Luc Van Gool 3D acquisition taxonomy s image cannot currently be displayed. 3D acquisition methods Thi passive active uni-directional multi-directional

More information

3D HAND LOCALIZATION BY LOW COST WEBCAMS

3D HAND LOCALIZATION BY LOW COST WEBCAMS 3D HAND LOCALIZATION BY LOW COST WEBCAMS Cheng-Yuan Ko, Chung-Te Li, Chen-Han Chung, and Liang-Gee Chen DSP/IC Design Lab, Graduated Institute of Electronics Engineering National Taiwan University, Taiwan,

More information

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai Traffic Sign Detection Via Graph-Based Ranking and Segmentation Algorithm C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT

More information

Detecting motion by means of 2D and 3D information

Detecting motion by means of 2D and 3D information Detecting motion by means of 2D and 3D information Federico Tombari Stefano Mattoccia Luigi Di Stefano Fabio Tonelli Department of Electronics Computer Science and Systems (DEIS) Viale Risorgimento 2,

More information

10/5/09 1. d = 2. Range Sensors (time of flight) (2) Ultrasonic Sensor (time of flight, sound) (1) Ultrasonic Sensor (time of flight, sound) (2) 4.1.

10/5/09 1. d = 2. Range Sensors (time of flight) (2) Ultrasonic Sensor (time of flight, sound) (1) Ultrasonic Sensor (time of flight, sound) (2) 4.1. Range Sensors (time of flight) (1) Range Sensors (time of flight) (2) arge range distance measurement -> called range sensors Range information: key element for localization and environment modeling Ultrasonic

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

CS 4758: Automated Semantic Mapping of Environment

CS 4758: Automated Semantic Mapping of Environment CS 4758: Automated Semantic Mapping of Environment Dongsu Lee, ECE, M.Eng., dl624@cornell.edu Aperahama Parangi, CS, 2013, alp75@cornell.edu Abstract The purpose of this project is to program an Erratic

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

arxiv: v1 [cs.cv] 21 Sep 2018

arxiv: v1 [cs.cv] 21 Sep 2018 arxiv:1809.07977v1 [cs.cv] 21 Sep 2018 Real-Time Stereo Vision on FPGAs with SceneScan Konstantin Schauwecker 1 Nerian Vision GmbH, Gutenbergstr. 77a, 70197 Stuttgart, Germany www.nerian.com Abstract We

More information

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Tomokazu Sato, Masayuki Kanbara and Naokazu Yokoya Graduate School of Information Science, Nara Institute

More information

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze

More information

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Motion and Tracking Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Motion Segmentation Segment the video into multiple coherently moving objects Motion and Perceptual Organization

More information

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships

More information

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time, ICRA 14, by Pizzoli, Forster, Scaramuzza [M. Pizzoli, C. Forster,

More information

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors Complex Sensors: Cameras, Visual Sensing The Robotics Primer (Ch. 9) Bring your laptop and robot everyday DO NOT unplug the network cables from the desktop computers or the walls Tuesday s Quiz is on Visual

More information

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester Topics to be Covered in the Rest of the Semester CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester Charles Stewart Department of Computer Science Rensselaer Polytechnic

More information

Supervised texture detection in images

Supervised texture detection in images Supervised texture detection in images Branislav Mičušík and Allan Hanbury Pattern Recognition and Image Processing Group, Institute of Computer Aided Automation, Vienna University of Technology Favoritenstraße

More information

Direct Methods in Visual Odometry

Direct Methods in Visual Odometry Direct Methods in Visual Odometry July 24, 2017 Direct Methods in Visual Odometry July 24, 2017 1 / 47 Motivation for using Visual Odometry Wheel odometry is affected by wheel slip More accurate compared

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Target Tracking Using Mean-Shift And Affine Structure

Target Tracking Using Mean-Shift And Affine Structure Target Tracking Using Mean-Shift And Affine Structure Chuan Zhao, Andrew Knight and Ian Reid Department of Engineering Science, University of Oxford, Oxford, UK {zhao, ian}@robots.ox.ac.uk Abstract Inthispaper,wepresentanewapproachfortracking

More information

The Kinect Sensor. Luís Carriço FCUL 2014/15

The Kinect Sensor. Luís Carriço FCUL 2014/15 Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time

More information

From Structure-from-Motion Point Clouds to Fast Location Recognition

From Structure-from-Motion Point Clouds to Fast Location Recognition From Structure-from-Motion Point Clouds to Fast Location Recognition Arnold Irschara1;2, Christopher Zach2, Jan-Michael Frahm2, Horst Bischof1 1Graz University of Technology firschara, bischofg@icg.tugraz.at

More information

Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm

Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm ALBERTO FARO, DANIELA GIORDANO, CONCETTO SPAMPINATO Dipartimento di Ingegneria Informatica e Telecomunicazioni Facoltà

More information

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

More information

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods

More information

Lecture 10 Dense 3D Reconstruction

Lecture 10 Dense 3D Reconstruction Institute of Informatics Institute of Neuroinformatics Lecture 10 Dense 3D Reconstruction Davide Scaramuzza 1 REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time M. Pizzoli, C. Forster,

More information

Scene Segmentation by Color and Depth Information and its Applications

Scene Segmentation by Color and Depth Information and its Applications Scene Segmentation by Color and Depth Information and its Applications Carlo Dal Mutto Pietro Zanuttigh Guido M. Cortelazzo Department of Information Engineering University of Padova Via Gradenigo 6/B,

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

Segmentation Computer Vision Spring 2018, Lecture 27

Segmentation Computer Vision Spring 2018, Lecture 27 Segmentation http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 218, Lecture 27 Course announcements Homework 7 is due on Sunday 6 th. - Any questions about homework 7? - How many of you have

More information

Part II: Modeling Aspects

Part II: Modeling Aspects Yosemite test sequence Illumination changes Motion discontinuities Variational Optical Flow Estimation Part II: Modeling Aspects Discontinuity Di ti it preserving i smoothness th tterms Robust data terms

More information

Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation

Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation Alaa E. Abdel-Hakim Electrical Engineering Department Assiut University Assiut, Egypt alaa.aly@eng.au.edu.eg Mostafa Izz Cairo

More information

3D Time-of-Flight Image Sensor Solutions for Mobile Devices

3D Time-of-Flight Image Sensor Solutions for Mobile Devices 3D Time-of-Flight Image Sensor Solutions for Mobile Devices SEMICON Europa 2015 Imaging Conference Bernd Buxbaum 2015 pmdtechnologies gmbh c o n f i d e n t i a l Content Introduction Motivation for 3D

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Highlight detection with application to sweet pepper localization

Highlight detection with application to sweet pepper localization Ref: C0168 Highlight detection with application to sweet pepper localization Rotem Mairon and Ohad Ben-Shahar, the interdisciplinary Computational Vision Laboratory (icvl), Computer Science Dept., Ben-Gurion

More information

Image Segmentation Via Iterative Geodesic Averaging

Image Segmentation Via Iterative Geodesic Averaging Image Segmentation Via Iterative Geodesic Averaging Asmaa Hosni, Michael Bleyer and Margrit Gelautz Institute for Software Technology and Interactive Systems, Vienna University of Technology Favoritenstr.

More information

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching Stereo Matching Fundamental matrix Let p be a point in left image, p in right image l l Epipolar relation p maps to epipolar line l p maps to epipolar line l p p Epipolar mapping described by a 3x3 matrix

More information

Motion Estimation using Block Overlap Minimization

Motion Estimation using Block Overlap Minimization Motion Estimation using Block Overlap Minimization Michael Santoro, Ghassan AlRegib, Yucel Altunbasak School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA 30332 USA

More information

Object Tracking with an Adaptive Color-Based Particle Filter

Object Tracking with an Adaptive Color-Based Particle Filter Object Tracking with an Adaptive Color-Based Particle Filter Katja Nummiaro 1, Esther Koller-Meier 2, and Luc Van Gool 1,2 1 Katholieke Universiteit Leuven, ESAT/VISICS, Belgium {knummiar,vangool}@esat.kuleuven.ac.be

More information

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK Mahamuni P. D 1, R. P. Patil 2, H.S. Thakar 3 1 PG Student, E & TC Department, SKNCOE, Vadgaon Bk, Pune, India 2 Asst. Professor,

More information

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation Alexander Andreopoulos, Hirak J. Kashyap, Tapan K. Nayak, Arnon Amir, Myron D. Flickner IBM Research March 25,

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information