KERNEL-BASED TRACKING USING SPATIAL STRUCTURE Nicole M. Artner 1, Salvador B. López Mármol 2, Csaba Beleznai 1 and Walter G.

Size: px
Start display at page:

Download "KERNEL-BASED TRACKING USING SPATIAL STRUCTURE Nicole M. Artner 1, Salvador B. López Mármol 2, Csaba Beleznai 1 and Walter G."

Transcription

1 KERNEL-BASED TRACKING USING SPATIAL STRUCTURE Nicole M. Artner 1, Salvador B. López Mármol 2, Csaba Beleznai 1 and Walter G. Kropatsch 2 Abstract We extend the concept of kernel-based tracking by modeling the spatial structure of multiple tracked feature points belonging to the same object by a simple graph-based representation. The task of tracking parts or multiple feature points of an object without considering the underlying structure becomes ambiguous if the target representation (for example color histograms) is similar to other nearby targets or to that of the background. Instead of considering tracking of multiple targets as isolated processes, we propose an approach incorporating spatial dependencies between tracked targets and an iterative technique to efficiently locate the spatial arrangement of targets maximizing the joint posterior. We present a series of experiments demonstrating that the proposed method provides improved tracking stability and accuracy when compared to the standard Mean Shift tracking. Furthermore we analyze the proposed method in terms of robustness by assessing its performance in scenarios where occlusions are present. 1. Introduction Object representation is a crucial issue for the task of object tracking and it has a significant impact on the quality of a tracking algorithm. The tracking quality can be for example gauged in terms of spatial accuracy and temporal stability. In order to achieve reliable tracking performance the employed object representation needs to be discriminative, it has to be invariant against variations in the object appearance and it needs to be computationally efficient to be applicable for multiple targets as it is often the case in realistic scenarios. Especially, for targets with deformable shape at the same time undergoing photometric variations these requirements are difficult to meet. In the field of visual object recognition similar challenges exist. For object recognition significant achievements have been accomplished in recent years by devising part-based object representations of high representational power at a low computational complexity [5, 4]. Part-based or structural representations, however, are still relatively unexplored for the tracking task, where mainly either strictly local (point tracking) or global object representations prevail. This is quite surprising considering that structure is an important invariant. In this paper, we propose an initial concept for combining deterministic tracking of object parts with graph representation encoding structural dependencies between the parts. In general, image graphs can be used to represent structure and topology. The output of any segmentation algorithm, which produces regions having closed boundaries (e.g. the watershed algorithm) can be represented as a region adjacency graph (RAG) [16]. We use a Maximally Stable Extremal Regions (MSER) detec- 1 Austrian Research Centers GmbH - ARC, Smart Systems Division, Vienna, Austria {nicole.artner, csaba.beleznai}@arcs.ac.at 2 PRIP, Vienna University of Technology, Austria {salva, krw}@prip.tuwien.ac.at

2 tor [1] to generate regions which represent the nodes of the graph. For this purpose other region detectors (f.e. Harris-Affine, Salient Regions [11]) can also be used. The edges between the nodes define the region adjacencies. We compute color histograms on the nodes, thus obtaining an attributed graph (AG). Other regions descriptors, such as SIFT [8], can be also used to encode node properties into a compact representation. When the structure of the tracked object is represented by a graph, the data association task between adjacent frames (integral part of most tracking algorithms) becomes a graph matching problem. As graph matching is NP-complete, it is only feasible on graphs with few nodes. This usually motivates the use of multiple resolutions of the graph structure [3]. In our case, we avoid the complexity of graph matching and use the mode seeking property of the Mean Shift algorithm for inter-frame node association. During object tracking the color histograms of the AG and spring-like edge energies of the structure are used to carry out gradient ascent energy minimization on the joint (color similarity and structure) probability surface. The remaining parts of this paper are organized as follows: Section 2. gives an overview on related approaches employing structure to track objects. In Section 3. our approach and its algorithmic steps are explained. Section 4. shows the experiments and discusses the results. Finally, in Section 5. a conclusion is given and future work is described. 2. Related work The few approaches which use graph-based structure representation for tracking can be grouped into three different categories: 2.1. Graph-based methods using graph matching Graphs offer a way to represent structure in a rich and compact manner. After setting up node attributes - such as size, average color, position -, edges are defined to specify the spatial relationships (adjacency, border) between the nodes. Graph matching methods can be used to associate structures acquired at different time instances. In [7], links between RAGs at consecutive frames are established using temporal edges, which connect the same regions in different frames. Tang et. al. model in [15] tracked targets with Scale-invariant Feature Transform (SIFT) and represent their relationships with Attributed Relational Graphs (ARGs). The graph matching problem is solved by relaxation labeling in an efficient way. The quality of the RAG obtained by segmenting an image depends heavily on the characteristics of the image and the process is usually slow. Because of this, in [14] graph matching is used where some features are detected in the image directly instead of on the RAG Graph-based methods not using graph matching In these approaches graphs are only used to represent structure, but not for associating consecutive measurements. For example Ma et al. [9] represent the spatial configuration of multiple targets by a graph and the association problem is solved by a maximum a posteriori formulation. Graph matching in such a case would be highly complex since spatial relations between multiple individual objects might change significantly over time. A different approach for graph-based tracking is proposed by Conte et al. in [3]. They use graph pyramids to describe each frame at several levels of detail. By the use of graph pyramids the method

3 is able to assign labels (such as occluded, not occluded or background) to each pixel of a moving foreground region during partial occlusions Other approaches Some methods use graphs to represent task-specific prior knowledge in form of a graph structure. For example, Rehg and Kanade deal in [12] with self-occluding articulated objects applying a kinematic model to predict occlusions and using a graph with just one level. In their experiments they track the fingers of a hand. They can distinguish between occluded cases by the order of the templates related to the fingers and their ordering relative to the camera. Related approaches exist, which attempt to recover the pose of a 3D articulated model from 2D video sequences. Sminchisescu and Triggs present in [13] an approach which uses graph-based structural constraints on human motion, together with a high-dimensional search strategy. 3. Our approach This section introduces the methods used in our approach and explains their combination MSER For initializing the attributed graph (representing the object to be tracked), we use the Maximally Stable Extremal Regions (MSER) detector. The MSER detector has been developed by Matas et al. [1] and it has been evaluated [6, 11] as the most reliable interest point detector in terms of detection repeatability across various geometric transformations, image blur and photometric changes. Maximally stable extremal regions are connected components of an image thresholded according to a specific scheme. The term extremal refers to the property that all pixels within an extremal region have either higher (bright or positive extremal regions) or lower (dark or negative extremal regions) intensity values than the pixels at the regions outer boundary. The maximally stable property refers to the criterion used in the selection process of an optimum threshold for a given region. The threshold selection criterion selects a given threshold and creates a maximally stable extremal region, if in the current threshold neighborhood the rate of area increase has a local minimum with respect to the threshold variation. The output of the MSER algorithm is not a simple binary image: for certain parts of an image multiple thresholds might exist (fulfilling the criterion of maximum stability), creating in such cases a nested subset of regions. For more details refer to [1]. The MSER detector is applied to the image region containing the object delineated manually by the user. The MSER computation is used only once to initialize the graph structure and the Mean Shift trackers at each node. Given the extremal property of the MSER region, the computed node-specific local histograms - used by the Mean Shift tracking - are well-defined (narrow peaked) given the high color uniformity within the detected regions Mean Shift The Mean Shift algorithm is used to associate the nodes of the structure between adjacent image frames. The Mean Shift algorithm is a statistical and robust procedure, which locates local density maxima in a given probability distribution. For that it uses a search window positioned over a section of the probability distribution. Within this search window the density maximum can be determined by a simple weighted average computation. Then the search window is moved to the position of the maximum and the calculation is repeated until the algorithm converges. The convergence of the mode

4 seeking process implies that the nearest local density maximum or mode is found and the Mean Shift offset becomes after certain number of iterations very small. The implementation of the tracking with Mean Shift in this paper mainly follows the ideas in [2]. For every region we obtain from the initialization step by the MSER algorithm a target model ˆq, which is created in the form of a 3D color histogram. Every dimension of the histogram corresponds to one channel of the RGB color space. The histogram is subdivided into bins u =1...m to reduce the amount of data and to cluster similar colors. The discrete distribution of color probabilities is computed according to the following formula [2]: n ˆq u = C k ( x i 2) δ (b(x i ) u), (1) i=1 where C is a normalizing factor such that m ˆq u =1. (2) u=1 k in Equation (1) stands for the Epanechnikov kernel [1] and is used to control the influence of the pixels in the region on the target model. The pixels are weighted depending on their distance to the center of the region. x i are the pixel positions in the image and x i are the normalized pixel positions. b is a function mapping a pixel in the 2D space to the 1D space of the histogram bin indices. Depending on the RGB value of a pixel the function b provides the index of the corresponding histogram bin. δ is the Kronecker delta function. As proposed in [2] we calculate a candidate model ˆp in every frame in addition to the target model ˆq from the initialization. The candidate model n ˆp u (c) =C k ( x i 2) δ (b(x i ) u) (3) i=1 is created from the pixels in the search window at the actual position c =(c x,c y ). Equation (3) is a reformulation of Equation (1) for the position c. The candidate and the target histogram models are used to compute the new position c = n x i w i i=1 n (4) w i i=1 of the target object within the Mean Shift algorithm iterations. The pixels used to calculate position c are weighted according to m ˆqu w i = ˆp u (c) δ (b (x i) u), (5) u=1 whereas ˆq and ˆp (c) are the target and the candidate model. The obtained weight w i denotes the probability value of a pixel within the search window.

5 A B A C A B B B A B B (a) B (b) Figure 1. Edge relaxation examples. B and B denote the deformed and equilibrium node locations, respectively 3.3. Graph relaxation The graph encompassing structural dependencies between the MSER regions is obtained using the Delaunay triangulation. Our objective is to link the processes of (1) structural energy minimization of the graph and (2) color histogram similarity maximization at the nodes by Mean Shift tracking. The graph relaxation step introduces a mechanism which - upon drift in the Mean Shift tracking results - imposes structural constraints on the Mean Shift mode seeking process. As the tracked objects are rigid, the objective of the relaxation is to maintain the tracked structure as similar as possible to the initial structure. The graph relaxation is used to minimize the dissimilarity between the initial structure of the object and the tracked structure. This is an energy minimization problem on the total energy of the structure, E t. The total energy of the structure in the initial state is because the initial structure is considered as the true object structure. During the tracking process E t usually changes, because of the spatial tracking errors of the Mean Shift tracker. The structural energy E t is computed using the concept of spring-like edges between nodes E t = k e, e 2, (6) e where e, and e denote the deformed and undeformed edge lengths. The variations of the edge lengths and their directions are used to determine a structural offset component for each node. The direction of the offset points toward the maximum descent in the structural energy function. In other words, the offset vector represents the direction where a given node should move such that its edges restore their initial length and the energy of the structure is minimized. We calculate this structural energy minimization offset vector O for each node n as follows: O(n) = k ( e, e ) 2 ( d(e, n)), (7) e E(n) where E(n) are all the edges e incident to node n, k is the elasticity constant of the edges in the structure and d(e, n) is the unitary vector in the direction of edge e that points toward n. In Figure 1(a) three possible states of an edge are shown. First the initial state is shown. Next, a state where the edge is contracted is shown. In this case, the offset vector O will force node B to move, enlarging the edge length back to its initial length. In the third case the edge is too long, so O will tend to contract it. Figure 1(b) shows how the sum of the offset vectors of each edge would move node B to its structurally correct position B Combining Mean Shift and graph relaxation In the proposed combined Algorithm 1 graph relaxation is embedded into the iterative Mean Shift tracking process. For every frame we perform Mean Shift and structural iterations until the algorithm

6 converges, because a maximum number of iterations is reached ɛ i or the graph structure attains equilibrium, i.e. its total energy is beneath a certain threshold ɛ e. To compute the position of each region (node), Mean Shift offset and structure-induced offset are combined using a mixing coefficient g. The ordering of the region selection during the iterations is randomized to minimize deterministic errors. One could think of ordering the regions depending on the confidences of their Mean Shift trackers. This ordering would have the advantage that the iteration process will not start with an occluded region, but the drawback is that it could lead to the already mentioned deterministic errors. The algorithmic combination represents a joint iterative mode seeking process on the color similarity and on the structural energy surfaces. As it is demonstrated in the next section, the joint use of Mean Shift and structural constraints significantly improves tracking in presence of occlusions or in cases when multiple similarly colored nearby objects are tracked on patterned background. The calculation of the 3D color histograms for the Mean Shift iteration represents the biggest part of the computational costs. Because of this, we could say that the complexity of our algorithm scales linearly with the number of regions. Algorithm 1 Mean Shift using spatial structure 1: TRACKER(V, nf, R, nr, S, g) V video sequence, nf number of frames, R regions from MSER, nr number of regions, S initial structure, g mixing coefficient ɛ e threshold for total energy of structure, ɛ f threshold for maximum number of iterations 2: i 1, counter 1 iteration counters 3: while (i nf) do 4: converged false 5: while ( converged) do 6: rir list of randomized indices of regions R 7: for ir 1,nrdo 8: p ms Mean Shift iteration for R(rir(ir)) p ms is the position from Mean Shift 9: p s structural iteration for R(rir(ir)) p s is the position from structure 1: Calculate new position p n for R(rir(ir)): p n =(1 g) p ms + g p s 11: end for 12: E t Determine total energy of structure 13: counter counter +1 14: if E t <ɛ e counter < ɛ i then 15: converged true 16: end if 17: end while 18: i = i +1 19: end while 2: end 4. Results and discussion In this section the results of one synthetic and two real video sequences are presented. In our experiments we used a Matlab implementation, which can process a frame in less than one second on a Pentium 2,8 GHz with 512 MB RAM. Notice that our implementation is only a prototype and better performance could definitely be achieved. For all sequences the MSER algorithm is used to initialize the Mean Shift tracking process and to build up the graph representing the structure. All videos have a resolution of pixels and in all of the experiments a mixing coefficient g of.55 and a spring constant k of.2 is used. The thresholds defined in Algorithm 1, ɛ i and ɛ e are set to 4 and.5.

7 (a) Frame 5 (b) Frame 2 (c) Frame 35 (d) Frame 5 (e) Frame 5 (f) Frame 2 (g) Frame 35 (h) Frame 5 Figure 2. Tracking results for the synthetic sequence. The interesting parts of the frames were croped. Top row (a, b, c, d) without structure. Bottom row (e, f, g, h) with structure. The black graphs are ground truth and the white graphs are the results Spatial deviation Spatial deviation (a) (b) Figure 3. Deviation from ground truth of the results of the synthetic sequence. (a) Without structure. (b) With structure Comparison of Mean Shift with and without structure In the synthetic sequence the task is to track 2 homogeneous, rigidly-connected regions of an image pattern ( pixels). During the sequence the pattern is translated and rotated. Figure 2 shows the results with and without the use of structure. The accuracy of the tracking without structure is obviously worse than with structure. In Figure 3 the time evolution of the spatial deviation (Euclidean distance) between ground truth and the results can be seen. It is visible that tracking without using structure gradually loses several tracked nodes, while with structure no significant drift is present.

8 Total energy Total energy (a) Iteration number (b) Figure 4. Temporal evolution of the total structure energy. (a) for the whole synthetic sequence and (b) for one frame Energy evolution of structure over time The total energy E t of the structure depends on the configuration of the graph. Figure 4(a) visualizes the evolution of E t of the graph over time for the synthetic sequence (see Figure 2). By comparing Figure 4(a) with 3(b) the coherence of energy and spatial deviation can be seen. During the iterations of one frame the total energy of the structure is minimized as far as possible. Figure 4(b) shows the evolution of energy during the iterations for one frame of the synthetic sequence Robustness against occlusions Occlusions are a serious problem for tracking with Mean Shift. Occlusions corrupt the color distribution locally and lead to erroneous Mean Shift offsets. Figure 5 contains several frames of the synthetic video sequence with a pixel occlusion. Results are shown without structure (top row) and with structure (bottom row). The use of structure again produces improved results. Figure 6 displays the temporal evolution of the spatial deviations for the occluded case. Figure 7 demonstrates the robustness of Mean Shift tracking using structure against occlusions. Table 1 gives a summary of occlusion parameters and obtained results. The synthetic sequence was used for tracking with an increasing area of the occluding block (white rectangle). The spatial deviations do not change significantly (see Figure 7(a) to (c)) even though the occlusion size is growing from pixels to pixels. When the size of the occluding region becomes pixels (see Figure 7(d)), too many nodes of the graph are affected by erroneous Mean Shift measurements and the structure - while keeping the correct topology - starts to drift. The temporal evolution of the total energy shows increasing oscillations with growing amount of occlusions (see Figure 7(c) and (d)). These are due to the occluder-induced local perturbations. Nevertheless, the global structural constraints are able - up to a point - to restore the equilibrium structure Behavior in real video sequences The two real videos sequences show a checkerboard moving around in the scene. In real video sequence 1 the checkerboard is rotated and in sequence 2 it is occluded by a hand. Both real video

9 (a) Frame 5 (b) Frame 15 (c) Frame 25 (d) Frame 45 (e) Frame 5 (f) Frame 15 (g) Frame 25 (h) Frame 45 Figure 5. Comparison of the tracking performance during occlusion without (a, b, c, d) and with (e, f, g, h) structure. Ground truth is marked with blacks graphs and the results with white. The occlusion (white rectangle) is pixels Spatial deviation Spatial deviation (a) (b) Figure 6. Spatial deviations over time from ground truth within the synthetic sequence (55 55 pixels occlusion). (a) Deviations without structure. (b) Deviations with structure. Graph Occlusion Occluded area Occluded nodes Highest deviation Highest energy (a) / / / / (b) / / / / (c) / / (d) / / Table 1. Results for the synthetic sequence with different occlusions using structure. First column: plot index in Figure 7. Second column: occlusion size (pixels). Third column: maximum occluded area, where (pixels) is the area of the pattern ( ). Fourth column: maximum number of occluded nodes out of the 2 nodes of the graph. Fifth column: max. spatial deviation during the sequence. Last: highest energy.

10 8 8 Spatial deviation, total energy Spatial deviation, total energy (a) (b) 8 8 Spatial deviation, total energy Spatial deviation, total energy (c) (d) Figure 7. Deviations from ground truth (red, thin) and evolution of E t (black, bold) within the synthetic structure with occlusions using structure. (a) pixels. (b) pixels. (c) pixels. (d) pixels. sequences showed that tracking with structure was successful in comparison to tracking without structure. The challenge in the real videos is on one hand the noise and on the other hand the task to track part of the checkerboard pattern without drifting apart. Figures 8 and 9 show interesting frames out of the sequences with and without structure. 5. Conclusion and future work The approach proposed in this paper improves tracking stability, accuracy and robustness compared to the standard Mean Shift in difficult scenes (similar objects and background) and during occlusions. The iterative concept of our approach enables real-time performance, although there is no guarantee that the algorithm converges to the global minimum, due to the local nature of employed search mechanism. Nevertheless, experiments show that the simultaneous use of structural and color similarity constraints together produce in most cases the optimum solution. The method is easily extensible for stochastic optimization, such as particle filtering, as planned for further improvement. Future work will extend the proposed method to work on non-rigid objects (structures) and include an adaptation process for the graph representation.

11 (a) Frame 1 (b) Frame 3 (c) Frame 65 (d) Frame 1 (e) Frame 1 (f) Frame 3 (g) Frame 65 (h) Frame 1 Figure 8. Results for real video sequence 1 without (top row) and with (bottom row) the use of structure. (a) Frame 5 (b) Frame 1 (c) Frame 166 (d) Frame 191 (e) Frame 5 (f) Frame 1 (g) Frame 166 (h) Frame 191 Figure 9. Results for real video sequence 2 without (top row) and with (bottom row) the use of structure. 6. Acknowledgment Partially supported by the Austrian Science Fund under grants P18716-N13 and S913-N13. References [1] D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. PAMI, 24(5):63 619, 22. [2] D. Comaniciu, V. Ramesh, and P. Meer. Kernel-based object tracking. PAMI, 25(5): , 23.

12 [3] D. Conte, P. Foggia, J.-M. Jolion, and M. Vento. Graph-Based Representations in Pattern Recognition, chapter A Graph-Based, Multi-Resolution Algorithm for Tracking Objects in Presence of Occlusions, pages Springer, 25. [4] D. J. Crandall and D. P. Huttenlocher. Composite models of objects and scenes for category recognition. In CVPR, pages 1 8, 27. [5] P. Felzenszwalb and D. Huttenlocher. Pictorial structures for object recognition. IJCV, 61(1):55 79, 25. [6] F. Fraundorfer and H. Bischof. A novel performance evaluation method of local detectors on non-planar scenes. In CVPR Workshop on Empirical Evaluation Methods in Computer Vision, pages 1 8, 25. [7] J. K. Lee, J. H. Oh, and S. Hwang. Clustering of video objects by graph matching. IEEE International Conference on Multimedia and Expo, pages , July 25. [8] D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 6(2):91 11, 24. [9] Y. Ma, Q. Yu, and I. Cohen. Advances in Visual Computing, chapter Multiple Hypothesis Target Tracking Using Merge and Split of Graph s Nodes, pages Springer, 26. [1] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, pages , 22. [11] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. A comparison of affine region detectors. IJCV, 65:43 72, 25. [12] J. M. Rehg and T. Kanade. Model-based tracking of self-occluding articulated objects. In ICCV, pages , Boston, IEEE. [13] Cristian Sminchisescu and Bill Triggs. Covariance scaled sampling for monocular 3D body tracking. In CVPR, pages IEEE, 21. [14] M. Taj, E. Maggio, and A. Cavallaro. Multimodal Technologies for Perception of Humans, chapter Multi-feature Graph-Based Object Tracking, pages Springer, 27. [15] F. Tang and H. Tao. Object tracking with dynamic feature graph. In International Conference on Computer Communications and Networks, pages 25 32, Washington, DC, 25. IEEE. [16] A. Tremeau and P. Colantoni. Regions adjacency graph applied to color image segmentation. IEEE Trans. on Image Processing, 9(4): , 2.

Requirements for region detection

Requirements for region detection Region detectors Requirements for region detection For region detection invariance transformations that should be considered are illumination changes, translation, rotation, scale and full affine transform

More information

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM Karthik Krish Stuart Heinrich Wesley E. Snyder Halil Cakir Siamak Khorram North Carolina State University Raleigh, 27695 kkrish@ncsu.edu sbheinri@ncsu.edu

More information

Tracking by Hierarchical Representation of Target Structure

Tracking by Hierarchical Representation of Target Structure Tracking by Hierarchical Representation of Target Structure Nicole M. Artner 1,SalvadorB.López Mármol 2, Csaba Beleznai 1, and Walter G. Kropatsch 2 1 Austrian Research Centers GmbH - ARC, Smart Systems

More information

Evaluation and comparison of interest points/regions

Evaluation and comparison of interest points/regions Introduction Evaluation and comparison of interest points/regions Quantitative evaluation of interest point/region detectors points / regions at the same relative location and area Repeatability rate :

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi hrazvi@stanford.edu 1 Introduction: We present a method for discovering visual hierarchy in a set of images. Automatically grouping

More information

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania 1 What is visual tracking? estimation of the target location over time 2 applications Six main areas:

More information

Prof. Feng Liu. Spring /26/2017

Prof. Feng Liu. Spring /26/2017 Prof. Feng Liu Spring 2017 http://www.cs.pdx.edu/~fliu/courses/cs510/ 04/26/2017 Last Time Re-lighting HDR 2 Today Panorama Overview Feature detection Mid-term project presentation Not real mid-term 6

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1 Feature Detection Raul Queiroz Feitosa 3/30/2017 Feature Detection 1 Objetive This chapter discusses the correspondence problem and presents approaches to solve it. 3/30/2017 Feature Detection 2 Outline

More information

An Evaluation of Volumetric Interest Points

An Evaluation of Volumetric Interest Points An Evaluation of Volumetric Interest Points Tsz-Ho YU Oliver WOODFORD Roberto CIPOLLA Machine Intelligence Lab Department of Engineering, University of Cambridge About this project We conducted the first

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

III. VERVIEW OF THE METHODS

III. VERVIEW OF THE METHODS An Analytical Study of SIFT and SURF in Image Registration Vivek Kumar Gupta, Kanchan Cecil Department of Electronics & Telecommunication, Jabalpur engineering college, Jabalpur, India comparing the distance

More information

Tracking in image sequences

Tracking in image sequences CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY Tracking in image sequences Lecture notes for the course Computer Vision Methods Tomáš Svoboda svobodat@fel.cvut.cz March 23, 2011 Lecture notes

More information

A Comparison of SIFT, PCA-SIFT and SURF

A Comparison of SIFT, PCA-SIFT and SURF A Comparison of SIFT, PCA-SIFT and SURF Luo Juan Computer Graphics Lab, Chonbuk National University, Jeonju 561-756, South Korea qiuhehappy@hotmail.com Oubong Gwun Computer Graphics Lab, Chonbuk National

More information

Adaptive Dominant Points Detector for Visual Landmarks Description Λ

Adaptive Dominant Points Detector for Visual Landmarks Description Λ Adaptive Dominant Points Detector for Visual Landmarks Description Λ Esther Antúnez 1, Rebeca Marl 2, Antonio Bandera 2, and Walter G. Kropatsch 1 1 PRIP, Vienna University of Technology, Austria fenunez,krwg@prip.tuwien.ac.at

More information

Motion illusion, rotating snakes

Motion illusion, rotating snakes Motion illusion, rotating snakes Local features: main components 1) Detection: Find a set of distinctive key points. 2) Description: Extract feature descriptor around each interest point as vector. x 1

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

A Comparison and Matching Point Extraction of SIFT and ISIFT

A Comparison and Matching Point Extraction of SIFT and ISIFT A Comparison and Matching Point Extraction of SIFT and ISIFT A. Swapna A. Geetha Devi M.Tech Scholar, PVPSIT, Vijayawada Associate Professor, PVPSIT, Vijayawada bswapna.naveen@gmail.com geetha.agd@gmail.com

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Structure Guided Salient Region Detector

Structure Guided Salient Region Detector Structure Guided Salient Region Detector Shufei Fan, Frank Ferrie Center for Intelligent Machines McGill University Montréal H3A2A7, Canada Abstract This paper presents a novel method for detection of

More information

Learning Efficient Linear Predictors for Motion Estimation

Learning Efficient Linear Predictors for Motion Estimation Learning Efficient Linear Predictors for Motion Estimation Jiří Matas 1,2, Karel Zimmermann 1, Tomáš Svoboda 1, Adrian Hilton 2 1 : Center for Machine Perception 2 :Centre for Vision, Speech and Signal

More information

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada Spatio-Temporal Salient Features Amir H. Shabani Vision and Image Processing Lab., University of Waterloo, ON CRV Tutorial day- May 30, 2010 Ottawa, Canada 1 Applications Automated surveillance for scene

More information

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive Computing Administrivia PS 3: Out due Oct 6 th. Features recap: Goal is to find corresponding locations in two images.

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania Visual Tracking Antonino Furnari Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania furnari@dmi.unict.it 11 giugno 2015 What is visual tracking? estimation

More information

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882 Matching features Building a Panorama Computational Photography, 6.88 Prof. Bill Freeman April 11, 006 Image and shape descriptors: Harris corner detectors and SIFT features. Suggested readings: Mikolajczyk

More information

Object Recognition with Invariant Features

Object Recognition with Invariant Features Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user

More information

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 Image Features: Local Descriptors Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 2/ 58 Local Features Detection: Identify

More information

Salient Visual Features to Help Close the Loop in 6D SLAM

Salient Visual Features to Help Close the Loop in 6D SLAM Visual Features to Help Close the Loop in 6D SLAM Lars Kunze, Kai Lingemann, Andreas Nüchter, and Joachim Hertzberg University of Osnabrück, Institute of Computer Science Knowledge Based Systems Research

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Global localization from a single feature correspondence

Global localization from a single feature correspondence Global localization from a single feature correspondence Friedrich Fraundorfer and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {fraunfri,bischof}@icg.tu-graz.ac.at

More information

A Keypoint Descriptor Inspired by Retinal Computation

A Keypoint Descriptor Inspired by Retinal Computation A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundling Features for Large Scale Partial-Duplicate Web Image Search Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research Abstract In state-of-the-art image retrieval systems, an image is

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

Computer Vision for HCI. Topics of This Lecture

Computer Vision for HCI. Topics of This Lecture Computer Vision for HCI Interest Points Topics of This Lecture Local Invariant Features Motivation Requirements, Invariances Keypoint Localization Features from Accelerated Segment Test (FAST) Harris Shi-Tomasi

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford Video Google faces Josef Sivic, Mark Everingham, Andrew Zisserman Visual Geometry Group University of Oxford The objective Retrieve all shots in a video, e.g. a feature length film, containing a particular

More information

Construction of Precise Local Affine Frames

Construction of Precise Local Affine Frames Construction of Precise Local Affine Frames Andrej Mikulik, Jiri Matas, Michal Perdoch, Ondrej Chum Center for Machine Perception Czech Technical University in Prague Czech Republic e-mail: mikulik@cmp.felk.cvut.cz

More information

Local Image Features

Local Image Features Local Image Features Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial This section: correspondence and alignment

More information

Robust Online Object Learning and Recognition by MSER Tracking

Robust Online Object Learning and Recognition by MSER Tracking Computer Vision Winter Workshop 28, Janez Perš (ed.) Moravske Toplice, Slovenia, February 4 6 Slovenian Pattern Recognition Society, Ljubljana, Slovenia Robust Online Object Learning and Recognition by

More information

Human Upper Body Pose Estimation in Static Images

Human Upper Body Pose Estimation in Static Images 1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Akshay Bhatia, Robert Laganière School of Information Technology and Engineering University of Ottawa

More information

Shape recognition with edge-based features

Shape recognition with edge-based features Shape recognition with edge-based features K. Mikolajczyk A. Zisserman C. Schmid Dept. of Engineering Science Dept. of Engineering Science INRIA Rhône-Alpes Oxford, OX1 3PJ Oxford, OX1 3PJ 38330 Montbonnot

More information

Real-Time Human Detection using Relational Depth Similarity Features

Real-Time Human Detection using Relational Depth Similarity Features Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,

More information

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

UNSUPERVISED OBJECT MATCHING AND CATEGORIZATION VIA AGGLOMERATIVE CORRESPONDENCE CLUSTERING

UNSUPERVISED OBJECT MATCHING AND CATEGORIZATION VIA AGGLOMERATIVE CORRESPONDENCE CLUSTERING UNSUPERVISED OBJECT MATCHING AND CATEGORIZATION VIA AGGLOMERATIVE CORRESPONDENCE CLUSTERING Md. Shafayat Hossain, Ahmedullah Aziz and Mohammad Wahidur Rahman Department of Electrical and Electronic Engineering,

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering

More information

Comparison of Local Feature Descriptors

Comparison of Local Feature Descriptors Department of EECS, University of California, Berkeley. December 13, 26 1 Local Features 2 Mikolajczyk s Dataset Caltech 11 Dataset 3 Evaluation of Feature Detectors Evaluation of Feature Deriptors 4 Applications

More information

Scale Invariant Segment Detection and Tracking

Scale Invariant Segment Detection and Tracking Scale Invariant Segment Detection and Tracking Amaury Nègre 1, James L. Crowley 1, and Christian Laugier 1 INRIA, Grenoble, France firstname.lastname@inrialpes.fr Abstract. This paper presents a new feature

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 10 Segmentation 14/02/27 http://www.ee.unlv.edu/~b1morris/ecg782/

More information

Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene 1 Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

Color Image Segmentation Using a Spatial K-Means Clustering Algorithm

Color Image Segmentation Using a Spatial K-Means Clustering Algorithm Color Image Segmentation Using a Spatial K-Means Clustering Algorithm Dana Elena Ilea and Paul F. Whelan Vision Systems Group School of Electronic Engineering Dublin City University Dublin 9, Ireland danailea@eeng.dcu.ie

More information

SIFT - scale-invariant feature transform Konrad Schindler

SIFT - scale-invariant feature transform Konrad Schindler SIFT - scale-invariant feature transform Konrad Schindler Institute of Geodesy and Photogrammetry Invariant interest points Goal match points between images with very different scale, orientation, projective

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

AK Computer Vision Feature Point Detectors and Descriptors

AK Computer Vision Feature Point Detectors and Descriptors AK Computer Vision Feature Point Detectors and Descriptors 1 Feature Point Detectors and Descriptors: Motivation 2 Step 1: Detect local features should be invariant to scale and rotation, or perspective

More information

School of Computing University of Utah

School of Computing University of Utah School of Computing University of Utah Presentation Outline 1 2 3 4 Main paper to be discussed David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, IJCV, 2004. How to find useful keypoints?

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

CS231A Section 6: Problem Set 3

CS231A Section 6: Problem Set 3 CS231A Section 6: Problem Set 3 Kevin Wong Review 6 -! 1 11/09/2012 Announcements PS3 Due 2:15pm Tuesday, Nov 13 Extra Office Hours: Friday 6 8pm Huang Common Area, Basement Level. Review 6 -! 2 Topics

More information

Local invariant features

Local invariant features Local invariant features Tuesday, Oct 28 Kristen Grauman UT-Austin Today Some more Pset 2 results Pset 2 returned, pick up solutions Pset 3 is posted, due 11/11 Local invariant features Detection of interest

More information

Deformation Invariant Image Matching

Deformation Invariant Image Matching Deformation Invariant Image Matching Haibin Ling David W. Jacobs Center for Automation Research, Computer Science Department University of Maryland, College Park {hbling, djacobs}@ umiacs.umd.edu Abstract

More information

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature

More information

Lecture 10 Detectors and descriptors

Lecture 10 Detectors and descriptors Lecture 10 Detectors and descriptors Properties of detectors Edge detectors Harris DoG Properties of detectors SIFT Shape context Silvio Savarese Lecture 10-26-Feb-14 From the 3D to 2D & vice versa P =

More information

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

SCALE INVARIANT FEATURE TRANSFORM (SIFT) 1 SCALE INVARIANT FEATURE TRANSFORM (SIFT) OUTLINE SIFT Background SIFT Extraction Application in Content Based Image Search Conclusion 2 SIFT BACKGROUND Scale-invariant feature transform SIFT: to detect

More information

A Novel Algorithm for Color Image matching using Wavelet-SIFT

A Novel Algorithm for Color Image matching using Wavelet-SIFT International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 A Novel Algorithm for Color Image matching using Wavelet-SIFT Mupuri Prasanth Babu *, P. Ravi Shankar **

More information

Patch Descriptors. EE/CSE 576 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Matching Local Invariant Features with Contextual Information: An Experimental Evaluation.

Matching Local Invariant Features with Contextual Information: An Experimental Evaluation. Matching Local Invariant Features with Contextual Information: An Experimental Evaluation. Desire Sidibe, Philippe Montesinos, Stefan Janaqi LGI2P - Ecole des Mines Ales, Parc scientifique G. Besse, 30035

More information

Designing Applications that See Lecture 7: Object Recognition

Designing Applications that See Lecture 7: Object Recognition stanford hci group / cs377s Designing Applications that See Lecture 7: Object Recognition Dan Maynes-Aminzade 29 January 2008 Designing Applications that See http://cs377s.stanford.edu Reminders Pick up

More information

Reinforcement Matching Using Region Context

Reinforcement Matching Using Region Context Reinforcement Matching Using Region Context Hongli Deng 1 Eric N. Mortensen 1 Linda Shapiro 2 Thomas G. Dietterich 1 1 Electrical Engineering and Computer Science 2 Computer Science and Engineering Oregon

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Local Features: Detection, Description & Matching

Local Features: Detection, Description & Matching Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

The goals of segmentation

The goals of segmentation Image segmentation The goals of segmentation Group together similar-looking pixels for efficiency of further processing Bottom-up process Unsupervised superpixels X. Ren and J. Malik. Learning a classification

More information

Image Feature Evaluation for Contents-based Image Retrieval

Image Feature Evaluation for Contents-based Image Retrieval Image Feature Evaluation for Contents-based Image Retrieval Adam Kuffner and Antonio Robles-Kelly, Department of Theoretical Physics, Australian National University, Canberra, Australia Vision Science,

More information

Computer Vision I - Basics of Image Processing Part 2

Computer Vision I - Basics of Image Processing Part 2 Computer Vision I - Basics of Image Processing Part 2 Carsten Rother 07/11/2014 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

Verslag Project beeldverwerking A study of the 2D SIFT algorithm

Verslag Project beeldverwerking A study of the 2D SIFT algorithm Faculteit Ingenieurswetenschappen 27 januari 2008 Verslag Project beeldverwerking 2007-2008 A study of the 2D SIFT algorithm Dimitri Van Cauwelaert Prof. dr. ir. W. Philips dr. ir. A. Pizurica 2 Content

More information

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Stefan Hinterstoisser 1, Selim Benhimane 1, Vincent Lepetit 2, Pascal Fua 2, Nassir Navab 1 1 Department

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

HISTOGRAMS OF ORIENTATIO N GRADIENTS

HISTOGRAMS OF ORIENTATIO N GRADIENTS HISTOGRAMS OF ORIENTATIO N GRADIENTS Histograms of Orientation Gradients Objective: object recognition Basic idea Local shape information often well described by the distribution of intensity gradients

More information

2D Image Processing Feature Descriptors

2D Image Processing Feature Descriptors 2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview

More information

3D reconstruction how accurate can it be?

3D reconstruction how accurate can it be? Performance Metrics for Correspondence Problems 3D reconstruction how accurate can it be? Pierre Moulon, Foxel CVPR 2015 Workshop Boston, USA (June 11, 2015) We can capture large environments. But for

More information

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada

More information

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL Maria Sagrebin, Daniel Caparròs Lorca, Daniel Stroh, Josef Pauli Fakultät für Ingenieurwissenschaften Abteilung für Informatik und Angewandte

More information

Shape Descriptors for Maximally Stable Extremal Regions

Shape Descriptors for Maximally Stable Extremal Regions Shape Descriptors for Maximally Stable Extremal Regions Per-Erik Forssén and David G. Lowe Department of Computer Science University of British Columbia {perfo,lowe}@cs.ubc.ca Abstract This paper introduces

More information

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore Particle Filtering CS6240 Multimedia Analysis Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore (CS6240) Particle Filtering 1 / 28 Introduction Introduction

More information

A Novel Extreme Point Selection Algorithm in SIFT

A Novel Extreme Point Selection Algorithm in SIFT A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes

More information

Detecting Object Instances Without Discriminative Features

Detecting Object Instances Without Discriminative Features Detecting Object Instances Without Discriminative Features Edward Hsiao June 19, 2013 Thesis Committee: Martial Hebert, Chair Alexei Efros Takeo Kanade Andrew Zisserman, University of Oxford 1 Object Instance

More information