REAL-TIME PATTERN RECOGNITION

Size: px
Start display at page:

Download "REAL-TIME PATTERN RECOGNITION"

Transcription

1 SYMPOSIUM ON VIRTUAL AND AUGMENTED REALITY 2007 MINICOURSE REAL-TIME PATTERN RECOGNITION USING THE OPENCV LIBRARY {jpsml, mouse, ela, gsm, ds2, vt, João Paulo Lima Thiago Farias Eduardo Apolinário Guilherme Moura Daliton Silva Veronica Teichrieb Judith Kelner MAY/JUNE 2007

2 Contents 1. Introduction Motivation License Installation Documentation Functionalities Image enhancement and feature selection Smoothing Edge detection Corner detection Segmentation Thresholding Line detection Contour detection Connected components labeling Object tracking Template matching CamShift Optical flow Object detection Example applications Square detection CamShift demo Kanade Lucas tracker Face detection Final considerations References Appendix

3 1. Introduction OpenCV is an open source software library for offline and real time image processing [1]. It may be applied in vast and different areas, such as humancomputer interaction (HCI), object identification, segmentation and recognition, face recognition, gesture recognition, motion tracking, ego motion, motion understanding and structure from motion (SfM), and mobile robotics. Table 1 shows a more detailed description of the functions supported by OpenCV. The library is developed by Intel, and has been under development since 2001, with its first release version distributed on October There are OpenCV versions targeting both, Windows and Linux operating systems. Table 1. OpenCV main functions Image data manipulation Image and video I/O Matrix and vector manipulation and linear algebra routines Various dynamic data structures Basic image processing allocation, release, copying, setting, conversion file and camera based input, image/video file output products, solvers, eigenvalues, single value decomposition (SVD) lists, queues, sets, trees, graphs filtering, edge detection, corner detection, sampling and interpolation, color conversion, morphological operations, histograms, image pyramids Structural analysis connected components, contour processing, distance transform, various moments, template matching, Hough transform, polygonal approximation, line fitting, ellipse fitting, Delaunay triangulation Camera calibration finding and tracking calibration patterns, calibration, fundamental matrix estimation, homography estimation, stereo correspondence Motion analysis Object recognition Basic GUI Image labeling optical flow, motion segmentation, tracking eigen-methods, hidden Markov models (HMM) display image/video, keyboard and mouse handling, scroll-bars line, conic, polygon, text drawing OpenCV s source code is fully available for download and anyone is free to modify it, once the license clauses are followed. OpenCV is written in C/C++, and provides support for developers that use Microsoft Visual Studio, Eclipse Project 3

4 and C++ Builder (when using Windows), and make files (when using Linux). OpenCV can be divided in four modules: cv, cvaux, cxcore and highgui. The cv module contains the main functions, and can be considered the library s heart. The cvaux module, as its name suggests, implements functions that support OpenCV s use. The cxcore module is responsible for data structures and linear algebra operations. At last, the highgui module provides support for GUI functions, such as showing a window containing images captured by a webcam, for example Motivation Real time pattern recognition represents an important task of applications applied in diverse areas, such as Augmented Reality, Robotics and HCI. OpenCV provides out of the box functionalities for building user interfaces, image loading and video capturing, for example. The OpenCV library is an open source solution for computer vision that provides many functionalities adequate for the types of application listed before. Since it has a well defined and simple programming interface, it can be easily integrated to projects that already exist. A considerable problem involving pattern recognition is the heavy processing load that this type of task demands. Because of that, OpenCV s implementation contains many optimizations that imply in a good performance. Since OpenCV is implemented by Intel, there are a series of optimizations specially designed for its produced processors. Intel has been realizing large scale investments on OpenCV development. Many Intel researchers dedicate their work to the development of new OpenCV functionalities, which are incorporated to the library every time a new version is released. As an example, it can be cited the cvgoodfeaturestotrack function, based on Carlo Tomasi s work License OpenCV s license allows modifying and freely distributing its source code and binaries, since some conditions are obeyed. Among these conditions, the following can be cited: - The source code and/or binaries distributed must contain all the information listed in OpenCV s original license; - The name of Intel or any of its partners cannot be used for promoting the use of applications that utilize OpenCV library without having a previous written authorization Installation OpenCV can be installed through a wizard, available at or it can be downloaded directly from its CVS. The instructions to use the CVS can be accessed at In both cases it will be copied the source code and some precompiled DLLs. In order to recompile or modify any OpenCV module, one has to open it using one of its supported IDEs. In sequence, OpenCV s step-by-step installation using the wizard is described. Figure 1 shows the screen that appears on OpenCV installation. By clicking on the Next button, the use license is shown. In order to agree with the listed terms, the user must check the option I accept the agreement and click on 4

5 the Next button again. The following three screens on the installation process show the standard installation configuration (place on hard disk where it should be installed and addition of installation directory to the PATH environment variable). Figure 1 illustrates the first of these screens. After confirming all options, OpenCV is copied to hard disk and available for use. Figure 1. OpenCV s first installation screen In order to recompile the OpenCV source code using Microsoft Visual Studio, it is necessary to open the opencv.sln file, located inside the _make folder. This way, the entire development environment is configured for a correct code compilation. In sequence, user must click on the Build Solution option, from the Build menu, as shown in Figure 2. Each OpenCV module is compiled into two link libraries (.dll and.lib). They integrate OpenCV with projects developed by the user. Figure 2. Compiling the OpenCV source code 5

6 1.4. Documentation OpenCV has a complete documentation, and many related forums and discussion groups, resulting in an efficient support for the library users. The installation application automatically copies a series of user support documents. Among these documents, the opencvman_old.pdf deserves special attention, since it is the library s reference manual. In this document it can be found a detailed description of all functions implemented by the OpenCV modules and example codes. The manual also explains general concepts related to OpenCV, such as specific data types defined and common implementation guidelines used in image processing. A discussion group about OpenCV is available at It is also available in the OpenCV installation an html documentation, which is a simpler and more frequently updated version of the library reference manual. When user registers himself/herself (the registration on the discussion group is free), he/she will have access to an extensive question and answer database involving OpenCV practical use situations. He/she will also have the possibility of posting questions that could not be simply solved using only existing documentation. There is a great chance that the question will be solved soon, since OpenCV s developers community has a large number of active members. 2. Functionalities This chapter explains a whole bunch of OpenCV functions related to several steps of the image processing pipeline. Functions related to preprocessing, segmentation, representation, and recognition are put into a context and their usage is explained Image enhancement and feature selection The main objective of image enhancement is to process an image so that the result is more suitable than the original image for a set of specific applications [2]. There are basically two approaches for image enhancement: spatial domain methods and frequency domain methods. The term spatial domain refers to the image plane itself, and approaches in this category are based on direct manipulation of image pixels. Frequency domain techniques are based on modifying the Fourier transform of an image. The OpenCV library handles only a subset of spatial domain techniques. Besides that, in this section, some functions related to feature selection (e.g. edges and corners) are presented Smoothing Smoothing filters are used for blurring and noise reduction. Blurring is used in preprocessing steps, such as removal of small details from an image. Besides that, noise reduction can be accomplished by blurring using a linear filter and by nonlinear filtering. An example of the use of smoothing is shown in Figure 3. 6

7 Figure 3. Original picture (left), a uniform kernel (center) and a Gaussian kernel (right) In the OpenCV library, there is a function related to smoothing filtering named cvsmooth, and since the smoothing operation is nothing more than a convolution with a specific matrix, it is possible to use the function cvfilter2d. Both functions are explained below. void cvsmooth( const CvArr* src, CvArr* dst, int smoothtype, int param1, int param2, double param3); The src argument is the source image, dst is the destination image, smoothtype is the type of the smoothing that can be applied (all the types are summarized in Table 2). Table 2. OpenCV smoothing types Smoothing type CV_BLUR_NO_SCALE CV_BLUR Description Summation over a param1 param2 pixel neighborhood. Summation over a param1 param2 pixel neighborhood with subsequent scaling by 1/(param1 param2). CV_GAUSSIAN Convolution of the image with a param1 param2 Gaussian kernel. CV_MEDIAN CV_BILATERAL Median of the param1 param1 neighborhood (i.e. a square neighborhood). Application of a bilateral 3x3 filtering with color sigma=param1 and space sigma=param2, as described in [3]. The only catch in the use of function cvsmooth is when a Gaussian smoothtype is used and the concern is with param3, because it indicates the Gaussian sigma (i.e., the standard deviation). If the result is zero, it is calculated from the kernel size, according to the following formula: sigma = ( n / 2 1) * , where sigma is param1 or param2, depending on the kernel orientation (horizontal or vertical). 7

8 Another function related to convolutions is the cvfilter2d, that convolves any linear kernel with an image. Its signature is: void cvfilter2d( const CvArr* src, CvArr* dst, const CvMat* kernel, CvPoint anchor); The src is the source image, dst is the destination image and kernel is the convolution kernel, a single-channel floating point matrix. The anchor indicates the relative position of a filtered point within kernel; a concern is that the anchor point must lie within the kernel. The special default value (-1,-1) means that it is at the kernel center Edge detection Edge detection techniques are inherently easy to implement and have a low computational complexity. As an example of edge detection application in tracking, RAPiD [4] is often cited, and can be seen in Figure 4. Figure 4. Some points are sampled along model edges (left), and these points are joined and used to infer pose (center). Occlusion can be treated in a robust way (right). This subsection presents all the OpenCV functions related to edge detection. The Sobel, Laplacian and Canny operators are described. Sobel operator The Sobel operator performs a 2D spatial gradient measurement on an image, emphasizing regions of high spatial gradient that correspond to edges. Typically it is used to find the approximate absolute gradient magnitude at each point in an input grayscale image. An example of the use of Sobel operator is illustrated in Figure 5. Figure 5. Grayscale image (left), Sobel x-gradient image (center) and Sobel y-gradient image (right) 8

9 In theory, the operator consists of a pair of 3 3 convolution masks as shown in Figure 6. The mask on the right is simply the other rotated by 90. Figure 6. Sobel convolution masks These masks are designed to respond maximally to edges, running vertically and horizontally relative to the pixel grid, one mask for each of the two perpendicular orientations. The masks can be applied separately to the input image, producing separate measurements of the gradient component for each orientation ( Gx and Gy ). These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient. The gradient magnitude is given by: G = G + G Although typically, an approximate magnitude is computed in a much faster way using: G = G + G In the OpenCV library, the Sobel operator is implemented via function cvsobel, and its signature is: void cvsobel( const CvArr* src, CvArr* dst, int xorder, int yorder, int aperture_size); The scr argument is the source image, dst is the destination image, xorder is the order of the x derivative, yorder is the same with the y axis. The parameter aperture_size is the size of the extended Sobel kernel; it must be 1, 3, 5 or 7. In all cases, except 1, aperture_size aperture_size kernel will be used to calculate the derivative. There is also a special value to aperture_size CV_SCHARR that corresponds to a 3x3 Scharr filter [5]. x x y y Laplacian operator The Laplacian is a 2D isotropic measure of the second spatial derivative of an image. The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection. 9

10 The Laplacian is frequently applied to an image that has first been smoothed with an approximated Gaussian filter in order to reduce its sensitivity to noise; the two variants will be described together here. The operator normally takes a single graylevel image as input and produces another graylevel image as output. by: The Laplacian L ( x, y) of an image with pixel intensity values I ( x, y) is given 2 2 I I L( x, y) = x y This can be approximated by a convolution filter. Since the input image is represented as a set of discrete pixels, it is necessary to find a discrete convolution kernel that can approximate the second derivatives in the definition of the Laplacian. Two commonly used small kernels are shown in Figure 7. Figure 7. Two commonly used discrete approximations to the Laplacian filter The OpenCV function that implements the Laplacian of an image is cvlaplace, and its signature is: void cvlaplace( const CvArr* src, CvArr* dst, int aperture_size); The src argument is the input image, dst is the resulting image, and aperture_size is the same argument used by the Sobel operator. Canny operator The Canny operator was designed to be an optimal edge detector (according to particular criteria, i.e., there are other detectors around that also claim to be optimal with respect to slightly different criteria). It takes as input a gray scale image, and produces as output an image showing the positions of tracked intensity discontinuities. An example of the Canny operator is shown in Figure 8. As can be seen, the edges of the image were highlighted. 10

11 Figure 8. Raw image (left), and Canny operator resulting image (right) The Canny operator works in a multi-stage process. First of all the image is smoothed by Gaussian convolution. Then, a simple 2D first derivative is applied to the smoothed image to highlight regions with high first spatial derivatives. Edges give rise to ridges in the gradient magnitude image. The algorithm then tracks along the top of these ridges and sets to zero all pixels that are not actually on the ridge top, so as to give a thin line in the output; this process is known as nonmaximal suppression. The tracking process exhibits hysteresis controlled by two thresholds: T 1 and T 2 with T 1 > T 2. Tracking can only begin at a point on a ridge higher than T 1. Tracking then continues in both directions out from that point until the height of the ridge falls below T 2. This hysteresis helps to ensure that noisy edges are not broken up into multiple edge fragments. The Canny operator is implemented in OpenCV via the function cvcanny and its signature is as follows: void cvcanny( const CvArr* image, CvArr* edges, double threshold1, double threshold2, int aperture_size); The image argument is the input image, edges is the image that will store the results, threshold1 and threshold2 are the two thresholds used in this operator, and aperture_size is the aperture parameter, just as with Sobel operator Corner detection One of the computer vision approaches available in literature to extract certain kinds of features from an image is corner detection, or more generally interest point detection. Figure 9 shows an interest point based markerless augmented reality application [6]. This algorithm is able to track using some interest points, for example, a computer, its position and orientation. As can be noted in Figure 9 (top left), most of the interest points are localized around corners; Figure 9 (bottom left) and (bottom right) show that the algorithm is robust to rotation and translation, since the board position is tracked correctly. 11

12 Figure 9. The wireframe is matched with the computer (top left), eight points are tracked (top right), and a board inside the computer is tracked (bottom left and bottom right) Some uses of feature detection include motion detection, tracking, image mosaicking, panorama stitching, 3D modeling, and object recognition. In the OpenCV library, there are three ways of detecting interest points: (1) directly calculating eigenvectors and eigenvalues, (2) via Harris operator and (3) using Good features to Track function. Both (1) and (2) alternatives are concerned with detecting corners in the image, and even that (3) is based on (2), the class of features this algorithm is able to handle is larger than just corners. These alternatives are described in sequence. Eigenvectors and eigenvalues The majority of feature detection algorithms are based on calculation of eigenvalues and eigenvectors, and this operation is implemented in OpenCV via two functions, cvcornereigenvalsandvects and cvcornermineigenval. Both functions are related to the covariation matrix of derivatives around a certain pixel. The covariation matrix is of the form: C 2 di di di dx dx dy, di di di dx dy dy = 2 where I ( x, y) is the intensity of the pixel in coordinates ( x, y). The function cvcornereigenvalsandvects calculates both eigenvalues and eigenvectors of the covariation matrix and cvcornermineigenval just calculates and stores its eigenvalues. 12

13 The signature of cvcornereigenvalsandvecs is: void cvcornereigenvalsandvecs( const CvArr* image, CvArr* eigenvv, int block_size, int aperture_size); The image argument is the input image, eigenvv is an image that stores the results and it must be 6 times wider than image, block_size is the neighborhood size, and aperture_size is the aperture parameter, just as with Sobel operator. The function signature of cvcornermineigenval is: void cvcornermineigenval( const CvArr* image, CvArr* eigenval, int block_size, int aperture_size); The parameters are the same as for cvcornereigenvalsandvecs function, except for eigenval, because this function just calculates and stores the minimal eigenvalue associated with the pixels; therefore, every eigenvalue is of the same size of image, instead of function cvcornereigenvalsandvecs that is 6 times wider and calculates and stores both eigenvalues and both eigenvectors. Harris corner detector The Harris corner detector computes the locally averaged moment matrix calculated from the image gradients, and then combines the eigenvalues of the moment matrix to compute a corner strength ; the maximum values of this result indicate the corner positions. It is also based on the covariation matrix. The key for unlocking the power of this matrix occurs when we examine its eigenvalues. When the matrix has two large eigenvalues this corresponds to the two separate principal directions in the underlying image gradient. This is calculated quickly and efficiently using a simple equation for the corner response, and stored in each image pixel: R C k trace C 2 = det( ) *( ( )), where k is a tunable parameter which determines how edge-phobic the response of the algorithm is. The result obtained by applying the algorithm to a picture can be seen in Figure 10. In (right) it is possible to see some white regions; these are local maxima that indicate that region as a corner. It should be noted that the Harris corner detector is also able to find edges, as can be seen in Figure 10 observing the dark gray regions. 13

14 Figure 10. Grayscale image (left), and the result of applying Harris corner detector to the image (right) The function signature is shown below: void cvcornerharris( const CvArr* image, CvArr* harris_response, int block_size, int aperture_size, double k); The image argument is the input image, harris_response is the image that will store the Harris detector result. The parameter block_size is responsible for neighborhood size, as discussed for eigenvectors and eigenvalues. The aperture_size is the same of the Sobel operator. Finally, k is a tunable parameter. Good Features to Track The Shi and Tomasi s work on feature tracking is implemented in OpenCV via the function cvgoodfeaturestotrack. It is strongly based on the Harris corner detection operator. The basic idea of the algorithm is to monitor the quality of image features during tracking by using a measure of feature dissimilarity that quantifies the change of appearance of a certain feature between the first and the current frame. Below, in Figure 11, the features found in the left image are enhanced and shown in the right one. Figure 11. The first frame of sequence (left), and the features selected according to the established criterion (right) There are some concerns about this function. One of those is the need for two temporary floating-point 32 bit images of the same sizes of input image. Therefore, in applications where memory is a scarce resource, some problems may appear when using this function. 14

15 The function signature is show below: void cvgoodfeaturestotrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image, CvPoint2D32f* corners, int* corner_count, double quality_level, double min_distance, const CvArr* mask, int block_size, int use_harris, double k); The image argument is the source single-channel image, eig_image and temp_image are temporary floating-point 32-bit images of the same size as image. The parameter corners is a previously allocated structure to store the detected corners, and corner_count is just the number of detected corners. The parameter quality_level is a number indicating minimal accepted quality of image corners, and min_distance is the minimal euclidian distance between corners. The parameter mask is the region of interest; if a pointer to NULL is passed, the whole image is used. The parameter block_size is the same as for the Harris corner detector function. When using that function it is possible to decide to use the Harris corner detector by setting use_harris to any value different from zero. At last, k is the free parameter of the Harris operator Segmentation The task of analyzing an image in order to distinguish specific elements is called segmentation. The following subsections explain some techniques used for accomplishing this goal Thresholding This technique consists in separating the objects of interest from the image background. It is often done in grayscale images, but can be also applied to other formats, such as color images. A common practice, when thresholding color images, is to consider the sum of the color components for each pixel. Thresholding is done by estimating level ranges that determine the pixels that belong to image background and to objects of interest. As can be seen in Figure 12 (left), when there is only one object to segment in an image f ( x, y), a threshold level T is specified and two level ranges are defined: f ( x, y) > T and f ( x, y) T. This operation is named single-level thresholding and the result is a binary image that distinguishes the pixels that belong to each level range. In multilevel thresholding, there are n objects to segment, requiring the use of different threshold levels T 1,, T n, as shown in Figure 12 (right). The result of this operation is a grayscale image with n distinct levels. 15

16 Figure 12. Single-level (left) and multi-level (right) thresholding. Thresholding operation can also be classified as global or local. In global thresholding, the same threshold levels T k are used for all pixels in the image. In local thresholding, the threshold levels T k depend on local properties of the pixels, such as the average level of their neighborhood. Global thresholding is more adequate to images where the objects have a constant illumination. Figure 13 shows an example where an object with constant illumination is properly segmented using global thresholding, while the results obtained with local thresholding are not satisfactory. Local thresholding is more indicated for cases where the objects illumination is variable, since it takes into account local image features. In Figure 14, local thresholding segments more pixels from the object than global thresholding, due to the variable nature of the objects illumination. Figure 13. Source image where object has a constant illumination (left), global thresholding result (center) and local thresholding result (right) Figure 14. Source image where objects have a variable illumination (left), global thresholding result (center) and local thresholding result (right) Many image processing algorithms used in real time applications do not handle color images. As a result, the original image has to be converted to a more suitable format, such as binary. This is where thresholding takes place. An example 16

17 of the use of thresholding in real time pattern recognition is present on the marker tracking performed by the ARToolKit augmented reality library [7]. The color images captured by a camera are analyzed in order to segment the dark pixels relative to the markers from the background. A global single-level thresholding is applied to the source image, with the threshold level T being specified by the user. A common value for T is 150. Considering a color pixel r, g, b ) from the source image, if r ( p p p + g + b < T, then the pixel is classified as marker pixel, else p p p 3 it is classified as background pixel. Figure 15 shows the results obtained by ARToolKit when thresholding an input frame. Figure 15. Global thresholding used in ARToolKit: source color image (left) and global thresholding result (right) OpenCV implements single-level thresholding and offers both a global and a local thresholding function for grayscale images. Global thresholding is implemented by the function cvthreshold, which has the following signature: void cvthreshold(const CvArr* src, CvArr* dst, double threshold, double max_value, int threshold_type); The src argument is the grayscale image to be thresholded. The dst argument is where the resulting binary image will be stored. The global threshold value T is specified by the threshold argument. The max_value argument is the level that will be used to distinguish the object s pixel from the background pixels. The threshold_type argument determines which function will be used in the thresholding operation. The five available thresholding types and their respective function are presented in Table 3, and their visual representation is illustrated in Figure

18 Table 3. OpenCV thresholding types and their functions Thresholding type Function max_value,if src( x, y) > threshold CV_THRESH_BINARY dst ( x, y) = 0, otherwise CV_THRESH_BINARY_INV 0,if src( x, y) > threshold dst ( x, y) = max_value, otherwise CV_THRESH_TRUNC threshold,if src( x, y) > threshold dst ( x, y) = src( x, y), otherwise CV_THRESH_TOZERO src( x, y),if src( x, y) > threshold dst ( x, y) = 0, otherwise CV_THRESH_TOZERO_INV 0,if src( x, y) > threshold dst ( x, y) = src( x, y), otherwise Figure 16. OpenCV thresholding types visual representation Local thresholding is implemented by the function cvadaptivethreshold, which has the following signature: 18

19 void cvadaptivethreshold(const CvArr* src, CvArr* dst, double max_value, int adaptive_method, int threshold_type, int block_size, double param1); The src, dst, max_value and threshold_type arguments are the same from cvthreshold. The only local thresholding types available are CV_THRESH_BINARY and CV_THRESH_BINARY_INV. The threshold level T is computed for each pixel of the input image. The adaptive_method argument determines how T is calculated. If it is CV_ADAPTIVE_THRESH_MEAN_C, then T is the mean of the block_sizexblock_size pixel neighborhood subtracted by param1. If it is CV_ADAPTIVE_THRESH_GAUSSIAN_C, then T is a Gaussian weighted sum of the block_sizexblock_size pixel neighborhood subtracted by param1. This means that the nearest pixels will have a bigger influence on the result than the farthest ones Line detection In order to detect straight lines in an image, the first step is to perform an edge enhancement operation on the input image. Then, a threshold is applied to the result. The binary output of this operation can be used for the detection of lines. Some real time markerless 3D tracking techniques rely on the detection of line segments in the image [8]. A wireframe model of the object to be tracked is projected onto the image using an estimated camera pose. The projection and the image lines are compared in order to calculate the current camera pose. Figure 17 illustrates the process. Figure 17. Detected lines on the image (left) and estimated pose of the car (right). The operator used for performing line detection is the Hough transform. The idea behind the Hough transform is to decrease computational complexity of line detection by using a line representation on the parameter space rather than on the xy plane. Considering a point ( x i, yi ) in the xy plane, there is an infinite number of lines passing through it. These lines have the form yi = axi + b. Rearranging the equation in terms of parameters a and b, the resulting equation is b = xia + yi. This is equivalent to a line in the parameter space. Considering another point x, y ), the set of lines that passes through it is represented on the ( j j parameter space by b = x a + y. The intersection of the lines in the parameter j j 19

20 space at point ( a ', b' ) determines the parameters of the line that passes through both points x, y ) and x, y ) in the xy plane. Figure 18 shows graphics that ( i i illustrate this idea. ( j j Figure 18. Line representation on the xy plane (left) and on the parameter space (right). In the representation used by the algorithm, each axis of the parameter space is subdivided in ranges of equal size, as shown in Figure 19, generating cells named accumulators. Each accumulator ( a, b) in the parameter space represents a line in the xy plane. Figure 19. Accumulators in the parameter space. The first step of the algorithm is to assign all accumulators to zero. Then, the input image is scanned for edge points. For each edge point ( x i, yi ), the parameters ( a, b) of all the lines that pass though it are evaluated by using all values allowed by the subdivision on the equation b = xa + y. The corresponding accumulator for each parameter pair is incremented. After all edge points are treated, the values on the accumulators will be the number of edge points contained within the corresponding line. A threshold can then be applied to select the lines with a higher number of points. The use of the equation b = xa + y for representing lines leads to a problem when detecting vertical lines, since a tends to infinity. Due to this, the polar representation is preferred, which has the form ρ = x cosθ + y sinθ, where ρ is the distance between the line and the origin and θ is the angle coefficient of the line. 20

21 The execution time of the Hough transform can be reduced without significant loss in the results quality by using a probabilistic approach [9]. Instead of utilizing all edge points, only a random fraction of these points are considered. Line detection is performed by the cvhoughlines2 OpenCV function: CvSeq* cvhoughlines2( CvArr* image, void* line_storage, int method, double rho, double theta, int threshold, double param1, double param2 ); The image argument is the binary image from where the lines will be retrieved. The line_storage argument is a container for the detected lines data. The Hough transform variant to be used is specified in the method argument. Table 4 presents the available Hough transform methods. The rho and theta arguments are the distance and angle resolution, respectively. The threshold argument determines the minimum number of points needed by a line. The param1 and param2 arguments are only used if method is CV_HOUGH_PROBABILISTIC or CV_HOUGH_MULTI_SCALE. In probabilistic Hough transform, param1 is the minimum line length and param2 is the maximum distance between segments lying on the same line that do not cause their joining. In multi-scale Hough transform, param1 is the divisor for rho and param2 is the divisor for theta. Table 4. OpenCV Hough transform methods. Hough transform method CV_HOUGH_STANDARD CV_HOUGH_PROBABILISTIC CV_HOUGH_MULTI_SCALE Description Classical Hough transform. Retrieve all lines detected. Each line is represented by the distance to the origin ( ρ ) and the angle between the x-axis and the line normal (θ ). Probabilistic Hough transform. Retrieve all line segments detected. Each segment is represented by its starting and ending points. Multi-scale variant of the classical Hough transform. Retrieve all lines detected in the same way as CV_HOUGH_STANDARD Contour detection After applying an image enhancement operator for edge detection and thresholding the resulting image, the contours can be extracted from the image. In order to perform this task, two steps need to be covered: contour tracing and contour representation. In contour tracing, the existing contours are followed in the image. In contour representation, the contours are described in a meaningful way. Contour detection is widely used in real time pattern recognition solutions. In ARToolKit, the contours relative to the markers present on the input frame need to be segmented in order to enable recognition and pose estimation [7]. Figure 20 illustrates the operation. 21

22 Figure 20. Contour detection used in ARToolKit: source color image (left), enhanced edges (center) and detected contours highlighted on the image (right). OpenCV uses the Suzuki algorithm to perform contour tracing [10]. In this algorithm, at first, the upper left contour pixel is found. Then, the neighborhood of the first pixel is checked in clockwise direction to find the next pixel of the contour. From now on, the search for the other pixels of the contour is done in anti-clockwise direction and ends when the first two pixels of the contour are found again. OpenCV performs contour representation using two different description types: chain codes and polygonal representation. Chain codes consist in a sequence of numbers that determine the neighborhood of a contour pixel where the next contour pixel is found. Figure 21 shows the codes for each neighborhood (left) and an example of a contour represented by a chain code. It can be noted that choosing a different starting point for the contour can give different chain code representations. This can be avoided by shifting the numbers of the chain code in a way that results in the integer of minimum magnitude. Figure 21. Chain codes for each neighborhood direction (left) and a chain code representation of a contour (right). The polygonal representation is a sequence of vertices that once linked together symbolizes the contour essence. Figure 22 exemplifies this type of representation, in which there is tradeoff between contour fidelity and codification overhead. 22

23 Figure 22. Polygonal representation. OpenCV provides several ways to organize the contours retrieved from an image. For example, the contours can be stored in a tree, where a contour C 1 is a parent of a contour C 2 if and only if C 1 contains C 2. Figure 23 shows how nested contours can be described by a tree. Figure 23. OpenCV hierarchical representation of contours. The function used to extract contours from a binary image is called cvfindcontours and is defined as follows: int cvfindcontours( CvArr* image, CvMemStorage* storage, CvSeq** first_contour, int header_size, int mode, int method, CvPoint offset); The image argument is the binary image from where the contours will be retrieved. The storage argument is a container for contour data. The first_contour argument is where a pointer to the first detected contour will be available after the function call. The header_size argument is relative to the size of the contour structure. The mode argument determines how the contours should be organized. Table 5 presents the available contour retrieval modes and their description. The method argument specifies the contour representation to be used. Table 6 describes the available representation methods. The offset argument is used to shift the retrieved points by an explicit amount of pixels. The function returns the number of contours found in the image. 23

24 Table 5. OpenCV contour retrieval modes. Retrieval mode CV_RETR_EXTERNAL CV_RETR_LIST CV_RETR_CCOMP CV_RETR_TREE Description Retrieve only the extreme outer contours. Retrieve all the contours in a list. Retrieve the connected components of the image (see topic 2.2.4). Retrieve all the contours hierarchically in a tree. Table 6. OpenCV contour representation methods. Representation method CV_CHAIN_CODE CV_CHAIN_APPROX_NONE CV_CHAIN_APPROX_SIMPLE CV_CHAIN_APPROX_TC89_L1 and CV_CHAIN_APPROX_TC89_KCOS CV_LINK_RUNS Description Chain code representation. Polygon representation where all contour pixels are returned as a vertex. Polygon representation where only the horizontal, vertical or diagonal segments ending points of the contour are returned as vertices. Two variants of the Teh-Chin polygon representation [11]. Only the contour pixels with high curvature are returned as vertices. Polygon representation where only the horizontal segments ending points of the contour are returned as vertices. After extracting the contours of an image, they can be handled in order to get information such as area and polygonal approximation. Contour area is calculated using the cvcontourarea function, described next: double cvcontourarea( const CvArr* contour, CvSlice slice); The area of the contour specified in the contour argument is returned by the function. If only a section of interest of the contour has to be considered in the calculation, it can be determined using the slice argument. Polygonal approximation of a contour is obtained using the cvapproxpoly function: CvSeq* cvapproxpoly(const void* src_seq, int header_size, CvMemStorage* storage, int method, double parameter, int parameter2); The src_seq argument is the contour to be approximated. The header_size and storage have the same purpose of the ones from 24

25 cvfindcontours. The only currently acceptable value for the method argument is CV_POLY_APPROX_DP, which corresponds to the Douglas-Peucker polygon approximation algorithm [12]. The parameter argument determines the tolerance value ε to be used by the algorithm. The parameter2 argument specifies if the hierarchical organization should be respected or if the contour is closed or not. This algorithm is based on the distance between a vertex and an edge segment, and on a tolerance ε. To start the algorithm, two extreme points of the polygon are connected. This connection defines the first edge to be used. Then, the distance between each remaining vertex and this edge is tested. If there are distances bigger than ε, then the vertex with the biggest distance away from the edge is added to the simplification. This process continues recursively for each edge of the current step until all distances between the vertices of the original polyline and the simplification are within the tolerance distance. Figure 24. Douglas-Peucker polygon approximation algorithm Connected components labeling Labeling of a binary image refers to the act of assigning a unique value to pixels belonging to a same connected region. After thresholding the input image, neighboring pixels that belong to objects of interest are associated with a label. Knowledge of the connected components of an image is very useful for automated recognition processes. This operation is often used in real time pattern recognition applications. Labeling is present on the image processing pipeline of the ARToolKit library [7]. The connected components of the thresholded image are labeled in order to identify the regions relative to marker borders and inner template, as can be seen in Figure

26 Figure 25. Connected components labeling used in ARToolKit: thresholded image (left) and labeling results (right). The principle of the labeling algorithm consists in scanning the binary image from the upper left pixel for objects pixels. For each objects pixels, their neighboring pixels that have already been scanned are examined. There are three options for the labeling of the pixel: If there are no objects pixels among the examined pixels, a new label is created and assigned to the current pixel; If there is one and only one object pixel among the examined pixels, the label of this pixel is assigned to the current pixel; If there are more than one objects pixels among the examined pixels, one of the labels is assigned to the current pixel and an equivalence between the different labels is created. When all the objects pixels are labeled, the equivalent labels are grouped in equivalence classes, each one with a unique label. The image is then scanned once again to solve the equivalences and define the connected components. In OpenCV, labeling is performed using the cvfindcontours function with CV_RETR_CCOMP set as the mode argument. The connected components are organized in a hierarchical manner. The external contours of the components are put on the first level of the tree. The contours of the holes present on a component are put on the second level Object tracking The OpenCV library comes bundled with some object tracking functions. This tutorial will cover three techniques which may be implemented using these featured functions. Template matching uses a simple template to scan the image storing the results. CamShift is an adaptive algorithm which tries to find an object based on its histogram back-projection. Optical flow is a technique that uses the flow of each pixel to track the object. Other free combinations can be performed in order to achieve different results Template matching Template matching is one of the simplest ways of finding an object within a picture. The goal of this technique is to scan all the pixels of the image and find every instance of a specific object described by a template. An application of template matching for real time pattern recognition can be found on some markerless augmented reality systems, such as the one shown in 26

27 Figure 26 and described in [13]. The difference between a region of the image and a reference template is minimized. After that, the parameters of a function that warps the template into the target image are calculated, so that tracking can be done. Figure 26. 3D tracking with template matching - green lines mark the template image (left) and augmented scene (right). OpenCV provides such feature through the cvmatchtemplate function: void cvmatchtemplate( const CvArr* image, const CvArr* templ, CvArr* result, int method ); All the comparison results are stored into the result variable, which can be interpreted like an image. The image argument stands for the original image, and the template is the object which is going to be detected. The appearance of such image will be related with the template matching: the closer the matched template, the higher the pixel value stored, which will be seen as a bright pixel. The method argument is a variable that tells the function which method to use in the comparison and it is detailed in Table 7. OpenCV supports several methods for comparing the template and the image: Table 7. Template matching methods. Method CV_TM_SQDIFF CV_TM_SQDIFF_NORMED CV_TM_CCORR CV_TM_CCORR_NORMED CV_TM_CCOEFF CV_TM_CCOEFF_NORMED Description The squared difference between the template and the image. The normalized squared difference. The cross correlation between template and image. The normalized cross correlation. The correlation coefficient. The normalized correlation coefficient. Figure 27 shows an example of a template used to match with the input image illustrated in Figure 28 (top). Figure 28 (bottom) shows the results obtained by matching this template with the input image. 27

28 Figure 27. Template used by the template matching technique. Figure 28. Example of template matching usage. Original image (top) and template matching result (bottom) CamShift CamShift stands for Continuous Adaptive Mean Shift and is an iterative algorithm [14]. It uses the mean shift algorithm, which iterates to find the object center given its 2D color probability distribution image. Then, the algorithm calculates the object s size and orientation. The algorithm s workflow is shown in Figure 29, and its steps are described in sequence. 28

29 Figure 29. CamShift workflow. The CamShift algorithm is described step by step next: 1. Set the calculation region of the probability distribution to the whole image. 2. Choose the initial location of the 2D mean shift search window. 3. Calculate the color probability distribution in the 2D region centered at the search window location in an region of interest (ROI) slightly larger than the mean shift window size. 4. Run mean shift algorithm [14] to find the search window center. Store the moment (area or size) and center location. 5. For the next video frame, center the search window at the mean location stored in Step 4 and set the window size to a function of the previous moment found there. Go to Step 3. The CamShift algorithm is implemented in OpenCV by the cvcamshift function: int cvcamshift( const CvArr* prob_image, CvRect window, CvTermCriteria criteria, CvConnectedComp* comp, CvBox2D* box); The prob_image parameter is the back projection of the object histogram, which can be calculated by the cvcalcbackproject function, as show in Figure 30. The window parameter is the initial search window which the algorithm will use in the iterations. The criteria parameter is used to tell the application when it should stop searching the next window. The user can specify whether the criterion is the number of iterations or an epsilon threshold for the windows similarity. The comp parameter stands for a Connected Component which will contain information about the converged search window. The window coordinates can be retrieved by calling comp->rect and the sum of all pixels inside the window can be retrieved by the comp->area field. The box parameter contains the object size and orientation at the end of the algorithm. Figure 31 shows an example of face tracking using CamShift. 29

30 Figure 30. Histogram back projection example. Figure 31. CamShift face tracking example Optical flow Optical flow is a technique for measuring velocity of image pixels by comparing them with a previous frame. The displacement of these pixels over time can be used to estimate camera movement in applications like the one shown in Figure 32, which performs 3D face tracking [15]. Figure 32. Optical flow used for 3D face tracking. There are several algorithms to implement optical flow, and OpenCV implements four of them, which are explained next. Lucas & Kanade technique The optical flow task is reduced to a linear system by applying the optical flow equation to a group of adjacent pixels and assuming that all of them have the 30

31 same velocity [16]. This technique is fast enough to be used in real time, because it does not process the whole image. The OpenCV function for this technique is cvcalcopticalflowlk, which is used by a sample described in Chapter 3. void cvcalcopticalflowlk( const CvArr* prev, const CvArr* curr, CvSize win_size, CvArr* velx, CvArr* vely ); The prev and curr arguments correspond to the two temporally adjacent frames of the image used to calculate the velocity. The win_size parameter is the average window which will work as a base for the whole image. The velx and vely parameters are, respectively, the horizontal and vertical components of the optical flow for every pixel. They have the same size as the input images. OpenCV implements also a pyramidal approach of the Lucas & Kanade algorithm [17]. The overall pyramidal tracking algorithm works as follows: first, the optical flow is computed at the deepest pyramid level L m. Then, the result of that computation is propagated to the upper level in a form of an initial guess for the pixel displacement. Given that initial guess, the refined optical flow is computed at level 0 (the original image). The OpenCV function for this technique is called cvcalcopticalflowpyrlk. void cvcalcopticalflowpyrlk( const CvArr* prev, const CvArr* curr, CvArr* prev_pyr, CvArr* curr_pyr, const CvPoint2D32f* prev_features, CvPoint2D32f* curr_features, int count, CvSize win_size, int level, char* status, float* track_error, CvTermCriteria criteria, int flags ); The prev and curr arguments are the two frames used for optical flow calculation. The prev_pyr and curr_pyr arguments are buffers used by the algorithm to store the partial images. The prev_features and curr_features arguments are the arrays of points that will be tracked. The first buffer contains the points, and the second buffer will store the new found positions. The count argument is the number of points to be tracked. The win_size parameter stands for the size of the search window for each pyramid, and level stands for the maximum pyramidal level. The status parameter is an array which contains 1 if that feature has been successfully tracked and the flow calculated and 0 otherwise. The track_error argument is an optional parameter which contains differences between patches around the original and moved points The criteria parameter is used to tell the application when it should stop searching the next window. The flags parameter can be set for saving some processing time, since it is possible to assume that one pyramid buffer is already calculated. The available flags are listed in Table 8. 31

32 Table 8. Pyramidal optical flow flags. Flag CV_LKFLOW_PYR_A_READY CV_LKFLOW_PYR_B_READY CV_LKFLOW_INITIAL_GUESSES Description Pyramid for the first frame is precalculated before the call. Pyramid for the second frame is precalculated before the call. Array curr_features contains initial coordinates of features before the function call. Horn & Schunck technique This is a function for finding the optical flow pattern which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image [18]. The Horn & Schunck optical flow technique is implemented by OpenCV through the cvcalcopticalflowhs function: void cvcalcopticalflowhs( const CvArr* prev, const CvArr* curr, int use_previous, CvArr* velx, CvArr* vely, double lambda, CvTermCriteria criteria ); The use_previous parameter denotes whether to use or not the calculation of the previous velocity field. The lambda field is called the Lagrangian multiplier. It must be smaller for a noisy image and larger for a clean and accurate image. The prev, curr, velx, vely and criteria arguments are the same as explained on the Lucas & Kanade technique topic. Block matching technique This technique does not use an optical flow equation directly. Instead, it uses more like a pattern matching technique. It uses a box inside the first image and tries to find another box of the same size in the second image that is similar to the first box. This algorithm presents an approximation result compared to the other techniques. This technique is implemented by the cvcalcopticalflowbm function: 32

33 void cvcalcopticalflowbm( const CvArr* prev, const CvArr* curr, CvSize block_size, CvSize shift_size, CvSize max_range, int use_previous, CvArr* velx, CvArr* vely ); The block_size and shift_size parameters denote the characteristics of the block used. The max_range parameter is the size of the neighborhood pixels around the block that are scanned. The prev, curr, velx, vely and criteria arguments are the same as explained on the previous topic Object detection Object detection is a feature bundled with OpenCV library. It uses a classifier, proposed by Paul Viola [19] and Rainer Lienhart [20], which can identify many different objects. The process begins by selecting a few hundred sample views of the desired object in many angles and illumination conditions. This set of images is called the positive samples. Then, another set of images must be chosen, containing arbitrary images not including the selected object. This set is called the negative samples. Afterwards, both sets must be used for training a cascade of boosted classifiers. The word "cascade" in the classifier context means that the resultant classifier consists of several simpler classifiers that are applied subsequently to a region of interest until at some stage the candidate is rejected or all the stages are passed. The word "boosted" means that the classifiers, at every stage of the cascade, are complex and built out of basic classifiers using one of four different boosting techniques (chosen performing a weighted voting). Currently, Discrete Adaboost, Real Adaboost, Gentle Adaboost and Logitboost are supported [21]. After the classifier is trained, it can be used to detect the object in a region of interest. The classifier is designed so that it can detect objects of different sizes by resizing the region of interest. OpenCV has a complete face detection project, detailed in Chapter 3, which implements the cascade classifier described above. This project can be easily adapted to any object, since a new set of images can be used on the training stage without modifying much of the code. Figure 33 illustrates an example of object detection application [22]. 33

34 Figure 33. Component detection example. There are some important functions that should be detailed for better understanding. The OpenCV function cvload is used to import a generated XML file during training stage: void* cvload( const char* filename, CvMemStorage* memstorage, const char* name, const char** real_name); The single parameter that must me passed to this function when loading a cascade is the filename, which must contain the address of the XML file. The other arguments are not necessary to specify when loading a Cascade file. The cvhaardetectobjects function returns a sequence of squares for every object detected: CvSeq* cvhaardetectobjects(const CvArr* image, CvHaarClassifierCascade* cascade, CvMemStorage* storage, double scale_factor, int min_neighbors, int flags, CvSize min_size); The image parameter stands for the image where the objects must be detected. cvload function returns its result through the cascade parameter. During the search, potential rectangle candidates are stored in the storage parameter. The scale_factor is the factor by which the search window is scaled every subscan. The min_neighbors parameter works grouping neighbor rectangles. The only flag currently supported by the flags parameter is CV_HAAR_DO_CANNY_PRUNNING, which uses a canny edge detector (see Subsection 2.1.2) to reject some image regions that do not contain the searched object. min_size stands for the minimum detection window size. 3. Example applications Some code samples are also distributed in the default package delivered by Intel. Inside the subfolder samples, located in the installation folder, there are source codes that may help a beginner OpenCV user. Among these files, some samples deserve a closer look, taking into account their relevance for pattern 34

35 detection and the use of concepts previously presented in this tutorial. Therefore, samples regarding square detection, the use of CamShift object tracking algorithm and Kanade Lucas tracker will be described as follows Square detection The square detection sample attempts to find and highlight squares contained in pictures loaded. In order to find squares, this sample implements a function that returns a sequence containing the vertexes of the squares detected. First, the sequence that will be used to carry the result is created, making possible any further addition of vertexes along the function. After this process, some filters are applied to enhance edges considering many thresholds. At each threshold, the function cvfindcontours will really split each contour and store them in a sequence object. For each contour the function cvapproxpoly will calculate an approximated polygon based on result output by cvfindcontours. If some conditions were satisfied (like having four vertexes and a convex contour), the polygon detected is a rectangle. Figure 34 illustrates two samples with squares highlighted by this demo CamShift demo Figure 34. Square detection sample. This sample shows how to apply the cvcamshift function to detect a pattern in an image acquired from a webcam, based on information given by a histogram. The histogram contains the color information of a pattern, and this demo uses it to find an area on the image that matches with the histogram data. To make tracking possible, user must give as input to the application a subarea of the captured frame. This area is taken as base to compute a hue histogram, which will serve as pattern to search the whole frame looking for a correspondent match. A histogram sample can be seen in Figure 35 that represents the face detected in Figure 36. This histogram serves as a color identity. The functions used for creating and calculating the histogram are cvcreatehist and cvcalchist, respectively. 35

36 Figure 35. Hue histogram. To find the pattern over a frame, one must compute the back projection of the hue plane, using the histogram. The back projection is an image where each position has a probability of composing an object like the one tracked. To calculate the image equivalent to the back projection of the image based on a histogram, user must call cvcalcbackprojection. The back projection result can be illustrated by Figure 36 (right). Given the result of back projection, the CamShift algorithm is applied to return the track box, which will contain the oriented rectangle that surrounds the matched area. The CamShift algorithm is implemented by the cvcamshift function. An example of the CamShift sample being applied to a face may be found in Figure 36 (left). The algorithm attempts to search the largest area that has connected neighbors having high probability of being part of the solution. Figure 36. CamShift detecting a face (left) and back projection of the image (right) Kanade Lucas tracker The Kanade Lucas tracker is a common feature tracker implemented using a pyramidal approach that uses optical flow to compute the new positions of the selected features. A feature can be explained as a point in a mostly rigid scene, which may be matched with another point in the next frame, conserving the semantic value of being in the same 3D point of the real scene. To generate some features to populate the application and test the behavior of the tracker, a function called cvgoodfeaturestotrack can be used. This function selects the strongest corners in the image to be the features that will 36

37 be tracked in the ongoing application. To refine the position of the selected corners, cvfindcornersubpix may be used in addition to the proper function. A result of this function can be verified in Figure 37. Figure 37. Frame showing the "good features to track". In the main loop, the program tests if the user has added a new feature manually, and also refines the position of this point to match the closest corner in the image. This is needed because corners are easier to track and facilitate optical flow calculation. To calculate the next position of the features shown in the last frame, the cvcalcopticalflowpyrlk function is used. This function implements the Kanade Lucas tracker and needs some temporary images to represent the pyramidal approach. They act like subsampled images that aid the algorithm to track optical flows with large movements. The result of this function is the new position of the points given as parameters Face detection The face detection sample uses the Haar classifier to recognize a face pattern. The demo loads a pretrained Haar classifier and uses it to detect the objects in the current frame. The Haar classifier used in this demo is the HaarClassifierCascade, which differs from other classifiers by cascading multiple Haar classifiers. To load the classifier, the sample uses the cvload function and then casts the result back to CvHaarClassifierCascade type. This classifier will be used as parameter of the cvhaardetectobjects function. With this function, user can find all the regions that match with the objects the cascade has been trained for. As return value, the cvhaardetectobjects function outputs a sequence of rectangles containing the matched objects. A good snapshot of a runtime instance of face detection is shown in Figure

38 Figure 38. Haar classifier detecting faces. 4. Final considerations This tutorial shows that OpenCV is a sophisticated computer vision library containing a large amount of functions related to pattern recognition. These functions are base for any image processing project that aims to detect a specific or general pattern. The OpenCV base functions for pattern detection are easy to use and follow the same structure, sharing common parameters offered by the library. Beyond the framework simplicity, OpenCV comes with an interface library, which will speed up the prototyping period of a test or application using OpenCV. This library is called HighGUI and a template code sample is shown in the Appendix of this tutorial. 5. References [1] G. Bradski and V. Pisarevsky. Intel's Computer Vision Library: applications in calibration, stereo segmentation, tracking, gesture, face and object recognition, Proceedings of the Conference on Computer Vision and Pattern Recognition, Hilton Head Island, USA, [2] R. Gonzalez and R. Woods. Digital Image Processing. Prentice Hall, [3] C. Tomasi and R. Manduchi. Bilateral Filtering for Gray and Color Images, Proceedings of the International Conference on Computer Vision, Bombay, India, [4] C. Harris. Tracking with Rigid Objects. MIT Press, [5] H. Scharr. Digitale Bildverarbeitung und Papier: Texturanalyse mittels Pyramiden und Grauwertstatistiken am Beispiel der Papierformation. Diplomarbeit, Fakultät für Physik und Astronomie, Ruprecht-Karls- Universität Heidelberg, [6] M. Uenohara and T. Kanade. Vision-Based Object Registration for Real- Time Image Overlay. Journal of Cognitive Neuroscience 3, 71 86, [7] H. Kato and M. Billinghurst. Marker Tracking and HMD Calibration for a Video-Based Augmented Reality Conferencing System, Proceedings of the Workshop on Augmented Reality, San Francisco, USA,

An Implementation on Object Move Detection Using OpenCV

An Implementation on Object Move Detection Using OpenCV An Implementation on Object Move Detection Using OpenCV Professor: Dr. Ali Arya Reported by: Farzin Farhadi-Niaki Department of Systems and Computer Engineering Carleton University Ottawa, Canada I. INTRODUCTION

More information

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html

More information

Image processing and features

Image processing and features Image processing and features Gabriele Bleser gabriele.bleser@dfki.de Thanks to Harald Wuest, Folker Wientapper and Marc Pollefeys Introduction Previous lectures: geometry Pose estimation Epipolar geometry

More information

Edge detection. Stefano Ferrari. Università degli Studi di Milano Elaborazione delle immagini (Image processing I)

Edge detection. Stefano Ferrari. Università degli Studi di Milano Elaborazione delle immagini (Image processing I) Edge detection Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Elaborazione delle immagini (Image processing I) academic year 2011 2012 Image segmentation Several image processing

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?

More information

Image Processing

Image Processing Image Processing 159.731 Canny Edge Detection Report Syed Irfanullah, Azeezullah 00297844 Danh Anh Huynh 02136047 1 Canny Edge Detection INTRODUCTION Edges Edges characterize boundaries and are therefore

More information

Lecture 7: Most Common Edge Detectors

Lecture 7: Most Common Edge Detectors #1 Lecture 7: Most Common Edge Detectors Saad Bedros sbedros@umn.edu Edge Detection Goal: Identify sudden changes (discontinuities) in an image Intuitively, most semantic and shape information from the

More information

Biomedical Image Analysis. Point, Edge and Line Detection

Biomedical Image Analysis. Point, Edge and Line Detection Biomedical Image Analysis Point, Edge and Line Detection Contents: Point and line detection Advanced edge detection: Canny Local/regional edge processing Global processing: Hough transform BMIA 15 V. Roth

More information

Object Move Controlling in Game Implementation Using OpenCV

Object Move Controlling in Game Implementation Using OpenCV Object Move Controlling in Game Implementation Using OpenCV Professor: Dr. Ali Arya Reported by: Farzin Farhadi-Niaki Lindsay Coderre Department of Systems and Computer Engineering Carleton University

More information

Introduction to Computer Vision

Introduction to Computer Vision Introduction to Computer Vision Dr. Gerhard Roth COMP 4102A Winter 2015 Version 2 General Information Instructor: Adjunct Prof. Dr. Gerhard Roth gerhardroth@rogers.com read hourly gerhardroth@cmail.carleton.ca

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 09 130219 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Feature Descriptors Feature Matching Feature

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

Local Image preprocessing (cont d)

Local Image preprocessing (cont d) Local Image preprocessing (cont d) 1 Outline - Edge detectors - Corner detectors - Reading: textbook 5.3.1-5.3.5 and 5.3.10 2 What are edges? Edges correspond to relevant features in the image. An edge

More information

Feature Detectors and Descriptors: Corners, Lines, etc.

Feature Detectors and Descriptors: Corners, Lines, etc. Feature Detectors and Descriptors: Corners, Lines, etc. Edges vs. Corners Edges = maxima in intensity gradient Edges vs. Corners Corners = lots of variation in direction of gradient in a small neighborhood

More information

Comparison between Various Edge Detection Methods on Satellite Image

Comparison between Various Edge Detection Methods on Satellite Image Comparison between Various Edge Detection Methods on Satellite Image H.S. Bhadauria 1, Annapurna Singh 2, Anuj Kumar 3 Govind Ballabh Pant Engineering College ( Pauri garhwal),computer Science and Engineering

More information

Ulrik Söderström 16 Feb Image Processing. Segmentation

Ulrik Söderström 16 Feb Image Processing. Segmentation Ulrik Söderström ulrik.soderstrom@tfe.umu.se 16 Feb 2011 Image Processing Segmentation What is Image Segmentation? To be able to extract information from an image it is common to subdivide it into background

More information

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854, USA Part of the slides

More information

Edge and local feature detection - 2. Importance of edge detection in computer vision

Edge and local feature detection - 2. Importance of edge detection in computer vision Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature

More information

Other Linear Filters CS 211A

Other Linear Filters CS 211A Other Linear Filters CS 211A Slides from Cornelia Fermüller and Marc Pollefeys Edge detection Convert a 2D image into a set of curves Extracts salient features of the scene More compact than pixels Origin

More information

Practical Image and Video Processing Using MATLAB

Practical Image and Video Processing Using MATLAB Practical Image and Video Processing Using MATLAB Chapter 14 Edge detection What will we learn? What is edge detection and why is it so important to computer vision? What are the main edge detection techniques

More information

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13. Announcements Edge and Corner Detection HW3 assigned CSE252A Lecture 13 Efficient Implementation Both, the Box filter and the Gaussian filter are separable: First convolve each row of input image I with

More information

Image Analysis. Edge Detection

Image Analysis. Edge Detection Image Analysis Edge Detection Christophoros Nikou cnikou@cs.uoi.gr Images taken from: Computer Vision course by Kristen Grauman, University of Texas at Austin (http://www.cs.utexas.edu/~grauman/courses/spring2011/index.html).

More information

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich. Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Lecture 6: Edge Detection

Lecture 6: Edge Detection #1 Lecture 6: Edge Detection Saad J Bedros sbedros@umn.edu Review From Last Lecture Options for Image Representation Introduced the concept of different representation or transformation Fourier Transform

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania 1 What is visual tracking? estimation of the target location over time 2 applications Six main areas:

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

CS4733 Class Notes, Computer Vision

CS4733 Class Notes, Computer Vision CS4733 Class Notes, Computer Vision Sources for online computer vision tutorials and demos - http://www.dai.ed.ac.uk/hipr and Computer Vision resources online - http://www.dai.ed.ac.uk/cvonline Vision

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 10 Segmentation 14/02/27 http://www.ee.unlv.edu/~b1morris/ecg782/

More information

Application questions. Theoretical questions

Application questions. Theoretical questions The oral exam will last 30 minutes and will consist of one application question followed by two theoretical questions. Please find below a non exhaustive list of possible application questions. The list

More information

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection COS 429: COMPUTER VISON Linear Filters and Edge Detection convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection Reading:

More information

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11 Announcement Edge and Corner Detection Slides are posted HW due Friday CSE5A Lecture 11 Edges Corners Edge is Where Change Occurs: 1-D Change is measured by derivative in 1D Numerical Derivatives f(x)

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 10 130221 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Canny Edge Detector Hough Transform Feature-Based

More information

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels Edge Detection Edge detection Convert a 2D image into a set of curves Extracts salient features of the scene More compact than pixels Origin of Edges surface normal discontinuity depth discontinuity surface

More information

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, School of Computer Science and Communication, KTH Danica Kragic EXAM SOLUTIONS Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, 14.00 19.00 Grade table 0-25 U 26-35 3 36-45

More information

Edge Detection. CSE 576 Ali Farhadi. Many slides from Steve Seitz and Larry Zitnick

Edge Detection. CSE 576 Ali Farhadi. Many slides from Steve Seitz and Larry Zitnick Edge Detection CSE 576 Ali Farhadi Many slides from Steve Seitz and Larry Zitnick Edge Attneave's Cat (1954) Origin of edges surface normal discontinuity depth discontinuity surface color discontinuity

More information

CS201: Computer Vision Introduction to Tracking

CS201: Computer Vision Introduction to Tracking CS201: Computer Vision Introduction to Tracking John Magee 18 November 2014 Slides courtesy of: Diane H. Theriault Question of the Day How can we represent and use motion in images? 1 What is Motion? Change

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Edge Detection. EE/CSE 576 Linda Shapiro

Edge Detection. EE/CSE 576 Linda Shapiro Edge Detection EE/CSE 576 Linda Shapiro Edge Attneave's Cat (1954) 2 Origin of edges surface normal discontinuity depth discontinuity surface color discontinuity illumination discontinuity Edges are caused

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

Segmentation algorithm for monochrome images generally are based on one of two basic properties of gray level values: discontinuity and similarity.

Segmentation algorithm for monochrome images generally are based on one of two basic properties of gray level values: discontinuity and similarity. Chapter - 3 : IMAGE SEGMENTATION Segmentation subdivides an image into its constituent s parts or objects. The level to which this subdivision is carried depends on the problem being solved. That means

More information

Image Analysis. Edge Detection

Image Analysis. Edge Detection Image Analysis Edge Detection Christophoros Nikou cnikou@cs.uoi.gr Images taken from: Computer Vision course by Kristen Grauman, University of Texas at Austin (http://www.cs.utexas.edu/~grauman/courses/spring2011/index.html).

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Straight Lines and Hough

Straight Lines and Hough 09/30/11 Straight Lines and Hough Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem, Lana Lazebnik, Steve Seitz, David Forsyth, David Lowe, Fei-Fei Li Project 1 A few project highlights

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages Announcements Mailing list: csep576@cs.washington.edu you should have received messages Project 1 out today (due in two weeks) Carpools Edge Detection From Sandlot Science Today s reading Forsyth, chapters

More information

Chapter 3: Intensity Transformations and Spatial Filtering

Chapter 3: Intensity Transformations and Spatial Filtering Chapter 3: Intensity Transformations and Spatial Filtering 3.1 Background 3.2 Some basic intensity transformation functions 3.3 Histogram processing 3.4 Fundamentals of spatial filtering 3.5 Smoothing

More information

Image Processing Fundamentals. Nicolas Vazquez Principal Software Engineer National Instruments

Image Processing Fundamentals. Nicolas Vazquez Principal Software Engineer National Instruments Image Processing Fundamentals Nicolas Vazquez Principal Software Engineer National Instruments Agenda Objectives and Motivations Enhancing Images Checking for Presence Locating Parts Measuring Features

More information

Topic 4 Image Segmentation

Topic 4 Image Segmentation Topic 4 Image Segmentation What is Segmentation? Why? Segmentation important contributing factor to the success of an automated image analysis process What is Image Analysis: Processing images to derive

More information

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection Why Edge Detection? How can an algorithm extract relevant information from an image that is enables the algorithm to recognize objects? The most important information for the interpretation of an image

More information

Final Review CMSC 733 Fall 2014

Final Review CMSC 733 Fall 2014 Final Review CMSC 733 Fall 2014 We have covered a lot of material in this course. One way to organize this material is around a set of key equations and algorithms. You should be familiar with all of these,

More information

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Intelligent Control Systems Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) Recall: Edge Detection Image processing

More information

Local Feature Detectors

Local Feature Detectors Local Feature Detectors Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Slides adapted from Cordelia Schmid and David Lowe, CVPR 2003 Tutorial, Matthew Brown,

More information

Feature Detectors - Canny Edge Detector

Feature Detectors - Canny Edge Detector Feature Detectors - Canny Edge Detector 04/12/2006 07:00 PM Canny Edge Detector Common Names: Canny edge detector Brief Description The Canny operator was designed to be an optimal edge detector (according

More information

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10 Announcements Assignment 2 due Tuesday, May 4. Edge Detection, Lines Midterm: Thursday, May 6. Introduction to Computer Vision CSE 152 Lecture 10 Edges Last Lecture 1. Object boundaries 2. Surface normal

More information

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies"

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing Larry Matthies ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies" lhm@jpl.nasa.gov, 818-354-3722" Announcements" First homework grading is done! Second homework is due

More information

ECEN 447 Digital Image Processing

ECEN 447 Digital Image Processing ECEN 447 Digital Image Processing Lecture 8: Segmentation and Description Ulisses Braga-Neto ECE Department Texas A&M University Image Segmentation and Description Image segmentation and description are

More information

Chapter 11 Representation & Description

Chapter 11 Representation & Description Chain Codes Chain codes are used to represent a boundary by a connected sequence of straight-line segments of specified length and direction. The direction of each segment is coded by using a numbering

More information

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation Motion Sohaib A Khan 1 Introduction So far, we have dealing with single images of a static scene taken by a fixed camera. Here we will deal with sequence of images taken at different time intervals. Motion

More information

Segmentation and Grouping

Segmentation and Grouping Segmentation and Grouping How and what do we see? Fundamental Problems ' Focus of attention, or grouping ' What subsets of pixels do we consider as possible objects? ' All connected subsets? ' Representation

More information

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania Visual Tracking Antonino Furnari Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania furnari@dmi.unict.it 11 giugno 2015 What is visual tracking? estimation

More information

Edge Detection Lecture 03 Computer Vision

Edge Detection Lecture 03 Computer Vision Edge Detection Lecture 3 Computer Vision Suggested readings Chapter 5 Linda G. Shapiro and George Stockman, Computer Vision, Upper Saddle River, NJ, Prentice Hall,. Chapter David A. Forsyth and Jean Ponce,

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Towards the completion of assignment 1

Towards the completion of assignment 1 Towards the completion of assignment 1 What to do for calibration What to do for point matching What to do for tracking What to do for GUI COMPSCI 773 Feature Point Detection Why study feature point detection?

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Digital Image Processing. Image Enhancement - Filtering

Digital Image Processing. Image Enhancement - Filtering Digital Image Processing Image Enhancement - Filtering Derivative Derivative is defined as a rate of change. Discrete Derivative Finite Distance Example Derivatives in 2-dimension Derivatives of Images

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Image features. Image Features

Image features. Image Features Image features Image features, such as edges and interest points, provide rich information on the image content. They correspond to local regions in the image and are fundamental in many applications in

More information

Feature Detectors - Sobel Edge Detector

Feature Detectors - Sobel Edge Detector Page 1 of 5 Sobel Edge Detector Common Names: Sobel, also related is Prewitt Gradient Edge Detector Brief Description The Sobel operator performs a 2-D spatial gradient measurement on an image and so emphasizes

More information

Lecture 4: Spatial Domain Transformations

Lecture 4: Spatial Domain Transformations # Lecture 4: Spatial Domain Transformations Saad J Bedros sbedros@umn.edu Reminder 2 nd Quiz on the manipulator Part is this Fri, April 7 205, :5 AM to :0 PM Open Book, Open Notes, Focus on the material

More information

University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision

University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision report University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision Web Server master database User Interface Images + labels image feature algorithm Extract

More information

Line, edge, blob and corner detection

Line, edge, blob and corner detection Line, edge, blob and corner detection Dmitri Melnikov MTAT.03.260 Pattern Recognition and Image Analysis April 5, 2011 1 / 33 Outline 1 Introduction 2 Line detection 3 Edge detection 4 Blob detection 5

More information

COMPUTER AND ROBOT VISION

COMPUTER AND ROBOT VISION VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington A^ ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California

More information

Coarse-to-fine image registration

Coarse-to-fine image registration Today we will look at a few important topics in scale space in computer vision, in particular, coarseto-fine approaches, and the SIFT feature descriptor. I will present only the main ideas here to give

More information

SRCEM, Banmore(M.P.), India

SRCEM, Banmore(M.P.), India IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Edge Detection Operators on Digital Image Rajni Nema *1, Dr. A. K. Saxena 2 *1, 2 SRCEM, Banmore(M.P.), India Abstract Edge detection

More information

CS231A Section 6: Problem Set 3

CS231A Section 6: Problem Set 3 CS231A Section 6: Problem Set 3 Kevin Wong Review 6 -! 1 11/09/2012 Announcements PS3 Due 2:15pm Tuesday, Nov 13 Extra Office Hours: Friday 6 8pm Huang Common Area, Basement Level. Review 6 -! 2 Topics

More information

Computer Vision for HCI. Topics of This Lecture

Computer Vision for HCI. Topics of This Lecture Computer Vision for HCI Interest Points Topics of This Lecture Local Invariant Features Motivation Requirements, Invariances Keypoint Localization Features from Accelerated Segment Test (FAST) Harris Shi-Tomasi

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

SURVEY ON IMAGE PROCESSING IN THE FIELD OF DE-NOISING TECHNIQUES AND EDGE DETECTION TECHNIQUES ON RADIOGRAPHIC IMAGES

SURVEY ON IMAGE PROCESSING IN THE FIELD OF DE-NOISING TECHNIQUES AND EDGE DETECTION TECHNIQUES ON RADIOGRAPHIC IMAGES SURVEY ON IMAGE PROCESSING IN THE FIELD OF DE-NOISING TECHNIQUES AND EDGE DETECTION TECHNIQUES ON RADIOGRAPHIC IMAGES 1 B.THAMOTHARAN, 2 M.MENAKA, 3 SANDHYA VAIDYANATHAN, 3 SOWMYA RAVIKUMAR 1 Asst. Prof.,

More information

Introduction to Medical Imaging (5XSA0)

Introduction to Medical Imaging (5XSA0) 1 Introduction to Medical Imaging (5XSA0) Visual feature extraction Color and texture analysis Sveta Zinger ( s.zinger@tue.nl ) Introduction (1) Features What are features? Feature a piece of information

More information

REAL-TIME FACIAL FEATURE POINT DETECTION AND TRACKING IN A VIDEO SEQUENCE

REAL-TIME FACIAL FEATURE POINT DETECTION AND TRACKING IN A VIDEO SEQUENCE Computer Modelling and New Technologies, 2013, vol.17, no.2, 48 52 Transport and Telecommunication Institute, Lomonosov 1, LV-1019, Riga, Latvia REAL-TIME FACIAL FEATURE POINT DETECTION AND TRACKING IN

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Dirk W. Wagener, Ben Herbst Department of Applied Mathematics, University of Stellenbosch, Private Bag X1, Matieland 762,

More information

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar. Matching Compare region of image to region of image. We talked about this for stereo. Important for motion. Epipolar constraint unknown. But motion small. Recognition Find object in image. Recognize object.

More information

Assignment 3: Edge Detection

Assignment 3: Edge Detection Assignment 3: Edge Detection - EE Affiliate I. INTRODUCTION This assignment looks at different techniques of detecting edges in an image. Edge detection is a fundamental tool in computer vision to analyse

More information

Lesson 6: Contours. 1. Introduction. 2. Image filtering: Convolution. 3. Edge Detection. 4. Contour segmentation

Lesson 6: Contours. 1. Introduction. 2. Image filtering: Convolution. 3. Edge Detection. 4. Contour segmentation . Introduction Lesson 6: Contours 2. Image filtering: Convolution 3. Edge Detection Gradient detectors: Sobel Canny... Zero crossings: Marr-Hildreth 4. Contour segmentation Local tracking Hough transform

More information

Edge Detection. Today s reading. Cipolla & Gee on edge detection (available online) From Sandlot Science

Edge Detection. Today s reading. Cipolla & Gee on edge detection (available online) From Sandlot Science Edge Detection From Sandlot Science Today s reading Cipolla & Gee on edge detection (available online) Project 1a assigned last Friday due this Friday Last time: Cross-correlation Let be the image, be

More information

Filtering Images. Contents

Filtering Images. Contents Image Processing and Data Visualization with MATLAB Filtering Images Hansrudi Noser June 8-9, 010 UZH, Multimedia and Robotics Summer School Noise Smoothing Filters Sigmoid Filters Gradient Filters Contents

More information

Review for the Final

Review for the Final Review for the Final CS 635 Review (Topics Covered) Image Compression Lossless Coding Compression Huffman Interpixel RLE Lossy Quantization Discrete Cosine Transform JPEG CS 635 Review (Topics Covered)

More information

Lecture 16: Computer Vision

Lecture 16: Computer Vision CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler Lecture 16: Computer Vision Motion Slides are from Steve Seitz (UW), David Jacobs (UMD) Outline Motion Estimation Motion Field Optical Flow Field

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spatial Domain Filtering http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Background Intensity

More information

Lecture 16: Computer Vision

Lecture 16: Computer Vision CS442/542b: Artificial ntelligence Prof. Olga Veksler Lecture 16: Computer Vision Motion Slides are from Steve Seitz (UW), David Jacobs (UMD) Outline Motion Estimation Motion Field Optical Flow Field Methods

More information

Capturing, Modeling, Rendering 3D Structures

Capturing, Modeling, Rendering 3D Structures Computer Vision Approach Capturing, Modeling, Rendering 3D Structures Calculate pixel correspondences and extract geometry Not robust Difficult to acquire illumination effects, e.g. specular highlights

More information