Fabio Remondino. Tracking of human movements in image space

Size: px
Start display at page:

Download "Fabio Remondino. Tracking of human movements in image space"

Transcription

1 Tracking of human movements in image space 1

2 Table of content 1. Introduction 3 2. Human tracking overview 4 3. Data acquisition 6 4. Algorithms overview The least square matching tracker Object tracking The Shi-Tomasi-Kanade tracker Detection and tracking of moving objects Features selection for tracking human body part Results Least square matching tracking Shi-Tomasi-Kanade tracker Detection and tracking of moving objects Object tracking Conclusions Feature works 35 Bibliography 36 2

3 1. Introduction Human motion analysis is receiving increasing attention from researchers of different fields of study. The interest is motivated by a wide spectrum of applications, such as athletic performance analysis, surveillance, man-machine interface, video-conferencing, human-computer interaction, motion capture (games and animation). A complete model of human consists of both the movements and the shape of the body. Many of the available systems consider the two modeling processes as separate even if they are very close. Depending on the applications (animation, visualization, medical imaging) different methods can be used for the measurement of the body shape: laser scanner, infra-red light scanner, photogrammetry, structured light. The modeling of the movement is often obtained capturing the motion with tracking processes: this can be achieved with photogrammetric methods, electromagnetic or mechanical sensors systems and image-based methods. In general the tracking process can be described as the establishment of correspondences of the image structure between consecutive frames, based on features related to position, velocity, shape, color and texture. The main problem is to establish automatically the corresponding features in different images. The tracking is required for 2D and 3D object localization and it is also used for object detection, classification and identification. The main goals of motion studies are to detect moving regions (points, features, areas), estimate the motion, model articulated objects and interpret the motion. It is a very hard task as: - the appearance of the people can vary dramatically from frame to frame; - people can appear in arbitrary poses; - the human body can deform in complex way; - tracked points can be occluded, resulting in ambiguities and multi interpretations; - tracked points (joints) are often not well observable (clothing hide the underlying structure); - it is geometrically under-constrained problem (images are 2D entities of a 3D world). This work focuses on the tracking of movements of humans in monocular sequences of images. Section 2 deals with a general overview of tracking techniques, including motion capture, human modelling processes and moving objects detection. In Section 3 the used techniques for images acquisition and the contrast enhancement process are presented. In Section 4 an overview of the implemented algorithms is described while a short description of interest features is contained in Section 5. Finally Section 6 shows all the results for the validation of the algorithms. 3

4 2. Human tracking overview The main problem of tracking humans (and in particular humans movements) is how to capture the position and motion in space of articulated parts of human body. Typically the tracking process involves the matching between frames using pixels, points, lines and blobs based on their motion, shape or other visual information. Tracking movements of persons and modeling the different part of the human body are two applications very close to each other. There are two main techniques to capture the human motions [2]: (a) Tracking using body markers These tracking systems can be divided in [13]: 1. Systems which employ sensors on the body that sense artificial external sources (e.g. electromagnetic field), or natural external sources. These systems provide 3D world-based information but their workspace and accuracy is generally limited due to the use of the external sources and their formfactor restricts their use to medium and larger sized body parts. 2. Systems which employ an external sensor that senses artificial sources or markers on the body, e.g. an electro-optical system that tracks reflective markers, or natural sources on the body (e.g. a video-camera based system that tracks the pupil and cornea). These systems generally suffer from occlusion and a limited workspace. 3. Systems which employ sensors and sources that are both on the body (e.g. a glove with piezoresistive flex sensors). The sensors generally have small form-factors and are therefore especially suitable for tracking small body parts. These systems allow for capture of any body movement and allow for an unlimited workspace but generally do not provide 3D world-based information. In figure 2.1 some systems for motion capture are presented. Fig.2.1: Different systems for motion capture. Left and right retro-reflective markers. Middle: electro-mechanical system All these techniques are used especially in motion capture where object s position and orientation in physical space are recorded as information in a suitable form that animators can use to control elements in a computer generated scene. The disadvantages of these technique are: - displacement of the markers during movement brings to uncertainty in the results; - difficulty to place on complex articulation (like shoulders, knees); - rigidity in movement (psychological effects) - difficult calibration of the system. 4

5 The main advantage is the capability of some systems to process the data and produce 3D results in real time. (b) Tracking without markers (marker free methods) Marker free methods are based on image sequences processing/analysis. These methods are often model-based; the image sequences can be acquired either from one camera (monocular vision), or from multiple cameras (multi-views). In monocular case different approaches can be used to track the human body: matching point features, contour extraction (sensitive to noise), 3-D geometric primitives (projected onto the images) [13], probabilistic models of the joint positions [22], particle filtering [3], active part decomposition. In the multi-views approach, multiple cameras acquire simultaneously different views of the person and the 3-D body poses and motions at each time instant can be recovered from the multiimage sequences [7]. The marker free methods offer the subject complete freedom of movement which is not the case of tracking with markers. Image understanding and extrapolation of the third dimension are the main problems for these methods, especially in monocular vision. In this case the 3D coordinates can be induced from the 2D image coordinates e.g. using a Bayesian approach and a set of training data [11] or fitting the projection of a three-dimensional person model through the sequence [21, 24]. The main problems of these approaches are the models of the different part of the body (using cylinders, cones, elliptical cylinders), the large number of degrees of freedom of the model (body joints, rotations, orientations) and the modeling of the motion (prediction of the next steps). In the multi-image approach, stereo-vision can be used to extract 3D information from the sequence. Fig.2.2: Left: geometric primitives projected onto the image [21]. Right: a volumetric human model [1] The interest in human motions analysis can also be limited to detect moving objects in image sequences. In applications as real-time tracking, monitoring of wide-area sites or surveillance, tracking approaches based on moving objects localization and body shape or body boundaries tracking are used (fig.2.3). The moving objects can be identified in the images using background subtraction or optical flow. If also a Fig.2.3: Moving shapes tracking motion of the camera is present, a rectification of the frames must be performed in order to apply the background knowledges [12]. Moving objects in the scene are often segmented while occlusion problems can be solved using temporal analysis and trajectory prediction (Kalman filter) [17]. 5

6 3. Data acquisition Four sequences (fig.3.1, 3.2, 3.3, 3.4) have been acquired with a Sony DCR-VX700E, a Sony digital handycam that records images in digital format on a mini DV tape. The images are stored in DV format with a size of 720x576 pixel and 24 bit color resolution. The DV format is a Sony property, compressed digital and video audio recording standard. As CCD cameras are interlaced, i.e. a full frame is split into two different fields which are recorded and read-out consecutively, odd and even lines of an image are captured at different time and a saw pattern is created during the digitizing process. For this reason only the odd (even) lines of an image are used in the algorithm, reducing the resolution in vertical direction by 50 per cent. Other two sequences (fig.3.5, 3.6) have been acquired digitizing an old VHS tape. Also in this case the digitalization process creates a saw pattern in the images; therefore reduced images are used for the validation of the algorithm. Fig.3.1: Sequence of 24 frames of a walking man: the camera is rotating on a tripod Fig.3.2: Sequence of 60 frames: the camera is still and the guy is just raising his arms Two sequences acquired from VHS tape (fig.3.5, 3.6) have very low resolution because of the video-tape and the digitalization process (RAZOR software). No way was successful in the enhancement of the frames: different filters and also motion blur compensation didn t achieve good results. Therefore just a local contrast enhancement has been done. 6

7 Fig.3.5: Sequence of 100 frames: two people are walking one over the other. Their trajectories are perpendicular to the camera which is still far away from them Fig.3.6: Sequence of 50 frames of moving people walking towards the camera Fig.3.3: Sequence of 9 frames acquired from VHS tape Fig.3.4: Sequence of 10 frames from VHS tape 7

8 4. Algorithms overview In this section the implemented algorithms are described: least square matching tracker, object tracking and extraction, Shi-Tomasi-Kanade tracker, detection and tracking of moving objects. 4.1 The least square matching tracker The basic idea of this algorithm is to track a selected point through a sequence of images using least square matching (LSM). The process is based on adaptive least square method technique [9] and is similar to [4]. Assume two image regions are given as discrete two-dimensional functions f(x,y) and g(x,y) and that f(x,y) is the template in one image and g(x,y) the patch in the other image; a correspondence is established if f(x,y) = g(x,y) (4.1) Because of random effects (noise) in both images, the above equation is not consistent. Therefore, a noise vector e(x,y) is added, resulting in f(x,y) - e(x,y) = g(x,y) (4.2) The location of the function values g(x,y) must be determined in order to provide the match point. This is achieved by minimizing a goal function, which measures the distances between the grey levels in a template and in an other patch. The goal function to be minimized in this approach is the L2-norm of the residuals of least squares estimation. Eq.4.2 can be considered as a non linear observation equation which model the vector of observation f(x,y) with a function g(x,y), whose location in the other image must be estimated. The location is usually described by shift parameters which are estimated with respect to an initial position of g(x,y). In order to account for a variety of systematic image deformations and to obtain a better match, image shaping parameters (affine image shaping) and radiometric corrections can be introduced beside the shift parameters [9]. An affine transformation is often used and the pixel coordinates of the matched point are computed as x new =a 0 +a 1 x+a 2 y (4.3.1) y new =b 0 +b 1 x+b 2 y (4.3.2) where the 6 parameters of the affine transformation must be estimated from eq. (4.2) by minimizing the sum of the squares of the differences between the grey values in image patches. The function g(x,y) in eq. (4.2) is linearized with respect to the unknow parameters and the obtained linear system is iterated using a Gauss-Markov method [9]. The implemented algorithm uses two images, one as template and the other as search image. The patches in the search image are modified by the affine transformation (translations, rotation, shearing and scaling) and the corresponding point is found in the search image after some iterations. Fig.4.1 shows the result of the least squares matching: the red box is the selected patch in the template image and the green box represents the affinely transformed patch in the search image (emphasize). Fig.4.1: LSM algorithm: template image (left) and search image (right) 8

9 In [4] three sequences of images of three synchronized cameras are available: spatial correspondences between three images at the same instant t and also temporal correspondences between subsequent frames of each camera are computed and 3D trajectory can be determined. In our case the algorithm works with monocular sequences of images and only temporal correspondences can be found. The fundamental operations of the tracking process are three: 1. predict the position in the next frames; 2. search the position with the highest cross correlation value; 3. establish the point in the next frame using lest square matching. If the images have been taken at near time instants, they are strongly related to each other and the image position of two corresponding features is very similar. Therefore, for the frame at time t+1, the predicted position of a point is the same as time t (fig.4.2). Around this position a search box is defined (blue box) and scanned for searching the position which has the higher cross-correlation. This position is considered an approximation of the exact position of the point to be tracked. The LSM algorithm is then applied at that position (red cross) and the result of the matching is considered the exact position of the tracked point in the next frame. Frame at time t: in red the patch for LSM Frame at time t+1: in blue the search area for cross-correlation Fig.4.2: The cross-correlation process to find the approximation for LSM For the frame at time t+2 a linear prediction of the position of the point from the two previous frames is computed (fig.4.3). Then a search box is defined around this predicted position and the point with bigger cross-correlation is used for the LSM computation. For the next frames a linear prediction (based on the previous positions) is always computed even if a more complicated interpolation could be implemented (splines or kalman filter, especially after occlusions). As the algorithm works with monocular sequences, few automatic controls on the corresponding matched points can be performed. In order to verify the reliability of the tracked points, two post-processing verification have been implemented: 1. cross-correlation computation: it checks if the matched point is reliable between two frames. If the cross-correlation coefficient of a point in two consecutive images is smaller than a predefined threshold value, the points is rejected; 2. distance between two matched joints: this test can be performed if the camera does not zoom and is stationary or if its movements are slower than the moving objects; in these cases a distance t-1 t t+1 Fig.4.3: Linear prediction to find the approximated position of the point 9

10 can be computed, in each frame, between two points on the body that must remain at the same distance (e.g. feet-knee, wrist-shoulder). Then the difference of this distance in two consecutive frame is calculated and if the difference does not belongs to a predefined domain, the tracked point is rejected. A cross-correlation computation has been also implemented to recover lost points after occlusions. Manually the user must select the last image where the point is visible and the image where the point reappears. The process finds the new position after occlusion using a suitable window; these coordinates are considered an approximation of the point and the LSM is applied to compute the correct position. If the tracked points have been selected in correspondence of the human joints, a final animation of the tracked points can be done and the 2D trajectories can be drawn. 4.2 Object tracking A tracking process can also involve the extraction of part of objects using few tracked points. Using an images matching process [4] which establishes many correspondences in three consecutive images it is possible to extract the full body (or part of it) through the sequence. The process is based on the adaptive least square method [9] and automatically determines a dense set of corresponding points between images starting from few points sparse on the surface to extract. The template image is divided into polygonal regions according to which of the seed points is closest (Voronoi tessellation)(fig.4.4). zoom seed points matched points o seed points. matched points Fig.4.4: Search strategy for establishment of correspondences between images Starting from the seed points and using a user-defined border of the interest object, the algorithm tries to match corresponding points in three consecutive images. The central image is used as template and the other two as search images. The matcher searchs the corresponding points in the two images independently. The process starts from a selected point, shift horizontally in the template and in the search images and applies the LSM algorithm in the shifted location. If the quality of the matching is good the matched point is stored and the process continues horizontally until it reaches the region boundaries. The covering of the entire polygonal region of a seed point is achieved by sequential horizontal and vertical shifts. In monocular sequences the reliability of the matched surfaces depends only on the matching parameters; in multi-views sequences a control can be done using the computed 3D coordinates and check the wrong correspondences [5]. 10

11 To evaluate the quality of the matched points the following indicators are used: - a posteriori standard deviation of the least square adjustment; - standard deviation of the shift in x and y directions. If the quality of the matching is not satisfactory, the algorithm computes again the process changing some parameters like smaller shift from the neighbor or bigger patch size. At the end of the process, a cloud of 2D points is obtained (fig second row) even if some holes due to not analyzed area can appear in the results: the algorithm tries to close these gaps by searching from all direction around. If the holes are in areas with low texture, the matching does not find many correspondences; therefore the results can be improved by increasing the number of seed points in these areas or using neighborhood information. Fig.4.5: Triplet of successive frames and found 2D correspondences 4.3 The Shi-Tomasi-Kanade tracker In this section the Shi-Tomasi-Kanade tracker [14, 19, 23] will be briefly described. In general, any function of three variables I(x,y,t), where the space variables x and y as well the time variable t are discrete and suitably bounded, can represent the intensity of an image sequence. If the camera moves, the patterns of image intensities change in a complex way; but images taken at near time instants are usually strongly related to each other, because in general they refer to the same scene taken from only slightly different viewpoints. Consider an image sequence I(x,t), with x=[u,v] T the coordinates of an image point. If the time sampling frequency is sufficiently high, we can assume that small image regions are displaced but their intensities remain unchanged. Therefore I(x,t) is not arbitrary but satisfies: I ( x, t) = I ( δ( x), t + t) (4.4) where δ(x) is the motion field, specifying the warping that is applied to image points between time instant t and t+ t. The fast-sampling hypothesis allow us to approximate the motion with a translation, that is, δ(x)=x ± d, where d is a displacement vector. So, a later image at time t+ t can be obtained by moving every point in the current image, taken at time t, by a suitable amount d. As the image motion model is not perfect and because of image noise, equation (4.4) is not exactly satisfied and can be written as: I ( x, t) = I ( δ( x), t + t) + n( x) (4.5) where n is a noise function. 11

12 The tracker s task is to compute the displacement d for a number of selected points for each pair of successive frames in the sequence. The displacement must be computed minimizing the SSD (Sum of Square Differences) residual: ε = [ I ( x + d, t + t) I ( x, t) ] 2 W (4.6) where W is a small image window centered on the point for which d is computed. By plugging the first-order Taylor expansion of I(x+d,t+ t) into eq. (4.6) and imposing that the derivatives with respect to d are zero, we obtain the linear system Gd=e (4.7) 2 I where: G u Iu I = v (4.8.1) W 2 I u I v I v with I u I v = I = I I (4.8.2) u v and e, the error vector, is: e = T I t I u I v W (4.8.3) I with I t =. t The derivatives of the function I can be computed with finite pixel difference but there are always problems with image noise and local minima. A better solution can be achieved with a convolution of the function with special filter (Gaussian kernel). The tracker is based on eq.(4.7): given a pair of successive frames, d is the solution of (4.7) that is d=g -1 e, and is used to compute the position in the new frame. The procedure is iterated according to Newton-Rapshon scheme, until the convergence of the displacement is estimated. The translation model δ(x)=x ± d, cannot account for certain transformation of the feature window we are tracking, for instance rotation, scaling and shear. An affine motion model is more accurate [19]: x + u y + v = a 1 a 2 a 3 a 4 x y + a 5 a 5, (4.9) because two rotations, two translations, a scale in x/y and a shear are considered. 12

13 It computes δ(x) of eq.(4.4) as δ(x)=ax+d (4.10) where d is a displacement and A is a 2x2 matrix accounting for affine warping and can be written as A=I+D, with D=[d ij ] a deformation matrix and I the identity matrix. As in the translational case, the motion parameters D and d are estimated by minimizing the residuals (SSD): ε = [ I( Ax + d, t + t) I ( x, t) ] 2 (4.11) W The equation (4.11) is differentiated with respect to the unknown entries of the matrix D and the vector d and the results are set to zero. Linearizing the resulting system by Taylor expansion, we obtain the linear system: Tz=a, (4.12) where: z=[d 11 d 12 d 21 d 22 d 1 d 2 ] T (4.13.1) contains the unknown entries of the deformation matrix D and the displacement vector d; T a = I t ui (4.13.2) u ui v vi u vi v I vu I W v is the error vector that depends on the differences between the two images; T = W U V V T G (4.13.3) and U is a 4x4 matrix containing the products of the first 4 element of the vector a for each of these elements; V is a 2x4 matrix containing the product of the elements I u and I v for the first 4 elements of a; G as in equation (4.8.1). Finally equation (4.12) can be solved iteratively for the entry of z. In both cases (translational and affine model) feature selection is very important. In [19] is recommended that T (or G) is well conditioned, i.e. the ratio between the largest and the smallest eigenvalue of T (or G) should not be too big (corner selection). Once the displacement has been found and the new position of the point has been determined, a control on the new position must be done. The control is computed with a cross-correlation process: given a template window around the point in frame n and a slave window around the matched point in frame n+1, a cross-correlation coefficient ρ is computed. The corresponding feature in frame n+1 is accepted if the computed ρ is bigger than a user-defined threshold value ρ 0. 13

14 Usually the STK tracker is not used for tracking human movements in image sequences; but if the images have been taken at near time instants they are usually strongly related to each other and this (extended) tracker can give quite good results for not very long sequences of high texture images. 4.4 Detection and tracking of moving objects In applications like video-surveillance and monitoring of human activities, the main idea is to detect and track moving objects (people, vehicles, etc.) as they move through the scene. Considering one image, regions of moving objects should be separated from the static environment. To identify and separate the moving object, different approach have been proposed: background subtraction [17], 2D active shape models [18], combination of motion, skin color and face detection [8]. If the camera is stationary or its movement are very small compared to the objects, a simple subtraction of two consecutive frames can be used (fig.4.6-c). The resulting image has much larger values for the moving components of the frame than the stationary components. A moving object produces two regions having large values: 1. a front region of the object caused by covering of the background by the object; 2. a rear region of the object caused by the uncovering of the object from the background. Therefore, using a threshold of the image it is possible to detect the rear region of the moving object. The threshold value is determined by experiments. The binary thresholded image can contain some noise which can be easily removed with an erosion process or with a median filter (Fig.4.6-d). (a) (b) (c) (d) Fig.4.6: Example of image subtraction. Two frames of a sequence (a, b). Binary image after absolute images difference with noise (c): black pixel represent movements. Result after median filter (d) Once the moving objects have been localized, their bounding boxes can be computed. For this purpose vertical projections of the binary image is at first performed (fig. 4.7). The different objects in the image are often already visible from this projection. The position of the objects in the horizontal axes are determined by slicing the vertical projections. If the counted number of pixel in a slice is higher than a threshold, then the slice is identified as an area of moving activities. This is done for all the slices along the horizontal axes and finally the adjacent slices with moving activities are joined together obtaining a set of areas where moving activities have been detected (fig. 4.7). The size of the slices can be adapted to the specific conditions of the acquired images. The smaller the slices are, the better will be the precision of the detected areas, but if the 14

15 slices are too small, then different moving objects could be detected as a single moving object. The threshold for the identification of a slice as a moving area depends on the size of the slices and has to be determined by experiments. Fig.4.7: Vertical projection (left) with 2 picks representing the two men. Vertical lines (right) delimiting the moving objects Then the same process is performed with the horizontal projections of the different determined areas of the horizontal axes. The horizontal projection of a person is sometimes divided in 2 different moving areas: indeed the middle of the body is usually not moving during the walk, therefore it is not detected. Once the moving areas are detected, the square bounding boxes can be obtained. Fig.4.8: Horizontal projections of the x-axis areas (left) and computed bounding boxes (right) In case of occlusions (two people walking one towards the other), it can be difficult to divide the vertical projections into its components. To avoid this problem, the center of gravity is computed and the boxes are calculated with respect to this center. Occlusions can also be predicted, detected and handled properly by estimating the positions and velocities of the object and projecting these estimation to the image plane[14]. Once the boxes have been computed, it is possible to visualize the moving foreground regions using background subtraction. 15

16 5. Features selection for tracking human body part Regardless the methods used for tracking, not all the parts of an image contain motion information. Moreover along an edge we can only determine the motion component orthogonal to the edge and so we must take care in selecting the feature to follow in the sequence. In general, to avoid these difficulties, only regions with enough texture are used. In fact a single pixel cannot be tracked unless it has a very distinctive brightness with respect to all its neighbors. As a consequence, it is often hard or impossible to determine where the single pixel is moved in the subsequence frame, based only on local information. Because of these problems, we do not track single points but windows containing good features and sufficient texture. The point features are usually extracted by local operators, often called interest operators. The attributes are computed within a rectangular or circular window, in selected or in all directions and are usually compared to a threshold to decide whether a feature is good or not. Many feature point extractors have been proposed in the last years [6, 10, 20]. Concerning all these interest operators, some characteristics can be found: 1. they work with a predefined or arbitrary idea of how a good window looks like; 2. they assume that a good feature is independently of the tracking algorithm; 3. they often find features well trackable only in pure translation; 4. they often find features which are good only in the first frames. So the resulting features are not guarantee to be the best for the tracking algorithm all over the sequence. Therefore a feature point must be consistently and should have enough information in its neighborhood over the different frames. Concerning tracking operations, researchers have proposed to track features as corners, windows with high spatial frequency content or region where some mix of second-order derivatives is sufficiently high [19]. But for human body movements tracking, as we want to extract 2D or 3D information from the tracked points, we cannot take the features randomly all over the body as an interest operator would make, or just in correspondence of edges, but we must select precisely points (joints). We are interested in capture the movement of the human body, therefore we should select points which can define the motion. Usually points in correspondence of head, shoulders, elbows, wrists, hips, knees and ankles are selected. Once this set of points has been extracted from the image, a human skeleton can be drawn (fig.5.1). Fig.5.1: Skeleton of human body (EPFL) 16

17 6. Results After selecting some points of interest, we can apply the different algorithms to track the points. The first two part of this chapter present the results obtained with least square matching tracker and Shi-Tomasi-Kanade tracker. The results of the detection of moving object, tracking and computation of bounding boxes are presented in the third part while the tracking of an whole object and its visualization is shown in the last part. All the results are in image space: 3D coordinates will be recovered in future works. 6.1 Least square matching tracking The least square matching tracking process starts from some points selected on the image. These results consider points selected manually and in particular positions (fig.6.1.1) as we want to extract a skeleton of the human body. The algorithm using these set of coordinates, computes the corresponding points in the other frames. The parameter file used in the computation contains: - used/not used parameters for affine transformation; - max sigma 0 of the matching; - max sigma-x and sigma-y in the computation of the affine parameter a 0 and b 0 ; - max value for the affine parameters a 0 and b 0 (translation parameters); - size of the window in the template and search image for LSM; - size of the window in the search image for cross-correlation between first and second frame; - size of the window in the search image for cross-correlation in the next frames; - step for cross-correlation computation in the search image; - size of a bigger window in the search image for cross-correlation when the value of LSM is not satisfactory. A result is stored when the computed values of the three sigma and of the two translation parameters are smaller than the default ones in the parameter file. The default value for sigma 0 is 25.0 and for sigma-x and sigma-y is 0.20; usually all the 6 parameters of the affine transformation are used and the max value for a 0 and b 0 is set to 4.0. A post-processing computation checks the reliability of the matched points computing the crosscorrelation coefficient between consecutive frames. The default value is 0.75 but the threshold can be decreased for low resolution images. In the next pages some results of the LSM tracking process are shown. Fig.6.1.1: Points selected on the image 17

18 The first sequence has been acquired from a VHS tape and has very low resolution; the camera is panning following the walking man and 10 frames were available. 14 points have been selected in the first frame; at the end of the process, 10 points have been tracked all over the sequence (fig.6.1.2). The average cross-correlation coefficient of all the points is The LSM algorithm worked with a sigma 0 of 30 while sigma x and sigma y were fixed to Frame nr.1 Frame nr.5 Frame nr.9 Fig.6.1.2: Some frames of the sequence with the tracked points In the next sequence, consisting of 60 frames, 14 points have been selected in correspondence of body joints: head, neck, shoulders, elbows, wrists, hips, knees and ankles. After 10 frames the points in correspondences of the elbows were lost while all the other joints have been tracked over the all sequence (fig.6.1.3). The sigma-y was fixed to 0.30 because the guy was moving his arms in vertical direction and the images have half resolution in vertical direction as only the odd lines are used. (a) (b) (c) (d) (e) (a)frame nr.1 (b)frame nr.11 (c)frame nr. 30 (d)frame nr. 50 (e)frame nr.60 Fig.6.1.3: Points tracked in a sequence of 60 frames 18

19 Because of the presence of clothes, when the guy was moving his arms, the folds of the sweater changed, so points selected in correspondence of big movements of the folds were not matched (or not well matched). The cross-correlation coefficient between tracked points in two consecutive frames was calculated and the results are summarize in Table 1. All the 12 points tracked over the sequence had a crosscorrelation coefficient bigger than 0.9. If the camera is still and stays approximately at the same distance from the subject, another control on the tracked points can be done, computing the differences of the distances between two points with fixed distance, namely feet-knee or neck-shoulder or neck-head. Figure shows the computed differences of the distances in all the frames. There is just a big outlier (with a difference of 4 pixels) while all the other differences are in the interval [-2.4, +2.2] pixels, that is an average error of one pixel for every matched point. The big outlier can be due to the folds of the sweater on the wrist, as said before. Table 1: Average of cross-correlation coefficient Pt1: wrist left Pt2: shoulder left Pt3: neck Pt4: head Pt5: shoulder right Pt6: wrist left Pt7 hip left Pt8 hip right Pt9 knee lefs Pt10 knee rights Pt11 ankle left Pt12: ankle right Distances Distance head neck Distance feet knee left Distance feet knee right Distance wrist shoulder left Distance wrist shoulder right 3 2 Distances [pixel] Frames Fig.6.1.4: Differences of the distances between some joints all over the sequence of 60 frames Once the 2D coordinates of the joints are computed, it is possible to build (by now only in 2D) a skeleton of the human body and represent the stylized person in all the sequence. An animation has been created and a visualization is shown in fig and with cylindric reconstruction of the human body parts. 19

20 Frames Y Fig.6.1.5: Visualization of the computed 2D skeleton to the human body X Frames Y X Fig.6.1.6: Cylindric reconstruction of the human skeleton from 2D points computed with the LSM tracking 20

21 Another sequence is presented in fig (a) (b) (c) (d) (e) (a) frame nr.2 (b) frame nr.6 (c) frame nr.12 (d) frame nr.15 (e) frame nr.22 Fig.6.1.7: Tracked frames with occlusion of some points This sequence is composed of 24 frames. 13 points have been selected in correspondence of joints. A point on the left wrist was lost quite immediately because of occlusion after 9 frames; also the points on the left leg were lost due to occlusion. When occlusions occur, a point can be wrongly matched and from the analysis of the cross-correlation results, is possible to remove the outlier and to track again this point after the occlusion. The points on the leg have been recovered after occlusion using a cross-correlation process(fig.6.1.9). A template around the point in the last image where is visible is used; the search area is acquired in the image where the point reappear (the user must select both images). The point is found in correspondence of the center of the window with bigger cross-correlation coefficient. Then the LSM algorithm can track the recovered points in the other frames (fig.6.1.8). Fig.6.1.8: Some frames of the sequence with recovered points after occlusions 21

22 Fig.6.1.9: Cross-correlation procedure to recover a point lost because of occlusion The mean cross-correlation coefficient of the points in all the sequence is 0.88 and the differences of the distances between joints are in the interval [-2.5,+2.5]. The graph of the differences of the distances is shown in fig Fig : Differences of the distances between some joints 22

23 A final visualization of the sequence with reconstructed human skeleton is show in fig Y Frames X Fig : Visualization (every 3 frames) of the skeleton built with the tracked points The last sequence has been acquired from a VHS tape; the camera was moving following the running man and 9 frames were used. 12 points have been selected in the first frame and 7 points have been tracked in all the sequence. The LSM sigma 0 was equal 30 while the cross-correlation coefficient had an average of In fig are presented some frames of the sequence with overlapped the stylized skeleton. Frame nr.1 Frame nr.5 Frame nr.9 Fig : A low resolution sequence of 9 frames: the camera is moving following the running man. 7 points have been tracked in all the frames. 23

24 6.2 Shi-Tomasi-Kanade tracker The core of the STK algorithm was already available on the web; A GUI to select and visualize the tracked points and a routine to compute the process for a whole sequence have been added. Given two consecutive frames I(x,t) and J(x,t+1), the principal steps of the program are: - compute matrix T (or G) and a (or e) of eq. (4.12): the image gradient in both windows (for fast converges) are computed with a gaussian kernel; - compute translation d (in the first few iterations) and affine parameters (in the last iterations) such that SSD difference of I(Ax+d) - J(x) is minimized (equation 4.11); - re-warp J with sub-pixel 2D bilinear interpolation using the computed affine motion; - check the SSD error. For every point, the algorithm computes n iterations and selects the affine motion parameters with smaller SSD error. The algorithm is very time consuming. In the first sequence of 24 frames, all the points were tracked (recovering those lost for occlusions with the cross-correlation process previously described). The results are shown in fig Frame nr.1 Frame nr.9 Frame nr.15 Frame nr.21 Fig.6.2.1: Four frames of the sequence. In red the points tracked between consecutive frames, in yellow the reconstructed human skeleton. 24

25 The cross-correlation coefficient ρ between consecutive frames has an average of The mean SSD (Sum of Square Differences) error in all the sequence was equal to while the differences of the distances between selected joints is in the interval [-3,+2.3] pixels (except a big outlier of 4 pixels) (see fig.6.2.2). Fig.6.2.2: Graph with the computed differences of the distances between selected joints. Only one big outlier is present while the other values belong to the interval [-3,+2.3] pixel In the successive sequence, 30 frames have been used to validate the algorithm. In the first frame (fig a) 14 points have been selected in correspondence of human joints; in the last frames (fig d,e) 10 points were still tracked while the others were lost due to small cross-correlation coefficient and big SSD. (a) (b) (c) (d) (e) (a) frame nr.1 (b) frame nr.5 (c) frames nr.9 (d) frame nr.20 (e) frame nr.30 Fig.6.2.3: Some frames of the sequence with the points tracked with STK algorithm. In yellow the reconstructed human skeleton 25

26 From the results shown in fig.6.2.3, we can see that the point on the lower left border of the sweater seems to be not correct in the last frames; but it is well visible a movement of the sweater which follows the lifting of the arms. Nevertheless the cross correlation coefficient of that point through the sequence is With the sequences acquired from VHS tape, the STK algorithm didn t give very good results; the selected points were tracked just for 2-3 frames with reliable precision and then were lost or mismatched. The STK algorithm needs very good features (in particular in case of movements) and very good texture around the point that must be tracked. 26

27 6.3 Detection and tracking of moving objects The detection and tracking of moving objects has been tested on two sequences where two people were walking. The program can work with a sequence of n-frames and gives as output the images with the different moving objects in colored boxes. The first sequence (100 frames) shows motions that are roughly on a linear path. Trajectories are linear and parallel to the camera plane and there are occlusions as the two men are directly one over the other. In the results (fig.6.3.1), there are two color-coded boxes, one for each tracked object. Frame nr.9 Frame nr.45 Frame nr.47 Frame nr.71 Fig.6.3.1: Results of moving people detection: tracking before, during and after occlusions 27

28 In fig.6.3.1, the first column show the projections of the pixels along the vertical and horizontal axes; it is easy to divide the vertical projections into its components when there are no occlusions, but when they occur, can be difficult to distinguish the two parts (picks) of the projections. To avoid this problem, the center of gravity of the projections is computed and is used to assign the bounding to the correct object. In the middle column of fig the computed bounding boxes projected on the image differences are shown while the last column presents the projections of the boxes on the original image. Occlusions are visible in the second and third row: the bounding boxes are not very precise because there is overlap between the vertical projections and the limits of the boxes are based only on these projections. More sophisticated computation as temporal analysis or trajectory prediction can be implemented. The moving foreground regions can be visualized with background subtraction. This part of the process is not automatic but could be so if a model of the empty scene is available [17]: once the bounding boxes have been computed, an image where the area inside the boxes is just background is selected. Then a subtraction between the two windows is performed and the moving foreground can be reconstructed with few processes of erosion and dilation (fig.6.3.2). Frame nr.9 Frame nr.45 Frame nr.71 Fig.6.3.2: Foreground moving regions detected by background subtraction In the second sequence (fig.6.3.3), 50 frames were available; two people were walking towards the stationary camera and their trajectories were not perpendicular to the camera. 28

29 Frame nr.5 Frame nr.25 Frame nr.45 Fig.6.3.3: Bounding boxes of two moving people walking towards the camera The computed bounding boxes depend on the sliced projections and the size of the slices can be adapted to the specific conditions of the frames. The projections depend on the image differences, therefore in some frames, small movements of the humans (i.e. feet) are not included in the boxes. Frame nr.45 Fig.6.3.4: Foreground regions detected by background subtraction 29

30 6.4. Object tracking To complete the tracking procedure, once few points of an object have been tracked all over a sequence, it is possible to extract and visualize the whole moving object by establishing many correspondences in some images starting from few tracked points. A clouds of 2D points is obtained and visualized displaying the matched grey value of the image. Using the sequence of fig.3.1, groups of three frames have been created and the middle frame has been used as template image. The seed points have been tracked with LSM tracker (fig upper row) in the all sequence and then used to establish the correspondences. A cloud of points was computed in every frame and then projected onto the image (fig second row). In fig all the matched grey value of the triplet are displayed. Fig.6.4.1: Computed correspondences in a triplet of images Fig.6.4.2: Object extraction from the computed 2D correspondences in a triplet of images. 30

31 The central image of fig is the template image: the number of matched correspondences is bigger than in the search images where much more holes are present due to not analyzed areas. The gaps can be due to poor texture, low contrast of the area, or wrong matching. In figure 6.4.3, some b/w and color results of the sequence are shown. Frame nr.2 Frame nr.14 Frame nr.23 Fig.6.4.3: Central frames of the triplets: visualized matched points representing the computed 2D correspondences extracted from b/w images (upper rows). Tracked object in color images (bottom row). 31

32 A problem encountered in object tracking is the texture: even if the image has high resolution, the matching process does not work with low texture giving big holes located in those regions where there is uniform texture of the subject (central part of the trousers or of the sweater). Some indicators that evaluate the quality of the results are shown in table 2 as average of all the sequence: Table 2: Some indicators to evaluate the quality of the process mean sigma std. dev mean sigma x std. dev mean sigma y std. dev In the second sequence, consisting of 10 frames, the process worked quite well but many gaps in the results occurred (fig.6.4.4). The seed points used for the measurement have been computed using the LSM tracker; there were 18 points in the first triplet but in the successive frames the number decreased because of not corrected matching or occlusions; therefore in the next triplets some seed points have been added. The holes in the tracked object are bigger than other sequences because of the low resolution of the images (fig.6.4.5). Fig.6.4.4: A triplet of the sequence: in the middle column the template image with the matched grey value, at the border the correspondences found in the search images. In table 3 the indicators of the process are presented. Table 3: Some indicators to evaluate the quality of the matching: an average of all the 9 frames mean sigma std. dev mean sigma x std. dev mean sigma y std. dev

33 Fig.6.4.5: Central template of the next three triplets of the sequence: big lacks of texture on the tracked object are visible because of few seed points and not found correspondences In fig and other triplets are shown. In this sequence the tracked model was moving only his arms: in the first experiments, only 14 points were selected as seed points. Fig.6.4.6: Triplets of a sequence: template image with 16 seed points (upper row). Central template image and search images at the border with the computed 2D clouds of correspondences (lower row) But big holes occurred because of not matched points (high sigma 0 ) in regions of uniform texture. It was necessary to add other two points on the torso of the man in order to extract the whole body (fig.6.4.6, 6.4-7). Also here the matching algorithm failed in regions with low contrast or homoge- 33

34 neous texture (fig.6.4.7, 6.4.8), as homologous points can not be assigned reliably or corresponding points can not be found at all in the images.. Fig.6.4.7: Object extraction from the triplet of images. In order: first search image (frame t-1 ), template image (frame t ), second search image (frame t+1 ) Fig.6.4.8: Central template image of the next three triplet with found correspondences. 34

35 7. Conclusions An overview of some methods for human movements detection and tracking in image space has been presented. Two algorithms that track points in image sequences have been used; the first is based on classic photogrammetric least square matching, the other one is based on a model of affine image changes proposed by Shi, Tomasi and Kanade and available on the net. Both algorithms have been tested on different sequences and the best results came from the LSM tracking. This algorithm can work with longer sequences, with bigger precision and is more reliable than the other one; moreover the LSM tracker can work also with low texture images and, if no occlusions occurs, no big outliers are present. On the other hand, the STK algorithm needs very good texture around the points to track and an efficient outlier rejection scheme too; indeed this algorithm is a very good tracker for indoor sequences full of features (corners) with high texture, but is very time consuming. The object tracking algorithm produced nice results when the images have good and not uniform texture and the seed points were well spread on the object to measure; in low resolution images many holes occurred in the results. It can be considered as a process for object extraction based on tracked points and image matching. The detection algorithm is an automatic process to determine the bounding boxes of moving people in a sequence of frames; it is a very simple implementation but could work with long sequences and avoid problems of occlusions. The precision of the boxes depended on the projections of the pixels and their slices; therefore was very important the choice of the threshold value to compute the image difference. 8. Future works 1. The LSM tracker must be improved in the outliers rejection. The cross-correlation process should be integrated in the main algorithm to reject the mismatched points in real time and not in post-processing. 2. A more accurate and refined process to detect and track objects in case of occlusions should be added. Occlusion can be predicted and avoided with sophisticated algorithms while foreground extraction can be performed with a better background subtraction technique. 3. A camera model could be defined to reconstruct the 3D world from the image coordinate extracted with the tracking process. 4. The object tracking algorithm can be improved adding neighborhoods information in the matching process to close the gaps occurred in the results. 35

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Image processing and features

Image processing and features Image processing and features Gabriele Bleser gabriele.bleser@dfki.de Thanks to Harald Wuest, Folker Wientapper and Marc Pollefeys Introduction Previous lectures: geometry Pose estimation Epipolar geometry

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

CS201: Computer Vision Introduction to Tracking

CS201: Computer Vision Introduction to Tracking CS201: Computer Vision Introduction to Tracking John Magee 18 November 2014 Slides courtesy of: Diane H. Theriault Question of the Day How can we represent and use motion in images? 1 What is Motion? Change

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO Stefan Krauß, Juliane Hüttl SE, SoSe 2011, HU-Berlin PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO 1 Uses of Motion/Performance Capture movies games, virtual environments biomechanics, sports science,

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Motion Tracking and Event Understanding in Video Sequences

Motion Tracking and Event Understanding in Video Sequences Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!

More information

Lecture 20: Tracking. Tuesday, Nov 27

Lecture 20: Tracking. Tuesday, Nov 27 Lecture 20: Tracking Tuesday, Nov 27 Paper reviews Thorough summary in your own words Main contribution Strengths? Weaknesses? How convincing are the experiments? Suggestions to improve them? Extensions?

More information

Robert Collins CSE598G. Intro to Template Matching and the Lucas-Kanade Method

Robert Collins CSE598G. Intro to Template Matching and the Lucas-Kanade Method Intro to Template Matching and the Lucas-Kanade Method Appearance-Based Tracking current frame + previous location likelihood over object location current location appearance model (e.g. image template,

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania 1 What is visual tracking? estimation of the target location over time 2 applications Six main areas:

More information

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation Motion Sohaib A Khan 1 Introduction So far, we have dealing with single images of a static scene taken by a fixed camera. Here we will deal with sequence of images taken at different time intervals. Motion

More information

Capturing, Modeling, Rendering 3D Structures

Capturing, Modeling, Rendering 3D Structures Computer Vision Approach Capturing, Modeling, Rendering 3D Structures Calculate pixel correspondences and extract geometry Not robust Difficult to acquire illumination effects, e.g. specular highlights

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

Marcel Worring Intelligent Sensory Information Systems

Marcel Worring Intelligent Sensory Information Systems Marcel Worring worring@science.uva.nl Intelligent Sensory Information Systems University of Amsterdam Information and Communication Technology archives of documentaries, film, or training material, video

More information

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania Visual Tracking Antonino Furnari Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania furnari@dmi.unict.it 11 giugno 2015 What is visual tracking? estimation

More information

CHAPTER 5 MOTION DETECTION AND ANALYSIS

CHAPTER 5 MOTION DETECTION AND ANALYSIS CHAPTER 5 MOTION DETECTION AND ANALYSIS 5.1. Introduction: Motion processing is gaining an intense attention from the researchers with the progress in motion studies and processing competence. A series

More information

Peripheral drift illusion

Peripheral drift illusion Peripheral drift illusion Does it work on other animals? Computer Vision Motion and Optical Flow Many slides adapted from J. Hays, S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others Video A video

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar. Matching Compare region of image to region of image. We talked about this for stereo. Important for motion. Epipolar constraint unknown. But motion small. Recognition Find object in image. Recognize object.

More information

Dense Image-based Motion Estimation Algorithms & Optical Flow

Dense Image-based Motion Estimation Algorithms & Optical Flow Dense mage-based Motion Estimation Algorithms & Optical Flow Video A video is a sequence of frames captured at different times The video data is a function of v time (t) v space (x,y) ntroduction to motion

More information

Lecture 16: Computer Vision

Lecture 16: Computer Vision CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler Lecture 16: Computer Vision Motion Slides are from Steve Seitz (UW), David Jacobs (UMD) Outline Motion Estimation Motion Field Optical Flow Field

More information

Lecture 16: Computer Vision

Lecture 16: Computer Vision CS442/542b: Artificial ntelligence Prof. Olga Veksler Lecture 16: Computer Vision Motion Slides are from Steve Seitz (UW), David Jacobs (UMD) Outline Motion Estimation Motion Field Optical Flow Field Methods

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

Detecting and Identifying Moving Objects in Real-Time

Detecting and Identifying Moving Objects in Real-Time Chapter 9 Detecting and Identifying Moving Objects in Real-Time For surveillance applications or for human-computer interaction, the automated real-time tracking of moving objects in images from a stationary

More information

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm Dirk W. Wagener, Ben Herbst Department of Applied Mathematics, University of Stellenbosch, Private Bag X1, Matieland 762,

More information

Robot vision review. Martin Jagersand

Robot vision review. Martin Jagersand Robot vision review Martin Jagersand What is Computer Vision? Computer Graphics Three Related fields Image Processing: Changes 2D images into other 2D images Computer Graphics: Takes 3D models, renders

More information

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22) Digital Image Processing Prof. P. K. Biswas Department of Electronics and Electrical Communications Engineering Indian Institute of Technology, Kharagpur Module Number 01 Lecture Number 02 Application

More information

C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S. Image Operations II

C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S. Image Operations II T H E U N I V E R S I T Y of T E X A S H E A L T H S C I E N C E C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S Image Operations II For students of HI 5323

More information

CS 4495 Computer Vision Motion and Optic Flow

CS 4495 Computer Vision Motion and Optic Flow CS 4495 Computer Vision Aaron Bobick School of Interactive Computing Administrivia PS4 is out, due Sunday Oct 27 th. All relevant lectures posted Details about Problem Set: You may *not* use built in Harris

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Comparison between Motion Analysis and Stereo

Comparison between Motion Analysis and Stereo MOTION ESTIMATION The slides are from several sources through James Hays (Brown); Silvio Savarese (U. of Michigan); Octavia Camps (Northeastern); including their own slides. Comparison between Motion Analysis

More information

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures Now we will talk about Motion Analysis Motion analysis Motion analysis is dealing with three main groups of motionrelated problems: Motion detection Moving object detection and location. Derivation of

More information

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Motion and Tracking Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Motion Segmentation Segment the video into multiple coherently moving objects Motion and Perceptual Organization

More information

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, School of Computer Science and Communication, KTH Danica Kragic EXAM SOLUTIONS Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, 14.00 19.00 Grade table 0-25 U 26-35 3 36-45

More information

Displacement estimation

Displacement estimation Displacement estimation Displacement estimation by block matching" l Search strategies" l Subpixel estimation" Gradient-based displacement estimation ( optical flow )" l Lukas-Kanade" l Multi-scale coarse-to-fine"

More information

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Tomokazu Sato, Masayuki Kanbara and Naokazu Yokoya Graduate School of Information Science, Nara Institute

More information

Visual Tracking (1) Feature Point Tracking and Block Matching

Visual Tracking (1) Feature Point Tracking and Block Matching Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1 Machine vision systems Problem definition Image acquisition Image segmentation Connected component analysis Machine vision systems - 1 Problem definition Design a vision system to see a flat world Page

More information

UNIT-2 IMAGE REPRESENTATION IMAGE REPRESENTATION IMAGE SENSORS IMAGE SENSORS- FLEX CIRCUIT ASSEMBLY

UNIT-2 IMAGE REPRESENTATION IMAGE REPRESENTATION IMAGE SENSORS IMAGE SENSORS- FLEX CIRCUIT ASSEMBLY 18-08-2016 UNIT-2 In the following slides we will consider what is involved in capturing a digital image of a real-world scene Image sensing and representation Image Acquisition Sampling and quantisation

More information

Edge and local feature detection - 2. Importance of edge detection in computer vision

Edge and local feature detection - 2. Importance of edge detection in computer vision Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature

More information

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

CS4442/9542b Artificial Intelligence II prof. Olga Veksler CS4442/9542b Artificial Intelligence II prof. Olga Veksler Lecture 8 Computer Vision Introduction, Filtering Some slides from: D. Jacobs, D. Lowe, S. Seitz, A.Efros, X. Li, R. Fergus, J. Hayes, S. Lazebnik,

More information

Interpolation is a basic tool used extensively in tasks such as zooming, shrinking, rotating, and geometric corrections.

Interpolation is a basic tool used extensively in tasks such as zooming, shrinking, rotating, and geometric corrections. Image Interpolation 48 Interpolation is a basic tool used extensively in tasks such as zooming, shrinking, rotating, and geometric corrections. Fundamentally, interpolation is the process of using known

More information

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

CS4442/9542b Artificial Intelligence II prof. Olga Veksler CS4442/9542b Artificial Intelligence II prof. Olga Veksler Lecture 2 Computer Vision Introduction, Filtering Some slides from: D. Jacobs, D. Lowe, S. Seitz, A.Efros, X. Li, R. Fergus, J. Hayes, S. Lazebnik,

More information

Autonomous Navigation for Flying Robots

Autonomous Navigation for Flying Robots Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 7.1: 2D Motion Estimation in Images Jürgen Sturm Technische Universität München 3D to 2D Perspective Projections

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 11 140311 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Motion Analysis Motivation Differential Motion Optical

More information

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Low Cost Motion Capture

Low Cost Motion Capture Low Cost Motion Capture R. Budiman M. Bennamoun D.Q. Huynh School of Computer Science and Software Engineering The University of Western Australia Crawley WA 6009 AUSTRALIA Email: budimr01@tartarus.uwa.edu.au,

More information

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra) Mierm Exam CS223b Stanford CS223b Computer Vision, Winter 2004 Feb. 18, 2004 Full Name: Email: This exam has 7 pages. Make sure your exam is not missing any sheets, and write your name on every page. The

More information

Dynamic Time Warping for Binocular Hand Tracking and Reconstruction

Dynamic Time Warping for Binocular Hand Tracking and Reconstruction Dynamic Time Warping for Binocular Hand Tracking and Reconstruction Javier Romero, Danica Kragic Ville Kyrki Antonis Argyros CAS-CVAP-CSC Dept. of Information Technology Institute of Computer Science KTH,

More information

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13. Announcements Edge and Corner Detection HW3 assigned CSE252A Lecture 13 Efficient Implementation Both, the Box filter and the Gaussian filter are separable: First convolve each row of input image I with

More information

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Intelligent Control Systems Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?

More information

Structure from Motion. Prof. Marco Marcon

Structure from Motion. Prof. Marco Marcon Structure from Motion Prof. Marco Marcon Summing-up 2 Stereo is the most powerful clue for determining the structure of a scene Another important clue is the relative motion between the scene and (mono)

More information

Multi-stable Perception. Necker Cube

Multi-stable Perception. Necker Cube Multi-stable Perception Necker Cube Spinning dancer illusion, Nobuyuki Kayahara Multiple view geometry Stereo vision Epipolar geometry Lowe Hartley and Zisserman Depth map extraction Essential matrix

More information

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska 1 Krzysztof Krawiec IDSS 2 The importance of visual motion Adds entirely new (temporal) dimension to visual

More information

Massachusetts Institute of Technology Department of Computer Science and Electrical Engineering 6.801/6.866 Machine Vision QUIZ II

Massachusetts Institute of Technology Department of Computer Science and Electrical Engineering 6.801/6.866 Machine Vision QUIZ II Massachusetts Institute of Technology Department of Computer Science and Electrical Engineering 6.801/6.866 Machine Vision QUIZ II Handed out: 001 Nov. 30th Due on: 001 Dec. 10th Problem 1: (a (b Interior

More information

Local Feature Detectors

Local Feature Detectors Local Feature Detectors Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Slides adapted from Cordelia Schmid and David Lowe, CVPR 2003 Tutorial, Matthew Brown,

More information

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html

More information

CS664 Lecture #18: Motion

CS664 Lecture #18: Motion CS664 Lecture #18: Motion Announcements Most paper choices were fine Please be sure to email me for approval, if you haven t already This is intended to help you, especially with the final project Use

More information

A Vision System for Automatic State Determination of Grid Based Board Games

A Vision System for Automatic State Determination of Grid Based Board Games A Vision System for Automatic State Determination of Grid Based Board Games Michael Bryson Computer Science and Engineering, University of South Carolina, 29208 Abstract. Numerous programs have been written

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

Computer Vision Lecture 20

Computer Vision Lecture 20 Computer Perceptual Vision and Sensory WS 16/17 Augmented Computing Computer Perceptual Vision and Sensory WS 16/17 Augmented Computing Computer Perceptual Vision and Sensory WS 16/17 Augmented Computing

More information

Product information. Hi-Tech Electronics Pte Ltd

Product information. Hi-Tech Electronics Pte Ltd Product information Introduction TEMA Motion is the world leading software for advanced motion analysis. Starting with digital image sequences the operator uses TEMA Motion to track objects in images,

More information

Computer Vision Lecture 20

Computer Vision Lecture 20 Computer Perceptual Vision and Sensory WS 16/76 Augmented Computing Many slides adapted from K. Grauman, S. Seitz, R. Szeliski, M. Pollefeys, S. Lazebnik Computer Vision Lecture 20 Motion and Optical Flow

More information

Towards the completion of assignment 1

Towards the completion of assignment 1 Towards the completion of assignment 1 What to do for calibration What to do for point matching What to do for tracking What to do for GUI COMPSCI 773 Feature Point Detection Why study feature point detection?

More information

Image Processing Fundamentals. Nicolas Vazquez Principal Software Engineer National Instruments

Image Processing Fundamentals. Nicolas Vazquez Principal Software Engineer National Instruments Image Processing Fundamentals Nicolas Vazquez Principal Software Engineer National Instruments Agenda Objectives and Motivations Enhancing Images Checking for Presence Locating Parts Measuring Features

More information

Optical flow and tracking

Optical flow and tracking EECS 442 Computer vision Optical flow and tracking Intro Optical flow and feature tracking Lucas-Kanade algorithm Motion segmentation Segments of this lectures are courtesy of Profs S. Lazebnik S. Seitz,

More information

CS4733 Class Notes, Computer Vision

CS4733 Class Notes, Computer Vision CS4733 Class Notes, Computer Vision Sources for online computer vision tutorials and demos - http://www.dai.ed.ac.uk/hipr and Computer Vision resources online - http://www.dai.ed.ac.uk/cvonline Vision

More information

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION Mr.V.SRINIVASA RAO 1 Prof.A.SATYA KALYAN 2 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRASAD V POTLURI SIDDHARTHA

More information

Introduction to behavior-recognition and object tracking

Introduction to behavior-recognition and object tracking Introduction to behavior-recognition and object tracking Xuan Mo ipal Group Meeting April 22, 2011 Outline Motivation of Behavior-recognition Four general groups of behaviors Core technologies Future direction

More information

Automatic Generation of Animatable 3D Personalized Model Based on Multi-view Images

Automatic Generation of Animatable 3D Personalized Model Based on Multi-view Images Automatic Generation of Animatable 3D Personalized Model Based on Multi-view Images Seong-Jae Lim, Ho-Won Kim, Jin Sung Choi CG Team, Contents Division ETRI Daejeon, South Korea sjlim@etri.re.kr Bon-Ki

More information

Visual motion. Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys

Visual motion. Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys Visual motion Man slides adapted from S. Seitz, R. Szeliski, M. Pollefes Motion and perceptual organization Sometimes, motion is the onl cue Motion and perceptual organization Sometimes, motion is the

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Model-Based Human Motion Capture from Monocular Video Sequences

Model-Based Human Motion Capture from Monocular Video Sequences Model-Based Human Motion Capture from Monocular Video Sequences Jihun Park 1, Sangho Park 2, and J.K. Aggarwal 2 1 Department of Computer Engineering Hongik University Seoul, Korea jhpark@hongik.ac.kr

More information

Introduction to Medical Imaging (5XSA0) Module 5

Introduction to Medical Imaging (5XSA0) Module 5 Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed

More information

Filtering Images. Contents

Filtering Images. Contents Image Processing and Data Visualization with MATLAB Filtering Images Hansrudi Noser June 8-9, 010 UZH, Multimedia and Robotics Summer School Noise Smoothing Filters Sigmoid Filters Gradient Filters Contents

More information

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors

Complex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors Complex Sensors: Cameras, Visual Sensing The Robotics Primer (Ch. 9) Bring your laptop and robot everyday DO NOT unplug the network cables from the desktop computers or the walls Tuesday s Quiz is on Visual

More information

BIL Computer Vision Apr 16, 2014

BIL Computer Vision Apr 16, 2014 BIL 719 - Computer Vision Apr 16, 2014 Binocular Stereo (cont d.), Structure from Motion Aykut Erdem Dept. of Computer Engineering Hacettepe University Slide credit: S. Lazebnik Basic stereo matching algorithm

More information

Basic relations between pixels (Chapter 2)

Basic relations between pixels (Chapter 2) Basic relations between pixels (Chapter 2) Lecture 3 Basic Relationships Between Pixels Definitions: f(x,y): digital image Pixels: q, p (p,q f) A subset of pixels of f(x,y): S A typology of relations:

More information

Lecture 4: Spatial Domain Transformations

Lecture 4: Spatial Domain Transformations # Lecture 4: Spatial Domain Transformations Saad J Bedros sbedros@umn.edu Reminder 2 nd Quiz on the manipulator Part is this Fri, April 7 205, :5 AM to :0 PM Open Book, Open Notes, Focus on the material

More information

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera Tomokazu Satoy, Masayuki Kanbaray, Naokazu Yokoyay and Haruo Takemuraz ygraduate School of Information

More information

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov Real-Time Scene Reconstruction Remington Gong Benjamin Harris Iuri Prilepov June 10, 2010 Abstract This report discusses the implementation of a real-time system for scene reconstruction. Algorithms for

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Dense 3D Reconstruction. Christiano Gava

Dense 3D Reconstruction. Christiano Gava Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Today: dense 3D reconstruction The matching problem

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching Stereo Matching Fundamental matrix Let p be a point in left image, p in right image l l Epipolar relation p maps to epipolar line l p maps to epipolar line l p p Epipolar mapping described by a 3x3 matrix

More information

COMPUTER AND ROBOT VISION

COMPUTER AND ROBOT VISION VOLUME COMPUTER AND ROBOT VISION Robert M. Haralick University of Washington Linda G. Shapiro University of Washington T V ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California

More information