Using dynamic Bayesian network for scene modeling and anomaly detection

Size: px
Start display at page:

Download "Using dynamic Bayesian network for scene modeling and anomaly detection"

Transcription

1 SIViP (2010) 4:1 10 DOI /s ORIGINAL PAPER Using dynamic Bayesian network for scene modeling and anomaly detection Imran N. Junejo Received: 7 July 2008 / Revised: 4 December 2008 / Accepted: 5 December 2008 / Published online: 8 January 2009 Springer-Verlag London Limited 2008 Abstract In this paper, we address the problem of scene modeling for performing video surveillance. The problem consists of using the trajectories, obtained by observing objects in a scene, to construct a scene model that can be used to distinguish a normal and an acceptable behavior from a atypical one. In this regard, the proposed method is divided into a training phase and a testing phase. During the training phase, the input trajectories are used to identify different paths or routes commonly taken by the objects in a scene. Important discriminative features are then extracted from these identified paths to learn a dynamic Bayesian network (DBN). During the testing phase, the learned network is used to classify the incoming trajectories based on their size, location, speed, acceleration, and spatio-temproal curvature characteristics. The proposed method (i) handles trajectories of varying lengths, (ii) automatically detects the number of paths presents in a scene, and (iii) introduces the novel usage of the DBN, which is very intuitive and accurately captures the dynamics of the scene. We show results on four datasets of varying lengths and successfully show results for both path clustering and anomalous behavior detection. Keywords Path modeling Dynamic Bayesian network Video surveillance 1 Introduction Video surveillance has brought computer vision into the limelight in recent years. This is primarily due to the increased I. N. Junejo (B) INRIA-Rennes, Campus Universitaire de Beaulieu, Rennes Cedex, France ijunejo@cs.ucf.edu security concerns, but also mainly because now the technology has advanced to a stage where we no longer have to wait for hours to get results from a number crunching machine. The task of processing a video sequence, performing background subtraction and detecting foreground objects, in addition to performing other higher level event detection tasks can now be done in real time. One such higher level task is building a scene model 1 for performing video surveillance. Once an acceptable or normal object behavior is obtained from a scene, the path modeling problem basically involves building a system that is able to learn routes or paths most commonly taken by objects in that scene. Based on this learned model, we aim to classify an incoming or a test behavior as conforming to our model or not. For example, consider the problem of monitoring an area of interest, e.g. a building entrance, a parking lot, a port facility, an embassy, or an airport lobby, using stationary cameras. Our goal in these scenarios would be to model the behavior of objects of interest, e.g. cars or pedestrians, with the intent to correctly detect any unusual movement or activity when it occurs. Although the solution we propose is general, objects of interest addressed in this paper are pedestrians. As objects tend to follow well established lines of travel while entering or exiting a scene, due to the presence of benches, trees etc, we contend that it is very critical to identify these different areas which receive extra attention from pedestrians. We refer to this process of segmenting different lines of travel as determining paths in the model. A path or route can be defined as any established line of travel or access [1]. This is the region that is most used by the objects. And a trajectory can be defined as a path followed 1 We shall use the terms path modeling and scene modeling interchangeably.

2 2 SIViP (2010) 4:1 10 by an object moving through the space. An example is shown in Fig. 1 where a person walking on a paved path is tracked by a system and correctly assigned a unique identifier. However, it is true that the definition of an unusual behavior might be different for different applications. People waiting several minutes by a conveyer belt at the airport maybe considered as acceptable while the same may not be true for a person standing outside a bank. Similarly, running of a person on a sidewalk may be an acceptable behavior for a certain application but may not be suitable at an airport lobby. This difference of context is mediated by dividing the process into atraining phase and a testing phase. Thus for any application, the training phase consists of acceptable behaviors that objects may demonstrate in a scene, which is obtained in terms of their tracked trajectories. And any test trajectory having characteristics different from the scene model built in the training phase shall be discarded as abnormal, flagging an alert. It has been recently argued by Junejo and Foroosh [2] that a camera needs to be calibrated for performing path modeling. This is primarily due to the effects of perspective projection. That is, objects tend to grow larger as they approach the camera and grow smaller while moving away from the camera; making it difficult to characterize objects in terms of their sizes and motions. However, we contend that this perspective effect is also a characteristic of the scene, depending on both the intrinsic and extrinsic parameters of the camera as well as on the locations and orientations of paths present in the scene. We also make a novel usage of an object s size by using its bounding box information. This gives us a prior knowledge about the expected size of an object at any location, allowing us to reject, for example, the presence of a car where a pedestrian was expected. In this paper, we provide a novel method to perform path modeling for video surveillance. Mainly, our contributions are: (i) a simple and intuitive method to segment trajectories obtained during the training phase into spatially different paths by the application of eigendecomposition, (ii) extracting useful novel features from each trajectory present in a detected path, characterizing the location, velocity, acceleration, spatio-temporal curvature and size of the observed pedestrians, and (iii) a novel usage of dynamic Bayesian network (DBN) to learn each of the unique path detected in the scene so that normal behavior can be distinguished from an abnormal one. The rest of the paper is organized as follow: a brief introduction to the related work is described next. The process of segmenting input trajectories into different paths is described in Sect. 2. Novel features are extracted from these segmented paths to learn a path model for each of the detected path by using the DBN, described in Sect. 3. We show promising results in Sect. 4 before concluding. 1.1 Related work It is beyond the scope of the current work to summarize the existing work on video surveillance, we therefore refer the readers to a recent survey [3]. Similarly, commonly used distance measures and their comparisons can be found in [4 7]. For video surveillance, Grimson et al. [8] use a distributed system of cameras to cover a scene, and employ an adaptive tracker to detect moving objects. A set of parameters for each detected object are recorded, e.g. position, direction of motion, velocity, size, and aspect ratio of each connected region. Tracked patterns (e.g. aspect ratio of a tracked object) are used to classify objects or actions. Tracks are clustered using spatial features based on the vector quantization approach. Once these clusters are obtained the unusual activities are detected by matching incoming trajectories to these clusters. Thus, unusual activities are outliers in the clustered distributions.boyd et al. [9] demonstrate the use of network tomography for statistical tracking of activities in a video sequence. The method estimates the number of trips made from one region to another based on the inter-region boundary traffic counts accumulated over time. It does not track an object through the scene but only logs the event when an object crosses a boundary. The method only determines the mean traffic intensities based on the calculated statistics and no information is given about trajectories. Johnson et al. [10] use a neural network to model the trajectory distribution for event recognition and prediction. Piciarelli et al. [11] cluster the trajectory based on normalized Euclidean distances, Khalid and Naftel [12] employ a Mahalanobis classifier for detection of anomalous trajectories, Calderara et al. [13] use a mixture of von Mises Distributions for abnormal behavior detection. However, in terms of path modeling, the most relevant work is that of Makris and Ellis [1,14], where they develop a spatial model to represent routes in an image. Once a trajectory of a moving object is obtained, it is matched with routes already existing in a database using a simple distance measure. If a match is found, the existing route is updated by a weight update function; otherwise a new route is created for this new trajectory having some initial weight. Spatially proximal routes are merged together and a graph representation of the scene is generated. One limitation of this approach is that only spatial information is used for trajectory clustering and behavior recognition. The system cannot distinguish between a person walking and a person lingering around, or between a person running and walking, since their models and measurements are not Euclidean. There also does not exist any stopping criteria for merging of routes. Following the work of Makris and Ellis [1], recently Junejo and Foroosh [15] proposed a method that overcomes some of the limitations of Makris and Ellis [1], in addition

3 SIViP (2010) 4: to calibrating the camera by observing the pedestrians. Trajectories are clustered into paths by performing Normalized- Cuts [16]. By applying dynamic time warping (DTW), a path envelope is built, representing the spatial extent of each path. Features are then extracted from these paths and each path is divided into segments. Mean and standard deviation of these feature are computed to create a Gaussian representation of each path. Mahalanobis distance is used to check the conformity of a test trajectory with the created model. However, they argue that a camera needs to be calibrated for performing this task, whereas we show that this is not necessary as perspective effects also represent a characteristic of the scene. We show results on their dataset and obtain comparable results. Moreover, they adopt the tedious approach of constructing a path envelope and dividing each path into segments. We overcome this drawback with the help of techniques from machine learning. Recently, Wright and Pless [17] used the 3D structure tensor for representing global patterns of local motion. Zhang et al. [18] built a generic rule induction framework based on trajectory series analysis for learning events in a scene. Jiang et al. [19] propose a HMM-based solution for event detection by performing dynamic hierarchical clustering. They only use the object positions to train the HMM, which severely limits the application of their method, as they are not able to account for the crookedness, or the difference in velocities between the training and the testing trajectories. Moreover, they train a HMM on each trajectory, which can be very time consuming. A hierarchical approach was also adopted by Fu et al. [20]. Wang et al. [21] describe an unsupervised framework, where scene is segmented initially into people and vehicles. However they only use size information for this initial yet crucial step. This feature is very prone to perspective effects and generally leads to unstable results [2](a person on skateboards might appear to be of the same size as a car or a golf cart in the scene). Their proposed clustering criterion is similar to the normalized-cuts [16], using only the location and velocity features, which might not be able to distinguish between a running person or a person riding a bicycle. Whereas, in addition to these, we make use of the spatio-temporal curvature feature which captures more accurately the dynamics of a pedestrian s motion. In contrast to the existing work, we propose a simple yet efficient solution that avoids the noise prone task of camera calibration [2] and does not construct a path envelope or mean path [1], which can lead to a serious model drifts [1]. For this problem of path modeling, we introduce the novel usage of DBN, which is able to accounts for the instantaneous dependencies between different time instances of a tracked object. Moreover, we extract features that can differentiate between people walking at different speed or velocity, between people walking with different spatio-temporal characteristics (i.e. in a crooked zigzag path), and we also obtain a Fig. 1 People generally tend to follow established lines of travel. In this example, a person is walking straight on a paved pathway, and is assigned a unique identifier by the object tracker. For our method, instead of tracking the centroid of an object, we track the feet locations prior information regarding the size of the object that should be visible at a particular location. 2 Behavior class identification The object tracker is able to correctly detect an object as it enters the scene and successfully tracks it until it exits. For our case of a single stationary camera, we use the tracker proposed by Javed and Shah [22]. With the unique label given by the tracker to a new object in the scene, we are able to correctly maintain a history of the object as it traverses the scene, as shown in Fig. 1. This history, in 2D image coordinates, can be represented as: T i ={l i 1, li 2,...,li n } (1) where T i is the trajectory of an object labeled i = 1, 2,...,N, where N is the total number of observed pedestrians throughout the whole video sequence, and l i t is the location of an observed object i at time instance t. People walk at different speed, thus the length of the trajectory T i will be different for different people. The vector l i t represents the bounding box of the detected object and contains (t x, t y, w, h), where (t x, t y ) denote the x and y coordinates of the top left points of the bounding box, and (w, h) represent the width and the height of the bounding box, respectively. However, instead of using (t x, t y ) for learning the path/scene model, we extract the feet location of an object at each time instance and denote it by (x, y) and use it along with the two quantities (w, h) for learning, as we discuss in Sect. 3. The feet location is computed simply as the midpoint of the bottom line of the bounding box. 2.1 Identifying paths from motion trajectories Before we can learn the motion behavior of objects in the scene, we need to segment the different classes of object

4 4 SIViP (2010) 4:1 10 Fig. 2 a Trajectories used for learning the scene model and for building the affinity matrix W. By successive application of the eigenvalue decomposition, different clusters, representing different paths in the scene, are obtained, as shown in b e motion. That is, we need to identify the number of paths S in the scene, and assign each trajectory its appropriate path label. In this regard, we aim to adopt an approach that is easy to implement, is able to distinguish between spatially dissimilar trajectories, and is flexible to accommodate different definitions of acceptable behaviors. In order to perform this task, we need to measure the affinity between trajectories. We define the N N symmetric affinity matrix [W ij ] where 1 i, j N, as: w(1, 1) w(1, 2)... w(1, N) w(2, 2)... w(2, N) W = w(n, N) where indicates symmetric values, N indicates the total number of trajectories in the training dataset. Each entry of this matrix is computed as: w(i, j) = e max{d(t i,t j ),d(t j,t i )}/2σ 2 w (2) where σ controls the decay of the similarities (set to σ w = 2 for our experiments), w(i, i) = 0, and d(t i, T j ) is a distance between xy-coordinates of two trajectories T i and T j. Our task at this stage is to cluster the input trajectories into paths based only on their spatial similarity. In order to do so, we choose the Hausdorff Distance to compute the distance d(t i, T j ) in (2), defined as: d(t i, T j ) = max min a b (3) a T i b T j One advantage of using the Hausdorff distance is that trajectories of different lengths can be compared and is fairly efficient to compute. So far, we have computed the Hausdorff distance between all the trajectories T ={T 1,...,T i,...,t N } by using the Eq. 2 and have built the affinity matrix W. In order to segment the entire trajectory (i.e. the training) dataset into different paths and to determine the total number of classes S in the scene, rather than looking at the first eigenvector of W, we look at its generalized eigenvectors [23]. Let d(i) = j w(i, j) be the total distance of a trajectory T i to all the other trajectories in the training dataset. Let D be the N N diagonal matrix with d on its diagonal, then define the generalized eigenvector γ i as a solution to (D W)γ i = λ i Dγ i (4) and the second generalized eigenvector is defined as the γ i corresponding to the second smallest λ i. In other words, once we have constructed W, we apply the eigendecomposition recursively until the value of the second generalized eigenvector is above a threshold. Shi and Malik [23] show that the second generalized eigenvector is the solution to a continuous version of a discrete problem, where the solution is to find a segmentation that minimizes the affinity between groups normalized by the affinity within each group. After applying this process, the entire training set of trajectories is clustered into spatially similar paths, and we denote the total number of paths detected in the scene by S. After this, each trajectory is assigned the label of its corresponding path. The result of applying the above method to our dataset is shown in Fig. 2. All trajectories used for learning are depicted in Fig. 2a. We then perform the generalized eigendecomposition, which divides the matrix into two segments. The eigendecomposition is again applied on the smaller segment of the two for further division, while thresholding on the second generalized eigenvector. Different detected paths used by the objects in the scene are shown in Fig. 2b e. 3 Path modeling In the previous section, we described an approach for segmenting trajectories into different paths. Once this is achieved, we need to extract useful features from these extracted paths so that the scene can be accurately modeled. We aim to extract features that enable us to distinguish between (i) spatially dissimilar trajectories, (ii) trajectories of people walking at different speeds, (iii) crooked trajectories, (iv) trajectories where people make unexpected u-turns or (v) trajectories where people unexpectedly vary their speed.

5 SIViP (2010) 4: The novel set of features extracted from the trajectories of an object i at time instance t is a 11-tuple: ψ t i = (x, y, w, h, v, v x, v y, a, a x, a y, κ). Feature description: As described above, (x, y) represent feet locations of a detected object, essential for characterizing positional information of the trajectories present in the path model. (w, h) is the width and the height, respectively, of the bounding box around the detected object. For example, consider the case when we only observe pedestrians during the training phase. The values of observed w and h will be quantitatively lower than those of, for example, the bounding box around an observed car. Thus this feature is very useful for detecting objects of unusual or unexpected size in the scene. In order to distinguish between objects of different speeds, we compute the velocity feature v, which can be derived as the magnitude of the first derivative of the x and y positions. Similarly, we compute v x and v y, the first derivative of the x coordinates and y coordinates, respectively. This feature is essential for differentiating between people walking at different velocities, for e.g. between walking vs. running. In this paper, for the sake of simplicity we discard the velocity direction information. But it could be very useful for the cases where, for instance, road traffic is observed and all the vehicles are expected to follow one particular direction. There might also occur a case where a pedestrian suddenly changes his/her velocity. Many reasons can be attributed to this behavior. For example, a person falls while walking, a running person suddenly stops, or a person does a combination of a slow and a fast walk intermittently. For addressing this type of scenarios, we compute the second derivative of the x and y positions, i.e. the acceleration a and the x and y derivatives of acceleration as a x and a y, respectively. In order to distinguish a crooked trajectory from a straight line motion trajectory, we extract the velocity and the acceleration discontinuities, i.e. the spatio-temporal curvature feature. This is where the importance of the velocity and the acceleration feature is greatly realized. For instance, in distinguishing a person walking normally in a straight line and a drunkard walking waywardly. At any time instance, this feature is computed as [24]: κ = a 2 y + a2 x + (v xa y a x v y ) 2 ( v 2 x + v2 y + 1 ) 3 (5) 3.1 Learning dynamic bayesian network From the previous step, the scene is clustered in to S paths or classes, and thus each trajectory T i contains the label of the class to which it belongs. Let C ={C 1, C 2,...,C i,...c S } be Fig. 3 The dynamic Bayesian network (DBN) used in our method. The blue shaded circles are the observed variables while the white circles are the latent variables. q c represents the path label of a trajectory during the training phase, while z n and y n are the continuous latent variable and 11D input variable, respectively the set consisting of the class labels. Let the set of extracted features for any trajectory i in class C S be represented by = {ψi 1,ψ2 i,...,ψn 2 i }. Similarly, let = {Ɣ C 1, Ɣ C S i Ɣ C 2,...,Ɣ C S} be the set representing features of the total trajectories used in our training process. We aim to learn the scene model by applying machine learning techniques to this features set. As described above, a common approach adopted for learning the set of input trajectories is to build a path envelope, and to divide each trajectory into segments [1], which can be a very tedious task. Other approaches that use machine learning techniques, such as HMM, assume equal length for all trajectories in a path [19]. However, by adopting DBN [25], we aim to provide a solution that overcomes the restrictions imposed by the existing methods. In addition, by employing the linear dynamical system (LDS), we learn the dynamics of the motion. Our goal is to propose a general solution for the problem where individual components can be modified to cater to any specific application at hand. Trajectory of a tracked object (a car or a pedestrian) is a temporal (time-series) data. For capturing the instantaneous correlation of this sequential data, it is natural to use directed graph models. The arcs (or edges) between different time slices of these models can be either directed or undirected. When these arcs are directed, they are known as DBN. DBNs of different topologies exist and in this paper, the DBN that we use is shown in Fig. 3. This network is an example of LDS (or commonly also known as the Kalman Filter [26]). Although this network appears similar to the input-output HMM (IOHMM) [27], however the latent node z n is continuous for our model. For a trajectory belonging to path label c (we omit the subscript i to make the notation clutter-free), q c is a discrete state variable, where the number of possible states are S, and y n is the continuous 11D input feature vector.

6 6 SIViP (2010) 4:1 10 The number of nodes per slice are three, and the observable variables are shaded blue while the latent (hidden) variable is shaded white, as shown in the figure. Each of the state variable q c and the input variable y n is connected to the latent variable z n. The state space description of this model is: z n = f c (z n 1, q c ) (6) y n = g c (z n, q c ) (7) where f c and g c are arbitrary differentiable functions (Gaussian), q c C is a discrete state variable containing the class (or path) label of a trajectory, and z n and y n are the continuous latent variable and the 11D input (observation) variable, respectively. Thus the observable variable q c represents the path label of each feature vector. As shown in Fig. 3, the input variable q c, in addition to the output variables y n, influence either the latent variables or the output variables or both. The network has the task of predicting the next state based on the current input q c and y n and the previous state z n 1. As LDS is a linear Gaussian model, the joint distribution overall the variables, as well as the marginals and the conditionals will all be Gaussian [25]. Hence each latent variable z n is taken as a Gaussian. and the observed 11D feature variable y n is also assumed as Gaussian. Traditionally, the conditional distribution of the latent variable, in terms of noisy linear equations, is given by z n = A c z n 1 + wc n (8) y n = C c z n + vc n (9) z 1 = µ 0 + u (10) where the noise terms w c, v c, and u c are zero mean Gaussians: w c N (w c 0, B c ) (11) v c N (v c 0, c ) (12) u c N (u c 0, V 0 ) (13) and the Gaussian distribution of the initial latent variable is given as: p(z 1 ) = N (z 1 µ 0, V 0 ). (14) Thus the parameters of a path with label c are denoted by { c = A c, B c, C c, c, V 0, µ 0 }. Therefore, once the membership of trajectories to their corresponding paths is determined, a unique DBN is trained for each path model by using the features extracted from their corresponding trajectories. We use the standard method of Maximum Likelihood to determine the model parameters c through the expectation-maximization (EM) algorithm. The publicly available BNT toolbox [28] is used for learning the proposed DBN. 3.2 Anomaly detection Once the labeled trajectories have been used to learn the DBN model M, we can begin to test our method to distinguish normal behavior from the abnormal one. The model M is used to classify an unseen behavior pattern as belonging to one of the S model classes obtained from the training set. Once a test trajectory T p is obtained, the membership of this trajectory is verified to each of the created models, as Ŝ = arg max c p(t p M c ) N p (15) where c C, and N p is the length of the test trajectory T p. Thus, a trajectory is assigned to a class M c for which (15)is maximum. However, it is also critical to detect if the test trajectory conforms to our constructed scene model or not. In order to do so, for every trajectory T k in the model class c, wefirst compute the following quantity: Lˆ c = min p(t k M c ) (16) N k Thus Lˆ c computes the minimum of the weights that each trajectory obtains from a given path model. Based on this, a normal trajectory is identified as: ( MŜ) p T p < LˆS ˆ (17) i.e. if the incoming test trajectory fails to obtain a substantial support from any path of our model, that trajectory is rejected as containing unusual or abnormal behavior. As we shall show shortly in the next section, this measure works very well and we are able to obtain better results than the existing approaches without resorting to camera calibration, path envelopes or other hierarchical clustering approaches. The novel set of extracted features allow us to account for the changes in size, location, velocity, and acceleration discontinuities in the motions of the observed objects. 4 Experiments and results We rigourously test the proposed method on a variety of test sequences. As described above, we first detect pedestrians in a scene and perform tracking. Often, due to tracking errors or noise in the image, the obtained trajectories are noisy. In order to remove this effect, we perform smoothing. Another important issue while performing object tracking is the occlusion that may occur. When an occlusion does occur, the accurate position and velocity of the occluded object can not be determined. The types of occlusion are (i) Inter-object occlusion, which occurs when one object blocks

7 SIViP (2010) 4: Fig. 4 Training and testing for G 1 : a depicts all the trajectories in the dataset G 1.After performing path clustering, two distinct paths are obtained, shown in b and c. Two results are also shown in d and e. A bicyclist is detected in d where its velocity is higher than the trained model. Hence, it is flagged red, i.e. unusual behavior. e shows the flagged trajectory of a person that makes a u-turn in the scene the view of other objects in the field of view of the camera. The background subtraction method gives a single region for occluding objects. If two initially non-occluding objects cause occlusion then this condition can be easily detected. (ii)occlusion of objects due to thin scene structures like poles or trees causes an object to break into two regions. Thus more than one extracted region can belong to the same object in such a scenario, and (iii) Occlusion of objects due to large structures, causes the objects to disappear completely for a certain amount of time, that is there is no foreground region representing such objects. More details on how we handle these occlusions during the tracking process can be found in [22]. Although our tracking can handle occlusions to a great degree, not all cases can be handled correctly. As a result, we obtain incorrect trajectories, which affects our trajectory clustering method. During our training phase, two cases are considered: 1) When inter-object occlusion occurs: this kind of occlusion generates incomplete trajectories, i.e. a trajectory starts from one end of the image and ends before reaching the image boundary (possibly an exit point). We ignore this trajectory and do not use in our path building phase. 2) A new trajectory is generated not at the boundary of the image, but rather well inside the image plane. This generally occurs when scene structures causes an object to break, or when the tracker assigns new trajectories to objects emerging from occlusion. We also ignore this type of trajectory. During the testing phase, trajectories resulting from occlusion are not treated specially. If such a trajectory does satisfy the spatial proximity feature, it fails the motion and spatiotemporal features. This happens because there is no information regarding velocity and the curvature of the trajectory at the missing sections of the trajectory. Testing on real data: We test the proposed scene modeling method on four video sequences provided by Junejo and Foroosh [2]. These sequences are captured from a single camera with image resolution of , and we label each dataset as G i, where i = 1, 2, 3, 4. For visualizing purposes, a conforming trajectory is marked green, while the abnormal trajectory is marked red in the results below. Once all the trajectories in the scene are obtained for the training phase, the affinity matrix is constructed by comparing the trajectories pair-wise. The purpose of this step is to distinguish between spatially and visual distinct paths. However, this segmentation can vary from application to application. For example, while monitoring a road for traffic surveillance, distinct path may be based on the speed of the motorists, in addition to segmenting different lanes on the road. Eigendecomposition is performed on the computed W. Recursive application of this decomposition is a form of segmentation, dividing the matrix (and hence the set of trajectories used in its construction) into two sets. The decomposition is applied again on the larger of the two sets until the values of the second generalized eigenvector is above a certain threshold, more on this can be found in [23]. Useful features are then extracted from these trajectories. Once the object trajectories are segmented into paths, the DBN [28], as described in Sect. 3, is constructed for each segmented path. During the test phase, the support for the test trajectory is computed from each path. The trajectory is assigned a class label that maximizes its marginal probability, or rejected when the probability is below the threshold. Testing on G 1 : This is a small dataset of 3,730 frames, containing 15 instances of pedestrians walking in the scene. All the trajectories for this class are plotted in Fig. 4a. In this scene, people move horizontal or vertical, following the paved paths. Thus, two distinct paths are obtained, as shown in Fig. 4b and c.

8 8 SIViP (2010) 4:1 10 Fig. 5 Training and testing for G 2 : a All the trajectories in dataset G 2. Three unique paths are clustered from the training sequence, shown in b d. Results are shown in e h. Two cases of normal behavior are detected, as shown in e and g, wherethe accepted trajectories are plotted green. Two abnormal cases, one containing a person walking on the grass (f), and another containing a person walking zigzag (h), are flagged red Fig. 6 Testing for G 3 : a depicts the rejected trajectory of a golf cart in the scene. This trajectory is rejected based on its unusual size and high speed. b, d f depict four cases of walking behavior which match our scene model as the object follows the learned paths. c shows the case when a person makes a left turn while entering from the right. This behavior is not part of the training process, hence rejected. Another case is shown in g where a person wanders on the grass. This is rejected based on its spatial characteristics Once the scene model has been learned, we test on two trajectories. One trajectory is that of a bicyclist, as shown in Fig. 4d, coming from the top of the scene and moving towards the left.this is marked as unusual due to two reasons: containing different spatial signature and having a non-conforming velocity. Another case is shown in Fig. 4e. Here, a person moves to the left of the scene and then makes are u-turn. This is detected as unusual behavior as well. As shown in Fig. 4b and c, the training trajectories consists of people walking in almost straight lines, thus this specific trajectory is correctly detected to be abnormal. This particular case shows the strength of the proposed method, as detection of this type of trajectory was not possible in [2,14]. Testing on G 2 : This is a sequence of 9,284 frames with 27 different trajectories forming three different path clusters. The length of the trajectories varies from 250 points to almost 800 points. Figure 5a depicts all the trajectories obtained in this sequence. The scene contains T shaped paved path. The proposed method successfully dissects the training set into three paths, as shown in Fig. 5b d. The training phase consisted of people walking straight, coming in from the left of the scene and going right, and going left. A normal case is detected in Fig.5e, where a persons walks in a straight line, getting highest probability from cluster one (cf. Fig. 5b). Figure 5f contains a trajectory, marked red, of a person walking on the grass. This trajectory fails to get support from any of the clustered paths, hence marked red. Similarly, a case of a person making a right turn is detected to be a normal behavior (cf. Fig. 5g). Figure 5h depicts the case when the person is walking zigzag. This is where our spatio-temporal curvature feature is useful, thus this trajectory is correctly marked as unusual. Testing on G 3 : This is a longer sequence, containing over 20 min of data forming over 100 trajectories of people walking. Figure 2a shows all the trajectories of the training sequence. Fours clustered paths are shown in Fig. 2b e. The first cluster contains people walking from the top and making a left turn, the second cluster contains people walking from the top and going down, the third cluster consists of people marking a right turn, and the fourth and the largest cluster contains people walking in a straight line. Test results for G 3 are shown in Fig. 6. The case of observing a golf cart in the scene is rejected based on the account of its size and speed, as shown in Fig. 6a. Four cases of

9 SIViP (2010) 4: Fig. 7 Testing for G 4 : a c depict three classes of clustered trajectories. Some of the test results are shown. Four detected cases of unusual behavior are shown in d f, h, wherethe observed pedestrians are not following the constructed model, marked in red. Samples of two positive detections are shown in g and i, the trajectory is marked blue.seetextfor more details acceptable walking are shown in Fig. 6b, d, e, f, and plotted in green. A negative case is shown in Fig. 6c, where a pedestrians makes are left turn while coming in from the right of the scene. This is rejected because this path was not detected in the clustering process. Similarly, a case of a person deviating from the paved path and walking onto the grass is shown in Fig. 6g. This trajectory is spatially different and fails to get support from any of the path in our scene model and is thus rejected as an abnormal trajectory. Testing on G 4 : This is the longest test sequence of resolution pixels. The dataset contains tracks obtain from a surveillance camera observing the scene for almost 183 minutes. The sequence contains more than 500 trajectories of people walking, car driving by etc. Figure 7a c show the three clustered paths for this scene. The first cluster contains people walking from the bottom of the scene and making a left turn, or vice versa. The second cluster consists a visible portion of a car parking lot, which is frequented by people as well. The third cluster consists of people walking form the bottom of the scene, walking straight and then disappearing around the corner of the building. The test sequence contains around 30 trajectories, some results are shown in Fig. 7d g. Two normal cases of walking are marked in blue, as shown in Fig. 7g, i, belonging to cluster 1 and 2, respectively. An abnormal case of a person entering a scene and performing some zigzag motion is shown in Fig. 7d, and marked in red. A case of a person entering from the right of the scene (next to the building) and going across the scene is detected as abnormal (red trajectory in Fig. 7e) as this kind of trajectory is not present in our training model. Similarly, a wayward motion is detected to be unusual, as shown in Fig. 7f, where a person enters the scene and goes onto the road and makes loops. Figure 7g depicts an interesting case. A person comes in to the scene and sits on the chair for some time. This scenario is correctly detected to be non-conforming to our model as the training phase did not include any such behavior. As shown above, the proposed method works fairly robustly and we are able to achieve very good results. We have propose a very general solution. Yet, the solution is flexible enough to be adopted for many applications monitoring behavior of objects in a scene. 5 Conclusion In order to address the growing need for new and innovative methods to do efficient and accurate path modeling for video surveillance, we propose a novel method based on DBN whose performance is shown to be comparable with

10 10 SIViP (2010) 4:1 10 the existing state-of-the-art methods. However, the advantage of our methods lies in doing-away with the extra processing commonly carried out while performing scene modeling, like calibrating cameras, obtaining prior knowledge of the number of different paths, or constructing path envelopes. In this paper, we showed that the trajectories obtained during the training phase can be clustered into different paths by the application of the eigendecomposition. These identified paths are then used to model the scene. That is, a novel feature set characterizing the motion of pedestrians, in terms of their speeds, sizes, locations, and discontinuities in their velocities and accelerations, are estimated from these clustered paths. These features are then used to train a DBN, which is used during the training phase to classify the test trajectories as either accept or reject. We obtain good results on four datasets, demonstrating the practicality, generality, and yet accuracy of the proposed approach. References 1. Makris, D., Ellis, T.: Path detection in video surveillance. Image Vis. Comput. J. (IVC) 20, (2002) 2. Junejo, I., Foroosh, H.: Trajectory rectification and path modeling for video surveillance. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1 7 (2007) 3. Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Syst. Man Cybern. C 34, (2004) 4. Zhang, Z., Huang, K., Tan, T.: Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In: Proceedings of the 18th IEEE International Conference on Pattern Recognition (ICPR), vol. III, pp (2006) 5. Vlachos, M., Gunopulos, D. G. K.: Robust similarity measures for mobile objects trajectories. In: Proceedings of the DEXA Workshops, pp (2002) 6. Keogh, E.J., Pazzani, M.J.: Scaling up dynamic time warping for datamining applications. Knowl. Discov. Data Min (2000) 7. Rangarajan, K., Allen, W., Sha, M.: Matching motion trajectoriesnext term using scale-space. Elsevier Pattern Recognit (2003) 8. Grimson, W., Stauffer, C., Romano, R., Lee, L.: Using adaptive tracking to classify and monitor activities in a site. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp (1998) 9. Boyd, J.E., Meloche, J., Vardi, Y.: Statistical tracking in video traffic surveillanc. In: Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV), pp (1999) 10. Johnson, N., Hogg, D.: Learning the distribution of object trajectories for event recognition. Image Vis. Comput. 14, (1996) 11. Piciarelli, C., Foresti, G., Snidara, L.: Trajectory clustering and its applications for video surveillance. In: Proceedings of the 2nd IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS), pp (2005) 12. Khalid, S., Naftel, A.: Classifying spatiotemporal object trajectories using unsupervised learning of basis function coefficients. In: Proceedings of the 3rd ACM International Workshop on Video Surveillance and Sensor Networks, pp (2005) 13. Calderara, S., Cucchiara, R., Prati, A.: Detection of abnormal behaviors using a mixture of von mises distributions. In: Proceedings of the AVSBS, pp (2007) 14. Makris, D., Ellis, T.: Learning semantic scene models from observing activity in visual surveillance. IEEE Trans. Syst. Man Cybern. B 35, (2005) 15. Junejo, I., Foroosh, H.: Euclidean path modeling for video surveillance. Elsevier J. Image Vis. Comput. (IVC) 26, Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Machine Intell. (PAMI) 22, (2000) 17. Wright, J., Pless, R.: Analysis of persistent motion patterns using the 3d structure tensor. In: Proceedings of the IEEE Workshop on Motion and Video Computing, pp (2005) 18. Zhang, Z., Huang, K., Tan, T., Wang, L.: Trajectory series analysis based event rule induction for visual surveillance. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1 8 (2007) 19. Jiang, F., Wu, Y., Katsaggelos, A.: Abnormal event detection from surveillance video by dynamic hierarchical clustering. In: Proceedings of the 14th IEEE International Conference on Image Processing (ICIP). V, pp (2007) 20. Fu, Z., Hu, W., Tan, T.: Similarity based vehicle trajectory clustering and anomaly detection. In: Proceedings of the 12th International Conference on Image Processing (ICIP), pp (2005) 21. Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Proceedings of the 9th European Conference on Computer Vision (ECCV), pp. 1 8 (2006) 22. Javed, O., Shah, M.: Tracking and object classification for automated surveillance. In: Proceedings of the 7th European Conference on Computer Vision (ECCV), pp (2002) 23. Weiss, Y.: Segmentation using eigenvectors: A unifying view. In: Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV), pp (1999) 24. Rao, C., Shah, M.: A view-invariant representation of human action. Int. J. Comput. Vis. 50(2), (2002) 25. Bishop, C.M.: Pattern Recognition and Machine Learning, 1st edn. Springer, Berlin, ISBN: (2006) 26. Welch, G., Bishop, G.: An introduction to the kalman filter. ACM SIGGRAPH 2001 Courses (2001) 27. Bengio, Y., Frasconi, P.: An input output HMM architecture. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS), vol. 7, pp (1995) 28. Murphy, K.P.: The bayes net toolbox for MATLAB

Trajectory Rectification and Path Modeling for Video Surveillance

Trajectory Rectification and Path Modeling for Video Surveillance Trajectory Rectification and Path Modeling for Video Surveillance Imran N. Junejo and Hassan Foroosh School of Electrical Engineering and Computer Science, University of Central Florida Abstract Path modeling

More information

Fast trajectory matching using small binary images

Fast trajectory matching using small binary images Title Fast trajectory matching using small binary images Author(s) Zhuo, W; Schnieders, D; Wong, KKY Citation The 3rd International Conference on Multimedia Technology (ICMT 2013), Guangzhou, China, 29

More information

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Chris J. Needham and Roger D. Boyle School of Computing, The University of Leeds, Leeds, LS2 9JT, UK {chrisn,roger}@comp.leeds.ac.uk

More information

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Learning the Three Factors of a Non-overlapping Multi-camera Network Topology Xiaotang Chen, Kaiqi Huang, and Tieniu Tan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

More information

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Daegeon Kim Sung Chun Lee Institute for Robotics and Intelligent Systems University of Southern

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,

More information

Det De e t cting abnormal event n s Jaechul Kim

Det De e t cting abnormal event n s Jaechul Kim Detecting abnormal events Jaechul Kim Purpose Introduce general methodologies used in abnormality detection Deal with technical details of selected papers Abnormal events Easy to verify, but hard to describe

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model

Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model Xiaogang Wang 1 Keng Teck Ma 2 Gee-Wah Ng 2 W. Eric L. Grimson 1 1 CS and AI Lab, MIT, 77 Massachusetts Ave., Cambridge,

More information

Cs : Computer Vision Final Project Report

Cs : Computer Vision Final Project Report Cs 600.461: Computer Vision Final Project Report Giancarlo Troni gtroni@jhu.edu Raphael Sznitman sznitman@jhu.edu Abstract Given a Youtube video of a busy street intersection, our task is to detect, track,

More information

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial

More information

Automatic Shadow Removal by Illuminance in HSV Color Space

Automatic Shadow Removal by Illuminance in HSV Color Space Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim

More information

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Evaluation

More information

Defining a Better Vehicle Trajectory With GMM

Defining a Better Vehicle Trajectory With GMM Santa Clara University Department of Computer Engineering COEN 281 Data Mining Professor Ming- Hwa Wang, Ph.D Winter 2016 Defining a Better Vehicle Trajectory With GMM Christiane Gregory Abe Millan Contents

More information

Application of the Self-Organising Map to Trajectory Classification

Application of the Self-Organising Map to Trajectory Classification Application of the Self-Organising Map to Trajectory Classification Jonathan Owens* and Andrew Hunter School of Computing and Engineering Technology University of Sunderland England email: {jonathan.owens,

More information

Using Pedestrians Walking on Uneven Terrains for Camera Calibration

Using Pedestrians Walking on Uneven Terrains for Camera Calibration Machine Vision and Applications manuscript No. (will be inserted by the editor) Using Pedestrians Walking on Uneven Terrains for Camera Calibration Imran N. Junejo Department of Computer Science, University

More information

Fundamental Matrices from Moving Objects Using Line Motion Barcodes

Fundamental Matrices from Moving Objects Using Line Motion Barcodes Fundamental Matrices from Moving Objects Using Line Motion Barcodes Yoni Kasten (B), Gil Ben-Artzi, Shmuel Peleg, and Michael Werman School of Computer Science and Engineering, The Hebrew University of

More information

A Two-stage Scheme for Dynamic Hand Gesture Recognition

A Two-stage Scheme for Dynamic Hand Gesture Recognition A Two-stage Scheme for Dynamic Hand Gesture Recognition James P. Mammen, Subhasis Chaudhuri and Tushar Agrawal (james,sc,tush)@ee.iitb.ac.in Department of Electrical Engg. Indian Institute of Technology,

More information

Self Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation

Self Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation For Intelligent GPS Navigation and Traffic Interpretation Tianshi Gao Stanford University tianshig@stanford.edu 1. Introduction Imagine that you are driving on the highway at 70 mph and trying to figure

More information

Image Segmentation for Image Object Extraction

Image Segmentation for Image Object Extraction Image Segmentation for Image Object Extraction Rohit Kamble, Keshav Kaul # Computer Department, Vishwakarma Institute of Information Technology, Pune kamble.rohit@hotmail.com, kaul.keshav@gmail.com ABSTRACT

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space

A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space Naoyuki ICHIMURA Electrotechnical Laboratory 1-1-4, Umezono, Tsukuba Ibaraki, 35-8568 Japan ichimura@etl.go.jp

More information

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level

More information

Semantic-Based Surveillance Video Retrieval

Semantic-Based Surveillance Video Retrieval Semantic-Based Surveillance Video Retrieval Weiming Hu, Dan Xie, Zhouyu Fu, Wenrong Zeng, and Steve Maybank, Senior Member, IEEE IEEE Transactions on Image Processing, Vol. 16, No. 4, April 2007 Present

More information

People Tracking and Segmentation Using Efficient Shape Sequences Matching

People Tracking and Segmentation Using Efficient Shape Sequences Matching People Tracking and Segmentation Using Efficient Shape Sequences Matching Junqiu Wang, Yasushi Yagi, and Yasushi Makihara The Institute of Scientific and Industrial Research, Osaka University 8-1 Mihogaoka,

More information

Motion Tracking and Event Understanding in Video Sequences

Motion Tracking and Event Understanding in Video Sequences Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!

More information

6.801/866. Segmentation and Line Fitting. T. Darrell

6.801/866. Segmentation and Line Fitting. T. Darrell 6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2015) A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b

More information

Expanding gait identification methods from straight to curved trajectories

Expanding gait identification methods from straight to curved trajectories Expanding gait identification methods from straight to curved trajectories Yumi Iwashita, Ryo Kurazume Kyushu University 744 Motooka Nishi-ku Fukuoka, Japan yumi@ieee.org Abstract Conventional methods

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

Spatial Latent Dirichlet Allocation

Spatial Latent Dirichlet Allocation Spatial Latent Dirichlet Allocation Xiaogang Wang and Eric Grimson Computer Science and Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge, MA, 02139, USA

More information

2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes

2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

ActivityRepresentationUsing3DShapeModels

ActivityRepresentationUsing3DShapeModels ActivityRepresentationUsing3DShapeModels AmitK.Roy-Chowdhury RamaChellappa UmutAkdemir University of California University of Maryland University of Maryland Riverside, CA 9252 College Park, MD 274 College

More information

The goals of segmentation

The goals of segmentation Image segmentation The goals of segmentation Group together similar-looking pixels for efficiency of further processing Bottom-up process Unsupervised superpixels X. Ren and J. Malik. Learning a classification

More information

Pedestrian counting in video sequences using optical flow clustering

Pedestrian counting in video sequences using optical flow clustering Pedestrian counting in video sequences using optical flow clustering SHIZUKA FUJISAWA, GO HASEGAWA, YOSHIAKI TANIGUCHI, HIROTAKA NAKANO Graduate School of Information Science and Technology Osaka University

More information

Introduction to Medical Imaging (5XSA0) Module 5

Introduction to Medical Imaging (5XSA0) Module 5 Introduction to Medical Imaging (5XSA0) Module 5 Segmentation Jungong Han, Dirk Farin, Sveta Zinger ( s.zinger@tue.nl ) 1 Outline Introduction Color Segmentation region-growing region-merging watershed

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi hrazvi@stanford.edu 1 Introduction: We present a method for discovering visual hierarchy in a set of images. Automatically grouping

More information

Unsupervised Video Surveillance for Anomaly Detection of Street Traffic

Unsupervised Video Surveillance for Anomaly Detection of Street Traffic Unsupervised Video Surveillance for Anomaly Detection of Street Traffic Muhammad Umer Farooq, Najeed Ahmed Khan Computer Science & IT department NED University of Engineering & Technology Karachi, Pakistan

More information

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video Workshop on Vehicle Retrieval in Surveillance (VRS) in conjunction with 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance Vehicle Dimensions Estimation Scheme Using

More information

Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes

Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes 1 Submitted to IEEE Trans. on Pattern Analysis and Machine Intelligence Regular Paper Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino

More information

Scene Text Detection Using Machine Learning Classifiers

Scene Text Detection Using Machine Learning Classifiers 601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department

More information

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods

More information

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hu, Qu, Li and Wang 1 Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hongyu Hu (corresponding author) College of Transportation, Jilin University,

More information

Motion Segmentation Based on Factorization Method and Discriminant Criterion

Motion Segmentation Based on Factorization Method and Discriminant Criterion Motion Segmentation Based on Factorization Method and Discriminant Criterion Naoyuki ICHIMURA Electrotechnical Laboratory 1-1-, Umezono, Tsukuba Ibaraki, 35-8568 Japan ichimura@etl.go.jp Abstract A motion

More information

An Approach for Real Time Moving Object Extraction based on Edge Region Determination

An Approach for Real Time Moving Object Extraction based on Edge Region Determination An Approach for Real Time Moving Object Extraction based on Edge Region Determination Sabrina Hoque Tuli Department of Computer Science and Engineering, Chittagong University of Engineering and Technology,

More information

Graph-based High Level Motion Segmentation using Normalized Cuts

Graph-based High Level Motion Segmentation using Normalized Cuts Graph-based High Level Motion Segmentation using Normalized Cuts Sungju Yun, Anjin Park and Keechul Jung Abstract Motion capture devices have been utilized in producing several contents, such as movies

More information

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker Image Segmentation Srikumar Ramalingam School of Computing University of Utah Slides borrowed from Ross Whitaker Segmentation Semantic Segmentation Indoor layout estimation What is Segmentation? Partitioning

More information

Detection and Classification of Vehicles

Detection and Classification of Vehicles Detection and Classification of Vehicles Gupte et al. 2002 Zeeshan Mohammad ECG 782 Dr. Brendan Morris. Introduction Previously, magnetic loop detectors were used to count vehicles passing over them. Advantages

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

A Feature Point Matching Based Approach for Video Objects Segmentation

A Feature Point Matching Based Approach for Video Objects Segmentation A Feature Point Matching Based Approach for Video Objects Segmentation Yan Zhang, Zhong Zhou, Wei Wu State Key Laboratory of Virtual Reality Technology and Systems, Beijing, P.R. China School of Computer

More information

Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008

Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008 Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008 Instructor: YingLi Tian Video Surveillance E6998-007 Senior/Feris/Tian 1 Outlines Moving Object Detection with Distraction Motions

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

CHAPTER 5 MOTION DETECTION AND ANALYSIS

CHAPTER 5 MOTION DETECTION AND ANALYSIS CHAPTER 5 MOTION DETECTION AND ANALYSIS 5.1. Introduction: Motion processing is gaining an intense attention from the researchers with the progress in motion studies and processing competence. A series

More information

Background subtraction in people detection framework for RGB-D cameras

Background subtraction in people detection framework for RGB-D cameras Background subtraction in people detection framework for RGB-D cameras Anh-Tuan Nghiem, Francois Bremond INRIA-Sophia Antipolis 2004 Route des Lucioles, 06902 Valbonne, France nghiemtuan@gmail.com, Francois.Bremond@inria.fr

More information

Reference Point Detection for Arch Type Fingerprints

Reference Point Detection for Arch Type Fingerprints Reference Point Detection for Arch Type Fingerprints H.K. Lam 1, Z. Hou 1, W.Y. Yau 1, T.P. Chen 1, J. Li 2, and K.Y. Sim 2 1 Computer Vision and Image Understanding Department Institute for Infocomm Research,

More information

Idle Object Detection in Video for Banking ATM Applications

Idle Object Detection in Video for Banking ATM Applications Research Journal of Applied Sciences, Engineering and Technology 4(24): 5350-5356, 2012 ISSN: 2040-7467 Maxwell Scientific Organization, 2012 Submitted: March 18, 2012 Accepted: April 06, 2012 Published:

More information

Vehicle Detection Using Gabor Filter

Vehicle Detection Using Gabor Filter Vehicle Detection Using Gabor Filter B.Sahayapriya 1, S.Sivakumar 2 Electronics and Communication engineering, SSIET, Coimbatore, Tamilnadu, India 1, 2 ABSTACT -On road vehicle detection is the main problem

More information

Practical Camera Auto-Calibration Based on Object Appearance and Motion for Traffic Scene Visual Surveillance

Practical Camera Auto-Calibration Based on Object Appearance and Motion for Traffic Scene Visual Surveillance Practical Camera Auto-Calibration Based on Object Appearance and Motion for Traffic Scene Visual Surveillance Zhaoxiang Zhang, Min Li, Kaiqi Huang and Tieniu Tan National Laboratory of Pattern Recognition,

More information

Multi-Camera Occlusion and Sudden-Appearance-Change Detection Using Hidden Markovian Chains

Multi-Camera Occlusion and Sudden-Appearance-Change Detection Using Hidden Markovian Chains 1 Multi-Camera Occlusion and Sudden-Appearance-Change Detection Using Hidden Markovian Chains Xudong Ma Pattern Technology Lab LLC, U.S.A. Email: xma@ieee.org arxiv:1610.09520v1 [cs.cv] 29 Oct 2016 Abstract

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Final Report Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This report describes a method to align two videos.

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

Adaptive Background Mixture Models for Real-Time Tracking

Adaptive Background Mixture Models for Real-Time Tracking Adaptive Background Mixture Models for Real-Time Tracking Chris Stauffer and W.E.L Grimson CVPR 1998 Brendan Morris http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Motivation Video monitoring and surveillance

More information

Car tracking in tunnels

Car tracking in tunnels Czech Pattern Recognition Workshop 2000, Tomáš Svoboda (Ed.) Peršlák, Czech Republic, February 2 4, 2000 Czech Pattern Recognition Society Car tracking in tunnels Roman Pflugfelder and Horst Bischof Pattern

More information

Spatio-Temporal Stereo Disparity Integration

Spatio-Temporal Stereo Disparity Integration Spatio-Temporal Stereo Disparity Integration Sandino Morales and Reinhard Klette The.enpeda.. Project, The University of Auckland Tamaki Innovation Campus, Auckland, New Zealand pmor085@aucklanduni.ac.nz

More information

Real-Time Human Detection using Relational Depth Similarity Features

Real-Time Human Detection using Relational Depth Similarity Features Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,

More information

Learning and Recognizing Visual Object Categories Without First Detecting Features

Learning and Recognizing Visual Object Categories Without First Detecting Features Learning and Recognizing Visual Object Categories Without First Detecting Features Daniel Huttenlocher 2007 Joint work with D. Crandall and P. Felzenszwalb Object Category Recognition Generic classes rather

More information

Optimal Clustering and Statistical Identification of Defective ICs using I DDQ Testing

Optimal Clustering and Statistical Identification of Defective ICs using I DDQ Testing Optimal Clustering and Statistical Identification of Defective ICs using I DDQ Testing A. Rao +, A.P. Jayasumana * and Y.K. Malaiya* *Colorado State University, Fort Collins, CO 8523 + PalmChip Corporation,

More information

Optimizing Trajectories Clustering for Activity Recognition

Optimizing Trajectories Clustering for Activity Recognition Optimizing Trajectories Clustering for Activity Recognition Guido Pusiol, Luis Patino, Francois Bremond, Monique Thonnant, and Sundaram Suresh Inria, Sophia Antipolis, 2004 Route des Lucioles, 06902 Sophia

More information

Pedestrian Detection Using Correlated Lidar and Image Data EECS442 Final Project Fall 2016

Pedestrian Detection Using Correlated Lidar and Image Data EECS442 Final Project Fall 2016 edestrian Detection Using Correlated Lidar and Image Data EECS442 Final roject Fall 2016 Samuel Rohrer University of Michigan rohrer@umich.edu Ian Lin University of Michigan tiannis@umich.edu Abstract

More information

A Novelty Detection Approach for Foreground Region Detection in Videos with Quasi-stationary Backgrounds

A Novelty Detection Approach for Foreground Region Detection in Videos with Quasi-stationary Backgrounds A Novelty Detection Approach for Foreground Region Detection in Videos with Quasi-stationary Backgrounds Alireza Tavakkoli 1, Mircea Nicolescu 1, and George Bebis 1 Computer Vision Lab. Department of Computer

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Ambiguity Detection by Fusion and Conformity: A Spectral Clustering Approach

Ambiguity Detection by Fusion and Conformity: A Spectral Clustering Approach KIMAS 25 WALTHAM, MA, USA Ambiguity Detection by Fusion and Conformity: A Spectral Clustering Approach Fatih Porikli Mitsubishi Electric Research Laboratories Cambridge, MA, 239, USA fatih@merl.com Abstract

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm

Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm ALBERTO FARO, DANIELA GIORDANO, CONCETTO SPAMPINATO Dipartimento di Ingegneria Informatica e Telecomunicazioni Facoltà

More information

1168 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 4, APRIL /$ IEEE

1168 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 4, APRIL /$ IEEE 1168 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 4, APRIL 2007 Semantic-Based Surveillance Video Retrieval Weiming Hu, Dan Xie, Zhouyu Fu, Wenrong Zeng, and Steve Maybank, Senior Member, IEEE Abstract

More information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Ana González, Marcos Ortega Hortas, and Manuel G. Penedo University of A Coruña, VARPA group, A Coruña 15071,

More information

Face Detection and Recognition in an Image Sequence using Eigenedginess

Face Detection and Recognition in an Image Sequence using Eigenedginess Face Detection and Recognition in an Image Sequence using Eigenedginess B S Venkatesh, S Palanivel and B Yegnanarayana Department of Computer Science and Engineering. Indian Institute of Technology, Madras

More information

Accelerometer Gesture Recognition

Accelerometer Gesture Recognition Accelerometer Gesture Recognition Michael Xie xie@cs.stanford.edu David Pan napdivad@stanford.edu December 12, 2014 Abstract Our goal is to make gesture-based input for smartphones and smartwatches accurate

More information

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL Maria Sagrebin, Daniel Caparròs Lorca, Daniel Stroh, Josef Pauli Fakultät für Ingenieurwissenschaften Abteilung für Informatik und Angewandte

More information

SURVEY PAPER ON REAL TIME MOTION DETECTION TECHNIQUES

SURVEY PAPER ON REAL TIME MOTION DETECTION TECHNIQUES SURVEY PAPER ON REAL TIME MOTION DETECTION TECHNIQUES 1 R. AROKIA PRIYA, 2 POONAM GUJRATHI Assistant Professor, Department of Electronics and Telecommunication, D.Y.Patil College of Engineering, Akrudi,

More information

A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion

A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion Marek Schikora 1 and Benedikt Romba 2 1 FGAN-FKIE, Germany 2 Bonn University, Germany schikora@fgan.de, romba@uni-bonn.de Abstract: In this

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

Detecting Multiple Symmetries with Extended SIFT

Detecting Multiple Symmetries with Extended SIFT 1 Detecting Multiple Symmetries with Extended SIFT 2 3 Anonymous ACCV submission Paper ID 388 4 5 6 7 8 9 10 11 12 13 14 15 16 Abstract. This paper describes an effective method for detecting multiple

More information

An Edge-Based Approach to Motion Detection*

An Edge-Based Approach to Motion Detection* An Edge-Based Approach to Motion Detection* Angel D. Sappa and Fadi Dornaika Computer Vison Center Edifici O Campus UAB 08193 Barcelona, Spain {sappa, dornaika}@cvc.uab.es Abstract. This paper presents

More information

A Study on Similarity Computations in Template Matching Technique for Identity Verification

A Study on Similarity Computations in Template Matching Technique for Identity Verification A Study on Similarity Computations in Template Matching Technique for Identity Verification Lam, S. K., Yeong, C. Y., Yew, C. T., Chai, W. S., Suandi, S. A. Intelligent Biometric Group, School of Electrical

More information

Globally Stabilized 3L Curve Fitting

Globally Stabilized 3L Curve Fitting Globally Stabilized 3L Curve Fitting Turker Sahin and Mustafa Unel Department of Computer Engineering, Gebze Institute of Technology Cayirova Campus 44 Gebze/Kocaeli Turkey {htsahin,munel}@bilmuh.gyte.edu.tr

More information

Pattern Feature Detection for Camera Calibration Using Circular Sample

Pattern Feature Detection for Camera Calibration Using Circular Sample Pattern Feature Detection for Camera Calibration Using Circular Sample Dong-Won Shin and Yo-Sung Ho (&) Gwangju Institute of Science and Technology (GIST), 13 Cheomdan-gwagiro, Buk-gu, Gwangju 500-71,

More information