Social Network Model for Crowd Anomaly Detection and Localization

Size: px
Start display at page:

Download "Social Network Model for Crowd Anomaly Detection and Localization"

Transcription

1 Social Network Model for Crowd Anomaly Detection and Localization Abstract Rima Chaker Zaher Al Aghbari Imran N. Junejo {rkhalid zaher In this work, we propose an unsupervised approach for crowd scene anomaly detection and localization using a social network model. Using a window-based approach, a video scene is first partitioned at spatial and temporal levels, and a set of spatio-temporal cuboids is constructed. Objects exhibiting scene dynamics are detected and the crowd behavior in each cuboid is modeled using local social networks (LSN). From these local social networks, a global social network (GSN) is built for the current window to represent the global behavior of the scene. As the scene evolves with time, the global social network is updated accordingly using LSNs, to detect and localize abnormal behaviors. We demonstrate the effectiveness of the proposed Social Network Model (SNM) approach on a set of benchmark crowd analysis video sequences. The experimental results reveal that the proposed method outperforms the majority, if not all, of the state-ofthe-art methods in terms of accuracy of anomaly detection. Keywords: crowd modeling, social network model, crowd analysis, anomaly detecting, anomaly localization, scene understanding, and video surveillance. 1. Introduction Crowd is defined as a collection of large number of people in a confined space. Socio-psychological studies [49] [50] have shown that people in a crowd tend to walk in groups, thus forming collective entities[31] each of which has a specific goal and similar characteristics like speed and trajectory. Early detection, or prediction, of abnormal behaviors occurring in surveillance scenario scenes is of utmost significance. By alerting human operators, potential dangerous consequences can be reduced, or prevented. However, the analysis of crowded scenes is a very challenging task, due to the fact that the analysis of human actions is still not a fully solved problem. The significance of understanding crowd scenes is due to its potential in applications such as crowd management [41], video surveillance [3], public space design [2], etc. Recently, crowd motion segmentation [42][5], crowd density estimation [7][8], and identifying individuals behavioral goals within a crowd [6], have all been subject of active research from different disciplines. This problem presents challenges of great complexity due to: (1) occlusion between individual objects, (2) random variations in the density of people over time, (3) low resolution videos with dynamic background, and (4) the inherent difficulty in accurately modeling the crowd behavior. What is needed is an automatic systems for analyzing crowd scenes and alerting human operators once anomalous activities are detected so that dangerous situations can be prevented. Anomaly detection refers to modeling the normal scene behavior and then to detect the behavior that does not confirm to it. Thus, behavior patterns that appear frequently, are referred to as normal behaviors and those appearing rarely are referred to as abnormal behaviors. In [10], anomaly detection is broadly classified into two types, namely local and global. Local abnormal behavior corresponds to the behavior of a group of objects in a localized region that is different from that of their neighbors in spatio-temporal terms [16]. On the other hand, global abnormal behavior corresponds to the abnormal behavior of a group of

2 Figure 1: A typical scenario (anomalies circled in red). (a) The region of instability flow of Pilgrims circling around kabba is detected. (b) Sample frame of anomaly detected (bicycle) in the UCSD dataset. objects in the whole scene. The key to accurate detection of abnormal behavior is the selection of an appropriate model that properly models both the local and the global behavior. Figure 1-(a) denotes a typical scenario. The red circle represents the detected region of instable flow around Kabba in Mecca. Another example is illustrated in Figure 1-(b). The appearance of the bicyclist, circled in red, represents an anomaly with respect to the overall behavior of its surrounding neighbors. In this paper, we aim at detecting local and global abnormal behaviors in crowd scenes using a social network model: a data structure consisting of nodes and links between the nodes. In the crowd scene context, nodes can represent people and links reflect the social relationship among the people. First, the unsupervised approach extracts dense tracklets from the crowd motion data in a scene. Second, the video scene is partitioned at spatial and temporal levels; as a result, a set of spatio-temporal cuboids are constructed. The granularity of scene partitioning is proportional to the crowd density. Third, we cluster the objects in each cuboid based on the unique features of their tracklets, such as velocity, curvature, direction, etc., to build the local social networks, which model the objects local behavior. Fourth, for each of the subsequent time windows, the global social network is updated incrementally using its local social networks and the previous window s global social network. By analyzing these social networks (local and global), a normal, or dominant, behavior and abnormal behavior can be identified. An earlier version of this work appeared in [51]. 2. Related Work Crowd behavior analysis comprises of motion information extraction and behavior modeling. The model is then used to distinguish between normal and abnormal behavior. Basharat et al. [5] use object tracking[43] to detect unusual events in image sequences. Similarly, Ali et al. [12,13] track subjects in high density crowd scenes that are captured from a distance. They learn the direction of motion as a prior information based on a force model (floor fields). However, their method requires a manual selection of individuals to be tracked in the crowd which hinders automatic unexpected behavior recognition. Also, floor fields is chaotic in crowded scenes as they result in highly inconsistent trajectories. For motion modeling, features, such as optical flow [11], tracklets [26], or Mixture of Dynamic Textures [16], are extracted at the pixel level. Different models are then built to solve the perplexities of occlusion and clutter. These models include Gaussian Mixture Model [21], Social Force Model [10], etc. For example, Mehran et al.[10] explore the socio-psychological concept social force in combination with optical flow to compute interaction forces that are later combined with Latent Dirichlet Allocation to model normal behaviors and detect abnormal ones. This method is further extended in [11] using Particle Swarm Optimization, in addition to social force model, to optimize the computed interaction force and thus detect global abnormal activities. Ali and Shah [13] utilize the idea of coherent structures in fluid dynamics for

3 Figure 2: Scenario of detecting anomalies using social network model: objects are detected and tracked. A spatio-temporal partitioning is constructed, producing a set of spatio-temporal cuboids that capture spatial and temporal features. A hierarchical social network is built to model crowd behavior. At the bottom-level of this hierarchical network, a spatial clustering is applied on each cuboid to detect local anomalies in its local social network. Moving up the hierarchical social network, a hierarchical clustering approach is employed to build the global social network. A temporal clustering is then applied on the global social network to detect global anomalies in each time window. An on-line mechanism is applied to update the global social network, for any subsequent time windows. segmenting dominant crowd flows and flow instability detection. Gaidon et al. [38] structure a video as a tree of nested motion components composed of short duration point trajectories, tracklets. Chongjing et al. [26] analyze motion patterns by clustering the extracted tracklets in a dynamic crowd scenes. [46] use spatiotemporal Laplacian eigenmap to extract different crowd activities from videos. Despite the many different representations of video events, many of the existing works ignore the importance of contextual anomaly in the field of crowd analysis. Contextual anomaly arises when an individual behavior exhibits behavior similar to others but it is anomalous in a specific context (e.g. neighborhood) [15]. Jiang et al. [15] focus on detecting contextual anomalies in the context of motion using statistical analysis. Leach et al. [18] detect subtle context-dependent behavioral anomalies based on contextual information. Beside the motion information, other works include important object features such as appearance or size. Mahadevan et al. [16] apply Mixture of Dynamic Textures (MDT) to jointly model the appearance and dynamics of crowded scenes. Their approach investigates both temporal and spatial abnormalities. Due to the reported heavy computational cost of [16], Reddy et al.[17] propose a more robust anomaly detection algorithm with relatively low complexity, while analyzing the size, motion and texture. An important aspect in crowd behavior analysis is event/behavior recognition. Regular motion patterns such as direction and speed [24,25,40] can be used to estimate the behavior of a crowd in a given environment. A deviant behavior from the normal behavior is considered abnormal behavior. Two types of approaches are commonly used: object-based approach and holistic-based approach [10]. In object-based

4 approaches, the crowd is considered as a collection of individuals. Ozturk et al. [24] propose an approach for clustering a set of flow vectors into local dominant motion flows. The local dominant motion flows are later combined to determine the global dominant motion flows in a crowd scene. In holistic-based approaches, a crowd, or a portion of a crowd, is treated as a single entity to estimate the regular and abnormal motions. For example, Mehran et al. [10] explored the social force model, which is based on socio-psychological studies, to model the behavior of a crowd. Anomaly Detection Techniques: To ensure public safety, the main objective of crowd analysis involves modeling the crowd dynamics and the detection of video anomalies in the scene. However, detecting anomalies in crowd scenes is a challenging task due to the followings [1][2]: The large number of moving objects in crowd scenes easily weakens the local anomaly detector. It is difficult to model the abnormal events, as they are rare and last for a short period of time. It is difficult to obtain a training dataset that covers every possible normal behavior. [48] propose an informative structural context descriptor (SCD), in addition to the 3-D discrete cosine transform (DCT), for describing the crowd individual, Ullah et al.[20], Mehran et al. [10] and Cui et al. [22] detect abnormal events in scenes of escape panics. Ullah et al. [20] initialized a fixed grid of particles that extracted the crowd motion features, and Gaussian Mixture Model [27] was adopted to learn the crowd behavior. The closest works to the proposed method are [10], [22] in terms of considering people social behaviors. Mehran et al.[10] attempt to detect abnormal events with a social force model. A bag of words method and a Latent Dirichlet Allocation are exploited to discriminate between normal and abnormal frames. Abnormal areas are localized as those representing higher force magnitudes. Cui et al. [22] propose interaction energy potentials to model group activities based on social behavior analysis and finally detect escape panic behavior in crowd. Saligrama et at. [19] categorize approaches of detecting abnormal behavior in crowd scenes into two types: local abnormal event (LAE) or global abnormal event (GAE). In LAE, most of the state-of-the-art methods extract motion or appearance features from local patches such as Mahadevan et al. [16]. For the GAE, Mehran et al. [10] detect abnormal crowd behavior by adopting the social force model and then using the Latent Dirichlet Allocation to discriminate abnormal frames from the normal ones. The above methods are often computationally expensive [20]. We proposed a simple yet robust approach where motion features are extracted from corner features by repeatedly generating features-to-track over a temporal window using KLT (Kanade-Lucas-Tomasi) [28,39]. In addition, our method is applicationindependent for detecting abnormal behaviors from different application videos. The proposed method not only detects anomalous events accurately, but also adapts itself to both spatial and temporal changes witnessed in the environment over time. The overview of the proposed method is shown in Figure Scene Modeling with Social Networks Given a set of objects in a crowd scene,,,, where N is the number of objects, is a feature vector, based on spatial and temporal characteristics, describing an individual object and d is the feature dimensionality. In order to capture the dynamics of the crowd, we extract motion tracklets [24, 25], using the KLT keypoint tracker [39]. A tracklet,, is a fragment of a long trajectory tracked across a small number of frames. Their short duration limits drifting problems i.e. trajectories deviating from the underlying tracked object. 3.1 Similarity Features In order to group tracklets that exhibit similar behavior, we focus on selecting the features that account for the (i) direction and magnitude of the motion, (ii) distance between the moving objects, and (iii)

5 different motion curvature of the object. Thus, we use the following measures: Cosine Similarity: Let, denote the dominant directions of tracklets and respectively. Thus, the cosine similarity is defined as [36]:,. 1.. (1) Magnitude Similarity: Let, denote the magnitudes (i.e. the distance between the first and the last spatial coordinates) of tracklets and respectively, the magnitude similarity is defined as,, 1 (2), Combining both similarity measures, and, linearly produces a weighted similarity measure, [36]as:,., 1., with 0 1, (3) where is the parameter that balances the effect of direction and magnitude of the two tracklets. Velocity Similarity Measure: Velocity is computed for each tracklet and Dynamic Time Warping (DTW) is used to measure the velocity similarity, between two tracklets and. We use the following local distance measure, :,,,, (4) where, and, represents velocity distance between the two tracklets along the x-axis and y-axis, respectively. The parameter and represents the standard deviation parameter in x-velocity and y-velocity respectively. Spatio-Temporal Curvature Similarity Measure: This measure capturing the discontinuity in velocity, acceleration and position of an object, is given by:, (5) where and are the and components of the velocity and and are the and components of the acceleration. This measure, denoted by,, is computed using DTW by using the following local distance measure:,,, (6) where, represents curvature distance between tracklets and, and represents the standard deviation parameter in spatio-temporal curvature. The similarity measures defined above are used by the proposed method (SNM) to cover the following cases:

6 Figure 3: Spatio-temporal cuboids at various spatial and temporal scales (represented by the upper arrows). Scale representation scheme is performed (represented by the down arrows). Cosine Similarity: This covers tracklets with zero Euclidean distance, but moving in different directions. They are considered dissimilar by SNM. Magnitude Similarity: This applied to tracklets moving in the same direction but have different lengths. A short tracklet is not considered similar to a long tracklet. Velocity Similarity: Spatially dis-similar tracklets moving in the same direction and having almost equal lengths are not considered similar if they exhibit different motion behavior. Spatio-Temporal Curvature Similarity: Tracklets similar in all above defined measures but with different curvatures are considered dissimilar. Now we are able to give the definition of our two social similarity measure between two tracklets and : Definition 1 (Velocity based Social Similarity Measure) Let, denote direction-magnitude similarity and, denote velocity similarity between the two tracklets and, then the social similarity measure between and is defined as,, =., 1.,, with 0 1 (7) where is a parameter that balances the effect of direction and magnitude on one hand and the velocities of the two tracklets on the other. Definition 2 (Curvature based Social Similarity Measure) Let, denote direction-magnitude similarity and, denote spatio-temporal curvature similarity between the two tracklets and, the social similarity measure between and is defined as,, =., 1.,, with 0 1 (8)

7 Figure 4: The procedure of producing LSN per cuboid. (a) We partition the current time window into cuboids,,,, using the spatio-temporal partitioning approach. Next, determine the tracklets within each processed cuboid. Tracklets in cuboid are colored differently for clarification. (b) Symmetric adjacency matrix of tracklet nodes similarity weights. (c) Connected tracklet nodes make up a local social network component, represented by its average-feature centroid (represented by black dot). where is a parameter that balances the effect of direction and magnitude on one hand and the spatiotemporal curvatures of the two tracklets on the other. The above two measures capture different behavior of the scene. As we shall show, one of these measures might be more appropriate for a certain crowd scene than the other depending on the applications and the scene dynamics. Thus our social similarity measures are flexible and work with different features depending on the nature of the video. 3.2 Spatio-Temporal Partitioning Inspired by the multi-resolution approaches, we sub-divide the input videos into smaller regions. This spatio-temporal partitioning is performed at various spatial and temporal scales producing a unique set of spatio-temporal volumes. We refer to an individual spatio-temporal volume as a cuboid,, where 1, is the number of rows and columns respectively of the spatio-temporal partitions within a window Ω. Each 3D spatio-temporal cuboid in a video is of size nx x ny x nf, in which nx x ny is the spatial dimensions of the cuboid and nf is the depth (or the number of frames). Each cuboid consists of the tracklets found within its dimensions. Therefore, the whole tracklet can belong to one or more cuboids at time window Ω. Depending on the dataset and the crowd dynamics, spatial blocks may range from 2 x 2 to m x m cuboids and temporal window of f frames. We observed that shorter duration (< 50 frames) yields erroneous tracklets due to motion blur and self-occlusions; therefore, in our experiments we set f to 50. Figure 3 illustrates the construction of the video hierarchy forming spatio-temporal cuboids at various levels i.e. 2 x 2, 4 x 4 or 8 x 8: the higher the density of the crowd, the higher the granularity of the partitioning to capture the details of the scene dynamics (illustrated by the right-to-left arrows in Figure 3).

8 3.3 Building Social Networks A social network is represented as a graph [30] where nodes represent objects and edges represent social interactions between people [29]. That is, each tracklet is represented by a node in the social network model, and the edge between two nodes represent the social interaction between these two nodes. The social interaction weights are based on our social similarity weight measure Equation (7) or Equation (8). On a graph, the geodesic between two nodes is a path connecting the nodes with the smallest number of edges. Since similar behaving tracklets need to be spatially close to each other, in addition to the social similarity measure, we use the closeness centrality among connected nodes representing tracklets for pruning only. The closeness centrality is defined as (the inverse of) the average distance to all other nodes[44]. If similar nodes are spatially distant (greater than a threshold ), their connecting edge is deleted. This is then followed by applying the connected component algorithm [35] to the whole network to find the connected components of the social network in each cuboid. Each extracted connected component is considered as a cluster denoted as the local social network (LSN). The aim is to identify the different dynamics of the scene, represented by the clusters in the network Building Local Social Networks (LSN) The cluster obtained above is denoted by its centroid, computed as a mean of the spatial (,, direction (, magnitude (, velocity and/or curvature (κ features of the tracklets belonging to a cluster :, κ (9) By finding the connected components, as defined above, we end up with a number of cluster(s) within each cuboid - referred to as local social network,. Figure 4 shows an example of processing one cuboid (shaded in red). The six extracted tracklets in are colored differently for clarification. Also a node is colored by its tracklet color for ease of referencing. Algorithm 1uses a threshold on the computed social similarity measure between two tracklets and ; and a threshold ( ) on the computed closeness centrality measure between tracklets. The results are stored in a symmetrical similarity adjacency matrix A. This adjacency matrix of non-zero value represents the weights of similarity among tracklet nodes, where zero indicates dissimilarity and one indicates highest similarity. Finally, for each component of a local social network,, its centroid, is computed (represented by black dot, Figure 4 bottom right). Hence, the algorithm outputs the local social network component(s) per cuboid including the corresponding centroid(s) Building Global Social Networks (GSN) GSN gives a general view of the activities occurring in a time window Ω. The Hierarchal Agglomerative Clustering (HAC) [34] is applied to merge similar from different cuboids in a time window into a global social network,, in a hierarchal fashion. Merging two components of the local

9 social networks (say and ) is based on the social similarity Equation (7) or Equation (8) between their centroid, i.e. and, respectively. That is, if the social similarity value between and is above the threshold (, then and are merged together to make a bigger social network and its new centroid is computed. This process continues up the hierarchy until no more merging is possible. The resulting global social network is considered a that may consist of one or more components. This bottom-up approach, as shown in Figure 5, aims to merge similar LSNs from different cuboids and finally discover the global social network within time window Ω. This is shown in GlobalSocialNetwork algorithm that takes as input the local social network from all cuboids ς within time window Ω including the representative centroid of each. The results of LSNto-LSN comparison are stored in a symmetrical similarity adjacency matrix, as an undirected graph. Adjacency matrix of non-zero value represents the weight of similarity among LSN components and zero indicates dissimilarity. 3.4 Anomaly Detection The social similarity measure and the size of social network are essential for detecting abnormal behavior. The social similarity measure separates the rare actions from the dominant ones. That is the resultant social network(s) with very few nodes is denoted as deviant behavior from the other dominant social network(s). Thus, isolated and small (few nodes) social networks are marked as anomaly. If the relative local size,, of the tested local social network component is less than ts, where ts is the ratio of LSNi to the largest local social network components in, where 1, then LSNi is classified as an anomaly. ts is set to 0.5 in our experiments:

10 , 1 (10) Table 1 shows an example of window Ω consisting of 50 frames partitioned into 2 x 2 spatiotemporal cuboids. On processing cuboid, for instance, it produces seven LSN components, of which four are normal and the other three are abnormal. As show in the Anomaly Detection Algorithm, the abnormality classification is based on the social similarity measure, Equation(7) or Equation (8), followed by the size of a LSN relative to the largest local social network component within the cuboid - Equation (10). Once the anomalous LSN is identified, the localization is simply determined by using the spatial feature, of the tracklet members in the anomalous LSN. The social similarity measure isolates the anomalous components from the normal components. Then, to identify those anomalous components, the size feature is used (see Anomaly Detection Algorithm). Within each window Ω, global anomalies at the top-level of the hierarchy are identified using Equation 11: if the relative global size,, of a target GSN component size is less than tg, where tg is the ratio of GSNj to the largest GSN in Ω, where 1, then is classified as an anomalous:, 1 (11)

11 An example of global anomaly detection is shown in Table 2. By using the hierarchal partitioning scheme, we can zoom in to finer details of the crowd behavior, which increases the efficiency of detecting and localizing anomalies, especially local anomalies. Also, as we move up the hierarchy level, certain tracklet nodes classified as abnormal in a lower level LSN component(s), might be merged with other nodes in a higher level normal LSN component(s) and vice versa. 3.5 GSN-Update The proposed hierarchical model maintains a link between local and global social networks. In this phase, we seek to learn any newly observed events, and in turn update the global social network, implicitly gaining any changes in the bottom-level of the hierarchy i.e. local social network. The process of GSN-Update performs as follows (illustrated in Figure 6): For every two successive windows, cluster centroid algorithm is employed, Equation (9), instead of tracklet-to-tracklet comparison that demands a high computational time. GSN components of current window are merged with previous components windows, i.e. Ω and Ω, therefore the corresponding GSNs are compared. The similarity comparison corresponds to only if their centroids exhibit features similarity. Non-matching GSN components from windows Ω and Ω, are dealt with as follows: a. Non-matching global social network components(s) that belong to the recently processed time window Ω are destroyed. b. Non-matching global social network components(s) that belong to current time window Ω are preserved.

12 Figure 5: Constructing GSN by hierarchically grouping similar LSNs components from different cuboids. (a) Once we obtain the local social network components, the hierarchical clustering algorithm is employed to have a coarser view of the scene. (b) The bottom-up approach, will aim to merge similar local social network components from different cuboids towards discovering the global social network within time window. As an example, Figure 6. shows three windows Ω,Ω and Ω. GSN-Update on windows Ω and Ω, merges similar GSNs i.e. the red,, and blue,,, respectively. As satisfies condition (a) above, it is destroyed. As satisfies condition (b) above, and is preserved. The result of GSN- Update, i.e. between window Ω and window Ω, is used as the input for the successive windows in the GSN- Update process, i.e. window Ω, and so on.

13 Table 1: Example of local anomaly detection. The time window is partitioned into 2 x 2, cuboid enclosed in red, produces 7 local social network components, in which 4 are classified as normal social network components and 3 exhibit abnormal behaviors. Cuboid in Time Window Ω LSN Component No. Analysis of Local Social Network Dominant Feature(s) No. of Tracklets Local Anomaly - LSN1 Direction & Magnitude 1 Yes LSN2 Magnitude 24 No LSN3 Direction & Magnitude 29 No LSN4 Direction & Magnitude 15 No LSN5 Direction & Magnitude 14 No LSN6 Direction & Magnitude 8 Yes LSN7 Direction & Magnitude & Velocity 7 Yes Table 2: Example of global anomaly detection. Window 1 produces 3 global social network components: colored in red, colored in green and colored in yellow. Out of the three global social network components, 2 are classified as normal social network components and 3 exhibit abnormal activities. Time Window Ω GSN Component No. Analysis of Global Social Network Dominant Feature(s) No. of Tracklets Global Anomaly - GSN1 Direction 345 No GSN2 Direction & Velocity 21 Yes Direction 274 No

14 Figure 6: GSN-Update on windows, and. Global social network components are labeled with the same window index. Similar global social network components from different windows contain the same network shape and color. Similar global social network components are merged together as shown in level 1. The red global social network components and blue global social network components are merged, respectively. Non-existing global social network component in recently processed window,, is destroyed. Newly non-matching global social network components in current window are preserved,. The result of GSN-Update between window and window is used as the input for the successive windows,, in the GSN-Update process. 4. Experiments & Results Our proposed method run all the experiments on a PC computer with an Intel(R) Core(TM) i5 3.10GHz CPU and 4GB RAM under the MATLAB implementation. We have used publicly available datasets: UCSD Dataset: The UCSD anomaly detection dataset 1 uses an elevated stationary camera and overlooks pedestrian walkways on UCSD campus. The dataset represents a real scene and the abnormalities occur naturally containing videos of two different pedestrian scenes, namely USCD Ped1: containing groups of people walking towards and away from the camera with some amount of perspective distortion; and UCSD Ped2: containing groups of people walking in parallel to the camera plane. The crowd density in the walkways was variable, ranging from sparse to crowded. The normal events contain only pedestrians. The abnormal events are due to either: 1) the appearance of non-pedestrian entities in the walkways, and/or 2) anomalous pedestrian motion patterns. Commonly occurring anomalies include small carts in the scene, skaters, bikes, and people in wheelchairs. The UCSD dataset contains both frame-level ground-truth and pixel-level ground-truth. 1

15 UCD Dataset: The UCD dataset 2 contains two outdoor videos of students moving across two buildings lasting for 12 and 5 minutes, respectively. Each sequence is segmented into two different subsequences with people mainly moving in a horizontal direction in the scene. This dataset defines anomaly as the deviations from what has been observed beforehand. The groundtruth consists of the number of frames in the scene when someone starts moving against the dominant crowd motion. In our experiments, of Equation 7 and of Equation 8 are determined experimentally to be 0.4 and 0.8, respectively. 4.1 Performance Evaluation For both local and global scene understanding and anomaly detection we use [16]: a frame-level criterion - a frame is considered an anomaly if it contains at least one abnormal pixel, and denoted as positive; and the pixel-level criterion - a frame is considered anomaly if (i) it is positive and (ii) at least 40 percent of its anomalous pixels are truly identified. For GSN evaluation, the Receiver Operating Characteristic (ROC) curve is computed and the Area Under the Curve (AUC) is used for comparison. In addition, we measure [16]: Equal Error Rate (EER) - the percentage of misclassified frames when the false positive rate (FPR) is equal to the false negative rate (miss rate) i.e. FPR = 1 - true positive rate (TPR). EER is calculated for both pixel and frame level analyses; Rate of Detection (RD) - reports the detection rate at equal error point on processing the anomaly localization component, i.e., 1- EER[16]. A. UCSD dataset Local Social Network Evaluation We use 200 frames at resolution 158 x features are detected with a minimum distance of 3 to have a complete coverage of the scene. We partitioned the dataset into 4 time windows. Each time window of 50 frames is partitioned into 8 x 8 spatio-temporal cuboids. The dataset contains the biker anomaly. The results are compared against the ground-truth in terms of frame accuracy and pixel accuracy. In addition, the average of both frame accuracy and pixel accuracy, respectively, is computed for each time window. As shown in Table 3, the green cuboid represents false positive abnormal behavior in some of its LSNs. The orange-colored cuboid represents true positive abnormal activity, whereas the abnormal region is enclosed in red border. For instance, the first time window, Ω, starting from frame 1 and ending at frame 50, contains 7 false positive cuboids. The abnormality is due to the existence of tracklets that exhibit rare features relative to the surrounding neighborhood. Cuboid 12 results in abnormal LSN with tracklets exhibiting short magnitude comparable to the dominant longer tracklets in the surrounding. Moving to the next time window Ω that starts at frame 51 and ends at frame 100, two abnormal cuboids out of four are detected correctly. The abnormal LSN components are in cuboid 47 and cuboid 55. However, the abnormal LSN components in cuboids 46 and 54 are wrongly classified as normal due to: (i) in cuboid 46, the incomplete abnormal tracklet(s) exhibit similar features to its surrounding neighbors and thus were assigned to the normal LSN component, (ii) cuboid 54 contains only one tracklet. The tracklets in cuboid 54 in Ω that starts at frame 101 and ends at frame 150, were removed. 2

16 Table 1: Frame average accuracy and pixel average accuracy of LSN algorithm. A dataset is partitioned into 4 time windows, each consist of 50 frames and partitioned into 8 x 8 spatio-temporal cuboids. Green-colored blocks represent false positive abnormal behavior while orange-colored blocks represent true positive abnormal activity. For each time window, the frame accuracy and pixel accuracy of the abnormal local social network components are averaged. Frame Sequences 8 x 8 spatio temporal cuboids Anomaly Spatiotemporal Cuboids Frame Accuracy Pixel Accuracy 1 50 Average Accuracy Average Accuracy Average Accuracy Average Accuracy Moreover, cuboid 46 gained more feature information regarding the abnormal tracklets and thus were distinguished in the neighborhood. The same applies on cuboid 27 in Ω that starts at frame 151 and ends at frame 200. Further analysis from Table 3 shows, as more tracklet information is gained or new behavior is captured by the newly produced tracklets, both average frame accuracy and average pixel accuracy increases with time. Moreover, each time window reflects the changing environment in the scene. As an example,

17 1 0.8 TPR SNM FPR Figure 7: Frame-level ROC curves on UCSD Ped1 dataset. Left: Our proposed approach SNM. Right: The state-of-the-art methods from [23]. TPR FPR SNM Figure 8: Frame-level ROC curve of our proposed approach SNM on UCSD Ped2 dataset. cuboid 55 in Ω was removed from the abnormal behaviors hence reflecting the ongoing crowd scenario in video. Another example is cuboid 36, which was wrongly classified as normal in window Ω, but later its truly abnormal detection was increased to an average of 78.5 in frame accuracy and an average rate of detection at 83 on the next two windows, Ω and Ω. B. UCSD dataset Global Social Network Evaluation For performance comparison, we choose six state-of-the-art methods namely: the Mixture of Dynamic Texture (DTM) [16], the Social Force Model (SF) [10], the Mixture of Optical Flow (MPPCA) [45], the Social Force Model with MPPCA (MPPCA+SF ) [16], and the Optical Flow Monitoring (Adam s) [46]. The quantitative results of these six methods are obtained from [16]. In addition, we also included the Sparse Reconstruction Cost (Sparse) of [23]. The abbreviation of our proposed method is Social Network Model (SNM). Two frame-level ROC curves are produced for UCSD Ped1 and UCSD Ped2 datasets, as shown in Figure 7 and in Figure 8, respectively. As UCSD Ped2 does not provide pixel-level ground- truth, we only present pixel-level ROC curve of UCSD Ped1, as in Figure 9. In addition, Figure 10 shows the Equal Error Rate (EER) of our approach and the state-of-the-art methods.

18 1 0.8 TPR SNM FPR Figure 9: Pixel-level ROC curves on UCSD Ped1 dataset. Left: Our proposed approach SNM; Right: The stateof-the-art methods from [23]. We also calculated the Area Under Curve (AUC) values (cf. Table 4), as well as the Rate of Detection (RD) values in Table 5. Missing entries indicate unavailable results. Some example of frames with anomalies detected by the proposed approach and by some state-of-the-art methods are shown in Figure 11. Our frame-level ROC curve on UCSD Ped2 shows higher anomaly detection rate than existing methods, except slightly lower than Sparce [23] on UCSD Ped1. On the other hand, our pixel-level ROC curve on UCSD Ped1, see Table 5, outperforms all state-of-the-art methods. For EER, our frame-level EER (about 20%) for UCSD Ped1 outperforms all methods, but is slightly worse than Sparse method [23] (about 19%), see Error! Reference source not found.. However, for the more precise pixel-level criterion (RD) on UCSD Ped1, see Table 5, our rate of detection is (48.5% > 46% [23]) which significantly outperforms all the state-of-the-art methods. For AUC values on UCSD Ped1 and UCSD Ped2 datasets, we obtained is 86.7% on average that also outperforms all the other methods including [23], where the average AUC is 86.1%, see Table 4. This indicates that the remaining approaches may be enjoying good detection rates in anomaly detection task due to lucky hits in terms of frame-level criterion. Some image results are shown in Figure 11 (the abnormal events are labeled by red masks), in which the first column is generated by DTM method [16], the second column is given by MPPCA+SF method [16], and the third and fourth are by our SNM method. MPPCA+SF method completely miss the biker in Figure 11-(b). DTM method does detect nearly all of the abnormal events, but the foreground mask is too large, which is not accurate, as shown in first column of Figure 11. For our method, we detect (third column in Figure 11) and track (fourth column in Figure 11) the abnormal objects robustly with more accurate masks, such as bikers, skaters, small cars, etc. Obviously, the proposed SNM method outperforms the other state-of-the-art methods. Our approach achieves high anomaly localization rate due to the efficiency of hierarchical construction of spatio-temporal cuboids at different spatial and temporal scales.

19 100% Equal Error Rate 80% 60% 40% 20% 0% UCSD Ped1 UCSD Ped2 Figure 7: Frame-level Equal Error Rate of UCSD Ped1 and UCSD Ped2 datasets. Table 2: Quantitative comparison of performance for the abnormality detection algorithms tested. The third and fourth rows show the AUC over the two datasets UCSD Ped1 and UCSD Ped2. The average over the two datasets is shown in the fifth row. Anomaly Detection Experiment: AUC Algorithm DTM [16] SF[10] MPPCA[45] MPPCA+SF[16] Adam et al.[46] Sparce [23] SNM UCSD Ped1 81.8% 67.5% 59.0% 66.8% 86.0% 85.5% UCSD Ped2 84.8% 62.3% 77.4% 71.0% 63.4% 86.1% 87.9% Average 83.3% 64.9% 68.2% 68.9% 63.4% 86.1% 86.7% Table 3: The quantitative comparison of the detection rate (RD) at equal error for the anomaly localization task on UCSD Ped1. Our SNM approach achieves the higher detection rate among the state-of-the-art methods. Anomaly localization Experiment: Rate of Detection Localization DTM [16] SF[10] MPPCA [45] MPPCA+SF [16] Adam et al. [46] Sparse [23] SNM

20 Figure 11: Examples of abnormal detections using (i) the DTM approach [16], (ii) the MPPCA+SF approach [16], (iii) our detection approach and (iv) our tracking approach. For DTM, its abnormal detection foreground mask is too large thus its results are not accurate; and for MPPCA+SF, it inaccurately detects the small car in (a), completely misses the bike in (b), completely misses the skater in (c) and produces spurious abnormality at the near end of the camera in (c). In contrast, our approach using social network model outperform the above approaches with high accuracy detection rate. Figure 12: The ROC curves of different spatio-temporal scales (2 x 2, 4 x 4 and 8 x 8) on UCSD Ped2 dataset. Table 4: The AUC of different spatio-temporal scales (2 x 2, 4 x 4 and 8 x 8) on UCSD Ped2 dataset. Spatio-temporal partitioning scales Scale AUC 2 x % 4 x % 8 x %

21 Figure 8: Anomaly detection in UCD dataset. Frames taken from video sequences representing normal behavior of crowd in the first row. Examples of frames containing anomalies are shown in the second row for the GMM method and in the third row for the proposed method (SNM), respectively. Spatio-temporal partitioning at different scales: In order to evaluate the impact of different spatiotemporal scales, we experiment with 2 x 2, 4 x 4 and 8 x 8 spatio-temporal scales on UCSD Ped2 dataset. The comparative AUC are shown in Table 6 and the ROC curves in Figure. It is clear that 8 x 8 spatiotemporal partitioning achieves the best result, degrades slightly on using 4 x 4 spatio-temporal partitioning and 2 x 2 spatio-temporal partitioning produces the worst result. The reason is that 2 x 2 provides coarse view of the scene. C. Comparison Form the above experiments we notice: (i) SNM is general - covers local and global anomalous events. On the other hand, Adam s work [46] detects only local abnormal events using Gaussian of Mixture Models. In addition, SF [10] is a spatial abnormality technique while MPPCA [45] is temporal abnormality technique. (ii) SNM is an unsupervised method, however Sparse [23] requires pre-learnt dictionary and MPPCA+SF [16] approach requires large training dataset. Moreover, the performance of MPPCA+SF [16] degrades if training dataset is small. In addition, optical force method, i.e. social force model uses offline learning. (iii) SNM extends to online event detection via incremental update mechanism. Although Sparse [23] also supports online event detection, however, its training is completely offline. In addition, DTM [16] is an offline approach.

22 D. UCD Dataset For this dataset, we compared SNM with the Gaussian mixtures model GMM [21] and the crowd segmentation model CSM [4] based on the anomaly detection ground truth. The performance is measured based on the detection accuracy rate. Quantitative comparison of SNM with the ground truth is shown in Table 5. SNM achieves higher anomaly detection accuracy in the four video segments. For accurate anomaly localization, we compared our results to the GMM [21] method. Figure 13 shows the results obtained on the UCD dataset. Frames from video scenes where crowd exhibit normal behaviors are shown in first row. The second row represents results from the GMM [21] and the third row represents results from the SNM. Anomalous behaviors are exhibited as a student running from bottom left to top right and a group of four students are running from left to right. Both events are identified as anomalous since they deviate from the dominant crowd motion. Although GMM[21] correctly identified the anomalous behavior, highlighted by the red dots, the SNM comprehensively highlights the anomalous region of interest. Such dense coverage of the region of interest leads to better tracking and better performance in identifying the anomalous frames, as shown in Table 7. As mentioned before, the compositional information of a video enables the method to handle illumination variations as shown in Figure 14. A person leaving a shop and moving in a direction opposite to the crowd motion (first row in Figure 14) has been detected (second row in Figure 14), tracked and successfully identified as an anomalous motion pattern (last row in Figure 14). Table 5: Comparison of our method with the CSM method based on the UCD groundtruth in anomaly detection. Anomaly Detection Experiment: Percent Accuracy Segment No. GroundTruth SNM Detection Frames Results Frames SNM Accuracy CSM Accuracy[4] Segment % 93.7% Segment % 88.5% Segment % 82.6% Segment % 84.9% E. Spatio-temporal partitioning at different scales Similarly, we tested the influence of different spatio-temporal scales, 2 x 2, 4 x 4 and 8 x 8, on video segment 1 of the UCD dataset. The statistical results of percent accuracy are tabulated in Table 8. Both 4 x 4 and 8 x 8 spatio-temporal partitioning produce similar results. However, 8 x 8 spatio-temporal partitioning consumes more computational time than 4 x 4 spatio-temporal partitioning. For UCD dataset, higher scale give better accuracy than a lower scale since the video is crowded. That is, higher resolutions tend to capture details of a crowded video better than coarser resolutions.

23 Figure 9: Anomaly object detected successfully under illumination variation. First row represents sample frame of object presence. Second row represents the detection of anomalous object and been tracked in the third row. Table 6: The AUC results of different spatio-temporal scales (2 x 2, 4 x 4 and 8 x 8) on UCD dataset. Spatio-temporal partitioning scales Segment1 Scale 2 x 2 4 x 4 8 x 8 SNM Detection Results SNM Accuracy % % % 5. Conclusion The proposed social network model, SNM, captures the scene dynamics and crowd interactions spatially and temporally through modeling crowd scenes by a social network. SNM has been shown to outperform the state-of-the-art methods in detecting and localizing anomalies of crowd scenes. Moreover, SNM allows for adaptive partitioning of crowd scenes to capture the details of scene dynamics and thus detect fine anomalous events in the scene as required by an application. Using a set of benchmark crowd analysis video sequences, our experiments show that the detection accuracy of SNM is higher than the other methods.

24 References: [1] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection: A survey, ACM Computing Surveys (CSUR), no. 3, September [2] J. Junior, S. Mussef, C. Jung, Crowd Analysis using Computer Vision Techniques, IEEE Signal Processing Magazine,, vol. 27, no. 5, pp , [3] S. Saxena, F. Brémond, M. Thonnat, and R. Ma, Crowd behavior recognition for video surveillance, Advanced Concepts for Intelligent Vision Systems, pp. 1 12, [4] H. Ullah and N. Conci, Crowd motion segmentation and anomaly detection via multi-label optimization, ICPR workshop on Pattern Recognition and Crowd Analysis, [5] A. Basharat, A. Gritai, and M. Shah, Learning object motion patterns for anomaly detection and improved object detection, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1 8, Jun [6] H. Ullah and N. Conci, Structured Learning for Crowd Motion Segmentation, in IEEE Conference on Image Processing (ICIP), pp , [7] R. Mazzon, S.F. Tahir, and A. Cavallaro, Person re-identification in crowd, Pattern Recognition Letters, vol. 33, no. 14, pp , Oct [8] W. Ge and R.T. Collins, Marked point processes for crowd counting, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp , Jun [9] Z. Wang, H. Liu, Y. Qian, and T. Xu, Crowd Density Estimation Based on Local Binary Pattern Co- Occurrence Matrix, 2012 IEEE International Conference on Multimedia and Expo Workshops, pp , Jul [10] R. Mehran, A. Oyama, and M. Shah, Abnormal crowd behavior detection using social force model, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), no. 1, pp , [11] R. Raghavendra, A.D. Bue and M. Cristani, Optimizing interaction force for global anomaly detection in crowded scenes, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), no. 1, pp , [12] S. Ali and M. Shah, Floor fields for tracking in high density crowd scenes, Computer Vision ECCV, pp. 1 14, [13] S. Ali and M. Shah, A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-7, [14] J. Feng, C. Zhang, and P. Hao, Online Learning with Self-Organizing Maps for Anomaly Detection in Crowd Scenes, th International Conference on Pattern Recognition, pp , Aug [15] F. Jiang, Y. Wu and A.K. Katsaggelos, Detecting contextual anomalies of crowd motion in surveillance video, th IEEE International Conference on Image Processing (ICIP), pp , [16] V. Mahadevan, W. Li and V. Bhalodia, Anomaly detection in crowded scenes, 2010 IEEE Conference on Computer Vision and Pattern Recogniton (CVPR), pp ,2010. [17] V. Reddy, C. Sanderson and BC. Lovell, Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture, 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recogniton Workshops (CVPRW),pp , [18] M. J. V. Leach, Ed.P. Sparks and N.M. Robertson, Contextual anomaly detection in crowded surveillance scenes, Pattern Recognition Letters, vol. 44, pp , Jul [19] V. Saligrama and Z. Chen, Video Anomaly Detection Based on Local Statistical Aggregates, 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp , [20] H. Ullah, M. Ullah, and N. Conci, Real-time anomaly detection in dense crowded scenes, SPIE-Video Surveillance and Transportation Imaging Applications, vol. 9026, pp Mar [21] H. Ullah, L. Tenuti, and N. Conci, Gaussian mixtures for anomaly detection in crowded scenes, IS&T/SPIE Electronic Imaging, pp , Mar [22] X. Cui, Q. Liu, M. Gao, and D.N. Metaxas, Abnormal detection using interaction energy potentials, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp , Jun [23] Y. Cong, J. Yuan, and J. Liu, Sparse reconstruction cost for abnormal event detection, 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp , Jun [24] O. Ozturk, T. Yamasaki, and K. Aizawa, Detecting Dominant Motion Flows in Unstructured/Structured Crowd Scenes, th International Conference on Pattern Recognition (ICPR), pp , Aug

Anomaly Detection in Crowded Scenes by SL-HOF Descriptor and Foreground Classification

Anomaly Detection in Crowded Scenes by SL-HOF Descriptor and Foreground Classification 26 23rd International Conference on Pattern Recognition (ICPR) Cancún Center, Cancún, México, December 4-8, 26 Anomaly Detection in Crowded Scenes by SL-HOF Descriptor and Foreground Classification Siqi

More information

Real-Time Anomaly Detection and Localization in Crowded Scenes

Real-Time Anomaly Detection and Localization in Crowded Scenes Real-Time Anomaly Detection and Localization in Crowded Scenes Mohammad Sabokrou 1, Mahmood Fathy 2, Mojtaba Hoseini 1, Reinhard Klette 3 1 Malek Ashtar University of Technology, Tehran, Iran 2 Iran University

More information

arxiv: v1 [cs.cv] 21 Nov 2015 Abstract

arxiv: v1 [cs.cv] 21 Nov 2015 Abstract Real-Time Anomaly Detection and Localization in Crowded Scenes Mohammad Sabokrou 1, Mahmood Fathy 2, Mojtaba Hoseini 1, Reinhard Klette 3 1 Malek Ashtar University of Technology, Tehran, Iran 2 Iran University

More information

Anomaly Detection in Crowded Scenes

Anomaly Detection in Crowded Scenes To appear in IEEE Conf. on Computer Vision and Pattern Recognition, San Francisco, 200. Anomaly Detection in Crowded Scenes Vijay Mahadevan Weixin Li Viral Bhalodia Nuno Vasconcelos Department of Electrical

More information

Histograms of Optical Flow Orientation and Magnitude to Detect Anomalous Events in Videos

Histograms of Optical Flow Orientation and Magnitude to Detect Anomalous Events in Videos Histograms of Optical Flow Orientation and Magnitude to Detect Anomalous Events in Videos Rensso Victor Hugo Mora Colque, Carlos Anto nio Caetano Ju nior and William Robson Schwartz Computer Science Department,

More information

arxiv: v1 [cs.cv] 16 Jun 2016

arxiv: v1 [cs.cv] 16 Jun 2016 HOLISTIC FEATURES FOR REAL-TIME CROWD BEHAVIOUR ANOMALY DETECTION Mark Marsden Kevin McGuinness Suzanne Little Noel E. O Connor Insight Centre for Data Analytics Dublin City University Dublin, Ireland

More information

Online Tracking Parameter Adaptation based on Evaluation

Online Tracking Parameter Adaptation based on Evaluation 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance Online Tracking Parameter Adaptation based on Evaluation Duc Phu Chau Julien Badie François Brémond Monique Thonnat

More information

Crowd Behavior Detection for Abnormal Conditions

Crowd Behavior Detection for Abnormal Conditions International Journal of Computer Systems (ISSN: 2394-1065), Volume 03 Issue 06, June, 2016 Available at http://www.ijcsonline.com/ Aniket A. Patil, Prof. S. A. Shinde Department of Computer Engineering,

More information

arxiv: v1 [cs.cv] 27 Sep 2018

arxiv: v1 [cs.cv] 27 Sep 2018 Interactive Surveillance Technologies for Dense Crowds Aniket Bera Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC Dinesh Manocha Department of Computer Science

More information

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Chris J. Needham and Roger D. Boyle School of Computing, The University of Leeds, Leeds, LS2 9JT, UK {chrisn,roger}@comp.leeds.ac.uk

More information

Crowd Event Recognition Using HOG Tracker

Crowd Event Recognition Using HOG Tracker Crowd Event Recognition Using HOG Tracker Carolina Gárate Piotr Bilinski Francois Bremond Pulsar Pulsar Pulsar INRIA INRIA INRIA Sophia Antipolis, France Sophia Antipolis, France Sophia Antipolis, France

More information

Dense Spatio-temporal Features For Non-parametric Anomaly Detection And Localization

Dense Spatio-temporal Features For Non-parametric Anomaly Detection And Localization Dense Spatio-temporal Features For Non-parametric Anomaly Detection And Localization Lorenzo Seidenari, Marco Bertini, Alberto Del Bimbo Dipartimento di Sistemi e Informatica - University of Florence Viale

More information

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images Karthik Ram K.V & Mahantesh K Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore,

More information

Including the Size of Regions in Image Segmentation by Region Based Graph

Including the Size of Regions in Image Segmentation by Region Based Graph International Journal of Emerging Engineering Research and Technology Volume 3, Issue 4, April 2015, PP 81-85 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Including the Size of Regions in Image Segmentation

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Daegeon Kim Sung Chun Lee Institute for Robotics and Intelligent Systems University of Southern

More information

Detection of Video Anomalies Using Convolutional Autoencoders and One-Class Support Vector Machines

Detection of Video Anomalies Using Convolutional Autoencoders and One-Class Support Vector Machines Detection of Video Anomalies Using Convolutional Autoencoders and One-Class Support Vector Machines Matheus Gutoski 1, Nelson Marcelo Romero Aquino 2 Manassés Ribeiro 3, André Engênio Lazzaretti 4, Heitor

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Fast trajectory matching using small binary images

Fast trajectory matching using small binary images Title Fast trajectory matching using small binary images Author(s) Zhuo, W; Schnieders, D; Wong, KKY Citation The 3rd International Conference on Multimedia Technology (ICMT 2013), Guangzhou, China, 29

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Det De e t cting abnormal event n s Jaechul Kim

Det De e t cting abnormal event n s Jaechul Kim Detecting abnormal events Jaechul Kim Purpose Introduce general methodologies used in abnormality detection Deal with technical details of selected papers Abnormal events Easy to verify, but hard to describe

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Automatic Parameter Adaptation for Multi-Object Tracking

Automatic Parameter Adaptation for Multi-Object Tracking Automatic Parameter Adaptation for Multi-Object Tracking Duc Phu CHAU, Monique THONNAT, and François BREMOND {Duc-Phu.Chau, Monique.Thonnat, Francois.Bremond}@inria.fr STARS team, INRIA Sophia Antipolis,

More information

Spatial Latent Dirichlet Allocation

Spatial Latent Dirichlet Allocation Spatial Latent Dirichlet Allocation Xiaogang Wang and Eric Grimson Computer Science and Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge, MA, 02139, USA

More information

arxiv: v1 [cs.cv] 3 Apr 2013

arxiv: v1 [cs.cv] 3 Apr 2013 Improved Anomaly Detection in Crowded Scenes via Cell-based Analysis of Foreground Speed, Size and Texture Vikas Reddy, Conrad Sanderson, Brian C. Lovell NICTA, PO Box 6020, St Lucia, QLD 4067, Australia

More information

Object Tracking using HOG and SVM

Object Tracking using HOG and SVM Object Tracking using HOG and SVM Siji Joseph #1, Arun Pradeep #2 Electronics and Communication Engineering Axis College of Engineering and Technology, Ambanoly, Thrissur, India Abstract Object detection

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

International Journal of Advance Engineering and Research Development

International Journal of Advance Engineering and Research Development Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 11, November -2017 e-issn (O): 2348-4470 p-issn (P): 2348-6406 Comparative

More information

CROWD MOTION ANALYSIS: SEGMENTATION, ANOMALY DETECTION, AND BEHAVIOR CLASSIFICATION. Habib Ullah

CROWD MOTION ANALYSIS: SEGMENTATION, ANOMALY DETECTION, AND BEHAVIOR CLASSIFICATION. Habib Ullah CROWD MOTION ANALYSIS: SEGMENTATION, ANOMALY DETECTION, AND BEHAVIOR CLASSIFICATION Habib Ullah Advisor: Nicola Conci, PhD February 2015 Abstract The objective of this doctoral study is to develop efficient

More information

Realtime Anomaly Detection using Trajectory-level Crowd Behavior Learning

Realtime Anomaly Detection using Trajectory-level Crowd Behavior Learning Realtime Anomaly Detection using Trajectory-level Crowd Behavior Learning Aniket Bera University of North Carolina Chapel Hill, NC, USA ab@cs.unc.edu Sujeong Kim SRI International Princeton, NJ, USA sujeong.kim@sri.com

More information

Abnormal Event Detection at 150 FPS in MATLAB

Abnormal Event Detection at 150 FPS in MATLAB Abnormal Event Detection at 15 FPS in MATLAB Cewu Lu Jianping Shi Jiaya Jia The Chinese University of Hong Kong {cwlu, jpshi, leojia}@cse.cuhk.edu.hk Abstract Speedy abnormal event detection meets the

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

A TRAJECTORY CLUSTERING APPROACH TO CROWD FLOW SEGMENTATION IN VIDEOS. Rahul Sharma, Tanaya Guha

A TRAJECTORY CLUSTERING APPROACH TO CROWD FLOW SEGMENTATION IN VIDEOS. Rahul Sharma, Tanaya Guha A TRAJECTORY CLUSTERING APPROACH TO CROWD FLOW SEGMENTATION IN VIDEOS Rahul Sharma, Tanaya Guha Electrical Engineering, Indian Institute of Technology Kanpur, India ABSTRACT This work proposes a trajectory

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation Jong Taek Lee, M. S. Ryoo, Matthew Riley, and J. K. Aggarwal Computer & Vision Research Center Dept. of Electrical & Computer Engineering,

More information

Self Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation

Self Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation For Intelligent GPS Navigation and Traffic Interpretation Tianshi Gao Stanford University tianshig@stanford.edu 1. Introduction Imagine that you are driving on the highway at 70 mph and trying to figure

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task

QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task QMUL-ACTIVA: Person Runs detection for the TRECVID Surveillance Event Detection task Fahad Daniyal and Andrea Cavallaro Queen Mary University of London Mile End Road, London E1 4NS (United Kingdom) {fahad.daniyal,andrea.cavallaro}@eecs.qmul.ac.uk

More information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi hrazvi@stanford.edu 1 Introduction: We present a method for discovering visual hierarchy in a set of images. Automatically grouping

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

An Approach for Real Time Moving Object Extraction based on Edge Region Determination

An Approach for Real Time Moving Object Extraction based on Edge Region Determination An Approach for Real Time Moving Object Extraction based on Edge Region Determination Sabrina Hoque Tuli Department of Computer Science and Engineering, Chittagong University of Engineering and Technology,

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

IMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS. Kirthiga, M.E-Communication system, PREC, Thanjavur

IMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS. Kirthiga, M.E-Communication system, PREC, Thanjavur IMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS Kirthiga, M.E-Communication system, PREC, Thanjavur R.Kannan,Assistant professor,prec Abstract: Face Recognition is important

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Texture Segmentation by Windowed Projection

Texture Segmentation by Windowed Projection Texture Segmentation by Windowed Projection 1, 2 Fan-Chen Tseng, 2 Ching-Chi Hsu, 2 Chiou-Shann Fuh 1 Department of Electronic Engineering National I-Lan Institute of Technology e-mail : fctseng@ccmail.ilantech.edu.tw

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

High Dense Crowd Pattern and Anomaly Detection Using Statistical Model

High Dense Crowd Pattern and Anomaly Detection Using Statistical Model High Dense Crowd Pattern and Anomaly Detection Using Statistical Model Muhammad Aatif, Amanullah Yasin CASE Pakistan atifmaju@gmail.com amanyasin@gmail.com ABSTRACT: Human crowd behavior analysis is a

More information

Color Local Texture Features Based Face Recognition

Color Local Texture Features Based Face Recognition Color Local Texture Features Based Face Recognition Priyanka V. Bankar Department of Electronics and Communication Engineering SKN Sinhgad College of Engineering, Korti, Pandharpur, Maharashtra, India

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute

More information

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Automatic visual recognition for metro surveillance

Automatic visual recognition for metro surveillance Automatic visual recognition for metro surveillance F. Cupillard, M. Thonnat, F. Brémond Orion Research Group, INRIA, Sophia Antipolis, France Abstract We propose in this paper an approach for recognizing

More information

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN Gengjian Xue, Jun Sun, Li Song Institute of Image Communication and Information Processing, Shanghai Jiao

More information

2 Proposed Methodology

2 Proposed Methodology 3rd International Conference on Multimedia Technology(ICMT 2013) Object Detection in Image with Complex Background Dong Li, Yali Li, Fei He, Shengjin Wang 1 State Key Laboratory of Intelligent Technology

More information

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hu, Qu, Li and Wang 1 Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hongyu Hu (corresponding author) College of Transportation, Jilin University,

More information

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Tetsu Matsukawa Koji Suzuki Takio Kurita :University of Tsukuba :National Institute of Advanced Industrial Science and

More information

Detecting motion by means of 2D and 3D information

Detecting motion by means of 2D and 3D information Detecting motion by means of 2D and 3D information Federico Tombari Stefano Mattoccia Luigi Di Stefano Fabio Tonelli Department of Electronics Computer Science and Systems (DEIS) Viale Risorgimento 2,

More information

Street Scene: A new dataset and evaluation protocol for video anomaly detection

Street Scene: A new dataset and evaluation protocol for video anomaly detection MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Street Scene: A new dataset and evaluation protocol for video anomaly detection Jones, M.J.; Ramachandra, B. TR2018-188 January 19, 2019 Abstract

More information

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,

More information

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Color Image Segmentation

Color Image Segmentation Color Image Segmentation Yining Deng, B. S. Manjunath and Hyundoo Shin* Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106-9560 *Samsung Electronics Inc.

More information

Textural Features for Image Database Retrieval

Textural Features for Image Database Retrieval Textural Features for Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington Seattle, WA 98195-2500 {aksoy,haralick}@@isl.ee.washington.edu

More information

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601 Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,

More information

Abnormal Event Detection in Crowded Scenes using Sparse Representation

Abnormal Event Detection in Crowded Scenes using Sparse Representation Abnormal Event Detection in Crowded Scenes using Sparse Representation Yang Cong,, Junsong Yuan and Ji Liu State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences,

More information

Mean shift based object tracking with accurate centroid estimation and adaptive Kernel bandwidth

Mean shift based object tracking with accurate centroid estimation and adaptive Kernel bandwidth Mean shift based object tracking with accurate centroid estimation and adaptive Kernel bandwidth ShilpaWakode 1, Dr. Krishna Warhade 2, Dr. Vijay Wadhai 3, Dr. Nitin Choudhari 4 1234 Electronics department

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation

Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation Alaa E. Abdel-Hakim Electrical Engineering Department Assiut University Assiut, Egypt alaa.aly@eng.au.edu.eg Mostafa Izz Cairo

More information

Scanner Parameter Estimation Using Bilevel Scans of Star Charts

Scanner Parameter Estimation Using Bilevel Scans of Star Charts ICDAR, Seattle WA September Scanner Parameter Estimation Using Bilevel Scans of Star Charts Elisa H. Barney Smith Electrical and Computer Engineering Department Boise State University, Boise, Idaho 8375

More information

ABNORMAL GROUP BEHAVIOUR DETECTION FOR OUTDOOR ENVIRONMENT

ABNORMAL GROUP BEHAVIOUR DETECTION FOR OUTDOOR ENVIRONMENT ABNORMAL GROUP BEHAVIOUR DETECTION FOR OUTDOOR ENVIRONMENT Pooja N S 1, Suketha 2 1 Department of CSE, SCEM, Karnataka, India 2 Department of CSE, SCEM, Karnataka, India ABSTRACT The main objective of

More information

Crowd Scene Understanding with Coherent Recurrent Neural Networks

Crowd Scene Understanding with Coherent Recurrent Neural Networks Crowd Scene Understanding with Coherent Recurrent Neural Networks Hang Su, Yinpeng Dong, Jun Zhu May 22, 2016 Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 1 / 26 Outline 1 Introduction 2 LSTM

More information

An ICA based Approach for Complex Color Scene Text Binarization

An ICA based Approach for Complex Color Scene Text Binarization An ICA based Approach for Complex Color Scene Text Binarization Siddharth Kherada IIIT-Hyderabad, India siddharth.kherada@research.iiit.ac.in Anoop M. Namboodiri IIIT-Hyderabad, India anoop@iiit.ac.in

More information

Detecting and Tracking a Moving Object in a Dynamic Background using Color-Based Optical Flow

Detecting and Tracking a Moving Object in a Dynamic Background using Color-Based Optical Flow www.ijarcet.org 1758 International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Detecting and Tracking a Moving Object in a Dynamic Background using Color-Based Optical Flow

More information

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 63 CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR) 4.1 INTRODUCTION The Semantic Region Based Image Retrieval (SRBIR) system automatically segments the dominant foreground region and retrieves

More information

Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes

Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes 1 Submitted to IEEE Trans. on Pattern Analysis and Machine Intelligence Regular Paper Tracking Pedestrians using Local Spatio-temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Local Features: Detection, Description & Matching

Local Features: Detection, Description & Matching Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION WILLIAM ROBSON SCHWARTZ University of Maryland, Department of Computer Science College Park, MD, USA, 20742-327, schwartz@cs.umd.edu RICARDO

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

Efficient Acquisition of Human Existence Priors from Motion Trajectories

Efficient Acquisition of Human Existence Priors from Motion Trajectories Efficient Acquisition of Human Existence Priors from Motion Trajectories Hitoshi Habe Hidehito Nakagawa Masatsugu Kidode Graduate School of Information Science, Nara Institute of Science and Technology

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.

More information

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

PROBLEM FORMULATION AND RESEARCH METHODOLOGY PROBLEM FORMULATION AND RESEARCH METHODOLOGY ON THE SOFT COMPUTING BASED APPROACHES FOR OBJECT DETECTION AND TRACKING IN VIDEOS CHAPTER 3 PROBLEM FORMULATION AND RESEARCH METHODOLOGY The foregoing chapter

More information

3D Face and Hand Tracking for American Sign Language Recognition

3D Face and Hand Tracking for American Sign Language Recognition 3D Face and Hand Tracking for American Sign Language Recognition NSF-ITR (2004-2008) D. Metaxas, A. Elgammal, V. Pavlovic (Rutgers Univ.) C. Neidle (Boston Univ.) C. Vogler (Gallaudet) The need for automated

More information

Critique: Efficient Iris Recognition by Characterizing Key Local Variations

Critique: Efficient Iris Recognition by Characterizing Key Local Variations Critique: Efficient Iris Recognition by Characterizing Key Local Variations Authors: L. Ma, T. Tan, Y. Wang, D. Zhang Published: IEEE Transactions on Image Processing, Vol. 13, No. 6 Critique By: Christopher

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

Robust and accurate change detection under sudden illumination variations

Robust and accurate change detection under sudden illumination variations Robust and accurate change detection under sudden illumination variations Luigi Di Stefano Federico Tombari Stefano Mattoccia Errico De Lisi Department of Electronics Computer Science and Systems (DEIS)

More information

On-line Real-time Crowd Behavior Detection in Video Sequences

On-line Real-time Crowd Behavior Detection in Video Sequences On-line Real-time Crowd Behavior Detection in Video Sequences Andrea Pennisi a, Domenico D. Bloisi a,, Luca Iocchi a a Department of Computer, Control, and Management Engineering Sapienza University of

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Object detection using non-redundant local Binary Patterns

Object detection using non-redundant local Binary Patterns University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information