Dynamic saliency models and human attention: a comparative study on videos

Size: px
Start display at page:

Download "Dynamic saliency models and human attention: a comparative study on videos"

Transcription

1 Dynamic saliency models and human attention: a comparative study on videos Nicolas Riche, Matei Mancas, Dubravko Culibrk, Vladimir Crnojevic, Bernard Gosselin, Thierry Dutoit University of Mons, Belgium, University of Novi Sad, Serbia Abstract. Significant progress has been made in terms of computational models of bottom-up visual attention (saliency). However, efficient ways of comparing these models for still images remain an open research question. The problem is even more challenging when dealing with videos and dynamic saliency. The paper proposes a framework for dynamicsaliency model evaluation, based on a new database of diverse videos for which eye-tracking data has been collected. In addition, we present evaluation results obtained for 4 state-of-the-art dynamic-saliency models, two of which have not been verified on eye-tracking data before. 1 Introduction When faced with visual stimuli the human vision system (HVS) does not process the whole scene in parallel. Part of the visual information sensed by the eyes is discarded in a systematic manner to attend to objects of interest [1][2]. The ability to orientate rapidly towards salient objects in a cluttered visual scene allows an organism to detect quickly possible prey, mates or predators in the visual world, making it a clear evolutionary advantage [1]. Current research considers attentional deployment as a two-component mechanism [1]. Subjects selectively direct attention to objects in a scene using both bottom-up, image-based saliency cues and top-down, task-dependent cues; an idea that dates back to 19 th century work of William James [3]. Bottom-up processing is driven by the stimulus presented [4]. Some stimuli are conspicuous or salient in a given context. This suggests that saliency is computed in a pre-attentive and involuntary manner across the entire visual field, most probably in terms of hierarchical centre-surround mechanisms. Similar goes for the moving stimuli; they are perceived to be moving only if they are undergoing motion different from their wider surround [2]. While bottom-up factors that influence attention are well understood [5], the integration of top-down knowledge into these models remains an open taskspecific problem. Because of this and the fact that bottom-up saliency can be seen as a generic approximation of attention in the very first viewing moments, applications of visual attention commonly rely on bottom-up models [6][7][8][9][10]. Significant progress has been made in terms of computational models of bottom-up visual saliency when working with still images [11][12][13][14]. Motion cues, despite their importance, have until recently been usually considered

2 2 Authors Suppressed Due to Excessive Length as an extension of the static saliency. Dynamic saliency models, operating on videos, are gaining interest [15][10][16]. Effective ways of comparing the performance of such models are still lacking as video-based saliency raises several issues in comparative studies which are novel when compared to still image studies. Moreover, while eye-tracking databases for still images are becoming available for model assessment, video databases are still scarce. In this study, we propose a framework for dynamic-saliency model comparison, based on ground truth eye-tracking data collected from humans viewing a new video freely available database [17]. We evaluate 4 state-of-the-art saliency models and discuss their performance on the proposed database. Two of the models proposed have not previously been verified using eye-tracking data and none have been evaluated using such an extensive set of videos. The rest of the paper is organized as follows: Section 2 provides an overview of the related published work. Section 3 is dedicated to the description of the proposed methodology. The results achieved can be found in Section 4. Finally, Section 5 contains the discussion and our conclusions. 2 Related Work Zhang et al. [14] proposed a Bayesian framework for saliency detection (SUN model). Their approach is based on the assumption that visual saliency is the probability of a target (e.g. food or predators) at every location given the visual features observed. Bottom-up saliency is considered equal to what is known in information theory as self-information of the random variable that represents the visual features of point. Saliency, therefore is dependent solely on the probability distribution of feature values. Zhang et al. estimate this probability by computing feature response maps on a set of 138 images of natural scenes. The distribution for each feature is parametrized by fitting a zero-mean generalized Gaussian. The authors did not propose their own features, but chose to test their model using biologically-plausible Difference-of-Gaussian filters, akin to those used in the seminal paper of Itti et al. [18]. In addition, features proposed by Bruce and Tsotsos [12] were considered, which are linear features derived form a set of natural images using Independent Component Analysis (ICA). In their study Zhang et al. concluded that the performance of their approach is much better when independent ICA features are used instead of dependent DoG features. Therefore, these form the foundation of the algorithm tested in this study. Seo and Milanfar [13] proposed a framework for static and space-time saliency detection (SEO model). Their saliency model relies solely on centre-surround differences between a local window centred on a pixel location and its neighbouring windows. To represent image structure in the windows, they proposed the use of Local Steering Kernels (LSK). These features are canonical kernels fitted to pixel value differences based on estimated gradients. To take motion into account and achieve space-time (dynamic) saliency detection, Seo and Milanfals extend the pixel location with the time coordinate: the central window and the surrounding region become spatio-temporal cubes. Similar to the approach of Zhang et

3 Author Guidelines for ACCV Final Paper 3 al., they estimate the conditional probability of a point being salient, depending on the features extracted for both the central window and the surrounding region. However, rather than using a parametric estimate, Seo and Milanfar estimated this conditional probability using Kernel Density Estimation (KDE). The saliency, then, is computed as a centre value of a normalized adaptive kernel computed for the centre + surround region. In [10], Culibrk et al. proposed the use of a multi-scale background modelling and foreground segmentation approach, as an efficient saliency model (Culibrk model) driven by both motion and simple static cues, which adheres to the motion-saliency principles reported in [2]. The model employs the principles of multi-scale processing, cross-scale motion consistency, outlier detection and temporal coherence. The algorithm employs a multi-scale model of the background in the form of a Gaussian pyramid. The background frames at each level are obtained by infinite impulse response (running average) filtering commonly used in background subtraction [19][20]. This allows the approach to take into account temporal consistency in the frames and to reduce the saliency of static objects as time passes. Rather than attempting to learn the distribution of the features, outlier detection [21] is used to detect salient changes in the frame. The assumption is that the salient changes are those that differ significantly from the changes undergone by most of the pixels in the frame. Mancas model proposed in [15], uses only dynamic features such as speed and direction. No static cues about colours, grey levels or orientations are included in the model. The algorithm is based on three main steps: motion feature extraction, spatio-temporal filtering, and rare motion extraction. The model is intended to quantify rare and abnormal motion. As a first step in the Mancas model, features are extracted making use of Farneback s algorithm for optical flow [22]. The features are then discretized into 4 directions (north, south, west, east) and 5 speeds (very slow, slow, mean, fast, very fast). Next, a spatiotemporal low-pass filter is used to cope with the discretized feature channels and its purpose is to keep just the consistent spatio-temporal events and provide a spatio-temporal description of the feature. After the filtering of each of the 9 feature channels (4 directions, 5 speeds), a histogram with 5 bins is computed for each resulting image and the self-information of the pixels for a given bin is computed. Self-information provides a pixel saliency index and the different feature channels are fused using the maximum operator. The final saliency map specifies the rarity of the statistics of a given video volume at several scales. 3 Human attention versus dynamic saliency models 3.1 ASCMN (Abnormal Surveillance Crowd Moving Noise) dataset The comparison of automatic saliency models first needs ground truth data. For overt attention and the detection of eye fixation positions, the ground truth of human attentional behaviour can be measured using an eye tracking device. Eye tracking devices let us acquire for several users the precise position of people gaze at a video-based frame rate (30 frames per second).

4 4 Authors Suppressed Due to Excessive Length There are very few databases containing both video and eye tracking data. The most well-known is CRCNS-ORIG Itti s video database [23]. This database contains 50 videos along with eye-tracking data for 8 viewers. The DIEM dataset [24] contains 85 videos and eye data collected from 250 participants but who did not necessarily view all the videos. In [25] an eye-tracking database with 12 standard videos used in image compression and quality estimation viewed by 15 people, was proposed. Finally, the Lubeck dataset [26] contains, in addition to static images, 18 videos for 54 subjects. All these databases contain some videos with different kind of motion in the scene ( the sequences are extracted mostly from TV programs, Hollywood movies, standard video databases or video games). However none of them has been designed specifically to contain anomalous motion which would attract attention in the presence of other motion and enable testing of dynamic-saliency models. The proposed ASCMN database [17] attempts to fill this gap. It contains videos obtained from the Itti s CRCNS database, Vasconcelo s database [27] and a standard complex-background video surveillance database [19]. These have been extended with Internet and proprietary crowd movies. The database is divided into 5 classes of movies as described in Table 1. In addition to videos for which eye tracking data has previously been published in existing databases, the classes cover a new type of videos, lacking in the other databases - videos that contain motion abnormalities and crowd motion. Also eye-tracking data for complex-background surveillance videos included in ASCMN has not previously been published. Sample frames for different classes of videos are shown in Figure 1. The ASCMN database, therefore, provides data which covers a wider spectrum of video types, than the existing databases, and accumulates previously published videos suitable for dynamic saliency model evaluation. ASCMN contains 24 videos, together with eye tracking data from 13 viewers, acquired using a commercial FaceLab eye tracking system [28]. Table 1. The 5 classes of videos contained into the ASCMN database Video classes Description Videos Nb. 1) Containing abnormal motion (ABNORMAL) 2) Video surveillance style (SURVEILLANCE) 3) Crowd motion (CROWD) 4) Videos with moving camera (MOVING) 5) Motion noise with sudden salient motion (NOISE) Some moving blobs have different speed or direction compared to the main stream: Figure 1 line 1. Classical surveillance camera with no special motion event: Figure 1 line 2. Motion of more or less dense crowds: Figure 1 line 3. Videos taken with a moving camera: Figure 1 line 4. No motion during several seconds followed by sudden important motion: Figure 1 line 5. 2, 4, 16, 18, 20 1, 3, 5, 9 8, 10, 12, 14, 21 6, 19, 22, 24 7, 11, 13, 15, 17, 23

5 Author Guidelines for ACCV Final Paper 5 Fig. 1. The 5 classes of videos from the ASCMN database. First line represents AB- NORMAL motion with bikes and cars faster than people for example. The second line shows SURVEILLANCE classical motion with nothing really salient in terms of movement. The third line shows CROWD motion with increasing density from left to right. Line four shows MOVING camera videos. Line five displays videos with long periods of NOISE (frames 2, 4) and a sudden appearance of a salient object (frames 1, 3). This system allows small head movements and is thus less intrusive than other eye tracking systems, making the viewer feel more comfortable. The viewers are PhD students and researchers ranging from 23 to 35 years old, both males and females. The eye gaze positions are recorded and superimposed on the initial video for all the viewers, as can shown in the first column of Figure 3. We then low pass filter subjects gaze positions to obtain a heatmap which can also be superimposed on corresponding video frame (Figure 3, second column). This post-processing step is useful in estimating the mean gaze density, eliminating the outliers and providing higher weight to the focus points common to several users. Finally, depending on the metric used to assess the correspondence between the eye tracking results and an automatic saliency model, a thresholded version of the heatmap can be obtained (Figure 3, third column).

6 6 Authors Suppressed Due to Excessive Length Fig. 2. Left column: aggregated eye tracking results: each red dot is the position of the eye gaze of one viewer. The middle column contains smoothed gaze location producing heatmpas superimposed to the corresponding frame. Right column shows a thresholded version of the heatmap. 3.2 Comparison metrics for dynamic saliency models Several similarity measures can be used to compare saliency models with the eye tracking data. We choose three of them, due to their complementary nature: the Normalized Scanpath Saliency (NSS) [29], the Correlation Coefficients (CC) and the area under the Receiver Operating Characteristics curve (AUC-ROC) [30]. These measures have been designed to use fixations, but this approach is not optimal with video data, because subjects gaze does not correspond precisely to objects in motion (due to lag or anticipation, see section 4.1) and there are few fixations per frame (several per second). Thus, we propose using fixation heatmaps instead of the fixations themselves, to conduct the comparison. The goal of choosing more than one comparison metric is to ensure that the discussion about the results is as independent as possible from the choice of the metrics. The result of the different metrics is not necessarily the same, but when two metrics show similarities, the interpretation is more robust.

7 Author Guidelines for ACCV Final Paper 7 For the initial comparison, we chose the NSS metric, which is the average of the response values at human eye positions in a model s saliency map. To do this, we needed to normalize the saliency map, to have zero mean and unit standard deviation. The eye tracking map can then be thresholded to become a binary mask as shown in Figure 3, third column. For the second, CC metric, the same procedure was used. To calculate both metrics, we used the freely available Matlab functions implemented by Ali Borji [31]. For NSS and CC a selective threshold of 0.7 was chosen for the eye heatmap. It is empirically set low enough to provide reasonably large regions but high enough to avoid potentially uninteresting regions: those two measures mainly validate the presence of high salience in the areas of high eye gaze density. The final evaluation is made using AUC ROC, the area under the ROC curve. In this metric, we normalize the saliency map and use the density eye tracking map ( heatmap in Figure 3, second column) directly. The saliency map is then thresholded to create a binary mask that separates the positive samples from the negative. Contrary to the NSS and CC measures, this threshold is much less selective and just intends to eliminate the areas containing no fixations. Indeed, the ROC measure mainly focuses on the order of fixations and position, and not on the fixation density. By thresholding the saliency map using several different thresholds and plotting True Positive Rate (TPR) versus False Positive Rate (FPR), a ROC curve is obtained. Once the ROC curve is computed, the Area Under the Curve (AUC) can be calculated. To compute this metric, we used freely available Matlab functions implemented by Brian Lau [32]. 4 Results 4.1 Comparison issues and data processing The comparison of static saliency models working with still images is a difficult task. Indeed, the nature of the data obtained from eye tracking and the saliency maps is different and the saliency maps themselves can be very different (in terms of spatial resolution for example). In addition, parameters that control the specific processes within the saliency estimation algorithms, like smoothing, can significantly influence the results and the quantitative results. For dynamic data (videos) and dynamic saliency maps, the task is even more challenging than for still images. Other problems due to object motion and eye tracking are compounded to the common comparison issues. The main issue is in the fact that rapid motion is not followed precisely by people s gaze and that the gaze position (fixation) sometimes lags behind the real object they subjects are tracking (in cases of high-speed objects) or moves in advance of the object, when people try to anticipate the object trajectory. This means that the maximum gaze density is not necessarily focused on the tracked object, but it could lay somewhere in its surroundings. Thus, the saliency maps with high spatial resolution, which are very well focused on the tracked object, will be penalized. To avoid the effects of lag or advancement, and have a fair relative comparison of the different dynamic saliency models, we propose the use of

8 8 Authors Suppressed Due to Excessive Length Fig. 3. Preprocessing of the data to provide a fair comparison between the models. novel saliency map preprocessing methodology. The initial saliency maps of the four dynamic models can be seen in the left column of Figure 3. When the saliency maps are thresholded, as is done for the NSS and CC metrics, one can observe (second column of Figure 3) that the sizes of the detected objects are very different. We, therefore, performed mathematical morphology operations and smoothing on the four saliency maps to make them all have the same object size. The result is visible in Figure 3, third column, while the forth column shows the eye tracking data after thresholding. The moving objects sizes are at this point identical for all four saliency models. The morphological operations and smoothing parameters were chosen in order to maximize each model s results in terms of NSS, CC and AUC-ROC. Those parameters ensure both the optimal evaluation for each model and their relative fairness. In addition, as substantial shifts between the tracked target and the real gaze position are observed, using the precise fixation position has much less meaning than in the case of still images. Thus, we dilated the fixation points to larger disks before producing the heatmaps. Due to the heterogeneous features used by the different saliency algorithms, an additional problem for the comparison of saliency maps and video eye scanpaths became apparent. Algorithms which do not use static cues will perform worse on frames where there is no visible motion than the others. Therefore, the mean score for the entire video will be influenced by the number of quasi-static frames within the video.

9 Author Guidelines for ACCV Final Paper Comparison results Here, we compare the four models presented in Section 2, based on the eye tracking data collected for the ASCMN database, using the comparison metrics discussed in section 3 and the preprocessing described in the previous section. In addition to the four models described in section 2, a comparison with a simple frame to frame differencing algorithm was performed. This algorithm uses the absolute difference between two successive frames to detect moving objects. We will call this approach BASELINE in the following sections. The baseline is used to compare the saliency models with a very simple motion detection algorithm to evaluate the advantages of the saliency models over motion detection alone. The whole set of results is summarized in Figure 5 and in Tables 2 and 3. The first line of the Figure 5 shows the results obtained for the CC metric, the second line for NSS and the third line for the ROC, for each video class taken separately and together. For ABNORMAL videos, Mancas model provides the best results, followed by the Culibrk model. While the baseline has good results for CC and NSS, it drops with the AUC-ROC metric below all the other models. For SURVEILLANCE videos, Culibrk model works the best according to CC and NSS metrics, while Mancas model is the best according to AUC-ROC. All the models do better than the baseline. For the CROWD videos, Culibrk model is the best according to AUC-ROC metric, while Mancas model is the best according to CC and NSS metrics. The baseline here is very efficient as the main cue is in motion itself. For the MOVING videos, the model of Seo is the best according to the AUC-ROC metric while Mancas model is the best according to CC and NSS metrics. For NOISE videos Mancas performs better for AUC-ROC while it is outperformed by Culibrk for CC measures and by Culibrk and Seo for NSS. Finally, the results for the whole dataset are aggregated providing the mean and standard error (SE) [33]. Mancas model is best, closely followed by Culibrk model. It is interesting to note that the Seo model does not work very well here, as it is outperformed by the baseline for two of the three metrics. 5 Discussion and conclusion We proposed a framework for evaluating dynamic-saliency models, based on a novel video database, which is publicly available [17]. The database contains 5 classes of videos covering a wide spectrum of realistic situations, but is designed specifically for evaluating dynamic saliency models. To the best of our knowledge, the proposed database has the largest coverage of such real-life videos. Four dynamic saliency models integrating both static and dynamic features, or only dynamic features in the case of Mancas model, were tested using three different metrics on this database. They were also compared with a basic motion detection model, which was considered the baseline. The comparison was performed both separately, for each of the 5 video classes, and globally for the whole database. Based on those comparisons several conclusions can be drawn.

10 10 Authors Suppressed Due to Excessive Length Fig. 4. Left images show high fixation dispersion, the two other show low dispersion. First, when compared to eye tracking data, the saliency models are better than the baseline which only detects motion. There is an exception to this general rule when the class of CROWD videos is concerned, where the baseline provides good results. In crowds, it seems that motion is of huge importance. As people are small, it is difficult for humans to spot the difference between the kinds of motion, and areas within the crowds with high motion will naturally attract attention. There are probably too many distractors for the eye to detect small groups of people behaving differently. The NOISE videos proved to be difficult for all the models. This is due to the fact that, in a significant part of the video, nothing special happens and only motion noise is present as shown in the two left frames of Figure 4. In this case it is impossible to use a saliency prediction model: human fixations dispersion is too high, with no common tendency. As expected, the saliency models perform best on the ABNORMAL videos, and two of the four models claim a high selectivity in the detection of the motion of interest, namely Seo and Mancas models. Seo is the most selective, which is very good if the selected blob fits with the one attracting people gaze, but, in the contrary, Seo s model is highly penalized. As we use the mean score within the video, Seo s model score dramatically decreases each time that the gaze target is missed, which provides a mediocre overall score for the whole video. Mancas algorithm is also selective but less than the one of Seo, so it is not penalized as severely when it misses the target, since the other moving blobs are not completely ignored. The interesting conclusion is that a saliency model on a video should be, of course, selective, but not too much to get good mean results. Finally, we identified a lot of issues in achieving a fair comparison, as the eye gaze is not necessarily always superimposed on the target. Before the preprocessing step performed in section 4.1, Culibrk model was highly penalized by its too high spatial resolution. But after the preprocessing step, the nature of the saliency maps begun to converge, which led to easier comparison between those maps. We used several metrics and well known implementations of those metrics to have complementary scores and not rely on any one metric. Four models of bottom-up dynamic saliency detection were compared on a database containing a wide variety of video classes. None of them is really outstanding in all the situations and they actually seem complimentary depending on the video class, even if Mancas and Culibrk models have the higher overall score. Seo is also interesting as it is highly selective while SUN s performance is below the others, probably due to its disregard for the motion cues.

11 Author Guidelines for ACCV Final Paper 11 Table 2. Results for the 5 classes of videos and 3 comparison metrics Video classes 1) Containing abnormal motion (ABNORMAL) 2) Video surveillance style (SURVEILLANCE) 3) Crowd motion (CROWD) 4) Videos with moving camera (MOVING) AUC-ROC mean Mancas: 74% Culibrk: 73% SEO: 71% SUN: 68% Baseline: 65% Mancas: 72% SEO: 70% SUN: 68% Culibrk: 65% Baseline: 63% Culibrk: 68% Baseline: 65% Mancas: 65% SEO: 61% SUN: 60% SEO: 63% Culibrk: 63% Mancas: 60% SUN: 59% Baseline: 56% CC mean Mancas: 0.28 Culibrk: 0.25 SUN: 0.24 SEO: 0.23 Baseline: 0.23 Culibrk: 0.27 SUN: 0.25 SEO: 0.20 Mancas: 0.19 Baseline: 0.16 Mancas: 0.18 Baseline: 0.15 Culibrk: 0.13 SUN: 0.10 SEO: 0.07 Mancas: 0.17 SEO: 0.14 SUN: 0.13 Culibrk: 0.10 Baseline: 0.08 NSS mean Mancas: 1.63 Culibrk: 1.43 SUN: 1.40 SEO: 1.39 Baseline: 1.35 Culibrk: 1.68 SUN: 1.51 SEO: 1.23 Mancas: 1.18 Baseline: 1.02 Mancas: 0.95 Baseline: 0.82 Culibrk: 0.73 SUN: 0.57 SEO: 0.38 Mancas: 0.99 SEO: 0.81 SUN: 0.74 Culibrk: 0.58 Baseline: ) Motion noise with sudden salient motion (NOISE) Mancas: 67% Culibrk: 65% SUN: 62% Baseline: 61% SEO: 60% Culibrk: 0.19 Mancas: 0.17 SUN: 0.16 Baseline: 0.14 SEO: 0.11 Culibrk: 1.07 SUN: 0.97 Mancas: 0.95 Baseline: 0.77 SEO: 0.67 Table 3. Results for the whole dataset of videos and the 3 comparison metrics Models AUC-ROC mean (+/ SE) CC mean (+/ SE) NSS mean (+/ SE) Mancas (+/ ) (+/ ) (+/ ) Culibrk (+/ ) (+/ ) (+/ ) SUN (+/ ) (+/ ) (+/ ) SEO (+/ ) (+/ ) (+/ ) Baseline (+/ ) (+/ ) (+/ )

12 12 Authors Suppressed Due to Excessive Length Fig. 5. Correlation NSS and ROC for the baseline and the four dynamic saliency methods for each of the five video categories. The bar graphs on bottom right show the mean and standard error (SE) for the whole dataset References 1. Itti, L., Koch, C.: Computational modelling of visual attention. Nature Reviews Neuroscience 2 (2001) Olveczky, B.P., Baccus, S.A., Meister, M.: Segregation of object and background motion in the retina. Nature 423 (2003) James, W.: The Principles of Psychology, Vol. 1. Dover Publications (1950) 4. Styles, E.A.: Attention, Perception, and Memory: An Integrated Introduction. Taylor & Francis Routledge, New York, NY (2005) 5. Wolfe, J.M.: Visual attention. In: Seeing, Academic Press (2000) Marques, O., Mayron, L.M., Borba, G.B., Gamba, H.R.: An attention-driven model for grouping similar images with image retrieval applications. EURASIP Journal on Advances in Signal Processing 2007 (2007) 7. Siagian, C., Itti, L.: Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2007) Siagian, C., Itti, L.: Biologically inspired mobile robot vision localization. IEEE Transactions on Robotics 25 (2009) Ma, Y.F., Hua, X.S., Lu, L., Zhang, H.J.: A generic framework of user attention model and its application in video summarization. Multimedia, IEEE Transactions on 7 (2005)

13 Author Guidelines for ACCV Final Paper Culibrk, D., Mirkovic, M., Zlokolica, V., Pokric, M., Crnojevic, V., Kukolj, D.: Salient motion features for video quality assessment. Image Processing, IEEE Transactions on (2010) Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Transactions on Image Processing 13 (2004) Bruce, N., Tsotsos, J.: Saliency based on information maximization. Advances in neural information processing systems 18 (2006) Seo, H., Milanfar, P.: Static and space-time visual saliency detection by selfresemblance. Journal of Vision 9 (2009) Zhang, L., Tong, M., Marks, T., Shan, H., Cottrell, G.: Sun: A bayesian framework for saliency using natural statistics. Journal of Vision 8 (2008) Mancas, M., Riche, N., J. Leroy, B.G.: Abnormal motion selection in crowds using bottom-up saliency. In: IEEE ICIP. (2011) 16. Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. Pattern Analysis and Machine Intelligence, IEEE Transactions on 32 (2010) Matei Mancas, N.R.: Attention website. ( Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) Li, L., Huang, W., Gu, I., Tian, Q.: Statistical modeling of complex backgrounds for foreground object detection. In: IEEE Trans. Image Processing, vol. 13, pp (2004) 20. Rosin, L.: Thresholding for change detection. In: Proc. of the Sixth International Conference on Computer Vision (ICCV 98). (1998) 21. Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artificial Intelligence Review 22 (2004) Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: SCIA. (2003) Itti, L.: Crcns-orig video and eye tracking database. ( 24. Henderson, J.M.: Diem video and eye tracking database. ( Hadizadeh, H., Enriquez, M.J., Bajic, I.V.: Eye tracking database for a set of standard video sequences. accepted for publication in IEEE Trans. Image Processing (2011) 26. Dorr, M., Martinetz, T., Gegenfurtner, K.R., Barth, E.: Variability of eye movements when viewing dynamic natural scenes. Journal of Vision 10 (2010) 27. Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2010) Machine, S.: Facelab commercial eye tracking system. ( Peters, R.J., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vision Research 45 (2005) Green, D.M., Swets, J.A.: Signal detection theory and psychophysics. Wiley, New York (1966) 31. Borji, A.: Evaluation measures for saliency maps: Cc and nss. ( Lau, B.: Evaluation measures for saliency maps: Auc roc. ( under roc curve) 33. Gurland, J., Tripathi, R.C.: A simple approximation for unbiased estimation of the standard deviation. 25 (1971) 30 32

Saliency Detection for Videos Using 3D FFT Local Spectra

Saliency Detection for Videos Using 3D FFT Local Spectra Saliency Detection for Videos Using 3D FFT Local Spectra Zhiling Long and Ghassan AlRegib School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA ABSTRACT

More information

VIDEO SALIENCY BASED ON RARITY PREDICTION: HYPERAPTOR

VIDEO SALIENCY BASED ON RARITY PREDICTION: HYPERAPTOR VIDEO SALIENCY BASED ON RARITY PREDICTION: HYPERAPTOR Ioannis Cassagne a, Nicolas Riche a, Marc Décombas a, Matei Mancas a, Bernard.Gosselin a, Thierry Dutoit a, Robert Laganiere b a University of Mons

More information

Video saliency detection by spatio-temporal sampling and sparse matrix decomposition

Video saliency detection by spatio-temporal sampling and sparse matrix decomposition Video saliency detection by spatio-temporal sampling and sparse matrix decomposition * Faculty of Information Science and Engineering Ningbo University Ningbo 315211 CHINA shaofeng@nbu.edu.cn Abstract:

More information

Dynamic visual attention: competitive versus motion priority scheme

Dynamic visual attention: competitive versus motion priority scheme Dynamic visual attention: competitive versus motion priority scheme Bur A. 1, Wurtz P. 2, Müri R.M. 2 and Hügli H. 1 1 Institute of Microtechnology, University of Neuchâtel, Neuchâtel, Switzerland 2 Perception

More information

Motion statistics at the saccade landing point: attentional capture by spatiotemporal features in a gaze-contingent reference

Motion statistics at the saccade landing point: attentional capture by spatiotemporal features in a gaze-contingent reference Wilhelm Schickard Institute Motion statistics at the saccade landing point: attentional capture by spatiotemporal features in a gaze-contingent reference Anna Belardinelli Cognitive Modeling, University

More information

Fast and Efficient Saliency Detection Using Sparse Sampling and Kernel Density Estimation

Fast and Efficient Saliency Detection Using Sparse Sampling and Kernel Density Estimation Fast and Efficient Saliency Detection Using Sparse Sampling and Kernel Density Estimation Hamed Rezazadegan Tavakoli, Esa Rahtu, and Janne Heikkilä Machine Vision Group, Department of Electrical and Information

More information

Learning video saliency from human gaze using candidate selection

Learning video saliency from human gaze using candidate selection Learning video saliency from human gaze using candidate selection Rudoy, Goldman, Shechtman, Zelnik-Manor CVPR 2013 Paper presentation by Ashish Bora Outline What is saliency? Image vs video Candidates

More information

Saliency Extraction for Gaze-Contingent Displays

Saliency Extraction for Gaze-Contingent Displays In: Workshop on Organic Computing, P. Dadam, M. Reichert (eds.), Proceedings of the 34th GI-Jahrestagung, Vol. 2, 646 650, Ulm, September 2004. Saliency Extraction for Gaze-Contingent Displays Martin Böhme,

More information

An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method

An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method Ms. P.MUTHUSELVI, M.E(CSE), V.P.M.M Engineering College for Women, Krishnankoil, Virudhungar(dt),Tamil Nadu Sukirthanagarajan@gmail.com

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Motion Detection Algorithm

Motion Detection Algorithm Volume 1, No. 12, February 2013 ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ Motion Detection

More information

Evaluation of regions-of-interest based attention algorithms using a probabilistic measure

Evaluation of regions-of-interest based attention algorithms using a probabilistic measure Evaluation of regions-of-interest based attention algorithms using a probabilistic measure Martin Clauss, Pierre Bayerl and Heiko Neumann University of Ulm, Dept. of Neural Information Processing, 89081

More information

UNDERSTANDING SPATIAL CORRELATION IN EYE-FIXATION MAPS FOR VISUAL ATTENTION IN VIDEOS. Tariq Alshawi, Zhiling Long, and Ghassan AlRegib

UNDERSTANDING SPATIAL CORRELATION IN EYE-FIXATION MAPS FOR VISUAL ATTENTION IN VIDEOS. Tariq Alshawi, Zhiling Long, and Ghassan AlRegib UNDERSTANDING SPATIAL CORRELATION IN EYE-FIXATION MAPS FOR VISUAL ATTENTION IN VIDEOS Tariq Alshawi, Zhiling Long, and Ghassan AlRegib Center for Signal and Information Processing (CSIP) School of Electrical

More information

Fixation Prediction in Videos using Unsupervised Hierarchical Features

Fixation Prediction in Videos using Unsupervised Hierarchical Features Fixation Prediction in Videos using Unsupervised Hierarchical Features Julius Wang, Hamed R. Tavakoli, Jorma Laaksonen Department of Computer Science Aalto University, Finland {tzu-jui.wang, hamed.r-tavakoli,

More information

TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL PRIORS

TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL PRIORS TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL PRIORS Ludovic Simon, Jean-Philippe Tarel, Roland Brémond Laboratoire Central des Ponts et Chausses (LCPC), 58 boulevard Lefebvre, Paris, France ludovic.simon@lcpc.fr,

More information

Predicting Eyes Fixations in Movie Videos: Visual Saliency Experiments on a New Eye-Tracking Database

Predicting Eyes Fixations in Movie Videos: Visual Saliency Experiments on a New Eye-Tracking Database Predicting Eyes Fixations in Movie Videos: Visual Saliency Experiments on a New Eye-Tracking Database Petros Koutras, Athanasios Katsamanis, and Petros Maragos School of Electrical and Computer Engineering

More information

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK Mahamuni P. D 1, R. P. Patil 2, H.S. Thakar 3 1 PG Student, E & TC Department, SKNCOE, Vadgaon Bk, Pune, India 2 Asst. Professor,

More information

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling Aircraft Tracking Based on KLT Feature Tracker and Image Modeling Khawar Ali, Shoab A. Khan, and Usman Akram Computer Engineering Department, College of Electrical & Mechanical Engineering, National University

More information

A New Framework for Multiscale Saliency Detection Based on Image Patches

A New Framework for Multiscale Saliency Detection Based on Image Patches Neural Process Lett (2013) 38:361 374 DOI 10.1007/s11063-012-9276-3 A New Framework for Multiscale Saliency Detection Based on Image Patches Jingbo Zhou Zhong Jin Published online: 8 January 2013 Springer

More information

A Model of Dynamic Visual Attention for Object Tracking in Natural Image Sequences

A Model of Dynamic Visual Attention for Object Tracking in Natural Image Sequences Published in Computational Methods in Neural Modeling. (In: Lecture Notes in Computer Science) 2686, vol. 1, 702-709, 2003 which should be used for any reference to this work 1 A Model of Dynamic Visual

More information

PERFORMANCE ANALYSIS OF COMPUTING TECHNIQUES FOR IMAGE DISPARITY IN STEREO IMAGE

PERFORMANCE ANALYSIS OF COMPUTING TECHNIQUES FOR IMAGE DISPARITY IN STEREO IMAGE PERFORMANCE ANALYSIS OF COMPUTING TECHNIQUES FOR IMAGE DISPARITY IN STEREO IMAGE Rakesh Y. Department of Electronics and Communication Engineering, SRKIT, Vijayawada, India E-Mail: rakesh.yemineni@gmail.com

More information

Asymmetry as a Measure of Visual Saliency

Asymmetry as a Measure of Visual Saliency Asymmetry as a Measure of Visual Saliency Ali Alsam, Puneet Sharma, and Anette Wrålsen Department of Informatics & e-learning (AITeL), Sør-Trøndelag University College (HiST), Trondheim, Norway er.puneetsharma@gmail.com

More information

Saliency and Human Fixations: State-of-the-art and Study of Comparison Metrics

Saliency and Human Fixations: State-of-the-art and Study of Comparison Metrics 2013 IEEE International Conference on Computer Vision Saliency and Human Fixations: State-of-the-art and Study of Comparison Metrics Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin and

More information

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Sikha O K 1, Sachin Kumar S 2, K P Soman 2 1 Department of Computer Science 2 Centre for Computational Engineering and

More information

Quasi-thematic Features Detection & Tracking. Future Rover Long-Distance Autonomous Navigation

Quasi-thematic Features Detection & Tracking. Future Rover Long-Distance Autonomous Navigation Quasi-thematic Feature Detection And Tracking For Future Rover Long-Distance Autonomous Navigation Authors: Affan Shaukat, Conrad Spiteri, Yang Gao, Said Al-Milli, and Abhinav Bajpai Surrey Space Centre,

More information

TRABAJO FIN DE GRADO

TRABAJO FIN DE GRADO Escuela Politécnica Superior Grado en Ingeniería de Sistemas Audiovisuales TRABAJO FIN DE GRADO A Multimodal Approach for the Automatic Assessment of Viewer Subjective Perception of Youtube Videos Fernando

More information

Motion in 2D image sequences

Motion in 2D image sequences Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or activities Segmentation and understanding of video sequences

More information

Real Time Motion Detection Using Background Subtraction Method and Frame Difference

Real Time Motion Detection Using Background Subtraction Method and Frame Difference Real Time Motion Detection Using Background Subtraction Method and Frame Difference Lavanya M P PG Scholar, Department of ECE, Channabasaveshwara Institute of Technology, Gubbi, Tumkur Abstract: In today

More information

Saliency Estimation Using a Non-Parametric Low-Level Vision Model

Saliency Estimation Using a Non-Parametric Low-Level Vision Model Accepted for IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado, June 211 Saliency Estimation Using a Non-Parametric Low-Level Vision Model Naila Murray, Maria Vanrell, Xavier

More information

Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008

Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008 Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008 Instructor: YingLi Tian Video Surveillance E6998-007 Senior/Feris/Tian 1 Outlines Moving Object Detection with Distraction Motions

More information

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM M.Ranjbarikoohi, M.Menhaj and M.Sarikhani Abstract: Pedestrian detection has great importance in automotive vision systems

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 11 140311 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Motion Analysis Motivation Differential Motion Optical

More information

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN Gengjian Xue, Jun Sun, Li Song Institute of Image Communication and Information Processing, Shanghai Jiao

More information

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection

A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection A Semi-Automatic 2D-to-3D Video Conversion with Adaptive Key-Frame Selection Kuanyu Ju and Hongkai Xiong Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China ABSTRACT To

More information

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Evaluation

More information

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation Chris J. Needham and Roger D. Boyle School of Computing, The University of Leeds, Leeds, LS2 9JT, UK {chrisn,roger}@comp.leeds.ac.uk

More information

Contour-Based Large Scale Image Retrieval

Contour-Based Large Scale Image Retrieval Contour-Based Large Scale Image Retrieval Rong Zhou, and Liqing Zhang MOE-Microsoft Key Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai

More information

Supplementary Material for submission 2147: Traditional Saliency Reloaded: A Good Old Model in New Shape

Supplementary Material for submission 2147: Traditional Saliency Reloaded: A Good Old Model in New Shape Supplementary Material for submission 247: Traditional Saliency Reloaded: A Good Old Model in New Shape Simone Frintrop, Thomas Werner, and Germán M. García Institute of Computer Science III Rheinische

More information

HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION. Gengjian Xue, Li Song, Jun Sun, Meng Wu

HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION. Gengjian Xue, Li Song, Jun Sun, Meng Wu HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION Gengjian Xue, Li Song, Jun Sun, Meng Wu Institute of Image Communication and Information Processing, Shanghai Jiao Tong University,

More information

Local 3D Symmetry for Visual Saliency in 2.5D Point Clouds

Local 3D Symmetry for Visual Saliency in 2.5D Point Clouds Local 3D Symmetry for Visual Saliency in 2.5D Point Clouds Ekaterina Potapova, Michael Zillich, Markus Vincze Automation and Control Institute, Vienna University of Technology Abstract. Many models of

More information

A Novel Approach to Saliency Detection Model and Its Applications in Image Compression

A Novel Approach to Saliency Detection Model and Its Applications in Image Compression RESEARCH ARTICLE OPEN ACCESS A Novel Approach to Saliency Detection Model and Its Applications in Image Compression Miss. Radhika P. Fuke 1, Mr. N. V. Raut 2 1 Assistant Professor, Sipna s College of Engineering

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

Computing 3D saliency from a 2D image

Computing 3D saliency from a 2D image Computing 3D saliency from a 2D image Sudarshan Ramenahalli Zanvyl Krieger Mind/Brain Institute and Dept. of Elec and Computer Eng Johns Hopkins University Baltimore, MD 21218 email: sramena1@jhu.edu Ernst

More information

Dynamic Obstacle Detection Based on Background Compensation in Robot s Movement Space

Dynamic Obstacle Detection Based on Background Compensation in Robot s Movement Space MATEC Web of Conferences 95 83 (7) DOI:.5/ matecconf/79583 ICMME 6 Dynamic Obstacle Detection Based on Background Compensation in Robot s Movement Space Tao Ni Qidong Li Le Sun and Lingtao Huang School

More information

Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task

Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task Neurally Inspired Mechanisms for the Dynamic Visual Attention Map Generation Task Maria T. López 1, Miguel A. Fernández 1, Antonio Fernández-Caballero 1, and Ana E. Delgado 2 1 Departamento de Informática

More information

A Modified Approach to Biologically Motivated Saliency Mapping

A Modified Approach to Biologically Motivated Saliency Mapping A Modified Approach to Biologically Motivated Saliency Mapping Shane Grant Department of Computer Science University of California, San Diego La Jolla, CA 9093 wgrant@ucsd.edu Kevin A Heins Department

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

A Novel Approach to Image Segmentation for Traffic Sign Recognition Jon Jay Hack and Sidd Jagadish

A Novel Approach to Image Segmentation for Traffic Sign Recognition Jon Jay Hack and Sidd Jagadish A Novel Approach to Image Segmentation for Traffic Sign Recognition Jon Jay Hack and Sidd Jagadish Introduction/Motivation: As autonomous vehicles, such as Google s self-driving car, have recently become

More information

Image and Video Quality Assessment Using Neural Network and SVM

Image and Video Quality Assessment Using Neural Network and SVM TSINGHUA SCIENCE AND TECHNOLOGY ISSN 1007-0214 18/19 pp112-116 Volume 13, Number 1, February 2008 Image and Video Quality Assessment Using Neural Network and SVM DING Wenrui (), TONG Yubing (), ZHANG Qishan

More information

Detecting and Identifying Moving Objects in Real-Time

Detecting and Identifying Moving Objects in Real-Time Chapter 9 Detecting and Identifying Moving Objects in Real-Time For surveillance applications or for human-computer interaction, the automated real-time tracking of moving objects in images from a stationary

More information

International Journal of Modern Engineering and Research Technology

International Journal of Modern Engineering and Research Technology Volume 4, Issue 3, July 2017 ISSN: 2348-8565 (Online) International Journal of Modern Engineering and Research Technology Website: http://www.ijmert.org Email: editor.ijmert@gmail.com A Novel Approach

More information

Small Object Segmentation Based on Visual Saliency in Natural Images

Small Object Segmentation Based on Visual Saliency in Natural Images J Inf Process Syst, Vol.9, No.4, pp.592-601, December 2013 http://dx.doi.org/10.3745/jips.2013.9.4.592 pissn 1976-913X eissn 2092-805X Small Object Segmentation Based on Visual Saliency in Natural Images

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

MULTI-SCALE STRUCTURAL SIMILARITY FOR IMAGE QUALITY ASSESSMENT. (Invited Paper)

MULTI-SCALE STRUCTURAL SIMILARITY FOR IMAGE QUALITY ASSESSMENT. (Invited Paper) MULTI-SCALE STRUCTURAL SIMILARITY FOR IMAGE QUALITY ASSESSMENT Zhou Wang 1, Eero P. Simoncelli 1 and Alan C. Bovik 2 (Invited Paper) 1 Center for Neural Sci. and Courant Inst. of Math. Sci., New York Univ.,

More information

A Novel Video Enhancement Based on Color Consistency and Piecewise Tone Mapping

A Novel Video Enhancement Based on Color Consistency and Piecewise Tone Mapping A Novel Video Enhancement Based on Color Consistency and Piecewise Tone Mapping Keerthi Rajan *1, A. Bhanu Chandar *2 M.Tech Student Department of ECE, K.B.R. Engineering College, Pagidipalli, Nalgonda,

More information

Spatio-Temporal Stereo Disparity Integration

Spatio-Temporal Stereo Disparity Integration Spatio-Temporal Stereo Disparity Integration Sandino Morales and Reinhard Klette The.enpeda.. Project, The University of Auckland Tamaki Innovation Campus, Auckland, New Zealand pmor085@aucklanduni.ac.nz

More information

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation , pp.162-167 http://dx.doi.org/10.14257/astl.2016.138.33 A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation Liqiang Hu, Chaofeng He Shijiazhuang Tiedao University,

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Blind Prediction of Natural Video Quality and H.264 Applications

Blind Prediction of Natural Video Quality and H.264 Applications Proceedings of Seventh International Workshop on Video Processing and Quality Metrics for Consumer Electronics January 30-February 1, 2013, Scottsdale, Arizona 1 Blind Prediction of Natural Video Quality

More information

Background/Foreground Detection 1

Background/Foreground Detection 1 Chapter 2 Background/Foreground Detection 1 2.1 Introduction With the acquisition of an image, the first step is to distinguish objects of interest from the background. In surveillance applications, those

More information

What do different evaluation metrics tell us about saliency models?

What do different evaluation metrics tell us about saliency models? 1 What do different evaluation metrics tell us about saliency models? Zoya Bylinskii*, Tilke Judd*, Aude Oliva, Antonio Torralba, and Frédo Durand arxiv:1604.03605v1 [cs.cv] 12 Apr 2016 Abstract How best

More information

Suspicious Activity Detection of Moving Object in Video Surveillance System

Suspicious Activity Detection of Moving Object in Video Surveillance System International Journal of Latest Engineering and Management Research (IJLEMR) ISSN: 2455-4847 ǁ Volume 1 - Issue 5 ǁ June 2016 ǁ PP.29-33 Suspicious Activity Detection of Moving Object in Video Surveillance

More information

STUDY ON DISTORTION CONSPICUITY IN STEREOSCOPICALLY VIEWED 3D IMAGES

STUDY ON DISTORTION CONSPICUITY IN STEREOSCOPICALLY VIEWED 3D IMAGES STUDY ON DISTORTION CONSPICUITY IN STEREOSCOPICALLY VIEWED 3D IMAGES Ming-Jun Chen, 1,3, Alan C. Bovik 1,3, Lawrence K. Cormack 2,3 Department of Electrical & Computer Engineering, The University of Texas

More information

An Implementation on Object Move Detection Using OpenCV

An Implementation on Object Move Detection Using OpenCV An Implementation on Object Move Detection Using OpenCV Professor: Dr. Ali Arya Reported by: Farzin Farhadi-Niaki Department of Systems and Computer Engineering Carleton University Ottawa, Canada I. INTRODUCTION

More information

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

PROBLEM FORMULATION AND RESEARCH METHODOLOGY PROBLEM FORMULATION AND RESEARCH METHODOLOGY ON THE SOFT COMPUTING BASED APPROACHES FOR OBJECT DETECTION AND TRACKING IN VIDEOS CHAPTER 3 PROBLEM FORMULATION AND RESEARCH METHODOLOGY The foregoing chapter

More information

Focusing Attention on Visual Features that Matter

Focusing Attention on Visual Features that Matter TSAI, KUIPERS: FOCUSING ATTENTION ON VISUAL FEATURES THAT MATTER 1 Focusing Attention on Visual Features that Matter Grace Tsai gstsai@umich.edu Benjamin Kuipers kuipers@umich.edu Electrical Engineering

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences

Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences Integrating Low-Level and Semantic Visual Cues for Improved Image-to-Video Experiences Pedro Pinho, Joel Baltazar, Fernando Pereira Instituto Superior Técnico - Instituto de Telecomunicações IST, Av. Rovisco

More information

/13/$ IEEE

/13/$ IEEE Proceedings of the 013 International Conference on Pattern Recognition, Informatics and Mobile Engineering, February 1- Background Subtraction Based on Threshold detection using Modified K-Means Algorithm

More information

Moving Object Detection for Video Surveillance

Moving Object Detection for Video Surveillance International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Moving Object Detection for Video Surveillance Abhilash K.Sonara 1, Pinky J. Brahmbhatt 2 1 Student (ME-CSE), Electronics and Communication,

More information

Locating ego-centers in depth for hippocampal place cells

Locating ego-centers in depth for hippocampal place cells 204 5th Joint Symposium on Neural Computation Proceedings UCSD (1998) Locating ego-centers in depth for hippocampal place cells Kechen Zhang,' Terrence J. Sejeowski112 & Bruce L. ~cnau~hton~ 'Howard Hughes

More information

Moving Object Tracking in Video Using MATLAB

Moving Object Tracking in Video Using MATLAB Moving Object Tracking in Video Using MATLAB Bhavana C. Bendale, Prof. Anil R. Karwankar Abstract In this paper a method is described for tracking moving objects from a sequence of video frame. This method

More information

A Fast Moving Object Detection Technique In Video Surveillance System

A Fast Moving Object Detection Technique In Video Surveillance System A Fast Moving Object Detection Technique In Video Surveillance System Paresh M. Tank, Darshak G. Thakore, Computer Engineering Department, BVM Engineering College, VV Nagar-388120, India. Abstract Nowadays

More information

Action recognition in videos

Action recognition in videos Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit

More information

Object Detection in Video Streams

Object Detection in Video Streams Object Detection in Video Streams Sandhya S Deore* *Assistant Professor Dept. of Computer Engg., SRES COE Kopargaon *sandhya.deore@gmail.com ABSTRACT Object Detection is the most challenging area in video

More information

Dense Image-based Motion Estimation Algorithms & Optical Flow

Dense Image-based Motion Estimation Algorithms & Optical Flow Dense mage-based Motion Estimation Algorithms & Optical Flow Video A video is a sequence of frames captured at different times The video data is a function of v time (t) v space (x,y) ntroduction to motion

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

Video Indexing. Option TM Lecture 1 Jenny Benois-Pineau LABRI UMR 5800/Université Bordeaux

Video Indexing. Option TM Lecture 1 Jenny Benois-Pineau LABRI UMR 5800/Université Bordeaux Video Indexing. Option TM Lecture 1 Jenny Benois-Pineau LABRI UMR 5800/Université Bordeaux Summary Human vision. Psycho-visual human system. Visual attention. «Subjective» visual saliency. Prediction of

More information

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2015) A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b

More information

Video Compression Using Block By Block Basis Salience Detection

Video Compression Using Block By Block Basis Salience Detection Video Compression Using Block By Block Basis Salience Detection 1 P.Muthuselvi M.Sc.,M.E.,, 2 T.Vengatesh MCA.,M.Phil.,(Ph.D)., 1 Assistant Professor,VPMM Arts & Science college for women, Krishnankoil

More information

Class 9 Action Recognition

Class 9 Action Recognition Class 9 Action Recognition Liangliang Cao, April 4, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual Recognition

More information

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution Detecting Salient Contours Using Orientation Energy Distribution The Problem: How Does the Visual System Detect Salient Contours? CPSC 636 Slide12, Spring 212 Yoonsuck Choe Co-work with S. Sarma and H.-C.

More information

Target Tracking Using Mean-Shift And Affine Structure

Target Tracking Using Mean-Shift And Affine Structure Target Tracking Using Mean-Shift And Affine Structure Chuan Zhao, Andrew Knight and Ian Reid Department of Engineering Science, University of Oxford, Oxford, UK {zhao, ian}@robots.ox.ac.uk Abstract Inthispaper,wepresentanewapproachfortracking

More information

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich. Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Nonparametric Bottom-Up Saliency Detection by Self-Resemblance

Nonparametric Bottom-Up Saliency Detection by Self-Resemblance Nonparametric Bottom-Up Saliency Detection by Self-Resemblance Hae Jong Seo and Peyman Milanfar Electrical Engineering Department University of California, Santa Cruz 56 High Street, Santa Cruz, CA, 95064

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Salient Region Detection using Weighted Feature Maps based on the Human Visual Attention Model

Salient Region Detection using Weighted Feature Maps based on the Human Visual Attention Model Salient Region Detection using Weighted Feature Maps based on the Human Visual Attention Model Yiqun Hu 2, Xing Xie 1, Wei-Ying Ma 1, Liang-Tien Chia 2 and Deepu Rajan 2 1 Microsoft Research Asia 5/F Sigma

More information

BLIND QUALITY ASSESSMENT OF VIDEOS USING A MODEL OF NATURAL SCENE STATISTICS AND MOTION COHERENCY

BLIND QUALITY ASSESSMENT OF VIDEOS USING A MODEL OF NATURAL SCENE STATISTICS AND MOTION COHERENCY BLIND QUALITY ASSESSMENT OF VIDEOS USING A MODEL OF NATURAL SCENE STATISTICS AND MOTION COHERENCY Michele A. Saad The University of Texas at Austin Department of Electrical and Computer Engineering Alan

More information

AUTONOMOUS IMAGE EXTRACTION AND SEGMENTATION OF IMAGE USING UAV S

AUTONOMOUS IMAGE EXTRACTION AND SEGMENTATION OF IMAGE USING UAV S AUTONOMOUS IMAGE EXTRACTION AND SEGMENTATION OF IMAGE USING UAV S Radha Krishna Rambola, Associate Professor, NMIMS University, India Akash Agrawal, Student at NMIMS University, India ABSTRACT Due to the

More information

Morphological Change Detection Algorithms for Surveillance Applications

Morphological Change Detection Algorithms for Surveillance Applications Morphological Change Detection Algorithms for Surveillance Applications Elena Stringa Joint Research Centre Institute for Systems, Informatics and Safety TP 270, Ispra (VA), Italy elena.stringa@jrc.it

More information

EE 5359 MULTIMEDIA PROCESSING. Implementation of Moving object detection in. H.264 Compressed Domain

EE 5359 MULTIMEDIA PROCESSING. Implementation of Moving object detection in. H.264 Compressed Domain EE 5359 MULTIMEDIA PROCESSING Implementation of Moving object detection in H.264 Compressed Domain Under the guidance of Dr. K. R. Rao Submitted by: Vigneshwaran Sivaravindiran UTA ID: 1000723956 1 P a

More information

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania 1 What is visual tracking? estimation of the target location over time 2 applications Six main areas:

More information

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi Motion and Optical Flow Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi We live in a moving world Perceiving, understanding and predicting motion is an important part of our daily lives Motion

More information

A semiautomatic saliency model and its application to video compression

A semiautomatic saliency model and its application to video compression A semiautomatic saliency model and its application to video compression Vitaliy Lyudvichenko, Mikhail Erofeev, Yury Gitman, Dmitriy Vatolin Abstract This work aims to apply visual-attention modeling to

More information

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.5, May 2009 181 A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods Zahra Sadri

More information

Automatic Shadow Removal by Illuminance in HSV Color Space

Automatic Shadow Removal by Illuminance in HSV Color Space Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim

More information