On the analysis of background subtraction techniques using Gaussian mixture models

Similar documents
A Fast Moving Object Detection Technique In Video Surveillance System

Abnormal behavior detection using a multi-modal stochastic learning approach

Adaptive Background Mixture Models for Real-Time Tracking

Detecting People in Images: An Edge Density Approach

Background Subtraction Techniques

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song

Multi-Channel Adaptive Mixture Background Model for Real-time Tracking

A Hierarchical Approach to Robust Background Subtraction using Color and Gradient Information

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL

QUT Digital Repository: This is the author version published as:

Vehicle tracking using projective particle filter

Detecting and Identifying Moving Objects in Real-Time

Motion Detection Using Adaptive Temporal Averaging Method

ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING

Image retrieval based on bag of images

Automatic Shadow Removal by Illuminance in HSV Color Space

An ICA based Approach for Complex Color Scene Text Binarization

Background Initialization with A New Robust Statistical Approach

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications

Object detection using non-redundant local Binary Patterns

A MIXTURE OF DISTRIBUTIONS BACKGROUND MODEL FOR TRAFFIC VIDEO SURVEILLANCE

A Feature Point Matching Based Approach for Video Objects Segmentation

A Background Subtraction Based Video Object Detecting and Tracking Method

Moving Shadow Detection with Low- and Mid-Level Reasoning

Image Segmentation Using Iterated Graph Cuts Based on Multi-scale Smoothing

EE795: Computer Vision and Intelligent Systems

Dynamic Background Subtraction Based on Local Dependency Histogram

Clustering Based Non-parametric Model for Shadow Detection in Video Sequences

Robust color segmentation algorithms in illumination variation conditions

A Texture-Based Method for Modeling the Background and Detecting Moving Objects

Background Subtraction based on Cooccurrence of Image Variations

Human Motion Detection and Tracking for Video Surveillance

A NOVEL MOTION DETECTION METHOD USING BACKGROUND SUBTRACTION MODIFYING TEMPORAL AVERAGING METHOD

Spatio-Temporal Nonparametric Background Modeling and Subtraction

A Texture-based Method for Detecting Moving Objects

EE795: Computer Vision and Intelligent Systems

Fast digital optical flow estimation based on EMD

HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION. Gengjian Xue, Li Song, Jun Sun, Meng Wu

Fast Denoising for Moving Object Detection by An Extended Structural Fitness Algorithm

New structural similarity measure for image comparison

Background subtraction in people detection framework for RGB-D cameras

A PRACTICAL APPROACH TO REAL-TIME DYNAMIC BACKGROUND GENERATION BASED ON A TEMPORAL MEDIAN FILTER

A Novelty Detection Approach for Foreground Region Detection in Videos with Quasi-stationary Backgrounds

Affine-invariant scene categorization

A physically motivated pixel-based model for background subtraction in 3D images

/13/$ IEEE

Motion Detection Algorithm

Median mixture model for background foreground segmentation in video sequences

Robust Real-Time Background Subtraction based on Local Neighborhood Patterns

Detecting motion by means of 2D and 3D information

Change Detection by Frequency Decomposition: Wave-Back

Image Quality Assessment Techniques: An Overview

ADAPTIVE DEBLURRING OF SURVEILLANCE VIDEO SEQUENCES THAT DETERIORATE OVER TIME. Konstantinos Vougioukas, Bastiaan J. Boom and Robert B.

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

AUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION. Ninad Thakoor, Jean Gao and Huamei Chen

DEALING WITH GRADUAL LIGHTING CHANGES IN VIDEO SURVEILLANCE FOR INDOOR ENVIRONMENTS

Face Hallucination Based on Eigentransformation Learning

Non-linear Parametric Bayesian Regression for Robust Background Subtraction

SEMI-BLIND IMAGE RESTORATION USING A LOCAL NEURAL APPROACH

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement

SHADOW DETECTION USING TRICOLOR ATTENUATION MODEL ENHANCED WITH ADAPTIVE HISTOGRAM EQUALIZATION

A Novel Extreme Point Selection Algorithm in SIFT

Video shot segmentation using late fusion technique

Segmentation and Tracking of Partial Planar Templates

Video Surveillance for Effective Object Detection with Alarm Triggering

Adaptive Skin Color Classifier for Face Outline Models

Video Surveillance System for Object Detection and Tracking Methods R.Aarthi, K.Kiruthikadevi

2D to pseudo-3d conversion of "head and shoulder" images using feature based parametric disparity maps

Searching Video Collections:Part I

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c

Chapter 7: Dual Modeling in the Presence of Constant Variance

A Bayesian Approach to Background Modeling

A Texture-based Method for Detecting Moving Objects

Learning a Sparse, Corner-based Representation for Time-varying Background Modelling

Background Subtraction for Urban Traffic Monitoring using Webcams

Background Segmentation with Feedback: The Pixel-Based Adaptive Segmenter

On the performance of turbo codes with convolutional interleavers

Image Segmentation Using Iterated Graph Cuts BasedonMulti-scaleSmoothing

Hybrid Cone-Cylinder Codebook Model for Foreground Detection with Shadow and Highlight Suppression

Accelerating Pattern Matching or HowMuchCanYouSlide?

Spatial Mixture of Gaussians for dynamic background modelling

Implementation of the Gaussian Mixture Model Algorithm for Real-Time Segmentation of High Definition video: A review 1

LOCAL-GLOBAL OPTICAL FLOW FOR IMAGE REGISTRATION

Bagging for One-Class Learning

Illumination-Robust Face Recognition based on Gabor Feature Face Intrinsic Identity PCA Model

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

Background Subtraction in Varying Illuminations Using an Ensemble Based on an Enlarged Feature Set

Human Motion tracking using Gaussian Mixture Method and Beta-Likelihood Matching

Connected Component Analysis and Change Detection for Images

Background Image Generation Using Boolean Operations

Research Article Model Based Design of Video Tracking Based on MATLAB/Simulink and DSP

A Fuzzy Approach for Background Subtraction

Introduction to Digital Image Processing

Spatio-Temporal Stereo Disparity Integration

A Real Time Human Detection System Based on Far Infrared Vision

An Approach for Real Time Moving Object Extraction based on Edge Region Determination

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines

Advanced Motion Detection Technique using Running Average Discrete Cosine Transform for Video Surveillance Application

Transcription:

University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 On the analysis of background subtraction techniques using Gaussian mixture models Abdesselam Bouzerdoum University of Wollongong, bouzer@uow.edu.au Azeddine Beghdadi Son Lam Phung University of Wollongong, phung@uow.edu.au Philippe L. Bouttefroy philippe@uow.edu.au Publication Details Bouttefroy, P. L., Bouzerdoum, A., Beghdadi, A. & Phung, S. (2010). On the analysis of background subtraction techniques using Gaussian mixture models. 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 4042-4045). USA: IEEE. Research Online is the open access institutional repository for the University of Wollongong. For further information contact the UOW Library: research-pubs@uow.edu.au

On the analysis of background subtraction techniques using Gaussian mixture models Abstract In this paper, we conduct an investigation into background subtraction techniques using Gaussian Mixture Models (GMM) in the presence of large illumination changes and background variations. We show that the techniques used to date suffer from the trade-off imposed by the use of a common learning rate to update both the mean and variance of the component densities, which leads to a degeneracy of the variance and creates saturated pixels. To address this problem, we propose a simple yet effective technique that differentiates between the two learning rates, and imposes a constraint on the variance so as to avoid the degeneracy problem. Experimental results are presented which show that, compared to existing techniques, the proposed algorithm provides more robust segmentation in the presence of illumination variations and abrupt changes in background distribution. Keywords techniques, analysis, models, mixture, gaussian, background, subtraction Disciplines Physical Sciences and Mathematics Publication Details Bouttefroy, P. L., Bouzerdoum, A., Beghdadi, A. & Phung, S. (2010). On the analysis of background subtraction techniques using Gaussian mixture models. 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 4042-4045). USA: IEEE. This conference paper is available at Research Online: http://ro.uow.edu.au/infopapers/809

ON THE ANALYSIS OF BACKGROUND SUBTRACTION TECHNIQUES USING GAUSSIAN MIXTURE MODELS P. L. M. Bouttefroy, A. Bouzerdoum, S. L. Phung School of Electrical, Computer & Telecom. Engineering University of Wollongong Northfields Avenue Wollongong, NSW 2522, Australia A. Beghdadi L2TI, Institut Galilée Université Paris13 Avenue J. B. Clément 93430 Villetaneuse, France ABSTRACT In this paper, we conduct an investigation into background subtraction techniques using Gaussian Mixture Models (GMM) in the presence of large illumination changes and background variations. We show that the techniques used to date suffer from the trade-off imposed by the use of a common learning rate to update both the mean and variance of the component densities, which leads to a degeneracy of the variance and creates saturated pixels. To address this problem, we propose a simple yet effective technique that differentiates between the two learning rates, and imposes a constraint on the variance so as to avoid the degeneracy problem. Experimental results are presented which show that, compared to existing techniques, the proposed algorithm provides more robust segmentation in the presence of illumination variations and abrupt changes in background distribution. Index Terms Motion analysis, Video signal processing, Image segmentation, Object detection. 1. INTRODUCTION Background modeling of indoor and outdoor scenes is a challenging task due to the uncontrolled nature of the environment (e.g. changes in illumination). A number of techniques have been developed to extract relevant motion from videos since the frame differencing method proposed by Jain and Nagel in 1979 [1]. For instance, Elgammal et al. proposed to model the background through kernel density estimation [2] and, Oliver et al. developed an algorithm that generates the eigenbackground of a scene, capturing the stable components or, equivalently, the background [3]. Nevertheless, the computation cost prohibits the use of these techniques for realtime applications. Parametric representations, which limit the processing load to the calculation of a few parameters, are preferred. In particular, Stauffer and Grimson have developed an algorithm for foreground segmentation using the Gaussian mixture model [4]. However, despite the plethora of techniques proposed to model background, there is a lack of rigorous analysis of the background model itself and, in particular, the influence of the update rates. Lee proposed a variation of Stauffer and Grimson s algorithm which adapts quickly to fast variations in background distribution imputable, for example, to changing illumination [5]; unfortunately, this approach leads to pixel saturation due to degeneracy of the variance as we will show later. Other proposed techniques to handle illumination changes rely on chrominance correspondence between the incoming pixel and the reference background model (for more details see [6]). These techniques are therefore inefficient when the reference model has changed. In this paper, we conduct an investigation into the effects of the Gaussian Mixture parameters, in particular the influence of the variance, on the performance of Stauffer and Grimson s and Lee s algorithms. Our study shows that the variance can degenerate creating the so-called saturated pixels, which appear due to a change in illumination. Furthermore, we propose a method that can cope with rapid changes in the background model. Contrary to existing techniques, the proposed method handles abrupt variations during the mixture update and not in a pre- or post-processing stage. The rest of the paper is organized as follows. Section 2 presents the general formulation of the mixture of Gaussians for background modeling. Section 3 describes the problem of variance degeneracy and defines saturated pixels. Section 4 introduces the adaptive mixture of Gaussians algorithm. Section 5 presents the experimental results, followed by concluding remarks in Section 6. 2. BACKGROUND MODELING USING GAUSSIAN MIXTURES Background modeling by Gaussian mixtures is a pixel based process. Let x be a random process representing the value of a given pixel in time. A convenient framework to model the probability density function of x is the parametric Gaussian mixture model where the density is composed of a sum of Gaussians. Let p(x) denote the probability density function of a Gaussian mixture comprising K component densities: 978-1-4244-4296-6/10/$25.00 2010 IEEE 4042 ICASSP 2010

Fig. 1. Original (first row) and foreground segmentation (second row) of an object passing through a saturated zone (aggregation of saturated pixels) in the HighwayII sequence. The saturated region is delimited with a rectangle. p(x) = K w k N (x; μ k,σ k ), (1) k=1 where w k are the weights, and N (x; μ k,σ k ) is the normal density of mean μ k and covariance matrix Σ k = σ k I,(I denotes the identity matrix). The mixture of Gaussians algorithm, proposed by Stauffer and Grimson [4], estimates these parameters over time to obtain a robust representation of the background. First, the parameters are initialized with w k = w 0, μ k = μ 0 and σ k = σ 0.Ifthereisamatch, i.e. x μ j /σ j <τfor some j [1..K], (2) where τ(> 0) is some threshold value, then the parameters of the mixture are updated as follows: w k (t) =(1 α) w k (t 1) + αm k (t), (3) μ k (t) =(1 β)μ k (t 1) + β x, (4) σ 2 k(t) =(1 β) σ 2 k(t 1) + β (x μ k (t)) 2, (5) where M k (t) is equal to 1 for the matching component j and 0 otherwise. If there is no match, the component with the lowest weight w k is re-initialized with w k = w 0, μ k = x and σ k = σ 0. The learning rate α is constant and β is defined as: β = α N (x; μ k,σ k ). (6) Finally, the weights w k are normalized at each iteration to add up to 1. Stauffer and Grimson proposed to sort the Gaussians by decreasing weight-to-standard-deviation ratio, w k /σ k,to represent the background. A threshold λ is applied to the cumulative sum of weights to find the set {1...B} of Gaussians modeling the background, defined as: ( KB K B =argmin KB k=1 ) w k >λ. (7) Intuitively, Gaussians with the highest probability of occurrence, w k, and lowest variability in the distribution, measured by σ k, indicating a representative mode, are the most likely to model the background. 3. VARIANCE DEGENERACY AND PIXEL SATURATION The Gaussian mixture algorithm used for background subtraction by Stauffer and Grimson updates the mean μ k and the variance σ k with the same learning rate β. Lee proposed a modified approach where the learning rate β is increased in the initial learning phase of the algorithm, hence providing a quicker adaptation when a new surface appears [5]. In this section, we focus on the approach proposed by Stauffer and Grimson [4] and Lee [5] to analyze the learning rates 1.It is convenient herein to introduce the term saturated pixel that defines a pixel which responds neither to a foreground nor to a background object. In other words, a saturated pixel is classified either as foreground or as background, regardless of the surface present. The phenomenon is illustrated in Fig. 1. In this example, the saturated pixels are classified as background even though the zone inside the square alternates between foreground and background, due to flow of cars. The increase of the learning rate β and large variations of the pixel value distribution (e.g., recurrent changes in illumination) are accountable for the saturation of pixels. Indeed, the update of the variance (Eq. 5) with the square of the difference between the mean and the pixel value is prone to result in large, overestimated, values for the variance. The variance increases until the component saturates the mixture by spanning over the entire pixel color range. As a consequence, pixels are constantly classified as either foreground or background, depending on the weight of the Gaussian component; indeed, Eq. (2) always holds if the variance is large enough. To the best of our knowledge, this phenomenon has never been investigated and was considered as an adaptation of the background to a changing distribution. However, it can be shown from Eq. (3) that this assumption does not hold because the minimum time t min for a new distribution to be part of the background model is 2 t min [ ] 1 ln(1 α) ln K + λ 3 (K 2)(α 1). (8) 1 Techniques which implement an expectation maximization algorithm are not considered due to their computational cost. 2 The derivation of the equation is omitted due to lack of space. 4043

as in [5], and copes with fast changes in illumination, thus addressing the problem encountered with Stauffer and Grimson method. The learning rate γ k replaces β in Eq (4). The variance is updated independently from the mean so as to avoid the saturation phenomenon; a fast update would lead to the degeneracy problem highlighted in Section 3. Here we propose to use a semi-parametric model for the variance, which enables a quasi-linear adaptation in case of small deviations from the mean and a flattened response for large deviations. The sigmoid function is used to this aim: f a,b (x,μ k )=a + b a 1+e sε(x,μ k), (10) Fig. 2. Percentage of saturated pixels: (a) Stauffer and Grimson s method (top), (b) Lee s method (bottom). There is equality when a new object presents a surface with a constant color value. Consider λ =0.7 and α =0.005, as in [5], a component is integrated to the background after 72 frames for K =3. Such a large number discards the assumption of fast background adaptation to explain the phenomenon presented in Fig. 1. Figure 2 displays the percentage of saturated pixels for the Stauffer and Grimson s method and Lee s method. The learning speed-up in Lee s method leads to a significant percentage of the image being saturated. 4. PROPOSED APPROACH As seen in Section 3, the method proposed by Stauffer and Grimson lacks adaptation to fast changes due to the use of a constant learning rate. On the other hand, the method proposed by Lee suffers from pixel saturation because of the fast adaptation of the variance. Here, we propose an algorithm that handles the degeneracy of the variance via adequate update of the parameters. To this end, the learning rates for the mean and the variance terms are decoupled. The first one is an adaptive learning rate γ k, updating the parameter μ k that includes clues of the relative probability q k = N (x; μ k,σ k ) of a pixel belonging to the kth Gaussian: γ k (t) =γ k (t 1) + K +1 K q k 1 K K q i. (9) i=1 Experiments showed that an initial value of γ k (0) = 0.05 provides good results for the tested datasets. The above equation shows that the learning rate γ k increases if the value of the kth Gaussian component is higher than the average value of other components. In other words, γ k is increased if the probability of the pixel belonging to the kth component is greater than the average probability of belonging to another component; otherwise, it is decreased. An adaptive coefficient allows a quicker update of the Gaussian mixture mean where ε(x,μ k )=(x μ k ) T (x μ k ). The parameter s controls the slope of the sigmoid and is equal to 0.005 in the experiments. The update of the variance in Eq. (5) is thus σ 2 k(t) =(1 η) σ 2 k(t 1) + ηf a,b (x,μ k (t 1)), (11) where η = 0.6 is a constant learning rate. The function f a,b (x,μ) R + bounds the variance value to the domain D [ a+b 2,b]. The values of the parameters a and b are chosen so that the domain D spans over one Kth of the pixel range value. 5. EXPERIMENTAL RESULTS The proposed algorithm is compared with the methods developed by Stauffer and Grimson [4] and by Lee [5] on a database containing several hours of video sequences from indoor and outdoor environments. The database is composed of video-surveillance footage, pedestrian and car sequences. We first test the three algorithms by adding illumination changes, generated artificially, to a video sequence of real scenes. Evaluation of the performance on natural illumination changes is then presented. Changes in illumination are fast but smooth due to the gradual transition from penumbra to umbra as described in [7]. We therefore model the variation with a time varying sine term sin (0.02πt), whose value is added to every pixel value in each frame of the HighwayII sequence, resulting in a modified sequence. To test the adaptation performance to illumination changes, the foreground from the original and the modified sequences are extracted by the three algorithms. Then, the pixels of the extracted foreground are counted for each frame. Here, the segmentation of the original sequence is used as reference to compare with, or pseudo-ground truth. Figures 3(a) (c) display the count of foreground pixels through time and Fig. 3(d) shows the average (over 50 frames) of the mean squared error (MSE). The difference in count between the original and the modified sequence represents the error of segmentation attributable to the illumination changes. The number of pixels extracted from the original sequence, the reference, is consistent across the 4044

three algorithms (dashed line). However, with illumination changes, Stauffer and Grimson s method is too slow updating the model: some background is detected as foreground and portions of sine are clearly visible. Lee s method shows only residual foreground detection after 120 frames due to the degeneracy of the variance for most pixels. Only the proposed algorithm is able to accurately segment the foreground in the presence of illumination changes. Number of foreground pixels Number of foreground pixels 5 x 104 4 3 2 1 7000 6000 5000 4000 3000 2000 1000 Foreground Original Foreground Modified 0 0 100 200 300 400 (a) Stauffer and Grimson 0 0 100 200 300 400 (c) Proposed Foreground Original Foreground Modified Number of foreground pixels Avg. segmentation MSE 1.5 1 0.5 10 6 10 4 10 2 2 x 104 Foreground Original Foreground Modified 0 0 100 200 300 400 (b) Lee Proposed Lee Stau. & Grim. 0 50 100 150 200 250 300 350 400 450 (d) Segmentation MSE Fig. 3. Number of detected foreground pixels for the original (dashed line) and modified (solid line) HighwayII sequence; (a), (b) and (c) represent Stauffer and Grimson s, Lee s and the proposed method, respectively. (d) is the average segmentation MSE for the three techniques. Natural changes in illumination are difficult to evaluate since the ground truth is not available. Traditionally, a qualitative evaluation is performed. Figure 4 shows excerpts for foreground segmentation. The displays are raw foreground masks, i.e. no post processing (opening, closure, median filtering, etc.) has been applied. The proposed algorithm outperforms Stauffer and Grimson s and Lee s techniques for global illumination changes (Fig. 4, rows 1 to 3) and for shadow edge (Fig. 4, row 4). It is worthwhile noting that saturated pixels here appear as foreground (black) for Lee s method. 6. CONCLUSION This paper proposed a new technique for background subtraction by mixture of Gaussians that can handle large variations in the background intensity distribution. It was shown that the phenomenon of pixel saturation is due to the degeneracy of the variance of some Gaussian mixture components, a consequence of a large learning rate. To address this problem, the variance is bounded, and the mean and variance of the Gaussian components are updated with different learning rates. The mean is updated with an adaptive rate, providing an accelerated update when sudden illumination changes occur. Experimental results were presented, which show that the Original Proposed Stau. & Grim. Lee Fig. 4. Examples of foreground mask for indoor and outdoor scenes. From left to right: original, proposed algorithm, Stauffer and Grimson, Lee. proposed method is robust to large variations in pixel intensity values and abrupt changes in background distribution. 7. REFERENCES [1] R.C. Jain and H.H. Nagel, On the analysis of accumulative difference pictures from image sequences of real world scenes, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 206 213, 1979. [2] A. Elgammal, D. Harwood, and L. Davis, Nonparametric model for background subtraction, in European Conf. on Computer Vision, Dublin, 2000, vol. 1843 of Lecture Notes in Computer Science, pp. 751 767. [3] N. M. Oliver, B. Rosario, and A. P. Pentland, A bayesian computer vision system for modeling human interactions, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 831 843, 2000. [4] C. Stauffer and W. E. L. Grimson, Adaptive background mixture models for real-time tracking, in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 1999, vol. 2, pp. 246 252. [5] D.-S. Lee, Effective gaussian mixture learning for video background subtraction, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 827 832, 2005. [6] N. Martel-Brisson and A. Zaccarin, Learning and removing cast shadows through a multidistribution approach, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 7, pp. 1133 1146, 2007. [7] L. Zhou, H. Kaiqi, T. Tieniu, and W. Liangsheng, Cast shadow removal with gmm for surface reflectance component, in Proc. of IEEE International Conference on Pattern Recognition, 2006, vol. 1, pp. 727 730. 4045