Gas Distribution Modeling Using Sparse Gaussian Process Mixture Models Cyrill Stachniss, Christian Plagemann, Achim Lilienthal, Wolfram Burgard University of Freiburg, Germany & Örebro University, Sweden
Applications of Gas Distribution Modeling Oil refinery surveillance Garbage dump site surveillance Pollution monitoring in cities Air quality monitoring Garbage detection Rescue Robotics Disaster Prevention
GDM with Autonomous Sensor Networks Integration of mobile sensors positioned by human operators...... or by mobile robots accurate positioning useful for other tasks central integration into one consistent model model should allow to make prediction about observed and unobserved locations
Gas Dispersal (FIX ME) Turbulent Flow Characteristics turbulent transport is much faster than molecular diffusion gaseous ethanol at 25 C and 1 atm: diffusion constant: 0.119 cm 2 /s diffusion velocity: 20.7 cm/h turbulent flow is chaotic/unpredictable instantaneous velocity/concentration at some instant of time is generally insufficient to predict the velocity some time late high degree of vortical motion large-scale eddies cause a meandering dispersal small scale eddies stretch and twist the gas distributi resulting in a complicated patchy structure P.J.W. Roberts and D.R. Webster, "Turbulent Diffusion". In "Environmental Fluid Mechanics Theories and Application". ASCE Press, Reston, Virginia, 2002
Potential Environments
Statistical Gas Distribution Modeling Simulation of Turbulent Flow Characteristics? no general solution to the non-linear fluid dynamics equations numerical simulations computationally expensive and depend sensitively on the boundary conditions boundary conditions not known in typical scenarios model gas distribution statistically from a large number of measurements
Statistical Gas Distribution Modeling Interpret measurements in a statistical sense Build a representation of the observed gas distribution from a sequence of measurements Aimed at a probabilistic representation gas sensor measurements treated as random variables Is it a good model? Allows to infer concentration levels "explains observations best and accurately predict new ones" Allows to infer hidden parameters average concentrations gas source locations
Problem Definition Learn predictive model gas prediction query location measurement locations gas measurements Or: estimate a posterior over gas distribution models
Related Approaches Average concentration on a grid [Ishida et al. 98] Grid with Bi-cubic interpolation [Pyk et al. 06] Peak concentration estimates [Purnamadjaja & Russel 05] Kernel Extrapolation Algorithm [Lilienthal & Duckett 04] FastSLAM1 on grids combined with GDM [Lilienthal et al. 07] So far: no predictive uncertainty
Gaussian Processes (GPs) GPs are a framework for non-parametric regression Provides a predictive mean and variance A covariance function is used to specify the influence of neighboring data points The covariance function requires hyperparameters that need to be learned Cost of learning: matrix inversion O(n3) (n training samples) mean Cov(test,train) Cov(train) obs noise targets
GDM with Gaussian Processes Using all measurements is too expensive Use a subsampled set of observations (40 locations & gas measurements out of ~2500) Learn hyperparameters via cross-validation data collection mean prediction predictive variance
Observations Issues(?) in gas distribution rather smooth gas distributions away from hotspots localized packages of high concentration diffusion (?) GP: predictive uncertainty independent of the gas measurements TODO: ADD PLOT!
GP Mixture Models Each component is an individual GP A gating function specifies the influence of the individual models for each position In GDM: different components have the ability to represent the individual physical properties One GP component for the hotspots One GP component for the background GP mixture models first introduced by Tresp in 2000
GP Mixture Models Math x TODO
Learning the Mixture Model Selecting data points for training Initialization of the mixture components Learning the components Finding hyperparameters
Subsampling & Error GP Randomly draw data points Initialize the first component with drawn data Learn an Error GP Draw data points in areas of high error Initialize 2 nd component with these values
Learning the Components Learning is done in an EM-style procedure Determine which data point belongs to which components Recompute the GP components based on the new assignment/weight Used to model the influence of training data normal GP data w infinite noise
Learning the Components M-step E-step
Finding Good Hyperparameters Squared exponential covariance function noise: Optimization inside the M-Stpe of EM with Rasmussens s minimize Overfitting problems Sampling hyperparameters outside the EM Start with a good guess [Snelson & Ghahramani 06] Even iteration: sample random parameters Odd iteration: refine the best solution so far Cross-validation to avoid overfitting
1D Example IS THIS NEEDED?
Experiments 5 real world datasets TODO: ADD OTHER FOTOS!
Experiments (A) GP mean prediction GP predictive variance Mixture components Mixture gating function Mixture mean prediction Mixture predictive variance
Experiments (B) initial guess error GP 2 nd component After EM: mixture mean mixture predictive variance
Experiments (B 2D view) GP mean prediction GP predictive variance Mixture mean prediction Mixture predictive variance
Comparison GP mixture models Lilienthal & Duckett Comparable mean estimates but GPs provide also a predictive uncertainty
Comparison 100 data points used for training 1000 data points used to evaluate the sampled hyperparameter Rest of the data for testing Average neg. log likelihood TODO: ADD NLPD PLOT OVER TIME!
Runtime 100 training points per model 2 model components Pentium 2.4 GHz, Matlab code Learning one mixture model (all steps) takes < 1s 60 hyperparameter sampling runs overall runtime < 1 min Complexity O(training points^3 + test points^2)
Conclusion Gas distribution modeling with Gaussian process mixtures Uses a sparse set of training data Provides a predictive model Most alternative approaches do not provide predictive variances, GP models do so Outperforms standard GP model Efficiency/Complexity? <- Achim how fast is your stuff approx. Efficiency & Complexity
Future Work Non-stationary covariance functions Heteroscedastic GPs Robot action selection based on the model