Discovery of the Source of Contaminant Release

Similar documents
Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Adaptive spatial resampling as a Markov chain Monte Carlo method for uncertainty quantification in seismic reservoir characterization

Robustness analysis of metal forming simulation state of the art in practice. Lectures. S. Wolff

Inclusion of Aleatory and Epistemic Uncertainty in Design Optimization

CSE 490R P1 - Localization using Particle Filters Due date: Sun, Jan 28-11:59 PM

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

Clustering Reduced Order Models for Computational Fluid Dynamics

Overview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week

Multi-Mesh CFD. Chris Roy Chip Jackson (1 st year PhD student) Aerospace and Ocean Engineering Department Virginia Tech

arxiv: v1 [cs.cv] 2 May 2016

Introduction to ANSYS DesignXplorer

Theoretical Concepts of Machine Learning

Experiments with Edge Detection using One-dimensional Surface Fitting

Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

High-Resolution Ocean Wave Estimation

Variability in Annual Temperature Profiles

Quantitative Biology II!

1.2 Numerical Solutions of Flow Problems

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Chapter 1. Introduction

Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods

10703 Deep Reinforcement Learning and Control

Expectation-Maximization Methods in Population Analysis. Robert J. Bauer, Ph.D. ICON plc.

Louis Fourrier Fabien Gaie Thomas Rolf

Samuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR

Driven Cavity Example

Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models

Modeling Plant Succession with Markov Matrices

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Temporal Modeling and Missing Data Estimation for MODIS Vegetation data

CONDITIONAL SIMULATION OF TRUNCATED RANDOM FIELDS USING GRADIENT METHODS

Stochastic Simulation: Algorithms and Analysis

Optimization of a two-link Robotic Manipulator

An Introduction to Markov Chain Monte Carlo

Level-set MCMC Curve Sampling and Geometric Conditional Simulation

Uncertainty Analysis Utilizing Gradient and Hessian Information

A Fast CMS Technique for Computational Efficient System Re-analyses in Structural Dynamics

ACCURACY AND EFFICIENCY OF MONTE CARLO METHOD. Julius Goodman. Bechtel Power Corporation E. Imperial Hwy. Norwalk, CA 90650, U.S.A.

MRF-based Algorithms for Segmentation of SAR Images

Saliency Detection in Aerial Imagery

Monte Carlo for Spatial Models

Robotics. Lecture 5: Monte Carlo Localisation. See course website for up to date information.

Cecil Jones Academy Mathematics Fundamental Map

MCMC Methods for data modeling

Adversarial Attacks on Image Recognition*

Design optimization and design exploration using an open source framework on HPC facilities Presented by: Joel GUERRERO

Random Search Report An objective look at random search performance for 4 problem sets

Humanoid Robotics. Least Squares. Maren Bennewitz

Communication Networks Simulation of Communication Networks

Chapter 2 Basic Structure of High-Dimensional Spaces

1.1 Temporal and Spatial Reconstruction of Atmospheric Puff Releases using Bayesian Inference

Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham

Stage 7 Checklists Have you reached this Standard?

Landscape Ecology. Lab 2: Indices of Landscape Pattern

Accelerometer Gesture Recognition

Design of Experiments

Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY Keynote, Outlier Detection and Description Workshop, 2013

Using Machine Learning to Optimize Storage Systems

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

Bayesian Estimation for Skew Normal Distributions Using Data Augmentation

CSE 586 Final Programming Project Spring 2011 Due date: Tuesday, May 3

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Network Traffic Measurements and Analysis

Modified Metropolis-Hastings algorithm with delayed rejection

Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2

CFD Project Workflow Guide

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

Local Features: Detection, Description & Matching

INTRODUCTION. Model: Deconvolve a 2-D field of random numbers with a simple dip filter, leading to a plane-wave model.

Chapter 16. Microscopic Traffic Simulation Overview Traffic Simulation Models

EPSRC Centre for Doctoral Training in Industrially Focused Mathematical Modelling

RJaCGH, a package for analysis of

Model Based Impact Location Estimation Using Machine Learning Techniques

Learning Adaptive Parameters with Restricted Genetic Optimization Method

Box-Cox Transformation for Simple Linear Regression

Groveport Madison Local School District Third Grade Math Content Standards. Planning Sheets

Proceedings of the 2015 Winter Simulation Conference L. Yilmaz, W. K. V. Chan, I. Moon, T. M. K. Roeder, C. Macal, and M. D. Rossetti, eds.

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Fathom Dynamic Data TM Version 2 Specifications

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

CS281 Section 9: Graph Models and Practical MCMC

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS

Simulation: Solving Dynamic Models ABE 5646 Week 12, Spring 2009

Issues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users

Applying Supervised Learning

Recap Randomized Algorithms Comparing SLS Algorithms. Local Search. CPSC 322 CSPs 5. Textbook 4.8. Local Search CPSC 322 CSPs 5, Slide 1

Gas Distribution Modeling Using Sparse Gaussian Process Mixture Models

A noninformative Bayesian approach to small area estimation

Reliability - Based Robust Design Optimization of Centrifugal Pump Impeller for Performance Improvement considering Uncertainties in Design Variable

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

VALERI 2003 : Concepcion site (Mixed Forest) GROUND DATA PROCESSING & PRODUCTION OF THE LEVEL 1 HIGH RESOLUTION MAPS

NONPARAMETRIC REGRESSION TECHNIQUES

Stream Function-Vorticity CFD Solver MAE 6263

Allstate Insurance Claims Severity: A Machine Learning Approach

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008.

Transcription:

Discovery of the Source of Contaminant Release Devina Sanjaya 1 Henry Qin Introduction Computer ability to model contaminant release events and predict the source of release in real time is crucial in various applications, especially in environmental safety monitoring and homeland security. In the event of unintentional industrial accidents or biological attacks in urban environments, immediate, accurate response is required. A real-time computer program that provides information about the identification of the contaminant, the source of the release, and the prediction of the subsequent path of contamination can be used to assist the decision making process for evacuation and countermeasures. The contaminant source inversion problem involves intricate geometry, uncertain flow conditions, and limited, noisy sensor readings. Moreover, the problem is generally ill-conditioned in the sense that small changes in the sensor readings can cause large changes in the calculated source of release [2]. This makes single-point deterministic calculations not robust. The lack of robustness is due to the fact that some inputs might produce nearly the same outputs, especially when measurement error is taken into account. To increase robustness, statistical approaches are used, but it often requires large sampling and numerous forward simulations which can quickly become computationally expensive. Multiple previous studies have been conducted to reduce the computational cost, such as grid coarsening, reduced-order modeling, and stochastic expansions. Previous studies have also considered applying uncertainty quantification methods to analyze the propagation of input uncertainties. In this paper, we combine machine learning models and computational fluid dynamics to discover the source of release for large-scale problems in real time. Various machine learning models are evaluated for robustness to noisy sensor readings and limited training data. We will also compare our results with statistical results from Markov Chain Monte Carlo (MCMC) with single walker [3] and ensemble walkers [1]. 2 Data Format Our training and test data are obtained using the computational fluid dynamics software (XFLOW) developed by Dr. Fidkowski at the University of Michigan: Ann Arbor. For the 2D case, we simulate a contaminant release around cross sections of buildings (see figure 1 (left)). Five sensors are placed around the buildings in a pseudorandom fashion without iteration or tuning. Each sensor takes three readings spaced equally in time, for a total of 15 sensor readings. A spatial approximation order of p = 2 is used and the Peclet number for the simulations, based on the mean velocity and domain size in the x-direction, is P e = 100. We use the 15 sensor readings as input features to our machine learning models, and X and Y as the output features. Each forward simulation used to obtain sensor readings was completed in less than 1 minute when parallelized with 8 processors. Figure 1: Mesh and setup used during CFD simulation for two (left) and 3D case (right). Sensor readings from both cases are used as our training and test sets. 1

For the 3D case, we simulate contaminant release in a realistic urban area (see figure 1 (right)); this domain is the same as in Lieberman et al [4]. There are 10 sensors placed around the buildings chosen in a pseudorandom fashion without iteration or tuning. Each sensor provides 4 readings for a total of 40 sensor readings. A spatial approximation order of p = 1 was used and the Peclet number for the simulations, based on the mean velocity and the domain extent in the direction of velocity, was Pe = 50. Our input features are the 40 sensor readings, and our outputs are the X, Y, Z, and Amplitude. Each forward simulation takes about 8 minutes on 100 processors. 3 Implementation & Discussion In this section, we will discuss how we implement machine learning models to our problem. All of our models are trained using the statistical programming language R or Matlab. For the purpose of the discussion below, we assume that we are only attempting to predict X, since predicting other variables (Y in 2D case and Y, Z, and Amplitude in 3D case) is symmetric. Moreover, to create realistic test cases, sensor errors are considered; we consider both uniform and Gaussian error distributions. Due to our time constraint, we will mainly discuss the results from the 2D case. 3.1 Test Error Definition All reported test errors are defined as a percentage of the interval over the parameter which we are making predictions. Equation 1) shows how we compute the test error when predicting X. %Test Error = x ˆx (max(x) min(x)) where x is the true value of X, ˆx is the predicted value of X, max(x) is the maximum of the interval of X, and min(x) is the minimum of the interval of X.In the 2D case, the X [0, 1], and in the 3D case, X [0, 4.71]. We believe this definition of test error percentage, rather than the standard definition of (Expected - Actual) / Actual, allows us to fairly compare predictions across examples with different true locations of the source of release. Intuitively, since we are trying to predict a location rather than a quantity, an error of 0.01 model units should be interpreted the same way whether it is an error from of a ground truth of 0.1 or a ground truth of 0.2, and our definition of error reflects this. Furthermore, this percentage error deifinition enables us to compare the test errors between 2D and 3D cases. 3.2 Perfect Sensor Readings First, we consider the case where all sensor readings are perfect. For our 2D test case, we have 144 examples in total: 72 for training and 72 for testing. We found that ordinary linear regression, which directly models the output values as a linear combination of the raw input feature values, does not perform well, with a mean error of 24%. However, linear regression with logarithmic feature mapping (see equation 2) performs quite well, with a mean error of 1%. This suggests that there is a log relationship between the sensor readings and the location of contaminant release. X = β 0 + β 1 log r 1 + β 2 log r 2 +... + β 15 log r 15 (2) To use log-transformed model, we first replace any readings that are less than equal to zero with the fixed constant 1 10 10, take the log of each sensor reading, and then apply multiple linear regression. Figure 2 (left) shows the residuals plot from our testing. This figure shows the mean error of 1% and standard deviation of 1% in predicting the source of release. Next, we consider the 3D case. Here, we have 256 examples in total: 200 examples for training and 56 examples for test. As with the 2D case, linear regression does not perform well, while linear regression with logarithmic feature mapping works well, with a mean error of 22.5%. Performing the same steps as before, we found a mean error of 0.26% and standard deviation of 0.26% as shown in Figure 2 (right). We acknowledge that these errors are unusually low, but do not claim that these errors will generalize to other 3D simulations, even assuming perfect sensors. (1) 2

0 10 20 30 40 50 60 70 0 20 40 60 80 100 120 Figure 2: Test residuals of linear regression with logarithmic feature mapping from the 2D (left) and 3D (right) cases with perfect sensor readings 3.3 Sensor Readings with Uniform Error Distribution Having successfully modeled the simple case, we moved onto a more complex case: uniform, substantial sensor error. Based on knowledge of the field, 1 10 2 is a reasonable sensor error. To apply the uniform error, we add the same fixed constant to each of our sensor readings, and treat these perturbed sensor readings as the new raw features. During training, we naturally assume that the constant is unknown, as it would be in practice. Our training and test sets are the same as in the previous case. Unfortunately, this more complex case clearly demonstrated that our previous method was not robust against uniform sensor error, as our test errors increased by an order of magnitude. This behavior is consistent with the ill-conditioned nature of the inverse problem. In retrospect, we could have anticipated this. When systematic sensor error is added, the true model starts to look like the function in equation 3, while we were still trying to model it using equation 2. X = β 0 + β 1 log (r 1 + ɛ) + β 2 log (r 2 + ɛ) +... + β 15 log (r 15 + ɛ) (3) We tried to model equation 3 by using R s nonlinear least squares nls package, but we ran into singularity problem. Since direct model fitting did not pan out, we implemented a hill-climbing algorithm in an attempt to greedily discover the value of the hidden constant ɛ. More specifically, our hill-climbing algorithm varies the value of ɛ to find the maximum R 2 statistic for a least squares fit against log (r i ɛ). The procedure is as follows: 1. Initialize a step size s to a constant 0.001. 2. Choose a random starting value ɛ 0 [1 10 2, 5 10 2 ]. 3. Fit a least squares model using the features log (r i ɛ 0 ). 4. Fit two more least squares models using ɛ = ɛ 0 s and ɛ = ɛ 0 + s. 5. Set ɛ 0 to be equal the ɛ in the model above which produced the highest R 2 statistic. 6. Half the step size s, so s s/2. 7. If R 2 > 0.99, terminate the algorithm and report ɛ 0. Otherwise, return to Step 3. The above algorithm is able to pinpoint ɛ in the 2D case well, and thus, we can substitute ɛ to 3 before modeling the data. However, this hill-climbing algorithm does not work well in 3D case due to multiple local maxima. 3.4 Mixed Sensor Readings Now, we will consider the case where some sensor readings might happen to contain no errors while others have uniform or Gaussian-distributed errors. To simulate these errors, we first replicate the original set of examples three times, creating a new data set with three times the number of examples. Next, we add a constant error term to one full replica, add Gaussian-distributed errors (mean 0, stddev 1E 2) to the second full replica, and then randomize the order of the data 3

set. From this mixed data set, we randomly select half the examples for use in training, and hold out the other half as a test set. Multiple machine learning models are trained, such as linear regression, linear regression with logarithmic feature mapping, locally weighted linear regression with logarithmic feature mapping, decision tree, boosting, random forest, and K-nearest neighbors. Figure 3 compares the various models based on different testing error metrics (mean, standard deviation, median, and 90th percentile) for our 2D case. Here, we can see random forest gives us the lowest mean testing error, which is about 5% and the lowest 90th percentile error, which is about 10%. Figure 4 shows the test residuals of random forest for our 2D and 3D cases. 0.45 0.4 0.35 Linear Linear with Log Loc. Weighted Linear with Log Decision Tree Boosting Random Forest K nearest Neighbor 0.3 Testing Errors 0.25 0.2 0.15 0.1 0.05 0 Mean StdDev Median 90% Error Metrics Figure 3: Testing error metrics of all machine learnings method applied to the 2D case with mixed sensor readings 1.0 0.5 0.0 0.5 1.0 1.5 0 50 100 150 200 0 100 200 300 Figure 4: Test residuals of random forest applied on the 2D (left) and 3D (right) cases with mixed sensor readings. Note that these figures are not on the same scale because the 2D and 3D cases have different ranges for their dimensions. 4

3.5 Comparison with MCMC We compare our results with statistical results from MCMC with single and ensemble walkers presented in[5]. Using MCMC and noisy sensor readings of 1 10 2, we obtained less than 1% in predicting X for both 2D and 3D cases. Although the results from MCMC are far more accurate, it takes substantial time to obtain a single prediction from MCMC because generating a set of sensor readings at multiple proposed location of the MCMC walker(s) is time consuming. For instance, the 2D case converges in about 4 minutes using 32 processors and the 3D case converges in about 6 minutes using 100 processors, On the other hand, we can train a Random Forest and evaluate hundreds of examples in less than 1.5 seconds on a single processor. 4 Conclusion To summarize, we make the following contributions in this paper. With perfect sensor data, the relationship between sensor readings and the contaminant source is a simple log-linear one. Out of all the models we experimented with, Random Forest proved to be the most robust against noisy data. Comparing to MCMC, supervised learning requires far less computational power, but is less accurate and less robust to noisy data. To increase the robustness of supervised learning to noisy data, more research is required. To the best of our knowledge, predicting the location of contaminant release in a realistic setting remains an open problem. 5 Acknowledgement We gratefully acknowledge Dr. Fidkowski at the University of Michigan: Ann Arbor for the use of his computing resources and simulation software (XFLOW) in generating our training and test data. References [1] J. Goodman and J. Weare. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science, 5(1):65 80, 2010. [2] J. Hadamard. Lectures on the Cauchy Problem in Linear Partial Differential Equations. Yale University Press, 1923. [3] W.K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97 109, 1970. [4] C. Lieberman, K. Fidkowski, K. Willcox, and B. van Bloemen Waanders. Hessian-based model reduction: large-scale inversion and prediction. International Journal for Numerical Methods in Fluids, 2012. [5] D. Sanjaya, I. Tobasco, and K. Fidkowski. Adjoint-accelerated statistical and deterministic inversion of atmospheric contaminant transport. Unpublished. 5