On internal consistency, conditioning and models of uncertainty

On internal consistency, conditioning and models of uncertainty Jef Caers, Stanford University Abstract Recent research has been tending towards building models of uncertainty of the Earth, not just building a single (or few) detailed Earth models. However, just as any model, models of uncertainty often need to be constrained/conditioned to data to have any prediction power or be useful in decision making. In this presentation, I propose the concept of internal consistency as a scientific basis to study prior (unconditional) and posterior (conditional) models of uncertainty as well as the various sampling techniques involved. In statistical science, internal consistency is the extent to which tests or procedures assess the same characteristic, skill or quality. In the context of uncertainty, I will therefore define internal consistency as the degree to which sampling methods honor the relationship between the unconditional model of uncertainty (prior) and conditional model of uncertainty (posterior) as specified under a (subjectively) chosen theory (for example: Bayes rule). The tests performed are then various different ways of sampling from the same (conditional or unconditional) distributions. If these distributions are related to each other via a theory, then such tests should yield similar results. I propose various such tests using Bayes rule as the theory. A first test is simply to generate unconditional models, extract data from them using a forward model and generate conditional models from this randomized data. Internal consistency with Bayes rule would mean that both sets of conditional and unconditional models span the exact same space of uncertainty, simply because the data spans the uncertainty in the prior. I show that this not true for a number of popular conditional stochastic modeling methods: sequential simulation with hard data, gradual deformation and ensemble Kalman filters for solving inverse problems. I also show that in some cases lack of internal consistency leads to a considerable artificial reduction of (conditional) uncertainty that may have important consequences if such models are used for prediction purposes. A case involving predicting flow behavior is presented. Finally, I offer some discussion on the importance of internal consistency in practical applications and introduce some novel approaches to conditioning that are internally consistent as well as computationally efficient.

Introduction There has been a shift in recent year in building models of uncertainty instead of just building models. What do I mean by that? In 3D modeling, one is interested in creating a 3D gridded model of the Earth representing the data and geological understanding of the spatial structures and components. In a way such a model is a model of what we know. In modeling uncertainty we are interested in covering the uncertainty about a certain response evaluated on these models such as flow simulations, we are not merely interested in just building a 3D model. Such as model is a model of what we don t know. Therefore modeling uncertainty is more than just cranking the random numbers of the same model a few more times (Caers, 2011), it requires a conceptual change in thinking. What is our state of knowledge, what is our lack of understanding, how do we quantify this? Often critical parameters in this uncertainty model need to be identified because too many parameters are uncertain for any model of uncertainty to be useful. Just like 3D model that are constrained to data, models of uncertainty need to be constrained to data. But what does this mean? Does it simply mean that every model in the set of 3D models generated needs to match the data in the same fashion? To judge such conditioning, we introduce the concept of internal consistency. In order not to invent yet another term, I borrowed this notion from statistical science where this refers to the extent to which tests or procedures assess the same characteristic, skill or quality. It is a measure of the precision between the observers or of the measuring instruments used in a study. For example, a researcher designs a questionnaire to find out about college students' dissatisfaction with a particular textbook. Analyzing the internal consistency of the survey items dealing with dissatisfaction will reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction. In this paper we will not necessarily follow this literal interpretation, but will focus on the fact that if two procedures test the same skill or property, then their scores should be similar. The property being studied here is the conditioning of various methods in reservoir modeling, be it to well-log or production data. At first, we will focus on a simple test, namely that if models of uncertainty are conditioning to a random set of data, then this conditional model of uncertainty should be the same the unconditional model of uncertainty. If that is not the case than the conditioning method is internally consistent with the theory that links conditional and unconditional models. We will see for example conditioning techniques that perfectly match the data are in fact internally inconsistent. Internal consistency for Earth models Two schools of thought Internal consistency requires a theory, hypothesis or objective, therefore we call it internal. There is no such thing as absolute internal consistency or external consistency. In conditioning we can use

Bayes rule as such a theorem. But it should be stated that we do not need to use Bayes rule, we could invent other rules, as long as we stay consistent with those rules, then, we know what we are doing and are following a scientifically rigorous path. Bayes rule is very simple, f MD ( m d) f D M ( d m) f ( m) f () d D M where M is the random vector describing a gridded Earth model, D is the random vector describing the data outcomes. Bayes rule basically states the relationship between the prior or unconditional model of uncertainty f M and the conditional or posterior model of uncertainty f M D. Once you choose the prior and the likelihood function, the posterior is fixed; you cannot choose it any longer independently without being internally inconsistent with Bayes rule. P(m) P(m d) d RN Sampler Explicit theory m (1) m (2) m (3)... Bayesian view Model of spatial continuity statistics d RN algorithm Implicit theory m (1) m (2) m (3)... frequentist view Figure 1: flowchart depiction of two schools of thought Generally two schools of thought have prevailed in geostatistics, see Figure 1. The Bayesian school of thinking prescribed that one should explicitly state the prior and likelihood, then use Bayes rule to determine the posterior and then sample as accurately as possible from this posterior distribution. By accurate sampling, we mean uniform sampling. This way of thinking is very rigorous, but also very cumbersome. First, very few explicit multi-variate distributions are known. Secondly, parameterizing and

stating explicitly parameters for these distributions is tedious and as a result simple models, such as multi-gaussian with simplified parameters assumptions (such as homoscedasticity in the variance) are assumed. Last but not least, sampling methods such as McMC are impractical if the data-model relationship is complex or CPU demanding to evaluate. But the Bayesian view is internally consistent. In juxtaposition, there is a more frequentist approach to modeling, namely, that any set of 3D Earth models represents a model of uncertainty, whether or not these models are conditional or unconditional. In their view, there is no need for specifying any distribution functions explicitly. As long as one can create models, one is fine. Any algorithm can be viewed as a model and this algorithm can be conditional or unconditional. Bayes rule is used in various ways, for example to constrain to data, but it is generally not used in an explicit way to link conditional and unconditional models of uncertainty. But are models created in this way internally consistent? Let s consider an example. Testing for internal consistency We design a simple test for internal consistency between conditioning mechanism, theory and models of uncertainty. Recall that as theory we chose for Bayes rule, a subjective choice, but a choice nonetheless. Our test works as follows and works for both Bayesian and Frequentist views. We assume that the data-model relationship is given by a forward model, namely d g( m ) In the Bayesian world the test would be 1. Sample m from f M 2. Generate d from d=g(m) 3. Specify the likelihood f D M and therefore the posterior 4. Sample now from f M D (m d) For the frequentist view, one has the same thing, but different ways of expressing it in practice: 1. Generate an unconditional simulation m using an unconditional algorithm 2. Generate d from d=g(m) 3. Generate a conditional simulation m d using the conditional algorithm What is the purpose here? In repeating this workflow, we obtain multiple conditional Earth models. The distribution of these conditional Earth model should be exactly the same as the unconditional or prior model, indeed, f ( m d) f ( d) dd f ( d m) f ( m) dd f ( m) f ( d m) dd f ( m) M D D D M M M D M M d d d (1)

This makes sense, since we randomize the data in such a way that it is consistent with the prior. Note that in this derivation we used our theory, namely Bayes rule. i=50 1 100 Figure 2: outline of the simple Boolean model A simple example To illustrate this internal consistency test, we use a simple but perhaps baffling example of what can go wrong. Consider a 1D grid, see Figure 1. The prior model is a simple Boolean model. The Boolean model consist placing exactly five objects on this line, where each object is drawn from a given uniform distribution with length [minl, maxl]. The objects can overlap. As data, we consider the exact observation of absence/presence of an object at the middle location, see Figure 1. We describe the random variable a A(i), where A(i)=1 indicates that the object is present, A(i)=0 that the object is absent. Consider a simple conditioning method, i.e. a method for generating conditional realizations. Clearly we have two case, either A(50)=1 or A(50)=0. If the first case would occur, we generate a conditional model as follows: 1. Draw an object with certain length from the uniform distribution 2. Put it uniformly around the conditioning location 50 3. Generate the four remaining objects When A(50), we do the following 1. Draw the length of a single object 2. Put it uniformly at those locations that will not violate the conditioning data A(50)=0 3. Repeat this till you have 5 objects Clearly this will generate conditional simulations, and it appears, follow the Boolean model. Not quite. Let s run our consistency test. First we need to know the marginal P(A(i)=1). This is easy, we simple generate 1000 unconditional models and calculate their ensemble average. To run our test, we simply generate N conditional models with A(50)=1, N conditional models with A(50)=0, calculate their ensemble averages and average them according to the marginal, namely

EA EA P( A 1) EA (1 P( A 1)) 1 0 If the conditioning method were to be internally consistent, one should obtain that EA PA ( 1) that is, get back the marginal as stated in Eq (1). Figure 3 shows that this is not the case, the marginal is shown in red, while the result from our test is shown in blue. The conclusion is that this method is not internally consistent. What happened? To further analyze this puzzling result, we use a sampler in the rigorous Bayesian sense that is known to be exact, the rejection sampler. In rejection sampling we simply generate a model and accept it is it matches the data, reject it else. Consider first rejection sampling when A(50)=0. In Figure 4 we plot the ensemble average (conditional mean) of 1000 rejection sampler results together the ensemble average of the simple conditioning method. We get a perfect match. The result is however different for the case A(50)=1,see Figure 4. Indeed it seems that the average size of objects generated with our technique is too small. This makes sense because of the fact that an observation of on object is more likely when the object is large than when it is small, a fact that was not considered in our simple conditioning method. 0.50 0.45 P( m) P(A=1) 0.40 0.35 0.30 0.25 0.20 0 20 40 60 80 100 1D grid d P( m d)p( d)dd Figure 3: results of the internal consistency test

P(A=0 A(50)=1) P(A=1 A(50)=1) Rejection sampler Rejection sampler Naïve conditioning Naïve conditioning 1D grid 1D grid Figure 4: comparing rejection sampler with simple conditioning for both cases A(50)=0 and A(50)=1 Training image Grid with single hard data Channel sand Background mud Figure 5: example TI and single hard data More examples In the current and previous SCRF reports we have provided several example illustrations of this internal consistency property. I will briefly summarize the results.

Rejection sampler unconditional sequential simulation Conditional sequential simulation Random path Figure 6: ensemble ab=verage of the rejection sampler and conditional MPS using snesim Conditional MPS simulation The conditional MPS simulation algorithms often differ from the unconditional ones, just as in the Boolean example, this can lead to internal inconsistency. Consider the simple example shown in Figure 5. A single hard conditioning data indicating sand is located in the center. A training image is given on the left of simple sinuous channels. Consider now conditioning first using rejection sampler. 150 models are created that match the data. The rejection sampler uses the unconditional version of snesim. The ensemble average is provided on the left in Figure 6. Next 150 models are created using the same snesim algorithm, but now the conditional version. Clearly the ensemble average in Figure 6 differs from the rejection sampler, meaning that there exists an internal consistency problem between conditional and unconditional snesim. Where does this problem occur? Since snesim works on multi-grids, the single hard data needs to be relocated to the nearest coarse grid node. This data re-location does not occur when performing unconditional simulation; hence this is the source of the discrepancy. In the work of Honarkhah (2011), this problem is resolved using a different data relocation algorithm, see his work for details. Figure 7 show this his dispat code indeed does have results comparable to rejection sampling, even in cases with complex data configurations as opposed to the simple data relocation implemented in snesim.

Figure 7: results from dispat History matching by ensemble Kalman filters Ensemble Kalman filters (EnKf) have recently been popular in researching methods for obtaining multiple history matched models. In this method a set of initial or prior reservoir models is generated. Next, a first time step of the flow simulation is executed and dynamic variables are calculated. This initial set of models is then updated (linearly) as a whole using the difference between the field data and the response of the initial set, as well as the covariance matrix between the static and dynamic variables. The theory requires that the models a multi-gaussian and the relationship between data and model is linear (or almost linear). This update is repeated until the last time step of the flow simulation. Consider an example in Figure 8. A simple injector and producer configuration is shown on a 31x31 grid. As prior

model, we have a training image, see Figure 9, from which a set of initial reservoir models can be generated, see Figure 10. To apply the EnKf method to these clearly non-gaussian fields, we use the metric ensemble Kalman filter (Park, 2011), an adaptation of the ensemble Kalman filter performed after kernel transformation. Figure 11-12 shows that success is obtained by generating 30 models that reproduce the training image patterns as well as match the data. So, to the eye of the innocent bystander, everything seems perfect. Consider however generating 30 history matched models using the rejection sampler. When then comparing the conditional variance of the permeability fields generated using these two techniques, see Figure 13, we notice the low variance of the ensemble Kalman filter as compared to the rejection sampler. Clearly the linear and Gaussian hypothesis of the Kalman filter lead to internal inconsistency. The data Figure 8: setup problem for the EnKf Figure 9: training image

In the work of Park (2011), a different approach is taken that does not require linearity, nor Gaussianity. His distance-based approach has been tested on the same example resulting in Figure 13. Clearly, while not yet perfect, he achieves a much greater randomization than the EnKf, while matching the data equally well. Unconditional simulation Figure 10: unconditional simulation Some posterior models obtained by EnKf Figure 11: conditional simulation

Rejection sampler 10.000 forward simulations EnKf 30 forward simulations Figure 12: history matching with rejection sampler and Enkf Rejection sampler EnKf Distance methods Figure 13: comparing conditional variance

The data The model Rock type and boundary of surfaces Surface-based model Figure 14: 9 wells of a real case and the surface-based model History matching by means of optimization The shortcoming of the EnKf method is that it is basically a kriging-type approach (linear updating, covariances), hence only works for posterior determination if that posterior is multi-gaussian. Once can therefore question how well other optimization technique work in terms of internal consistency when applied to the history matching problem, since basically EnKf, as kriging, is an optimization technique. How well do we match the 8 wells? How well do we predict the one well? Bias Variance Figure 15: (top) matching performance, (bottom)prediction performance

Consider therefore a more complex example, studied in Bertoncello (2011). In his work, a complex surface-based forward model is built, see Figure 14, based on various input parameters such as length and height of the lobes, origin location, migration and progradation statistics as well as several rules related to the deposition and erosion of these bodies. Spatial uncertainty is modeled by placing the lobe surfaces in various different positions. A complex but geologically realistic model can be created. In order to fit such a model to data, such as well data, one can execute an iterative trial-and-error type optimization algorithm that modifies the lobe parameters and placement such that some objective function quantifying the mismatch between data and model is minimized. In order to check how predictive such optimization is, consider a realistic case outlined in Bertoncello (2011). A surface-based model is matched iteratively to 8 wells, with the aim of predicting the outcome of a 9 th well in a real-case dataset. Figure 15 shows that as the iteration proceeds, the match is getting better. Several models were matched, each starting from a different initial solution, providing an envelope of mismatches. All models reached a good match as long as the optimization is ran long enough. How well do these models predict the 9 th well. The prediction is good up to a certain amount of matching. Clearly when the iterations are run a long time, the predictions will start to deteriorate, meaning that the solutions spread an uncertainty space that has become too narrow and also biased. Focusing too much on matching data may therefore create poor models of uncertainty. There may be various reasons for this in this example. The forward model may not accurately capture reality, hence any strict matching leads to a bias. Secondly, optimization methods tend to provide a too narrow space of uncertainty. Does it matter? One can wonder whether this sudden focus on internal consistency really matters. Consider again the MPS conditional simulation and consider now predicting flow in a neighboring producer well, see Figure 16. Clearly the uncertainty in terms of flow can be highly affected. Nevertheless, one should have a broader discussion on internal consistency than this simple example. If we return to Bayes rule than the role of the prior becomes important. Often we have a good handle on the likelihood, that is, how well we should match the data. The problem in modeling uncertainty often lies in the prior. Clearly in the above examples, the data was matched, but it was matched incorrectly. In previous years, we put a lot of emphasis on geological consistency. This geological consistency issue is not the cause of the incorrect matching. All models reproduce the prior statistics and the data. But the posterior has become inconsistent with the prior. Does this matter? It matters if considerable effort has been put in constructing the prior, such as for example the case with multiple training images. If however the prior is multi-gaussian, then, to my opinion, there is no need to be consistent with it since it is already a fabricated model that is not very in tune with reality.

% water produced Grid with single hard data injector 0.7 0.6 0.5 0.4 Rejection sampler Conditional simulation P90 0.3 producer 0.2 0.1 P10 0 0 1000 2000 3000 4000 5000 6000 7000 8000 Time Figure 16: the consequence in terms of flow prediction of internal inconsistency References Bertoncello, A., 2011. Conditioning of Surface-Based Models to Wells and Thickness Maps. PhD dissertation, Stanford University. Caers, J., 2011. Modeling Uncertainty in the Earth Sciences. Wiley-Blackwell. 250p. Honarkhah, M., 2011. Stochastic Simulation of Patterns Using Distance-based Pattern Modeling. PhD dissertation, Stanford University. Park, K, 2011. Modeling Uncertainty in Metric Space, PhD dissertation, Stanford University.