SIMPAT: Stochastic Simulation with Patterns

SIMPAT: Stochastic Simulation with Patterns G. Burc Arpat Stanford Center for Reservoir Forecasting Stanford University, Stanford, CA 94305-2220 April 26, 2004 Abstract Flow in a reservoir is mostly controlled by the connectivity of extreme permeabilities (both high and low) which are generally associated with marked, multiplescale geological patterns. Thus, accurate characterization of such patterns is required for successful flow performance and prediction studies. In this paper, a new pattern-based geostatistical algorithm (SIMPAT) is proposed that redefines reservoir characterization as an image construction problem. The approach utilizes the training image concept of multiple-point geostatistics but instead of exporting statistics, it infers multiple-scale geological patterns that occur within the training image and uses these patterns as the building blocks for image (reservoir) construction. The method works equally well with both continuous and categorical variables while conditioning to a variety of local subsurface data such as well logs and 3D seismic. The paper includes the technical details of the SIMPAT algorithm and various complex examples to demonstrate the flexibility of the approach. 1

1 Introduction Sequential simulation is one of the most widely used stochastic imaging techniques within the Earth Sciences. The sequential simulation idea found its first use in the traditional variogram-based geostatistics algorithms such as sequential Gaussian simulation (SGSIM) and sequential indicator simulation (SISIM) as described in Deutsch and Journel (1998). In the early nineties, multiple-point geostatistics introduced the training image concept proposing to replace the variogram with the training image within an extended sequential simulation framework (Guardiano and Srivastava, 1993). Later, the landmark algorithm (SNESIM) of Strebelle (2000) made the use of multiple-point geostatistics practical for real reservoirs (Caers et al., 2003; Strebelle et al., 2002). In this paper, a different approach to the use of training images within the extended sequential simulation framework is explored. The problem is redefined as an image processing problem where one finds the geological patterns in a training image instead of exporting multiple-point statistics through conditional probabilities. Once these patterns are found, the method applies sequential simulation such that, during the simulation, a local pattern similarity criterion is honored instead of local conditional distributions. The method accounts for multiple-scale interactions of geological patterns through a new multiple-grid approach where the scale relations are tightly coupled (See Tran (1994) for the details of the traditional multiple-grid method). The same algorithm works equally well both with categorical and continous variables such as facies and permeability while conditioning to a variety of local subsurface data such as well logs and 3D seismic. This algorithm is named SIMPAT (SIMulation with PATterns). For a discussion about the general philosophy of the basic algorithm, the reader is referred to Caers and Arpat (2004). This paper is intended to document the technical details of SIMPAT. The paper first introduces a formal notation for the proposed algorithm (Section 2). Then, a detailed discussion of the SIMPAT algorithm using this notation is given (Section 3). This section is followed by the results section where several examples are discussed (Section 4). Finally, future work plans about the algorithm are laid out in Section 5. 2

2 Notation For clarity, this section introduces the required notation for explaining the SIMPAT algorithm on a binary (sand/non-sand) case. The extension to multiple-categories and continuous variables are discussed later in Section 3.1.1. 2.1 Grids and Templates Define i(u) as a realization of an indicator variable I(u) modeling the occurrence of a binary event (For example, a model for the spatial distribution of two facies, e.g. sand/non-sand): 1 if at u the event occurs (sand) i(u) = 0 if at u the event does not occur (non-sand) where u = (x,y,z) G and G is the regular Cartesian grid discretizing the field of study. When node u of i is unknown or missing, it is denoted by i(u) = χ. i T (u) indicates a specific multiple-point event of indicator values within a template T centered at u; i.e., i T (u) is the vector: i T (u) = {i(u + h 0 ),i(u + h 1 ),i(u + h 2 ),...,i(u + h α ),...,i(u + h nt 1)} (2) (1) where the h α vectors are the vectors defining the geometry of the n T nodes of template T and α = 0,...,n T 1. The vector h 0 = 0 identifies the central location u. To distinguish the realization (the simulation grid), the training image, the primary data (such as well data) and the secondary data (such as seismic), the notations re, ti, dt 1 and dt 2 are used in place of i. For example, ti T (u ) = {ti(u + h 0 ),ti(u + h 1 ),ti(u + h 2 ),...,ti(u + h α ),...,ti(u + h nt 1)} (3) denotes a multiple-point event scanned from the training image ti at location u and u G where G is the regular Cartesian grid discretizing the training image. Notice that, the training image grid G need not be the same as the simulation grid G. 3

2.2 Patterns A pattern pat k T is the particular k-th configuration of the previous vector of indicator values ti T (u ) of the training image ti, with each indicator value now denoted by pat k T(h α ), where k = 0,...,n patt 1 and n patt is the number of total available patterns in the pattern database associated with the training image ti. The pattern geometry is again defined by a template T containing n T nodes with the vectors h α where α = 0,...,n T 1. That configuration is location-independent, hence the definition of a particular pattern is itself location-independent. Thus: 1 if at the α-th template location h pat k α the event occurs (sand) T(h α ) = 0 if at the location h α the event does not occur (non-sand) Different from i T (u) defined in Eq. (1), a pattern pat k T cannot have unknown/missing nodes as patterns are extracted from the training image ti and the training image is always fully known by definition. The k-th pattern is then defined by the vector pat k T of n T binary indicator values: pat k T = {pat k T(h 0 ),pat k T(h 1 ),pat k T(h 2 ),...,pat k T(h α ),...,pat k T(h nt 1)} (5) where k = 0,...,n patt 1. All n patt patterns are defined on the same template T. In any finite training image ti, there is a finite maximum number of n ti pat T patterns that can be extracted which are defined over a template T. A filter can be applied to discard patterns with undesirable characteristics to reach the final smaller n patt count. Therefore, the final number (n patt ) of patterns retained might be smaller than n ti pat T. (4) 2.3 Data Events A data event dev T (u) is defined as the vector of indicator values, dev T (u) = {dev T (u + h 0 ),dev T (u + h 1 ),...,dev T (u + h α ),...,dev T (u + h nt 1)} (6) with dev T (u + h α ) = re(u + h α ) where re is the realization (simulation grid) indicator function as previously defined in Section (2.1). In other words, dev T (u) = re T (u). 4

2.4 Similarity d x, y is used to denote a generic dissimilarity (distance) function. The vector entries x and y can be patterns or data events. Ideally, d x,y should satisfy the conditions: 1. d x,y 0, 2. d x,y = 0 only if x = y, 3. d x,y = d y,x 4. d x,z d x,y + d y,z In practice the last two conditions, which are required to make d x,y a metric, are sometimes relaxed for certain distance functions. A commonly used distance function is the Minkowski metric (Duda et al., 2001). Applied to the case of d x,y for a data event and a pattern, one has, d dev T (u),pat k T = ( nt 1 α=0 dev T (u + h α ) pat k T(h α ) q ) 1/q (7) where q 1 is a selectable parameter. Setting q = 2 gives the familiar Euclidean distance while setting q = 1 gives the Manhattan or the city block distance. The entries of d x, y, vectors x and y might contain unknown/missing components. For example, a data event dev T (u) may have unknown nodes. In such a case, these missing components are not used in the distance calculations, i.e. if u + h α is unknown, the node α is skipped during the summation such that: and, d ( dev T (u),pat k nt T = d α devt (u + h α ),pat k T(h α ) ) 1/q α=0 x α y α q if x α and y α are both known, i.e. χ d α x α,y α = 0 if either x α or y α is unknown/missing (8) (9) The above expression prevents the node h α from contributing to the similarity calculation if it contains an unknown/missing value in either one of the entires of d α x α,y α. 5

2.5 Multiple-grids On a Cartesian grid, the multiple-grid view of a grid G is defined by a set of cascading coarse grids G g and sparse templates T g instead of a single fine grid and one large dense template where g = 0,...,n g 1 and n g is the total number of multiple-grids for grid G. See Fig. 1 for an illustration of multiple-grid concepts. coarse node fine node [ a ] coarse grid [ b ] fine grid empty node full node [ c ] coarse template [ d ] fine (base) template Figure 1: Illustration of multiple-grid concepts for n g = 2. The g-th (0 g n g 1) coarse grid is constituted by each 2 g -th node of the final grid (g = 0; i.e. G 0 = G). If T is a template defined by vectors h α, α = 0,...,n T 1 where n T is the number of nodes in the template, then the template used for the grid G g, T g is defined by h g α = 2 g.h α and has the same configuration of n T nodes as T but with spacing 2 g times larger. 2.6 Dual Templates For a given coarse template T g, one can define a 3D bounding box around the nodes of T g such that this imaginary box is expanded to contain all the nodes of the template T g but not larger (Fig. 2). Then, the dual template of T g, T g is defined as the template 6

that contains all the nodes within this bounding box. In other words, the dual template T g contains the coarse nodes of the current coarse grid G g as well as all the finest nodes from the finest grid G 0 which fall inside the bounding box. The total number of nodes within the dual template T g is denoted by and typically n n T T << n T. Fig. 2 illustrates a coarse grid template (also called a primal template ) and its corresponding dual template for g = 1 for a base template of size 3 3. [ a ] primal template [ b ] dual template empty node full node bounding box [ c ] base template Figure 2: Illustration of a primal template and its corresponding dual template for g = 1. By definition, a dual template T g always shares the center node h g 0 with its corresponding primal template T g. Using this property, the dual of a pattern pat k T, pat k T is defined as the pattern extracted at the same location u in the ti as pat k T but using the template T g instead of T g. A dual data event dev T(u) can be defined similarly. 7

3 The SIMPAT Algorithm In this section, the details of the SIMPAT algorithm are presented using the above notation. For clarity, the algorithm is first presented without considering the details of the application of the multiple-grid approach (such as the use of dual templates) and data conditioning. These topics are elaborated in the subsequent subsections. 3.1 Single-grid, Unconditional SIMPAT The basic algorithm can be divided into two modules: Preprocessing and simulation. Preprocessing of the training image ti: P-1. Scan the training image ti using the template T for the grid G to obtain all existing patterns pat k T, k = 0,...,n ti pat T 1 that occur over the training image. P-2. Reduce the number of patterns to n patt by applying filters to construct the pattern database. Typically, only unique patterns are taken. Simulation on the realization re: S-1. Define a random path on the grid G of re to visit each node only once. Note that, there is a variation to this step as explained in Section 3.1.3. S-2. At each node u, retain the data event dev T (u) and find the pat T that minimizes d dev T (u),patt k for k = 0,...,npatT 1, i.e. pat T is the most similar pattern to dev T (u). See Section 3.1.4 for details of how the most similar pattern is searched within the pattern database. S-3. Once the most similar pattern pat T is found, assign pat T to dev T (u), i.e. for all the n T nodes u +h α within the template T, dev T (u +h α ) = pat T(h α ). See Section 3.1.2 for details of why the entire pattern pat T is pasted on to the realization re. S-4. Move to the next node of the random path and repeat the above steps until all the grid nodes along the random path are exhausted. 8

3.1.1 Multiple-categories and Continuous Variables Section 2 explains the concepts used in the SIMPAT algorithm applied to a binary case. Yet, one may notice that the only time the actual values of the training image ti or the realization re are used is when calculating the distance between a pattern and a data event. Within this context, extension to multiple-categories and continuous variables can be implemented in a trivial manner. As long as the distance function chosen is capable of handling the nature of the simulation variable (i.e. categorical or continuous), all the concepts explained in Section 2 as well as the above basic version of the SIMPAT algorithm remain intact, i.e. one does not need to modify the algorithm to accommodate multiple-category or continuous variables. The Minkowski metric explained in Section 2.4 is indifferent to the nature of the simulation variable and thus can be used both with multiple-category or continuous variables. There exists other distance functions that can only operate on a certain type of variable. Examples of such distance functions are discussed in Section 5. 3.1.2 Enhancing Pattern Reproduction Step S-3 of the above algorithm dictates that, at node u, the entire content of the most similar pattern pat T is assigned to the current data event dev T (u). The rationale behind this pasting of the entire pattern pat T on to the data event dev T (u) is to improve the overall pattern reproduction within the realization re. The above aim is achieved through two mechanisms: First, assigning values to multiple nodes of re at every step of the random path results in a rapid reduction of the unknown nodes within re. Second, assigning the entire pat T to dev T (u) communicates the shape of the pattern existing at node u better : When another node u is visited along the random path that happens to fall in the neighborhood of u, the distance calculations include not only the value of u but many of the n T nodes previously determined at u. Thus, the shape information provided by pat T is retained in a stronger manner by encouraging a greater interaction between the patterns selected at neighboring nodes. 9

It is important to understand that, nodes assigned by pasting of pat T to dev T (u) are not marked visited, i.e. they will be visited again during the traversal of the random path. Furthermore, although the current node u is marked visited after Step S-3, its value is not fixed. When visiting another node u that falls in the neighborhood of u, the previously determined value of u might be updated depending on the selected pat T for node u. In other words, the algorithm is allowed to revise its previous decisions regarding the most similar pattern selection, i.e. any node value is temporary and may change until the simulation completes. Due to this property, any node u of re is updated approximately n T times during a simulation. 3.1.3 Skipping Grid Nodes Whenever a pat T defined over a template T is pasted on to the realization, the algorithm actually decides on the values of several nodes of re, not only the visited central value. One might consider not visiting all or some of these nodes later during the simulation, i.e. they might be skipped while visiting the remaining nodes of the random path. This results in visiting less number of nodes within re and thus improves the CPU efficiency of the algorithm. In SIMPAT, skipping of already calculated nodes is achieved by visiting all grid nodes separated by a distance (called the skip size ). In essence, this is a modification of the Step S-1. Consider the 2D 9 9 Cartesian grid given in Fig. 3. Assume a 5 5 template is used and is currently located on u = u 20. If the template T is defined by h α vectors with α = 0,...,n T 1 and n T = 25, then, the data event dev T (u) is defined by vectors u+h α such that u+h α = {u 0,...,u 4,u 9,...,u 13,u 18,...,u 22,u 27,...,u 31,u 36,...,u 40 }. Assume the skip size is set to 3. During the simulation, when node u = u 20 is visited, the values of the most similar pattern pat T are used to populate all the values of the data event dev T (u). As the skip size is decided to be 3, all the nodes within the 3 3 neighborhood of the node u defined by the h β vectors, β = 0,...,8 such that u + h β = {u 10,u 11,u 12,u 19,u 20,u 21,u 28,u 29,u 30 } are marked visited and removed from the random path. This removal from the random path does not mean the values of these 10

nodes are fixed as explained in Section 3.1.2. It simply means the algorithm will not perform an explicit most similar pattern search for these nodes. 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 72 73 74 75 76 77 78 79 80 1 1 72 73 74 75 76 77 78 79 80 1 2 63 64 65 66 67 68 69 70 71 2 2 63 64 65 66 67 68 69 70 71 2 3 54 55 56 57 58 59 60 61 62 3 3 54 55 56 57 58 59 60 61 62 3 4 45 46 47 48 49 50 51 52 53 4 4 45 46 47 48 49 50 51 52 53 4 5 36 37 38 39 40 41 42 43 44 5 5 36 37 38 39 40 41 42 43 44 5 6 27 28 29 30 31 32 33 34 35 6 6 27 28 29 30 31 32 33 34 35 6 7 18 19 20 21 22 23 24 25 26 7 7 18 19 20 21 22 23 24 25 26 7 8 9 10 11 12 13 14 15 16 17 8 8 9 10 11 12 13 14 15 16 17 8 9 0 1 2 3 4 5 6 7 8 9 9 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 ( a ) visiting node u = 20 ( b ) visiting node u = 23 Figure 3: (a) Visiting node u = u 20. The 5 5 template is placed on the node. Due to the skip size of 3, the nodes within the 3 3 shaded area marked visited and removed from the random path. (b) Visiting node u = u 23 later during the simulation. In the above example, a single most similar pattern search is performed but a total of 9 values are calculated. Hence, the number of nodes that needs to be visited is decreased by a factor of 9, which results in 9 times less number of the most similar pattern search. Thus, depending on the skip size, considerable speed gains might be achieved. In general, only skip sizes much smaller than the template size should be considered. The sensitivity of the skip size to the overall pattern reproduction quality of a realization is discussed further in Section 4.2. 3.1.4 Searching for the Most Similar Pattern In Step S-2, the algorithm requires finding the most similar pattern pat T given a data event dev T (u) minimizing d dev T (u),patt k for k = 0,...,npatT 1. This is a well understood problem in computational geometry known as the nearest-neighbor problem or sometimes the post office problem due to Knuth (1997). This problem can be trivially solved in O(n patt ) time by calculating all possible d dev T (u),patt k and taking the minimum distance pat k T as pat T. Yet, this trivial 11

solution (also known as linear search ) can be highly CPU demanding, especially when both n patt and n T are large; typically due to a large and complex 3D training image calling for the use of a large template. The computational geometry literature has many known algorithms that solve the exact nearest-neighbor problem in at least O(log n patt ) time. Approximate solutions that perform better than this logarithmic time are also known. For a comprehensive review of exact and approximate solutions, the reader is referred to Smid (1997). Typically, it is not possible to take a computational geometry algorithm and immediately apply it to SIMPAT as SIMPAT operates on data events with missing (unknown) nodes, which is uncommon in computer sciences and thus generally not addressed by the above cited solutions. Furthermore, the proposed solutions generally apply to cases where n T < 100, whereas n T > 250 is typical in SIMPAT. Due to these issues, currently, SIMPAT employs a slightly improved variation of the linear search for finding the most similar pattern (See Section 4.5 of Duda et al. (2001) for the details of this improved search). Further possible improvements of this scheme are discussed in Section 5. 3.2 Multiple-grid, Unconditional SIMPAT In SIMPAT, the multiple-grid simulation of a realization is achieved by successively applying the single-grid algorithm to the multiple-grids starting from the coarsest grid G ng 1. After each multiple-grid simulation, g is set to g 1 and this succession of multiple-grids continues until g = 0. The application of the multiple-grid approach can further be discussed in two parts: Interaction between grids and use of dual templates. 3.2.1 Interaction Between Coarse and Fine Grids At the beginning of the multiple-grid simulation, g is set to n g 1, i.e. first the coarsest grid G ng 1 is to be simulated. Then, the single-grid SIMPAT algorithm is applied to this grid G g using template T g. When the current multiple-grid simulation is completed, the values calculated on the current grid are transferred to G g 1 and g is set to g 1. It 12

should be noted that, different from the classical multiple-grid approach (Strebelle, 2002; Tran, 1994), SIMPAT does not freeze the coarse grid values when they are transferred to a finer grid, i.e. such values are still allowed to be updated and visited by the algorithm in the subsequent multiple-grid simulations. On a grid G g, the previously calculated coarse grid values contribute to the distance calculation of Step S-2, i.e. if on node u a previous coarse grid value exists, this value is taken into account when minimizing d dev T g(u),pat k T. Thus, the new value of g node u is calculated conditioned to both the coarse grid value and the newly available neighborhood information dev T g(u). 3.2.2 Using Dual Templates Dual templates allows the algorithm to populate the nodes of the finest grid along with the nodes of the current coarse grid. To use dual templates, Step S-3 of the algorithm is modified. On node u, first, the most similar pattern pat T g is found based on the minimization of the distance d dev T g(u),pat k T g. Then, the corresponding dual pattern pat Tg is retrieved from the pattern database. Finally, instead of populating the nodes of dev T g(u), the algorithm populates all the nodes of the dual data event dev Tg(u), i.e. not only the coarse nodes but also the fine nodes of re are updated. Dual templates are used only when populating the nodes of the dual data event and not during the distance calculations. In other words, Step S-2 of the algorithm is not modified: the distance calculations are still performed on the original grid template T g using the data event dev T g(u). This property means use of dual templates has no effect on the current multiple-grid simulation as the finer nodes calculated by the dual pattern pat are never taken into account during the distance calculations of the current grid. Tg The effects of using dual templates will only be seen when simulating the one finer grid. Use of dual templates can be envisioned as a dimensionality reduction method. Instead of minimizing d dev Tg(u),pat k Tg, where each vector entry to the distance func- 13

tion has n Tg nodes such that n Tg >> n T g, the algorithm uses the minimization of d dev T g(u),pat k T g as an approximation to the more costly minimization between the dual data event and the dual pattern. 3.3 Data Conditioning This section explains the details of how data conditioning to both primary and secondary data is performed. 3.3.1 Conditioning to Primary Data In SIMPAT, conditioning to primary data is performed in Step S-2 of the algorithm, during the search for the pattern yielding the minimum d dev T (u),patt k, k = 1,...,npatT. If a conditioning data exists on dev T (u+h α ), the algorithm first checks if pat k T(h α ) is within a threshold ǫ d of this data, i.e. devt (u + h α ) pat k T(h α ) ǫd. Typically, this threshold ǫ d is taken zero for hard conditioning information such as well data (ǫ d = 0). If the pattern pat k T does not fulfill this condition, it is skipped and the algorithm searches for the next most similar pattern until a match is found. If none of the available patterns fulfill the condition, the algorithm selects the pattern that minimizes d dev T (u + h α ),pat k T(h α ), i.e. only the nodes of dev T (u) that has conditioning information are considered during the distance calculations. If several pat k T fulfills this condition, then a second minimization is performed on the non-conditioning nodes of dev T (u) using only the these patterns. Note that, such a case generally occurs when the training image is not representative of the conditioning data. The above conditioning methodology used in SIMPAT can be summarized as follows: 1. At node u, retrieve the data event dev T (u). 2. Divide the data event into two parts: dev T,1 and dev T,2 such that dev T,1 contains only the conditioning nodes of dev T (u) and dev T,2 contains only the nonconditioning and uninformed nodes of dev T (u). 14

3. Minimize the distance d dev T,1,pat k T to find the most similar pattern pat T to the conditioning data event. If only one such pattern exists, take pat T as the final most similar pattern and proceed with Step S-3 of the SIMPAT algorithm. 4. If several patterns minimize the distance d dev T,1,pat k T, perform another minimization, namely d dev T,2,pat k T, only within these conditioned patterns to find the final most similar pat T. Conditioning data might impose certain problems when used in a multiple-grid setting. A particular multiple-grid G g might not include the location containing the conditioning data. In such a case, as conditioning is performed during the distance calculations using the coarse template T g, the conditioning data will not be included within the calculation, resulting in poorly-conditioned results. SIMPAT avoids the above situation by making use of dual templates when conditioning to primary data. While minimizing d dev T,1,patT k, each corresponding dual pattern pat k T is checked against the dual data event dev T(u) to discover whether the pat k T conflicts with the primary data. Such conflicting patterns are ignored during the minimization. Recall that, by definition, a dual template contains extra fine grid nodes in addition to the coarse grid nodes of the primal template. Thus, checking dev T(u) against pat k T guarantees that the final most similar pattern pat T selected is conditioned to the primary data at the finest grid and not only at the current coarse grid. 3.3.2 Conditioning to Secondary Data In Earth Sciences, secondary data typically refers to data obtained from an indirect measurement such as seismic surveys. Thus, secondary data is nothing but a filtered view of the original field of study (reservoir), where the filtering is performed by some forward model F. Generally, the forward model F is not fully known and is approximated by a known model F. Furthermore, secondary data typically exists in exhaustive form, i.e. for every node u of the field of study, dt 2 (u) is known where dt 2 is the vector of all secondary data values. 15

Within this context, in SIMPAT, conditioning to secondary data calls for additional steps to the basic algorithm. First, a secondary training image ti 2 is required. This training image can be obtained by applying the approximate forward model F to ti. Once the secondary training image is obtained, Step P-1 of the algorithm is modified such that, for every pat k T of the primary training image, a corresponding secondary pattern pat 2 k T is extracted from the secondary training image from the same location. In essence, patterns now exist in pairs in the pattern database. Once the preprocessing module of the algorithm is modified this way, another modification is done to the Step S-2, i.e. the search for the most similar pattern. Instead of minimizing d dev T (u),patt k, the algorithm now minimizes the summation of d dev T (u),pat k T and d dev 2 T (u),pat 2 k T where dev 2 T (u) denotes the secondary data event obtained from dt 2, i.e. dev 2 T(u) = dt 2 T(u). The net result of the above modifications is that, for every node u, the algorithm now finds the most similar pattern not only based on the previously calculated nodes but also based on the secondary data. When there is also primary data available, this minimization is performed only after the patterns that condition to the primary data are found as explained in Section 3.3.1, i.e. primary data has priority over secondary data. The values of the secondary data events and the secondary patterns need not be in the same range as the values of the primary data event and the primary patterns. This is typically the case when seismic information is used as secondary data. such cases, the combined distance minimization might result in biased distances as for a node u, d dev 2 T(u),pat 2 k T might be several orders of magnitude greater than d dev T (u),pat k T, effectively dominating the distance calculation and shadowing the contribution of the previously calculated re nodes to the combined distance. To prevent the above problem, the individual distances are normalized such that both d dev T (u),pat k T and d dev 2 T (u),pat 2 k T range between the same values; typically, [0,1]. Furthermore, a weight is attached to the combined summation to let the user of the algorithm give more weight to either the primary or the secondary values, reflecting the trust of the user to the secondary data. Thus, for the pairs dev T (u), pat k T and In 16

dev 2 T(u), pat 2 k T, the final form of the distance function becomes: d 1,2, = ω d 1 dev T (u),pat k T + (1 ω) d 2 dev 2 T(u),pat 2 k T (10) where d 1,2, is the final combined distance that will be minimized to find the most similar pattern pat T, d 1, and d 2, are the normalized distances, ω is the weight factor and typically ω [0, 1]. If desired a local ω(u) can be used instead of a global ω. This reflects the fact that one has varying degrees of trust to secondary data in different regions of the reservoir. 17

4 Examples In this section, the results of several runs performed using the SIMPAT algorithm are discussed. The results are presented under three subsections: (1) Simple 2D, unconditional examples; (2) complex 3D, multiple-category and continuous, unconditional examples and (3) complex 3D, multiple-category, conditional examples. All results discussed in this section are presented at the end of the section. Reported CPU times are obtained using an Intel Pentium IV 2.4 GHz. 4.1 2D Simple Examples Fig. 5 illustrates the intermediate steps of an application of the multiple-grid, unconditional SIMPAT to a binary (sand/non-sand) case. Using the 250 250 training image given in Fig. 5a, the final realization (Fig. 5d) is obtained using a 11 11 base template, 3 multiple-grids and a skip size of 2. Fig. 5b and Fig. 5c are the intermediate multiplegrid results for the coarsest and one finer grids, g = 2 and g = 1. Despite the fact that these results are taken from intermediate steps of SIMPAT, they are still completely informed due to the use of dual templates as explained in Section 3.3. In the figure, the algorithm starts by roughly placing the channel patterns on to the realization re during the coarsest grid simulation. In essence, the multiple-grid approach ignores the finer-scale details on the coarser grids. Later, the algorithm corrects the details of the realization as subsequent multiple-grid simulations are performed. Fig. 5e and Fig. 5f shows closeups of the same region on the second coarsest grid and on the finest grid. Here, the distance calculations detect the broken channels that does not exist in the training image and replace them with connected channel patterns. Fig. 6 provides some insights into the sensitivity to template size when a fixed skip size of 2 is used. As expected using a small template (3 3) results in poor pattern reproduction (Fig. 6b). Increasing the template size to 5 5 (which effectively triples the number of template nodes n T ) improves the pattern reproduction (Fig. 6c). Successive increases in the template size continue to improve the final results; yet, it appears 18

that after a certain threshold, increasing the template size does not provide any further improvements. This threshold (sometimes called the optimum template size ) is related to the complexity of the patterns found in a training image. Given a training image, automatic discovery of the optimum template size is one of the active research topics of pattern-based geostatistics. Fig. 7 shows application of SIMPAT to different types of 2D training images. Of these, Fig. 7e and 7f demonstrate the capability of pattern-based geostatistics to inherently reproduce the lower-order statistics that exist in the training image. A visual check of the SIMPAT realization in Fig. 7f shows that the algorithm successfully reproduces the Gaussian patterns found in the training image. Variogram comparisons between the two figures further confirms this check (Fig. 4). Figure 4: Comparison of 0, 45 and 90 variograms of Fig. 7e and 7f. The color red denotes the calculated variograms of the SIMPAT realization. 4.2 3D Complex Examples Fig. 8a is an example of wavy bed lamination. This continuous training image of size 90 90 80 is generated using the SBED software (Wen et al., 1998). In this example, the training image contains complex patterns in the finest scale. Yet, higher scale patterns are fairly repetitive and thus can be considered simple. A template size of 11 11 7 19

is chosen to reflect the complexity of fine scale patterns. Due to the apparent repetitive and simple nature of the higher scale patterns, a skip size of 6 is used on all grids, which significantly improved the CPU time of the run (35 minutes). The SIMPAT unconditional realization for this case is shown in Fig. 8b. Fig. 9a is a 7 facies, complex training image of size 100 75 25, depicting a tidal channel system. Another notable property of Fig. 9a is that, the image is highly nonstationary, especially in the higher scales, where one facies appears only locally (The facies represented by the yellow color which runs through the x-axis). Fig. 9b shows the unconditional SIMPAT realization obtained using a 15 15 5 template and a skip size of 4 on all grids. As the figure illustrates, the algorithm successfully captures the non-stationary, local behavior of the training image, while also retaining the other more stationary facies relations. The CPU time for this run was approximately 120 minutes. Fig. 10 studies the sensitivity to the skip size. The figure demonstrates that the pattern reproduction quality of realizations might be sensitive to the skip size setting, especially in complex 3D training images where one facies dominates the others (such as the non-channel, background facies of Fig. 10a). In such cases, one might observe proportion reproduction problems, which in turn, might cause pattern reproduction problems. Fig. 10b and Fig. 10c reveal this issue: When a higher skip size setting is used, the realization honors the patterns in the training image better. Removing this sensitivity to skip size is an active area of research and is further discussed in Section 5. In this case, the better realization of Fig. 10b required approximately 100 minutes of CPU time; Fig. 10c, albeit produced worse results, took approximately 400 minutes. 4.3 3D Conditional Examples Fig. 11 shows a 100 100 50 synthetic reference case with 6 facies generated using the SBED software. Two different data sets are sampled from this reference case to test the primary conditioning capabilities of SIMPAT (Fig. 11b and 11c). The first, dense data set is used with a train imagine that is highly representative 20

of the reference (Fig. 12a). Using such a training image guarantees that, during the simulation, the number of conflicting patterns are kept to a minimum. Yet, in a real reservoir, one would expect a reasonable amount of conflict between the available data and the selected training image. Thus, the final SIMPAT realization (Fig. 12c) obtained for this case should be viewed as a check of the conditioning capabilities of SIMPAT rather than a representative example of primary data conditioning in a real reservoir. The second, sparse data set is used with a training image that contains patterns which are more likely to conflict with the available data (Fig. 13a). In this case, the data dictates stacked channels whereas the training image only has isolated channels. Combined with the sparse data set, use of this training image can be considered as a representative conditioning example in a real reservoir. Fig. 13c is the final conditional SIMPAT realization obtained. Fig. 14, 15 and 16 demonstrate the application of secondary data conditioning using SIMPAT. In this case, secondary data is obtained by applying a seismic forward model F to the binary reference case (See Wu and Journel (2004) for details of the forward model used). The same model is applied to the training image to obtain the secondary training image. The final SIMPAT realization generated (Fig. 16b) has the settings 11 11 3 template size and a skip size of 2. The realization conditions to secondary data relatively well but pattern reproduction is somewhat degraded as made evident by the disconnected channel pieces in Fig. 16b. It is believed that this problem occurs because the template T used for the secondary data events dev 2 T(u) might not be representative of the volume support of the secondary data. This issue is furthered discussed in Section 5. 21

[ a ] training image [ b ] multiple-grid ( g = 2 ) 250 250 sand non-sand [ c ] multiple-grid ( g = 1 ) [ d ] SIMPAT realization ( g = 0 ) [ e ] closeup area of ( c ) [ f ] closeup area of ( d ) Figure 5: Intermediate steps of the multiple-grid, unconditional SIMPAT. (e) and (f) are the closeups of the marked regions on (c) and (d). 22

[ a ] training image [ b ] template size = 3x3 250 250 sand non-sand [ c ] template size = 5x5 [ d ] template size = 7x7 [ e ] template size = 11x11 [ f ] template size = 21x21 Figure 6: SIMPAT realizations obtained using different template sizes with a fixed skip size of 2 for all grids. 23

300 [ a ] training image [ b ] SIMPAT realization 100 100 channel levee non-sand [ c ] training image [ d ] SIMPAT realization 250 300 sand non-sand 250 1.0 0.0 [ e ] training image [ f ] SIMPAT realization Figure 7: SIMPAT realizations obtained from different training images: (a) is a binary training image generated using the SBED software; (b) is a 3 facies training image generated using an object-based model and (c) is a continuous training image generated from an unconditional SGSIM run. 24

[ a ] training image 0.5 0.1 [ b ] SIMPAT realization Figure 8: A wavy bed lamination training image (continuous) and its SIMPAT realization. 25

[ a ] training image 7 1 [ b ] SIMPAT realization Figure 9: A tidal channel training image (7 facies) and its SIMPAT realization. 26

[ a ] training image 3 0 [ b ] SIMPAT realization w/ skip size 4, 4, 2 [ c ] SIMPAT realization w/ skip size 2, 2, 1 Figure 10: A channel training image (4 facies) and two SIMPAT realizations obtained using different skip size settings. Each figure shows two z-slices. 27

[ a ] reference 5 0 [ b ] dense well data ( 150 wells ) [ c ] sparse well data ( 50 wells ) Figure 11: Reference for primary data conditioning and two data sets randomly sampled from this reference. 28

[ a ] training image 5 0 [ b ] primary data ( 150 wells ) [ c ] SIMPAT realization Figure 12: Primary data conditioning result obtained using a training image that is highly representative of the reference given in Fig. 11 and a dense data set. 29

[ a ] training image 5 0 [ b ] primary data ( 50 wells ) [ c ] SIMPAT realization Figure 13: Primary data conditioning result obtained using a training image that has conflicting patterns with the given sparse data set. 30

sand non-sand [ a ] reference 11.0 3.0 [ b ] secondary data ( seismic ) Figure 14: Reference for secondary data conditioning and the corresponding synthetic seismic obtained using an approximate forward model F. 31

sand non-sand [ a ] training image 11.0 3.0 [ b ] secondary training image Figure 15: Training image for secondary data conditioning and the corresponding secondary training image obtained using the same F as Fig. 14. 32

sand non-sand [ a ] reference sand non-sand [ b ] conditionalsimpat realization Figure 16: The conditional SIMPAT realization generated using the secondary data shown in Fig. 14 and the training image in Fig. 15. 33

5 Future Work In this section, several ongoing additions to the current SIMPAT implementation in various stages of development and testing are discussed. 5.1 Conditional SIMPAT As discussed in Section 4.3, there are two main issues related to the conditioning capabilities of SIMPAT that require further studying: 1. Conflicting training image patterns and primary data: As discussed in conjunction with Fig. 13, a conflict between the patterns and the primary data might result in poor pattern reproduction. One possible solution to the problem is to modify the primary conditioning scheme such that, when comparing the conditioning data event dev T,1 to pat k T, data closer to the center node u of the template T are given more weight in the distance calculation. This modification is expected to improve pattern reproduction without sacrificing too much data conditioning when the training image patterns and the primary data are at most moderately conflicting. If there is a strong disagreement between the training image and the primary data, the user is advised to revise the training image selection instead. 2. The issue of volume support when conditioning to secondary data: The secondary data conditioning methodology explained in Section 3.3.2 dictates that the same template T be used both for the primary data event dev T (u) and the secondary data event dev 2 T(u). Yet, in Earth Sciences, secondary data (such as seismic) typically has a different volume support than the volume support of the primary variable. Thus, for secondary data events and patterns, using a different template T 2 to reflect this difference might improve the pattern reproduction quality of the realizations. Fig. 17 19 illustrate the case where a different template T 2 is used for the secondary data events and patterns. In the figure, the primary template T has the size 11 11 3 and the secondary template T 2 has the size 5 5 7. 34

sand non-sand [ a ] reference 1.0 0.0 [ b ] secondary data ( average ) Figure 17: Reference case for secondary data conditioning using a different T 2. The secondary data is obtained by applying a low-pass filter on the binary facies model. 35

sand non-sand [ a ] training image 1.0 0.0 [ b ] secondary training image Figure 18: Training image for secondary data conditioning using a different T 2. 36

sand non-sand [ a ] reference sand non-sand [ b ] conditionalsimpat realization Figure 19: The conditional SIMPAT realization generated using the secondary data shown in Fig. 17 and the training image in Fig. 18. 37

5.2 Relational Distance The Minkowski metric explained in Section 2.4 can be used both with multiple-category and continuous variables. Yet, the generalized nature of this distance function might not be desired in all cases; especially when used with multiple-categories: In Earth sciences, multiple-categories are typically used to generate facies models of the field of study (reservoir). It is well-understood that, in such models, certain facies exhibit tight relations (For example, a channel facies and an attached levee facies) and such relations should be honored in the generated realizations. In SIMPAT, facies relations are described through the training image patterns. The distance function used (Manhattan distance) is indifferent to these relations, i.e. it does not differentiate between facies and treats them equally. For large templates, this property of the Manhattan distance might be undesirable due to its possible effect on the most similar pattern search (Duda et al., 2001). For example, the sensitivity to skip size that is discussed in Section 4.2 is believed to be one of such undesirable effects. To circumvent the problem, one might envision replacing the Manhattan distance such that, in the new distance function, the facies relations are expressed by defining certain categories to be less distant to each other. One possible implementation of the distance function described above can be achieved thorough defining a matrix of category (facies) relations and using this matrix to calculate the individual pixel distances within the summation over the template nodes h α. Consider the below general distance function: d dev T (u),pat k n T T = d α devt (u + h α ),pat k T(h α ) (11) α=0 where d α x α,y α is currently defined as x α y α. Instead of this definition, one can define d α x α,y α such that each x α,y α pair points to an entry in a matrix that defines relative distances between categories. For categorical variables, x α and y α are indicator values, i.e. = 0, 1,...,n f 1 where n f is the number of categories. Thus, an x α,y α pair can be taken as the column and row indices of the matrix and d α x α,y α as the cell value of the matrix at this column and row. Unknown/missing values are trivially handled 38

by defining the unknown identifier χ as another unique category and setting the matrix entry to zero for all x α,y α pairs such that either x α or y α equals to χ. This above distance function is named Relational Distance and will be implemented in a future version of SIMPAT. Two benefits are expected to be achieved through the use of Relational Distance: (1) Better pattern reproduction due to explicit honoring of facies relations within the distance function and (2) lower sensitivity to skip size. 5.3 Conditioning to Local Angles and Affinities Constraining the realizations using available local angle and affinity information (typically obtained from seismic or via expert input) can be achieved by using multiple training images in a single simulation run. The SNESIM algorithm imposes a similar approach (Zhang, 2002). Consider the case of angle conditioning only. SNESIM first divides the angle data into classes. Say the angles of the field in question range between [ 90, 90] degrees and a total of 6 angle classes are desired. SNESIM implicitly rotates the given training image to angles 75 o, 45 o, 15 o, 15 o, 45 o and 75 o and uses these rotated training images for different parts of the simulation grid dictated by the angle data. For example, if the angle at node u falls between [0, 30] degrees (i.e. the first bin), the training image that is rotated 15 o is used to perform the data event lookup. A similar mechanism is utilized for affinities. A slightly different approach will be used in future implementations of SIMPAT. Instead of implicitly transforming the training images internally, SIMPAT will accept the use of explicit training images for different parts of the simulation grid (called regions). A separate software will be provided to the users of SIMPAT that performs angle and affinity transformation on the training images for convenience. To understand the approach better, consider the case of angle conditioning only. Simply put, one can control the angle of individual patterns by using rotated training images in different parts of the simulation grid while ensuring a smooth transition between these different parts (regions). At a given node u, the most similar pattern search will be 39

performed using the particular rotated image; thus, resulting in a rotated pattern. The same mechanism is also applicable to affinity conditioning. When both angle and affinity information is desired to be retained for the same location, a training image that is both rotated and scaled can be used. 5.4 Computational Issues Step S-2 of the SIMPAT algorithm requires finding the most similar pattern pat T given a data event dev T (u) minimizing d dev T (u),pat k T for k = 0,...,npatT 1. As explained in Section 3.1.4, currently SIMPAT employs a modified linear search to perform this minimization. For complex training images and large simulation grids, the performance level of linear search is undesirable; especially when a large template is used. One solution to address this issue is to parallelize the algorithm and exploit multiple CPUs to improve the search performance. From an algorithmic point of view, parallelization of SIMPAT is trivial: One can divide the pattern database into n cpu parts where n cpu is the number of CPUs available. Then, instead of minimizing d dev T (u),pat k T for k = 0,...,npatT 1, each CPU minimizes d dev T (u),pat k T for k = α,...,β where α and β denote the beginning and ending indices of ranges of size n patt /n cpu. In essence, each CPU searches for a similar pattern pat,c T where c denotes the CPU index and c = 0,...,n cpu 1. Once the results for these minimizations are obtained, a final minimization of d dev T (u),pat,c T is performed to find the final most similar pattern pat T. This method of parallelization is currently being tested and is expected to be included in the next version of SIMPAT. Another possible solution to the CPU time problem can be achieved through the use of dual templates. As explained in Section 2.6, once the multiple-grid simulation of the coarsest grid G ng 1 is complete, the entire realization re is fully informed as dual templates populate the finest grid nodes in addition to the coarse grid nodes. Thus, the algorithm guarantees that for any grid G g other than the coarsest grid G ng 1, the data event dev T g(u) is also fully informed and does not contain unknown/missing nodes. 40