Using Statistical Techniques to Improve the QC Process of Swell Noise Filtering

Similar documents
Summary. Introduction

DI TRANSFORM. The regressive analyses. identify relationships

Network Traffic Measurements and Analysis

An ICA based Approach for Complex Color Scene Text Binarization

Th SRS3 07 A Global-scale AVO-based Pre-stack QC Workflow - An Ultra-dense Dataset in Tunisia

Machine Learning Methods in Visualisation for Big Data 2018

Clustering CS 550: Machine Learning

Outlier detection using autoencoders

Chemometrics. Description of Pirouette Algorithms. Technical Note. Abstract

How reliable is statistical wavelet estimation? Jonathan Edgar*, BG Group, and Mirko van der Baan, University of Alberta

Emerge Workflow CE8 SAMPLE IMAGE. Simon Voisey July 2008

Chapter 5. 3D data examples REALISTICALLY COMPLEX SYNTHETIC INVERSION. Modeling generation and survey design

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Clustering and Visualisation of Data

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

Data Preprocessing. Data Preprocessing

EMERGE Workflow CE8R2 SAMPLE IMAGE. Simon Voisey Hampson-Russell London Office

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

Response to API 1163 and Its Impact on Pipeline Integrity Management

Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham

ELEC Dr Reji Mathew Electrical Engineering UNSW

AUTOMATIC SCORING OF THE SEVERITY OF PSORIASIS SCALING

An Experiment in Visual Clustering Using Star Glyph Displays

P068 Case Study 4D Processing OBC versus Streamer Example of OFON filed, Block OML102, Nigeria

ChristoHouston Energy Inc. (CHE INC.) Pipeline Anomaly Analysis By Liquid Green Technologies Corporation

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Closing the Loop via Scenario Modeling in a Time-Lapse Study of an EOR Target in Oman

Introduction to Data Mining

Automated Classification of Quilt Photographs Into Crazy and Noncrazy

Supplementary text S6 Comparison studies on simulated data

Fluid flow modelling with seismic cluster analysis

Exploratory data analysis for microarrays

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Chapter DM:II. II. Cluster Analysis

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process

Exploratory Analysis: Clustering

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Unsupervised learning in Vision

Examples of GLOBE Claritas Processing

A Review on Plant Disease Detection using Image Processing

Keywords: clustering algorithms, unsupervised learning, cluster validity

Principles and Techniques of VSP Data Processing

Intensity Augmented ICP for Registration of Laser Scanner Point Clouds

Spectral Classification

Programs for MDE Modeling and Conditional Distribution Calculation

Bioimage Informatics

Visual Analytics. Visualizing multivariate data:

Bioimage Informatics

then assume that we are given the image of one of these textures captured by a camera at a different (longer) distance and with unknown direction of i

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008

Lecture 25: Review I

Clustering Analysis based on Data Mining Applications Xuedong Fan

Unsupervised Learning : Clustering

The Effect of Changing Grid Size in the Creation of Laser Scanner Digital Surface Models

Effects of multi-scale velocity heterogeneities on wave-equation migration Yong Ma and Paul Sava, Center for Wave Phenomena, Colorado School of Mines

CS 229: Machine Learning Final Report Identifying Driving Behavior from Data

Table of Contents (As covered from textbook)

University of Ghana Department of Computer Engineering School of Engineering Sciences College of Basic and Applied Sciences

Introduction to digital image classification

Selection of an optimised multiple attenuation scheme for a west coast of India data set

Radial Basis Function (RBF) Neural Networks Based on the Triple Modular Redundancy Technology (TMR)

Analyzing ICAT Data. Analyzing ICAT Data

Using the DATAMINE Program

Machine Learning in the Wild. Dealing with Messy Data. Rajmonda S. Caceres. SDS 293 Smith College October 30, 2017

ICA vs. PCA Active Appearance Models: Application to Cardiac MR Segmentation

Fabric Defect Detection Based on Computer Vision

An Improved Document Clustering Approach Using Weighted K-Means Algorithm

Digital Image Processing. Prof. P.K. Biswas. Department of Electronics & Electrical Communication Engineering

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

Machine-learning Based Automated Fault Detection in Seismic Traces

Inverse Continuous Wavelet Transform Deconvolution Summary ICWT deconvolution Introduction

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Automatic Grayscale Classification using Histogram Clustering for Active Contour Models

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

OPTIMIZING A VIDEO PREPROCESSOR FOR OCR. MR IBM Systems Dev Rochester, elopment Division Minnesota

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

First Order Statistics Classification of Sidescan Sonar Images from the Arctic Sea Ice

R. James Purser and Xiujuan Su. IMSG at NOAA/NCEP/EMC, College Park, MD, USA.

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Gene Clustering & Classification

MATH11400 Statistics Homepage

AN EXAMINING FACE RECOGNITION BY LOCAL DIRECTIONAL NUMBER PATTERN (Image Processing)

Understanding Clustering Supervising the unsupervised

Chapter 9 Object Tracking an Overview

Special Review Section. Copyright 2014 Pearson Education, Inc.

AN IMAGE BASED SYSTEM TO AUTOMATICALLY

Seismic Noise Attenuation Using Curvelet Transform and Dip Map Data Structure

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity

Tutorial: Using Tina Vision s Quantitative Pattern Recognition Tool.

A Knowledge Framework For Sensor Data Quality Control

Unsupervised Learning

5. Spike Sorting. By: Clinton Zheng Da Goh

Learning a Manifold as an Atlas Supplementary Material

CHAPTER 5 CLUSTER VALIDATION TECHNIQUES

UNSUPERVISED LEARNING, CLUSTERING

Introduction. Surface and Interbed Multtple Elimination

High throughput Data Analysis 2. Cluster Analysis

Transcription:

Using Statistical Techniques to Improve the QC Process of Swell Noise Filtering A. Spanos* (Petroleum Geo-Services) & M. Bekara (PGS - Petroleum Geo- Services) SUMMARY The current approach for the quality control (QC) process of the swell noise filtering phase carries significant drawbacks. This abstract investigates the use of statistical techniques to improve the QC process. The main approach of this analysis is to combine attributes that capture properties of the filtering process and plot them against each other in order to identify outliers. It is shown that cases which have been badly or not properly filtered will cluster away from the main group and may be detected using statistical methods such as clustering algorithms. Statistical techniques may potentially play an important role in the development of a data-driven automated QC platform.

Introduction A common quality control (QC) tool for the swell noise filtering process is to visualise the root-meansquares (RMS) maps of the data before and after the filtering in order to identify any anomalies. Anomalies are mainly abnormal high RMS values in the output map that are not consistent with their spatial neighbourhood, making them a discrepancy that is more likely to be attributable to bad filtering performance, rather than to a geological feature of the subsurface. Often after spotting these anomalies, the user visualises the corresponding seismic sections and performs further investigations. Two main drawbacks characterise the above QC process: 1. The approach of identifying anomalies in colour-maps is sensitive to thresholds in the colourcode used to generate them. This introduces some user subjectivity; different users may endup spotting different sets of anomalies. 2. The performance of the filtering is captured by only few attributes, e.g. two RMS values per shot (before/after the filtering), which constitutes a considerably coarse representation. The number of attributes is limited as a result of the human assimilation limitation to simultaneously inspect and inter-correlate a different set of maps at a time. In order to improve the QC process, one needs to include more attributes that quantify the performance of the filtering and to adopt a data-driven method to detect anomalies. This can only be achieved if the process of attribute analysis that includes inter-attribute dependency extraction and anomaly identification is done by a computer program. Such an automated system would ease the QC process for the user and make it more streamlined. This abstract investigates the use of statistical techniques to improve the QC step for swell noise filtering processes. The main idea is to combine two attributes: (1) amplitude reduction after the filtering and (2) statistical properties of the swell noise. The performance of the filtering process on the dataset (e.g. shots within a sailine) is visualised by plotting the two attributes in a scatter plot. A clustering algorithm (Fraley and Raftery, 2002) is used to identify the different clusters that exist in the scatter plot. Shots that have been badly or not properly filtered will cluster away from the main scatter of shots. The prospective goal of this work is to lay the foundations for the development of an automated QC platform that can identify cases of undesirable filtering by taking advantage of machine learning and pattern recognition techniques. The above idea is expected to work better for a filtering process that targets high amplitude noise such as swell and seismic interference, as the properties of these types of noise are considerably different to the desired output of the filtering process; this allows more useful statistical inferences to take place. Methods Firstly, the input, output and difference RMS of seismic amplitudes per shot are combined into a single attribute, representing normalised energy reduction. This normalisation aims to eliminate variations related to the power of the seismic source. Secondly, statistical properties of the filtering can be obtained by computing for every trace within a shot the quantity

and then by extracting some measures from the statistical distribution of for each shot point. The motivation for considering this expression for stems from the fact that the swell noise and the signal (geological events) are statistically independent. Hence for a case of good filtering, the difference (input output) will be approximately equal to the noise. It then follows that, therefore should have a value close to zero for good filtering. This means that if the filtering is properly done for a given shot, we expect the distribution of for that shot to have approximately zero mean. Any large deviation from the zero mean is therefore considered as an indication of undesirable filtering. Moreover, swell noise, contrary to random noise, affects only a limited set of traces within a shot. This means that the variance of for an ideal filtering is expected to be greater than the corresponding variance for a simple high-pass filtering which affects all traces within a shot. The reference to a high-pass filter is invoked here because it is considered a bad filtering method for swell noise. Therefore, the variance of can potentially be an informative attribute that measures the performance of the swell noise filtering process. The above three attributes (energy reduction, mean of, variance of ) provide a per-shot quantification of filtering properties and can be plotted against each other in a scatter plot. Prior to this step, it is also possible to implement dimension reduction techniques such as principal component analysis (PCA) and independent component analysis (ICA) (Hyvärinen et al., 2001) in order to obtain an optimally informative pair of attributes, which are easy to visualize in a scatter plot. Finally, a clustering algorithm can be applied to detect groups of outliers in the scatter plot. This may be performed by machine learning techniques; either by an unsupervised learning algorithm (hierarchical, centroid, statistical clustering) or by a supervised learning algorithm. The latter case can be implemented to allow the user to preselect known cases of good filtering, thus defining a rule that can be used to infer the classification as good or bad for the remaining cases. Examples Figure 1 shows the result of applying two filtering methods to remove swell noise on a real seismic dataset and serves as an example that showcases divergent filtering performance. An optimised FXde-spiking with prediction interpolation method (Schonewille et al., 2008) achieves good filtering results, whereas a high-pass filter achieves bad results. Both filters are applied over the same frequency range of 0Hz-20Hz. Figure 2 shows histograms of the statistical distribution for data belonging to a single test shot of values for the high-pass filter (Figure 2-a) and the FX-prediction filter (Figure 2-b). Both distributions have a mean value close to zero. This was expected for the FX-prediction filter, but also for the high-pass filter because of the different frequency content of the difference and the output sections. This implies that if the taper range is large enough to encompass all swell noise frequencies, the difference and output datasets will be uncorrelated; if some swell noise is present in the output section, non-zero cross-correlation values are to be expected. This makes the mean uninformative, but the distinctly dissimilar shapes of these distributions can be easily captured by the second-order statistical moment of variance. It is obvious that the variance for the high-pass filter output is much smaller compared to the F-X prediction output; hence it is more informative as a measure of discrimination between these good and bad filtering outcomes. The next objective is to validate whether the proposed attributes are informative with respect to filtering performance in a realistic situation. This can be attained by investigating whether they can provide a reliable detection of few cases of bad filtering in real data from a sailine, as this is usually the case in practice. To mimic this situation, some shots that have been filtered with the high-pass

filter are mixed with shots filtered using the F-X prediction method. In Figure 3, the corresponding scatter plot of the variance of versus per shot energy reduction is shown, displaying favourable results. The bad filter shots are only slightly distinguishable at the energy reduction (vertical) axis, but are clearly separated from the good filter shots at the -variance (horizontal) axis. The combination of the two attributes conveniently creates two clear natural groupings of shots, the good group and the bad group. In this example, the classification is easily made by visual inspection, however for more realistic cases a clustering algorithm can be applied order to successfully reveal the desired groupings. It is also possible to modify the clustering algorithm to return a classification of more than two groups to allow for more complex data distributions. Figure 1 Comparison of the industry-standard FX-de-spiking with prediction interpolation and crude high-pass methods performance on the filtering of swell noise. The high-pass method clearly underperforms compared to the industry-standard, as it fails to remove large amounts of swell noise and also undesirably removes significant portions of primary signal as seen in the difference section. Figure 2 Histograms that show the distribution of for data belonging to a single test shot filtered for swell noise with (a) the crude high-pass method and (b) the industry-standard FX-despiking with prediction interpolation method.

Figure 3 Scatter plot of the per shot variance of versus energy reduction for sailine data from a mixed dataset of good filter shots (black) and bad filter shots (red). Conclusions Incorporating statistics-based information about the swell noise filtering process can help enhance the accuracy of the QC step in terms of its capacity to highlight locations where the filtering was not performed well enough. The visualisation of the attributes in a scatter plot rather than maps, accompanied with a visualisation of the corresponding seismic sections can be used to further explore potential relationships between observed attribute values and isolated anomalies. Clustering techniques may also be implemented in order to automatically detect anomalies that are not clearly distinguishable by an initial visual examination. In addition to the above, the visualisation of, for example, a -variance map in addition to RMS maps of input and output could also provide more information about the performance of the filtering process. The above methodology is an exploratory step towards the development of a data-driven automated QC platform that takes into account techniques from the fields of statistics and machine learning. Acknowledgements We would like to thank Prof Charles Taylor (School of Mathematics, University of Leeds) for his helpful input during the analysis stage and Paul LeCocq for his advices. References Fraley C, Raftery A. Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97(458):611 631, 2002. Hyvärinen A, Karhunen J, Oja E. Independent Component Analysis. Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons, 2001. Schonewille M, Vigner A and Ryder A. Advances in swell-noise attenuation, First break vol. 26, pp.103-108, 2008.