Rapid Processing of Synthetic Seismograms using Windows Azure Cloud

Similar documents
ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS

The State of High Performance Computing in the Cloud Sanjay P. Ahuja, 2 Sindhu Mani

Most real programs operate somewhere between task and data parallelism. Our solution also lies in this set.

Developing Microsoft Azure Solutions

Developing Microsoft Azure and Web Services. Course Code: 20487C; Duration: 5 days; Instructor-led

Distributed Systems. Tutorial 9 Windows Azure Storage

Windows Azure Services - At Different Levels

A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and

Developing In The Cloud

[MS20487]: Developing Windows Azure and Web Services

On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows

COMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014

Course Outline. Introduction to Azure for Developers Course 10978A: 5 days Instructor Led

Corral: A Glide-in Based Service for Resource Provisioning

COURSE 20487B: DEVELOPING WINDOWS AZURE AND WEB SERVICES

10 years of CyberShake: Where are we now and where are we going

Developing Windows Azure and Web Services

escience in the Cloud: A MODIS Satellite Data Reprojection and Reduction Pipeline in the Windows

Sustainable Networks: Challenges and Opportunities. Anne Meltzer

Saranya Sriram Developer Evangelist Microsoft Corporation India

Data Intensive processing with irods and the middleware CiGri for the Whisper project Xavier Briand

High Performance Computing. Introduction to Parallel Computing

Load Balancing Algorithm over a Distributed Cloud Network

Seismic Reflection Method

Microsoft Azure Storage

MS-20487: Developing Windows Azure and Web Services

Chapter N/A: The Earthquake Cycle and Earthquake Size

Developing Microsoft Azure Solutions (70-532) Syllabus

How to scale Windows Azure Application

Liquefaction Analysis in 3D based on Neural Network Algorithm

Multi-Domain Pattern. I. Problem. II. Driving Forces. III. Solution

En oversikt En, oversikt likheter, og forskjeller Rune Zakariassen Microsoft Micr

The Omega Seismic Processing System. Seismic analysis at your fingertips

D ATA P R O C E S S I N G S E R V I C E S. S e i s U P S O F T W A R E & D E V E L O P M E N T

Ray and wave methods in complex geologic structures

High performance Computing and O&G Challenges

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

Serverless Computing: Design, Implementation, and Performance. Garrett McGrath and Paul R. Brenner

Developing Microsoft Azure Solutions: Course Agenda

Full-waveform Based Complete Moment Tensor Inversion and. Stress Estimation from Downhole Microseismic Data for. Hydrofracture Monitoring.

Automatic Moment Tensor solution for SeisComP3

Course Outline. Lesson 2, Azure Portals, describes the two current portals that are available for managing Azure subscriptions and services.

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

Developing Microsoft Azure Solutions (70-532) Syllabus

Wave-equation migration from topography: Imaging Husky

Vendor: Microsoft. Exam Code: Exam Name: Developing Microsoft Azure Solutions. Version: Demo

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

Yves Goeleven. Solution Architect - Particular Software. Shipping software since Azure MVP since Co-founder & board member AZUG

Windows Azure Overview

RE-IMAGINING THE DATACENTER. Lynn Comp Director of Datacenter Solutions and Technologies

Electromagnetic migration of marine CSEM data in areas with rough bathymetry Michael S. Zhdanov and Martin Čuma*, University of Utah

Tu P13 08 A MATLAB Package for Frequency Domain Modeling of Elastic Waves

Integrated IoT and Cloud Environment for Fingerprint Recognition

The PageRank method for automatic detection of microseismic events Huiyu Zhu*, Jie Zhang, University of Science and Technology of China (USTC)

Parallelizing a seismic inversion code using PVM: a poor. June 27, Abstract

Research Article Mobile Storage and Search Engine of Information Oriented to Food Cloud

Microsoft. [MS20762]: Developing SQL Databases

Microsoft Developing Windows Azure and Web Services

Headwave Stacking in Terms of Partial Derivative Wavefield

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

Exam Questions

Welcome to the. Migrating SQL Server Databases to Azure

Angle-gather time migration a

A Cloud Framework for Big Data Analytics Workflows on Azure

Dr. Fabrizio Gagliardi

Downhole Data Dictionary and Data Formatting Requirements Jamison Steidl. University of California at Santa Barbara Institute for Crustal Studies

A Framework for Online Inversion-Based 3D Site Characterization

Introduction to Deadlocks

Adaptive spatial resampling as a Markov chain Monte Carlo method for uncertainty quantification in seismic reservoir characterization

E044 Ray-based Tomography for Q Estimation and Q Compensation in Complex Media

Wave Imaging Technology Inc.

OPENSTACK PRIVATE CLOUD WITH GITHUB

McAfee Endpoint Security for Servers Product Guide

Course Outline. Developing Microsoft Azure Solutions Course 20532C: 4 days Instructor Led

THE HYBRID CLOUD. Private and Public Clouds Better Together

Introduction to Parallel Computing

3D Inversion of Time-Domain Electromagnetic Data for Ground Water Aquifers

Introduction to Oracle

Chapter 7: Deadlocks. Operating System Concepts 9 th Edition

Microsoft Developing Microsoft Azure Solutions.

ITBraindumps. Latest IT Braindumps study guide

Programming Windows Azure

Real-Time Hazard Map as an Application of Enhanced Integrated Earthquake Simulation (IES) with High Performance Computing Technique

1. Query and manipulate data with Entity Framework.

Parallelizing Multiple Group by Query in Shared-nothing Environment: A MapReduce Study Case

Inversion after depth imaging

Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration

Efficient On-Demand Operations in Distributed Infrastructures

Sentinet for Microsoft Azure SENTINET

GraphTrek: Asynchronous Graph Traversal for Property Graph-Based Metadata Management

Source Depth and Mechanism Inversion at Teleseismic Distances Using a Neighborhood Algorithm

On the Scattering Effect of Lateral Discontinuities on AVA Migration

VANILLAIP SIP TRUNKING

High Resolution Imaging by Wave Equation Reflectivity Inversion

Developing Microsoft Azure Solutions (70-532) Syllabus

Developing Microsoft Azure Solutions (MS 20532)

Downloaded 10/23/13 to Redistribution subject to SEG license or copyright; see Terms of Use at

Kenneth A. Hawick P. D. Coddington H. A. James

Finite-difference elastic modelling below a structured free surface

Transcription:

Rapid Processing of Synthetic Seismograms using Windows Azure Cloud Vedaprakash Subramanian, Liqiang Wang Department of Computer Science University of Wyoming En-Jui Lee, Po Chen Department of Geology and Geophysics University of Wyoming 1

Introduction Scientific applications Large amount of computing resources Massive storage for datasets Traditionally Handled by HPC Clusters But Cost-ineffective Claim: Cloud Computing is a better substitute 2

more Conduct numerical simulation of synthetic seismograms Implemented a system on Azure cloud to generate synthetic seismograms on the fly based on user queries to deliver them in real-time 3

Synthetic Seismograms Seismogram is a record of the ground shaking Synthetic seismograms are generated by solving seismic wave equation Method is rapid centroid moment tensor (CMT) inversion method Based on 3D velocity model To increase efficiency, we store Receiver Green Tensors strain fields Generated at the receiver s location 4

more The input parameters for this wave-equation are latitude, longitude and depth of the earthquake locations strike, dip, and rake (source parameters) 5

more Why synthetic seismograms is important? Helps to map the Earth s internal structure Locate and measure the size of different seismic sources For realistic interpretation of structures Seismic Hazard Analysis 6

Windows Azure 7

Windows Azure Windows Azure is Platform as a service architecture Provides Scalable cloud operating system Data storage system User controls the hosted application User cannot control the underlying infrastructure 8

Service Architecture Azure service consists of : Web role For web application For user interface Worker role For generalized development For background processing Roles are virtual machines Say, 2 instances of worker role = 2 virtual machines running the code of worker role

Storage Abstractions Blobs Tables Queues Drives Our system uses only first three

Blob Storage Blobs Interface for storing files Two types namely Page blobs and Block blobs Containers for grouping Account Geo Container California Texas Blob File1 txt File2 txt File1 txt

Table Storage Tables Structured storage Consists set of entities, which contain a set of properties Partition key and Row key Account Geo Table Customers Photos Entity Name = Email = Name = Email = Photo Id = Date =

Queue Storage Queues Reliable storage and delivery of messages Communication between roles Account Geo Queue Orders Message ID = ID =

Load Balancing Azure master system Automatically load balance based on the partition key Partition key for various storage abstractions Blobs Container Name + Blob Name Entities Table Name + Partition Key Messages Queue Name 14

Implementation of the System 15

User Overview of the System The architecture of the system : Web Role User interface Job Manager Coordinate the computation Monitor the system Computation Input Queue Request Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

User Computation Worker Role Computation stuff 3 Azure Queues more Communication interface between the roles Computation Input Queue Request Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

User Web Role Receive the user input Place the request as message in request queue Request Queue Windows Azure Storage (Blob, Table, Queue)

Job Manager Retrieve the message from request queue Read the job Divide the job into sub-jobs Place the sub-jobs as message in inputcomputation queue Request Queue Computation Input Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

Coordinate the Computation Num of computation worker roles : 5 Num of CPU cores in each instance : 8 Input : 1000 source locations Num of queue messages : 1000 / 8 = 125 125 queue messages served by 5 computation worker roles Performance gain factor (Theoretical) = Num of CPU cores * Num of instances = 8 * 5 = 40

Inside Computation Worker Role Retrieve the message Process the sub-jobs in parallel using.net Task Parallel Library Write the result Send message stating job completed Computation Input Queue Computation Output Queue Windows Azure Storage (Blob, Table, Queue)

Monitor System Response Based on the response time of the message Threshold response time is 2ms If exceeds Allocate a new instance of computational worker role Maintain its detail

more De-allocate if any new instance has been allocated and its lifetime > one hour and no message in the queue

Monitor System Response Request to allocate VM msg msg msg Computation Input Queue msg Distributed File System

Job Manager Allocation & de-allocation are asynchronous So, mutual exclusion lock are enforced between Job assignment De-allocation of an instance

Data Storage Seismic wave observations stored as data file Point (40 59 N, 122 7 W) 4059-1227 Each data file is represented by its own latitude and longitude So, blob name = latitude + longitude For grouping the blobs, the region of California is divided into blocks based on the seismic wave observation stations 26

more Currently 4096 stations = 4096 blocks 35 25 N, 119 3 W 35 25 N, 116 47 W Each block is characterized by range of latitude and longitude 34 51 N, 119 3 W 34 51 N, 116 47 W So, block identification number = range of latitude + longitude 3525-3451-1193-11647 Container name = identification number 27

more Divide California region into 16 parts 1 Table for 1 part Store list of blocks in the part into the table Helps in better retrieval 28

Data Query Algorithm Locate blob corresponding to the given point Retrieve the container name and blob name Retrieve container name from the table Locate the table Do a linear search inside the table Blob name = latitude + longitude 29

Data Query Algorithm 41 43 N, 124 6 W 41 43 N, 120 43 W 40 85 N, 123 56 W 40 85 N, 122 52 W 40 59 N, 122 7 W Point 40 27 N, 123 56 W 40 27 N, 122 52 W 40 27 N, 124 6 W 39 11 N, 124 6 W 39 11 N, 122 52 W 39 11 N, 120 43 W 30

Experiment 31

Experiment Performance was evaluated on various configurations and number of instances of computational worker role datasets from different number of seismic wave observation 32

Execution Time (seconds) Performance Measurement Execution Time 1000 900 800 700 600 500 400 300 200 100 0 0 100 200 300 400 500 600 700 800 900 1000 Number of stations Single Worker (4 core) Four Worker (4 core) Single worker (4 core) + TPL Four Worker (4 core) + TPL Two worker (8 core) + TPL 33

Conclusion 34

Conclusion Implemented the system on Windows Azure Hence Cloud is a better substitute Future Work : Add Seismic Hazard Analysis feature to the system 35

Thanks 36