Scientific Workflows and Cloud Computing. Gideon Juve USC Information Sciences Institute

Similar documents
Clouds: An Opportunity for Scientific Applications?

A Cloud-based Dynamic Workflow for Mass Spectrometry Data Analysis

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN. PostgresConf US

Pegasus. Automate, recover, and debug scientific computations. Rafael Ferreira da Silva.

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

Corral: A Glide-in Based Service for Resource Provisioning

What is. Thomas and Lori Duncan

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

MySQL In the Cloud. Migration, Best Practices, High Availability, Scaling. Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017.

Next Generation Storage for The Software-Defned World

Cloud Computing for Science

Managing large-scale workflows with Pegasus

CPET 581 Cloud Computing: Technologies and Enterprise IT Strategies

COP Cloud Computing. Presented by: Sanketh Beerabbi University of Central Florida

Cloud Infrastructure

Basics of Cloud Computing Lecture 2. Cloud Providers. Satish Srirama

AWS: Basic Architecture Session SUNEY SHARMA Solutions Architect: AWS

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics. 3. Charges and Billing

Evaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization

Integration of Workflow Partitioning and Resource Provisioning

On the Use of Cloud Computing for Scientific Workflows

Service Description. IBM DB2 on Cloud. 1. Cloud Service. 1.1 IBM DB2 on Cloud Standard Small. 1.2 IBM DB2 on Cloud Standard Medium

Designing MQ deployments for the cloud generation

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES

Cloud Computing with Nimbus

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

Joint Center of Intelligent Computing George Mason University. Federal Geographic Data Committee (FGDC) ESIP 2011 Winter Meeting

MySQL in the Cloud Tricks and Tradeoffs

IBM Terms of Use SaaS Specific Offering Terms. IBM DB2 on Cloud. 1. IBM SaaS. 2. Charge Metrics

Cloud Computing Lecture 4

IBM Compose Managed Platform for Multiple Open Source Databases

EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD

FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS

Aneka Dynamic Provisioning

Performance Characterization of ONTAP Cloud in Amazon Web Services with Application Workloads

Large Scale Sky Computing Applications with Nimbus

MySQL and Virtualization Guide

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007

Non-disruptive, two node high-availability (HA) support keeps you operating against unplanned storage failures in the cloud

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

CS / Cloud Computing. Recitation 3 September 9 th & 11 th, 2014

City of Carlsbad Web Mapping in the Amazon Cloud. Karl von Schlieder, GIS Manager June Acosta, GIS Administrator October 9, 2013

SoftNAS Cloud Performance Evaluation on AWS

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications

5 Fundamental Strategies for Building a Data-centered Data Center

Performance Comparison for Resource Allocation Schemes using Cost Information in Cloud

Best Practices and Performance Tuning on Amazon Elastic MapReduce

Write On Aws. Aws Tools For Windows Powershell User Guide using the aws tools for windows powershell (p. 19) this section includes information about

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Determining the IOPS Needs for Oracle Database on AWS

SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS

The Secret World of Cloud IaaS Pricing: How to Compare Apples and Oranges Among Cloud Providers

Contextualization: Providing One-Click Virtual Clusters

High Performance Computing Cloud - a PaaS Perspective

What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet.


Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Pegasus WMS Automated Data Management in Shared and Nonshared Environments

AWS Solution Architecture Patterns

Analysis in the Big Data Era

Cloud Computing with Nimbus

CS15-319: Cloud Computing. Lecture 3 Course Project and Amazon AWS Majd Sakr and Mohammad Hammoud

A Comparative Study of Various Computing Environments-Cluster, Grid and Cloud

Research Challenges in Cloud Infrastructures to Provision Virtualized Resources

RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING

Introduction to Cloud Computing

CIT 668: System Architecture. Amazon Web Services

Welcome to the New Era of Cloud Computing

POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN

Workflow scheduling algorithms for hard-deadline constrained cloud environments

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

Pocket: Elastic Ephemeral Storage for Serverless Analytics

Cloud Providers more AWS, Aneka

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Persistent Storage with Docker in production - Which solution and why?

vrealize Business Standard User Guide

Cloudera s Enterprise Data Hub on the Amazon Web Services Cloud: Quick Start Reference Deployment October 2014

CLOUD computing has become a popular computing infrastructure

Next-Generation Cloud Platform

The Use of Cloud Computing Resources in an HPC Environment

Secure Block Storage (SBS) FAQ

Practical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars

Licensing Oracle on Amazon EC2, RDS and Microsoft Azure now twice as expensive!

Overcoming Data Locality: an In-Memory Runtime File System with Symmetrical Data Distribution

CLOUD computing has become an important computing

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France

Introduction to Amazon Web Services

SCALABLE RESOURCE MANAGEMENT IN CLOUD COMPUTING IMAN SADOOGHI DEPARTMENT OF COMPUTER SCIENCE

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

TPP On The Cloud. Joe Slagel

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016

Multiple Disk VM Provisioning

POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN. PGConf.EU 2017, Warsaw

Transcription:

Scientific Workflows and Cloud Computing Gideon Juve USC Information Sciences Institute gideon@isi.edu

Scientific Workflows Loosely-coupled parallel applications Expressed as directed acyclic graphs (DAGs) Nodes = Tasks, Edges = Dependencies Data is communicated via files Small Montage Workflow

Workflow Management System Pegasus workflow planner Efficiently maps tasks and data to resources DAGMan workflow engine Tracks dependencies, releases tasks, retries tasks Condor task manager Dispatches tasks (and data) to resources Pegasus WMS

Amazon Web Services (AWS) IaaS Cloud Services Elastic Compute Cloud (EC2) Provision virtual machine instances Simple Storage Service (S3) Object-based storage system Put/Get files from a global repository Elastic Block Store (EBS) Block-based storage system Unshared, SAN-like volumes Others (queue, key-value, RDBMS, MapReduce, etc.)

Workflows and Clouds Benefits User control over environment On-demand provisioning / Elasticity SLA, support, reliability, maintenance Drawbacks Complexity (more control = more work) Cost Performance Resource Availability Vendor Lock-In

Questions About Clouds How can we deploy workflows in the cloud? Install and configure software Execute workflow tasks Store workflow data How well do workflows perform in the cloud? Compared to grids and clusters Using various storage systems How much does it cost to run a workflow? To provision resources To store data To transfer data

Deploying Workflows in the Cloud Virtual Machines/Virtual Machine Images Clouds provide resources, but the software is up to the user Virtual Clusters Collections of virtual machines used together Configured to mimic traditional clusters Contextualization Dynamically configuring virtual clusters is not trivial Nimbus Context Broker automates provisioning and configuration of virtual clusters

Execution Environment Cloud Grid

Workflow Storage In the Cloud Executables Transfer into cloud Store in VM image Input Data Transfer into cloud Store in cloud Intermediate Data Use local disk (single node only) Use distributed storage system Output Data Transfer out of cloud Store in cloud

Resource Type Experiments Run workflows on single instances of different resource types (using local disk) Goals: Compare performance/cost of cloud resources Compare performance of grid and cloud Characterize virtualization overhead Quantify performance benefit of network/file system Resource Types Used

Storage System Experiments Investigate different options for storing intermediate data in a virtual cluster Goals Determine how to deploy storage systems Compare performance/cost of storage systems Determine which storage system Amazon Issues EC2 does not allow kernel patches (no Lustre, Ceph) EBS volumes cannot be shared between nodes Use c1.xlarge resources

Local Disk Storage Systems RAID0 across available partitions with XFS NFS: Network file system 1 dedicated node (m1.xlarge) PVFS: Parallel, striped cluster file system Workers host PVFS and run tasks GlusterFS: Distributed file system Workers host GlusterFS and run tasks NUFA, and Distribute modes Amazon S3: Object-based storage system Non-POSIX interface required changes to Pegasus Data is cached on workers

Example Applications Montage (astronomy) I/O: High Memory: Low CPU: Low Epigenome (bioinformatics) I/O: Low Memory: Medium CPU: High Broadband (earthquake science) I/O: Medium Memory: High CPU: Medium

Resource Type Performance Virtualization overhead is less than 10% Network/file system are biggest advantage for grid c1.xlarge is good, m1.small is bad Montage (high I/O) likes Lustre, Epigenome (high CPU) doesn t care

Storage System Performance GlusterFS (NUFA) is best overall Epigenome file system doesn t matter NFS, PVFS perform relatively poorly S3 performs poorly when reuse is low, and # files is large

Cost Components Resource Cost Cost for VM instances Billed by the hour Transfer Cost Cost to copy data to/from cloud over network Billed by the GB Storage Cost Cost to store VM images, application data Billed by the GB-month, # of accesses

Resource Cost (by Resource Type) The per-workflow cost is not bad m1.small is not the cheapest m1.large is most cost-effective Resources with best performance are not cheapest Per-hour billing affects price/ performance tradeoff

Resource Cost (by Storage System) Cost tracks performance Adding resources does not reduce cost (except in unusual cases) S3, NFS are at a disadvantage because of extra charges

Application Input Output Logs Montage 4291 MB 7970 MB 40 MB Broadband 4109 MB 159 MB 5.5 MB Epigenome 1843 MB 299 MB 3.3 MB Transfer Sizes Transfer Cost Cost of transferring data to/from cloud Input: $0.10/GB (first 10 TB, free till June 30) Output: $0.17/GB (first 10 TB, now $0.15) Transfer costs are a relatively large Application Input Output Logs Total Montage $0.42 $1.32 < $0.01 $1.75 Broadband $0.40 $0.03 < $0.01 $0.43 Epigenome $0.18 $0.05 < $0.01 $0.23 Transfer Costs For Montage, transferring data costs more than computing it Costs can be reduced by storing input data in the cloud and using it for multiple workflows

Storage Cost Storage Charge Price for storing data Per GB-month Access Charge Price for accessing data Per operation S3 Storage: $0.15 / GB-month Access: PUT: $0.01 / 1,000 GET: $0.01 / 10,000 EBS Storage: $0.10 / GB-month Access: $0.10 / million IOs Application Volume Size Monthly Cost Montage 5GB $0.66 Broadband 5GB $0.60 Epigenome 2GB $0.26 Storage of Inputs in EBS Image Size Monthly Cost 32-bit 773 MB $0.11 64-bit 729 MB $0.11 Storage of VM images in S3

Conclusions Deployment and Usability Easy to start using, but some work is required to generate images and automate configuration Tools like Nimbus Context Broker can help Little maintenance, good reliability Performance Not bad given resources, but not as good as dedicated clusters & grids VM overhead is less than 10% for apps tested c1.xlarge has best performance overall Avoid using m1.small

Conclusions Cost m1.small is not always the cheapest resource Transferring data is relatively expensive Store inputs long-term if possible Using multiple nodes is not cost-effective

Web Resources Pegasus http://pegasus.isi.edu Condor/DAGMan http://cs.wisc.edu/condor Nimbus Context Broker http://www.nimbusproject.org/ Amazon Web Services http://aws.amazon.com