Pydron. Semi-Automatic Parallelization for Astronomy Data Processing. Stefan C. Müller Gustavo Alonso Adam Amara André Csillaghy

Size: px
Start display at page:

Download "Pydron. Semi-Automatic Parallelization for Astronomy Data Processing. Stefan C. Müller Gustavo Alonso Adam Amara André Csillaghy"

Transcription

1 Pydron Semi-Automatic Parallelization for Astronomy Data Processing Stefan C. Müller Gustavo Alonso Adam Amara André Csillaghy Pydron 10/8/2014 1

2 RHESSI Gustavo Alonso Systems Group ETHZ Adam Amara Institute of Astronomy ETHZ Solar Orbiter André Csillaghy Institute of 4D Technologies FHNW Stefan C. Müller Systems Group ETHZ Institute of 4D Technologies FHNW Dark Energy Survey Euclid RHESSI: NASA/Goddard Space Flight Center Conceptual Image Lab Solar Orbiter: ESA / AOES Dark Energy Survey: T. Abbott and NOAO/AURA/NSF Euclid: ESA/C. Carreau 10/8/2014 2

3 Overview Astronomy Processing Codes def main(raw_images): calibrated_files = [] for raw_image in raw_images: bias = measure_bias(raw_image) ft= measure_flat(raw_image) Python cal_file = calib(raw_image,bias,ft) calibrated_files.append(cal_file) Semi-Automatic Parallelization Pydron Cloud, Cluster, Multicore Amazon EC2 Pydron 10/8/2014 3

4 Outline Astronomy Data Processing Using Pydron How it works def main(raw_images): calibrated_files = [] for raw_image in raw_images: bias = measure_bias(raw_image) ft= measure_flat(raw_image) Python cal_file = calib(raw_image,bias,ft) calibrated_files.append(cal_file) Pydron 4 Results Amazon EC2 Pydron 10/8/2014 4

5 Why is Astronomy Data Processing Different? Lack of Economies of Scale Pydron 10/8/2014 5

6 # of Users One of a Kind Commercial Systems: Thousands of deployments Millions of users Astronomy Projects: Single Deployment Hundreds of Researchers < 10 that run jobs Innovative Instrument Eperiment... Never seen before.. Kind of data Data volume Data Processing Analysis... Code Reusability is very limited Pydron 10/8/2014 6

7 Code changes constantly Idea Write Code Run Study Results Unproductive Time Astronomers are Developers, not Users Write Paper never run again Pydron 10/8/2014 7

8 Sequential Script Write Code Run Parallel Code Eplain to Developer Write Code Run Pydron Write Code Run Sequential Script + Scaling Time Pydron 10/8/2014 8

9 How is Pydron used? Pydron 10/8/2014 9

10 Sequential Python Code def main(raw_images): calibrated_files = [] for raw_image in raw_images: bias = measure_bias(raw_image) flat = measure_flat(raw_image) cal_file = calibrate(raw_image,bias,flat) calibrated_files.append(cal_file) def measure_bias(raw_image):... def measure_flat(raw_image):... def calibrate(raw_image,bias,bg):... Pydron 10/8/

11 Look for parallelism in here Minimal code changes required no def main(raw_images): calibrated_files = [] for raw_image in raw_images: bias = measure_bias(raw_image) flat = measure_flat(raw_image) cal_file = calibrate(raw_image,bias,flat) def def def calibrate(raw_image,bias,bg):... Pydron 10/8/

12 Just run it Once function is invoked, Pydron will... For the cloud: Start cloud nodes & Start Python processes Transfer code (and libraries) to nodes Schedule tasks Send results back to workstation Shutdown nodes function returns with the result. Pydron 10/8/

13 Machine Learning Random Forest Training Uses scipy native library Multi-Core Cluster Pydron 10/8/

14 How does it work? Pydron 10/8/

15 Data-flow Graph def main(raw_images): cal_files = [] for raw_image in raw_images: bias = measure_bias(raw_image) ft = measure_flat(raw_image) cal_file = cal(raw_image,bias,ft) cal_files.append(cal_file) Epressions Task nodes Variables Value nodes for-each raw_image measure_bias measure_flat bias flat cal Static Single Assignment Form Sub-Graphs for compounds cal_file Pydron cal_files 15

16 Python Challenges Data-dependent control-flow No static type information Invoked functions unknown Decorators unknown Adapt Graph with Runtime Information Change graph depending on data Use dynamic type information Detect function decorators Dynamically adapt granularity Pydron 16

17 Data dependent Control-Flow while f(): = g() s += h() Pydron 17

18 Data dependent Control-Flow while f(): = g() s += h() translate s f while-task f g h s + s while s s Pydron 18

19 Data dependent Control-Flow while f(): = g() s += h() translate s f f g while h s + s s s Pydron 19

20 Data dependent Control-Flow while f(): = g() s += h() translate & replace while-task (3) g h g h g h f while s s + s + s + s Pydron 20

21 Data dependent Control-Flow while f(): = g() s += h() translate & replace while-task (3) g h g h g h f while s s + s + s + s Pydron 21

22 Data dependent Control-Flow while f(): = g() s += h() translate & replace while-task (3) g h g h g h f while s s + s + s + s Pydron 22

23 Results Pydron 23

24 Eo-Planet Detection Eecution Time Overhead Pydron 10/8/

25 Future Optimizations Scheduling algorithm Data specific optimizations Pre-fetching Dynamic Resource Allocation Pydron 10/8/

26 Conclusions Pydron lowers the boundary to use parallel infrastructure Non-intrusiveness & low learning curve over ideal performance Pydron 10/8/

27 Q & A Pydron 10/8/

In astroinformatics, the processing and analyzing. Scaling Astroinformatics: Python + Automatic Parallelization COVER FEATURE

In astroinformatics, the processing and analyzing. Scaling Astroinformatics: Python + Automatic Parallelization COVER FEATURE Scaling Astroinormatics: Python + Automatic Parallelization Stean C. Müller, ETH Zürich and University o Applied Sciences Northwestern Switzerland Gustavo Alonso, ETH Zürich André Csillaghy, University

More information

Advances in GIS help create Smarter Communities

Advances in GIS help create Smarter Communities Advances in GIS help create Smarter Communities POP(ovich) Quiz Who is a Desktop User? Who is an ArcGIS Online User? Who is a ArcGIS Server Admin? Who is a Programmer? Who works with or for a government

More information

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1 Scaling MATLAB for Your Organisation and Beyond Rory Adams 2015 The MathWorks, Inc. 1 MATLAB at Scale Front-end scaling Scale with increasing access requests Back-end scaling Scale with increasing computational

More information

BIDS 2016 Santa Cruz de Tenerife

BIDS 2016 Santa Cruz de Tenerife BIDS 2016 Santa Cruz de Tenerife 1 EUCLID: Orchestrating the software development and the scientific data production in a map reduce paradigm Christophe Dabin (CNES) M. Poncet, K. Noddle, M. Holliman,

More information

Performance-related aspects in the Big Data Astronomy Era: architects in software optimization

Performance-related aspects in the Big Data Astronomy Era: architects in software optimization Performance-related aspects in the Big Data Astronomy Era: architects in software optimization Daniele Tavagnacco - INAF-Observatory of Trieste on behalf of EUCLID SDC-IT Design and Optimization image

More information

Adaptive Cluster Computing using JavaSpaces

Adaptive Cluster Computing using JavaSpaces Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of

More information

Cloud interoperability and elasticity with COMPSs

Cloud interoperability and elasticity with COMPSs www.bsc.es Cloud interoperability and elasticity with COMPSs Interoperability Demo Days Dec 12-2014, London Daniele Lezzi Barcelona Supercomputing Center Outline COMPSs programming model COMPSs tools COMPSs

More information

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA Sara Nieto on behalf of B.Altieri, G.Buenadicha, J. Salgado, P. de Teodoro European Space Astronomy Center, European Space Agency, Spain O.R.

More information

An Introduction to Big Data Formats

An Introduction to Big Data Formats Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION

More information

Immersion Day. Getting Started with AWS Lambda. August Rev

Immersion Day. Getting Started with AWS Lambda. August Rev Getting Started with AWS Lambda August 2016 Rev 2016-08-19 Table of Contents Overview... 3 AWS Lambda... 3 Amazon S3... 3 Amazon CloudWatch... 3 Handling S3 Events using the AWS Lambda Console... 4 Create

More information

ArcGIS Enterprise in the Amazon Cloud

ArcGIS Enterprise in the Amazon Cloud ArcGIS Enterprise in the Amazon Cloud Cherry Lin (clin@esri.com) David Cordes (dcordes@esri.com) This Presentation Available at http://bit.ly/2tz2hpu AWS SIG Date: 07/13/2017 Time: 12:00pm - 1:00pm Location:

More information

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications? YOUR APPLICATION S JOURNEY TO THE CLOUD What s the best way to get cloud native capabilities for your existing applications? Introduction Moving applications to cloud is a priority for many IT organizations.

More information

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Programming Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Computing Only required amount of CPU and storage can be used anytime from anywhere via network Availability, throughput, reliability

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

STK: DEMYSTIFYING THE CLOUD AND VIRTUAL ENVIRONMENTS

STK: DEMYSTIFYING THE CLOUD AND VIRTUAL ENVIRONMENTS STK: DEMYSTIFYING THE CLOUD AND VIRTUAL ENVIRONMENTS Analytical Graphics Inc. January 2017 CONTENTS Abstract... 3 Introduction... 3 Virtual Environment System Requirements... 3 Hardware Accelerated Graphics...

More information

Keywords AST, Pattern Matching, Automatic Parallelization, Loop Parallelization, Python

Keywords AST, Pattern Matching, Automatic Parallelization, Loop Parallelization, Python Volume 7, Issue 3, March 217 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Automatic Parallelizing

More information

ALIENVAULT USM FOR AWS SOLUTION GUIDE

ALIENVAULT USM FOR AWS SOLUTION GUIDE ALIENVAULT USM FOR AWS SOLUTION GUIDE Summary AlienVault Unified Security Management (USM) for AWS is a unified security platform providing threat detection, incident response, and compliance management

More information

How Verizon boosted product delivery with Dynatrace Software Intelligence

How Verizon boosted product delivery with Dynatrace Software Intelligence How Verizon boosted product delivery with Dynatrace Software Intelligence 3x faster build and test cycles 2x faster deployments 33 percent faster revenue realization 50 percent reduction in issues 2019

More information

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA Rees Williams on behalf of A.N.Belikov, D.Boxhoorn, B. Dröge, J.McFarland, A.Tsyganov, E.A. Valentijn University of Groningen, Groningen,

More information

Euclid Archive Science Archive System

Euclid Archive Science Archive System Euclid Archive Science Archive System Bruno Altieri Sara Nieto, Pilar de Teodoro (ESDC) 23/09/2016 Euclid Archive System Overview The EAS Data Processing System (DPS) stores the data products metadata

More information

Image Processing on the Cloud. Outline

Image Processing on the Cloud. Outline Mars Science Laboratory! Image Processing on the Cloud Emily Law Cloud Computing Workshop ESIP 2012 Summer Meeting July 14 th, 2012 1/26/12! 1 Outline Cloud computing @ JPL SDS Lunar images Challenge Image

More information

Work Queue + Python. A Framework For Scalable Scientific Ensemble Applications

Work Queue + Python. A Framework For Scalable Scientific Ensemble Applications Work Queue + Python A Framework For Scalable Scientific Ensemble Applications Peter Bui, Dinesh Rajan, Badi Abdul-Wahid, Jesus Izaguirre, Douglas Thain University of Notre Dame Distributed Computing Examples

More information

Important DevOps Technologies (3+2+3days) for Deployment

Important DevOps Technologies (3+2+3days) for Deployment Important DevOps Technologies (3+2+3days) for Deployment DevOps is the blending of tasks performed by a company's application development and systems operations teams. The term DevOps is being used in

More information

Workflow as a Service: An Approach to Workflow Farming

Workflow as a Service: An Approach to Workflow Farming Workflow as a Service: An Approach to Workflow Farming Reginald Cushing, Adam Belloum, Vladimir Korkhov, Dmitry Vasyunin, Marian Bubak, Carole Leguy Institute for Informatics University of Amsterdam 3

More information

ESA Science Archives Architecture Evolution. Iñaki Ortiz de Landaluce Science Archives Team 13 th Sept 2013

ESA Science Archives Architecture Evolution. Iñaki Ortiz de Landaluce Science Archives Team 13 th Sept 2013 ESA Science Archives Architecture Evolution Iñaki Ortiz de Landaluce Science Archives Team 13 th Sept 2013 Outline Introduction: ESA Science Archives Archives Architecture Evolution User Interfaces and

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies.

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies. Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

A Spot Capacity Market to Increase Power Infrastructure Utilization in Multi-Tenant Data Centers

A Spot Capacity Market to Increase Power Infrastructure Utilization in Multi-Tenant Data Centers A Spot Capacity Market to Increase Power Infrastructure Utilization in Multi-Tenant Data Centers Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, and Adam Wierman This work was supported in part by the U.S.

More information

Clouds: An Opportunity for Scientific Applications?

Clouds: An Opportunity for Scientific Applications? Clouds: An Opportunity for Scientific Applications? Ewa Deelman USC Information Sciences Institute Acknowledgements Yang-Suk Ki (former PostDoc, USC) Gurmeet Singh (former Ph.D. student, USC) Gideon Juve

More information

FuncX: A Function Serving Platform for HPC. Ryan Chard 28 Jan 2019

FuncX: A Function Serving Platform for HPC. Ryan Chard 28 Jan 2019 FuncX: A Function Serving Platform for HPC Ryan Chard 28 Jan 2019 Outline - Motivation FuncX: FaaS for HPC Implementation status Preliminary applications - Machine learning inference Automating analysis

More information

Container-Native Storage

Container-Native Storage Container-Native Storage Solving the Persistent Storage Challenge with GlusterFS Michael Adam Manager, Software Engineering José A. Rivera Senior Software Engineer 2017.09.11 WARNING The following presentation

More information

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2 Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design

More information

Private Cloud Public Cloud Edge. Consistent Infrastructure & Consistent Operations

Private Cloud Public Cloud Edge. Consistent Infrastructure & Consistent Operations Hybrid Cloud Native Public Cloud Private Cloud Public Cloud Edge Consistent Infrastructure & Consistent Operations VMs and Containers Management and Automation Cloud Ops DevOps Existing Apps Cost Management

More information

When, Where & Why to Use NoSQL?

When, Where & Why to Use NoSQL? When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

Execution Architecture

Execution Architecture Execution Architecture Software Architecture VO (706.706) Roman Kern Institute for Interactive Systems and Data Science, TU Graz 2018-11-07 Roman Kern (ISDS, TU Graz) Execution Architecture 2018-11-07

More information

A Cloud-based Dynamic Workflow for Mass Spectrometry Data Analysis

A Cloud-based Dynamic Workflow for Mass Spectrometry Data Analysis A Cloud-based Dynamic Workflow for Mass Spectrometry Data Analysis Ashish Nagavaram, Gagan Agrawal, Michael A. Freitas, Kelly H. Telu The Ohio State University Gaurang Mehta, Rajiv. G. Mayani, Ewa Deelman

More information

CSIRO and the Open Data Cube

CSIRO and the Open Data Cube CSIRO and the Open Data Cube Dr Robert Woodcock, Matt Paget, Peter Wang, Alex Held CSIRO Overview The challenge The Earth Observation Data Deluge Integrated science needs Data volume, rate of growth and

More information

Visual Design Flows for Faster Debug and Time to Market FlowTracer White Paper

Visual Design Flows for Faster Debug and Time to Market FlowTracer White Paper Visual Design Flows for Faster Debug and Time to Market FlowTracer White Paper 2560 Mission College Blvd., Suite 130 Santa Clara, CA 95054 (408) 492-0940 Introduction As System-on-Chip (SoC) designs have

More information

Machine Learning in WAN Research

Machine Learning in WAN Research Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network

More information

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

At Course Completion Prepares you as per certification requirements for AWS Developer Associate. [AWS-DAW]: AWS Cloud Developer Associate Workshop Length Delivery Method : 4 days : Instructor-led (Classroom) At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

More information

Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization

Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization Ryan Chladny Application Engineering May 13 th, 2014 2014 The MathWorks, Inc. 1 Design Challenge: Electric

More information

NC Education Cloud Feasibility Report

NC Education Cloud Feasibility Report 1 NC Education Cloud Feasibility Report 1. Problem Definition and rationale North Carolina districts are generally ill-equipped to manage production server infrastructure. Server infrastructure is most

More information

An Oracle White Paper October Minimizing Planned Downtime of SAP Systems with the Virtualization Technologies in Oracle Solaris 10

An Oracle White Paper October Minimizing Planned Downtime of SAP Systems with the Virtualization Technologies in Oracle Solaris 10 An Oracle White Paper October 2010 Minimizing Planned Downtime of SAP Systems with the Virtualization Technologies in Oracle Solaris 10 Introduction When business-critical systems are down for a variety

More information

Cycle Sharing Systems

Cycle Sharing Systems Cycle Sharing Systems Jagadeesh Dyaberi Dependable Computing Systems Lab Purdue University 10/31/2005 1 Introduction Design of Program Security Communication Architecture Implementation Conclusion Outline

More information

Seeking Supernovae in the Clouds: A Performance Study

Seeking Supernovae in the Clouds: A Performance Study Seeking Supernovae in the Clouds: A Performance Study Keith R. Jackson, Lavanya Ramakrishnan, Karl J. Runge, Rollin C. Thomas Lawrence Berkeley National Laboratory Why Do I Care About Supernovae? The rate

More information

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017 A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council Perth, July 31-Aug 01, 2017 http://levlafayette.com Necessary and Sufficient Definitions High Performance Computing: High

More information

Planning and Administering SharePoint 2016

Planning and Administering SharePoint 2016 SharePoint Course - 203391 Planning and Administering SharePoint 2016 Length 5 days Prerequisites In addition to their professional experience, students who attend this training should already have the

More information

CDS. André Schaaff1, François-Xavier Pineau1, Gilles Landais1, Laurent Michel2 de Données astronomiques de Strasbourg, 2SSC-XMM-Newton

CDS. André Schaaff1, François-Xavier Pineau1, Gilles Landais1, Laurent Michel2 de Données astronomiques de Strasbourg, 2SSC-XMM-Newton Docker @ CDS André Schaaff1, François-Xavier Pineau1, Gilles Landais1, Laurent Michel2 1Centre de Données astronomiques de Strasbourg, 2SSC-XMM-Newton Paul Trehiou Université de technologie de Belfort-Montbéliard

More information

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

SOLVING THE MOBILE TESTING CONUNDRUM

SOLVING THE MOBILE TESTING CONUNDRUM SOLVING THE MOBILE TESTING CONUNDRUM Even though mobile testing is complex, it can be done successfully with the correct strategy. A sound mobile test automation strategy must include test automation frameworks,

More information

Exploring Architecture Options for a Federated, Cloud-based Systems Biology Knowledgebase. Ian Gorton, Jenny Liu, Jian Yin

Exploring Architecture Options for a Federated, Cloud-based Systems Biology Knowledgebase. Ian Gorton, Jenny Liu, Jian Yin Exploring Architecture Options for a Federated, Cloud-based Systems Biology Knowledgebase Ian Gorton, Jenny Liu, Jian Yin 1 Systems Biology Systems Biology Integrated study of organisms as a whole Obtain,

More information

En oversikt En, oversikt likheter, og forskjeller Rune Zakariassen Microsoft Micr

En oversikt En, oversikt likheter, og forskjeller Rune Zakariassen Microsoft Micr En oversikt, likheter og forskjeller Rune Zakariassen Microsoft Historic Computing Transformations We are all excited about the cloud IDC Sees Cloud Market Maturing Quickly In 2009, approximately $17 billion

More information

CLOUD AND AWS TECHNICAL ESSENTIALS PLUS

CLOUD AND AWS TECHNICAL ESSENTIALS PLUS 1 P a g e CLOUD AND AWS TECHNICAL ESSENTIALS PLUS Contents Description... 2 Course Objectives... 2 Cloud computing essentials:... 2 Pre-Cloud and Need for Cloud:... 2 Cloud Computing and in-depth discussion...

More information

AWS Lambda. 1.1 What is AWS Lambda?

AWS Lambda. 1.1 What is AWS Lambda? Objectives Key objectives of this chapter Lambda Functions Use cases The programming model Lambda blueprints AWS Lambda 1.1 What is AWS Lambda? AWS Lambda lets you run your code written in a number of

More information

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018 Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning

More information

Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres

Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres M. Poncet (CNES) Q. Le Boulc h (IN2P3) M. Holliman (ROE) On behalf of Euclid EC SGS System Team ADASS 2016 1 Outline Euclid Project

More information

Hop Onset Network: Adequate Stationing Schema for Large Scale Cloud- Applications

Hop Onset Network: Adequate Stationing Schema for Large Scale Cloud- Applications www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 6 Issue 10 October 2017, Page No. 22616-22626 Index Copernicus value (2015): 58.10 DOI: 10.18535/ijecs/v6i10.11

More information

THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU

THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU M. Fehr, V. Navarro, L. Martin, and E. Fletcher European Space Astronomy Center,

More information

Planning and Administering SharePoint 2016

Planning and Administering SharePoint 2016 Planning and Administering SharePoint 2016 20339-1; 5 Days; Instructor-led Course Description This five-day course will provide you with the knowledge and skills to plan and administer a Microsoft SharePoint

More information

Design Patterns for the Cloud. MCSN - N. Tonellotto - Distributed Enabling Platforms 68

Design Patterns for the Cloud. MCSN - N. Tonellotto - Distributed Enabling Platforms 68 Design Patterns for the Cloud 68 based on Amazon Web Services Architecting for the Cloud: Best Practices Jinesh Varia http://media.amazonwebservices.com/aws_cloud_best_practices.pdf 69 Amazon Web Services

More information

Managing large-scale workflows with Pegasus

Managing large-scale workflows with Pegasus Funded by the National Science Foundation under the OCI SDCI program, grant #0722019 Managing large-scale workflows with Pegasus Karan Vahi ( vahi@isi.edu) Collaborative Computing Group USC Information

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

Your Complete Guide to Backup and Recovery for MongoDB

Your Complete Guide to Backup and Recovery for MongoDB Your Complete Guide to Backup and Recovery for MongoDB EBOOK Your Complete Guide to Backup and Recovery for MongoDB Table of Contents Part I: Backup and Recovery for MongoDB Part II: Customer Case Study

More information

Lecture 1: January 22

Lecture 1: January 22 CMPSCI 677 Distributed and Operating Systems Spring 2018 Lecture 1: January 22 Lecturer: Prashant Shenoy Scribe: Bin Wang 1.1 Introduction to the course The lecture started by outlining the administrative

More information

Models for model integration

Models for model integration Models for model integration Oxford, UK, June 27, 2017 Daniel S. Katz Assistant Director for Scientific Software & Applications, NCSA Research Associate Professor, CS Research Associate Professor, ECE

More information

Smarter Systems In Your Cloud Deployment

Smarter Systems In Your Cloud Deployment Smarter Systems In Your Cloud Deployment Hemant S Shah ASEAN Executive: Cloud Computing, Systems Software. 5 th Oct., 2010 Contents We need Smarter Systems for a Smarter Planet Smarter Systems = Systems

More information

CS 552 -Final Study Guide Summer 2015

CS 552 -Final Study Guide Summer 2015 CS 552 -Final Study Guide Summer 2015 TRUE/FALSE 1. Most people feel comfortable purchasing complex devices, such as cars, home theater systems, and computers. 2. To make an informed choice when purchasing

More information

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca The GISandbox: A Science Gateway For Geospatial Computing Davide Del Vento, Eric Shook, Andrea Zonca 1 Paleoscape Model and Human Origins Simulate Climate and Vegetation during the Last Glacial Maximum

More information

Hadoop/MapReduce Computing Paradigm

Hadoop/MapReduce Computing Paradigm Hadoop/Reduce Computing Paradigm 1 Large-Scale Data Analytics Reduce computing paradigm (E.g., Hadoop) vs. Traditional database systems vs. Database Many enterprises are turning to Hadoop Especially applications

More information

Windows Server 2008 Active Directory Lab Manual READ ONLINE

Windows Server 2008 Active Directory Lab Manual READ ONLINE Windows Server 2008 Active Directory Lab Manual READ ONLINE 70-640, Package: Windows Server 2008 Active - Official Academic Course Series) 9780470874981 - Exam 70-640 Windows Server 2008-9780470874981

More information

"Charting the Course... MOC /2: Planning, Administering & Advanced Technologies of SharePoint Course Summary

Charting the Course... MOC /2: Planning, Administering & Advanced Technologies of SharePoint Course Summary Description Course Summary This five-day course will provide you with the knowledge and skills to plan and administer a Microsoft environment. The course teaches you how to deploy, administer, and troubleshoot

More information

Course Outline. Advanced Automated Administration with Windows PowerShell Course 10962: 3 days Instructor Led

Course Outline. Advanced Automated Administration with Windows PowerShell Course 10962: 3 days Instructor Led Advanced Automated Administration with Windows PowerShell Course 10962: 3 days Instructor Led Prerequisites: Before attending this course, students must have: Knowledge and experience working with Windows

More information

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON Created by Moritz Wundke INTRODUCTION Concurrent aims to be a different type of task distribution system compared to what MPI like system do. It adds a simple

More information

Millisort: An Experiment in Granular Computing. Seo Jin Park with Yilong Li, Collin Lee and John Ousterhout

Millisort: An Experiment in Granular Computing. Seo Jin Park with Yilong Li, Collin Lee and John Ousterhout Millisort: An Experiment in Granular Computing Seo Jin Park with Yilong Li, Collin Lee and John Ousterhout Massively Parallel Granular Computing Massively parallel computing as an application of granular

More information

Discovering Dependencies between Virtual Machines Using CPU Utilization. Renuka Apte, Liting Hu, Karsten Schwan, Arpan Ghosh

Discovering Dependencies between Virtual Machines Using CPU Utilization. Renuka Apte, Liting Hu, Karsten Schwan, Arpan Ghosh Look Who s Talking Discovering Dependencies between Virtual Machines Using CPU Utilization Renuka Apte, Liting Hu, Karsten Schwan, Arpan Ghosh Georgia Institute of Technology Talk by Renuka Apte * *Currently

More information

Distributed Systems COMP 212. Lecture 18 Othon Michail

Distributed Systems COMP 212. Lecture 18 Othon Michail Distributed Systems COMP 212 Lecture 18 Othon Michail Virtualisation & Cloud Computing 2/27 Protection rings It s all about protection rings in modern processors Hardware mechanism to protect data and

More information

Aws Certified Solutions Architect Ustoreore

Aws Certified Solutions Architect Ustoreore AWS CERTIFIED SOLUTIONS ARCHITECT USTOREORE PDF - Are you looking for aws certified solutions architect ustoreore Books? Now, you will be happy that at this time aws certified solutions architect ustoreore

More information

Beyond Hive Pig and Python

Beyond Hive Pig and Python Beyond Hive Pig and Python What is Pig? Pig performs a series of transformations to data relations based on Pig Latin statements Relations are loaded using schema on read semantics to project table structure

More information

Sourcefire Solutions Overview Security for the Real World. SEE everything in your environment. LEARN by applying security intelligence to data

Sourcefire Solutions Overview Security for the Real World. SEE everything in your environment. LEARN by applying security intelligence to data SEE everything in your environment LEARN by applying security intelligence to data ADAPT defenses automatically ACT in real-time Sourcefire Solutions Overview Security for the Real World Change is constant.

More information

SECURING AWS ACCESS WITH MODERN IDENTITY SOLUTIONS

SECURING AWS ACCESS WITH MODERN IDENTITY SOLUTIONS WHITE PAPER SECURING AWS ACCESS WITH MODERN IDENTITY SOLUTIONS The Challenges Of Securing AWS Access and How To Address Them In The Modern Enterprise Executive Summary When operating in Amazon Web Services

More information

Hybrid MapReduce Workflow. Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US

Hybrid MapReduce Workflow. Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US Outline Introduction and Background MapReduce Iterative MapReduce Distributed Workflow Management

More information

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER 2 WHY KUBERNETES? Kubernetes is an open-source container orchestrator for deploying and managing containerized applications. Building on 15 years of experience

More information

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on

More information

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services Course 6235A: Implementing and Maintaining Microsoft SQL Server 2008 Integration Services Course Details Course Outline Module 1: Introduction to SQL Server 2008 Integration Services The students will

More information

A Practitioner s Guide to Migrating Workloads to VMware Cloud on AWS

A Practitioner s Guide to Migrating Workloads to VMware Cloud on AWS A Practitioner s Guide to Migrating Workloads to VMware Cloud on AWS Adam Osterholt, VMware, Inc. Paul Gifford, VMware, Inc. #vmworld HYP1496BU #HYP1496BU Disclaimer This presentation may contain product

More information

"Charting the Course... MOC 6435 B Designing a Windows Server 2008 Network Infrastructure Course Summary

Charting the Course... MOC 6435 B Designing a Windows Server 2008 Network Infrastructure Course Summary MOC 6435 B Designing a Windows Network Infrastructure Course Summary Description This five-day course will provide students with an understanding of how to design a Windows Network Infrastructure that

More information

An Intelligent Service Oriented Infrastructure supporting Real-time Applications

An Intelligent Service Oriented Infrastructure supporting Real-time Applications An Intelligent Service Oriented Infrastructure supporting Real-time Applications Future Network Technologies Workshop 10-11 -ETSI, Sophia Antipolis,France Karsten Oberle, Alcatel-Lucent Bell Labs Karsten.Oberle@alcatel-lucent.com

More information

An Eclipse-based Environment for Programming and Using Service-Oriented Grid

An Eclipse-based Environment for Programming and Using Service-Oriented Grid An Eclipse-based Environment for Programming and Using Service-Oriented Grid Tianchao Li and Michael Gerndt Institut fuer Informatik, Technische Universitaet Muenchen, Germany Abstract The convergence

More information

Stream Processing on IoT Devices using Calvin Framework

Stream Processing on IoT Devices using Calvin Framework Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised

More information

KNIME for the life sciences Cambridge Meetup

KNIME for the life sciences Cambridge Meetup KNIME for the life sciences Cambridge Meetup Greg Landrum, Ph.D. KNIME.com AG 12 July 2016 What is KNIME? A bit of motivation: tool blending, data blending, documentation, automation, reproducibility More

More information

On demand Satellite Image Processing Next generation technology for processing Terabytes of imagery on the Cloud White Paper July, 2011

On demand Satellite Image Processing Next generation technology for processing Terabytes of imagery on the Cloud White Paper July, 2011 On demand Satellite Image Processing Next generation technology for processing Terabytes of imagery on the Cloud White Paper July, 2011 On-Demand Satellite Image Processing Next generation technology for

More information

QLIK INTEGRATION WITH AMAZON REDSHIFT

QLIK INTEGRATION WITH AMAZON REDSHIFT QLIK INTEGRATION WITH AMAZON REDSHIFT Qlik Partner Engineering Created August 2016, last updated March 2017 Contents Introduction... 2 About Amazon Web Services (AWS)... 2 About Amazon Redshift... 2 Qlik

More information

Alteryx Technical Overview

Alteryx Technical Overview Alteryx Technical Overview v 1.5, March 2017 2017 Alteryx, Inc. v1.5, March 2017 Page 1 Contents System Overview... 3 Alteryx Designer... 3 Alteryx Engine... 3 Alteryx Service... 5 Alteryx Scheduler...

More information

PowerShell for System Center Configuration Manager Administrators

PowerShell for System Center Configuration Manager Administrators Course 55133A: PowerShell for System Center Configuration Manager Administrators - Course details Course Outline Module 1: Review of System Center Configuration Manager Concepts This module explains the

More information

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI Overview Unparalleled Value Product Portfolio Software Platform From Desk to Data Center to Cloud Summary AI researchers depend on computing performance to gain

More information

Advanced Topics in Computer Architecture

Advanced Topics in Computer Architecture Advanced Topics in Computer Architecture Lecture 7 Data Level Parallelism: Vector Processors Marenglen Biba Department of Computer Science University of New York Tirana Cray I m certainly not inventing

More information

Machine Learning in WAN Research

Machine Learning in WAN Research Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information