Computing over the Internet: Beyond Embarrassingly Parallel Applications. BOINC Workshop 09. Fernando Costa

Similar documents
2013 AWS Worldwide Public Sector Summit Washington, D.C.

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Efficient On-Demand Operations in Distributed Infrastructures

Assignment 5. Georgia Koloniari

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Chapter 11 DISTRIBUTED FILE SYSTEMS

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

Big Compute, Big Net & Big Data: How to be big

Distributed Systems Final Exam

Analytics in the cloud

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

CSE6331: Cloud Computing

Dynamically Estimating Reliability in a Volunteer-Based Compute and Data-Storage System

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

MapReduce. U of Toronto, 2014

Geant4 on Azure using Docker containers

Introduction to MapReduce

Forget about the Clouds, Shoot for the MOON

The Use of Cloud Computing Resources in an HPC Environment

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP

Distributed Systems. peer-to-peer Johan Montelius ID2201. Distributed Systems ID2201

MapReduce programming model

Mixing and matching virtual and physical HPC clusters. Paolo Anedda

Distributed NoSQL Storage for Extreme-scale System Services

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

CSE 444: Database Internals. Lecture 23 Spark

Introduction to Hadoop and MapReduce

BigData and Map Reduce VITMAC03

BANDWIDTH MODELING IN LARGE DISTRIBUTED SYSTEMS FOR BIG DATA APPLICATIONS

Data Intensive Scalable Computing

Overview. Why MapReduce? What is MapReduce? The Hadoop Distributed File System Cloudera, Inc.

Pocket: Elastic Ephemeral Storage for Serverless Analytics

AWS Solution Architecture Patterns

Parallel VS Distributed

More AWS, Serverless Computing and Cloud Research

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

CS November 2018

Course Microsoft Dynamics 365 Customization and Configuration with Visual Development (CRM)

Improving the MapReduce Big Data Processing Framework

MapReduce for Data Intensive Scientific Analyses

Genomics on Cisco Metacloud + SwiftStack

Large-scale volunteer computing over the Internet

Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.

Hadoop/MapReduce Computing Paradigm

Introduction to Distributed Computing Systems

MOHA: Many-Task Computing Framework on Hadoop

Datacenter replication solution with quasardb

ICN for Cloud Networking. Lotfi Benmohamed Advanced Network Technologies Division NIST Information Technology Laboratory

Op#mizing MapReduce for Highly- Distributed Environments

Peer-to-Peer Architectures and Signaling. Agenda

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Peer-to-Peer Systems and Distributed Hash Tables

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Ian Foster, CS554: Data-Intensive Computing

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

CS60021: Scalable Data Mining. Sourangshu Bhattacharya

HPC Storage Use Cases & Future Trends

Chapter 12: Multiprocessor Architectures. Lesson 12: Message passing Systems and Comparison of Message Passing and Sharing

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Cloud Bursting Jim Thompson Avere Systems

Federated data storage system prototype for LHC experiments and data intensive science

Using peer to peer. Marco Danelutto Dept. Computer Science University of Pisa

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

EECS 498 Introduction to Distributed Systems

Lustre* - Fast Forward to Exascale High Performance Data Division. Eric Barton 18th April, 2013

Simplifying Collaboration in the Cloud

Distributed Systems Conclusions & Exam. Brian Nielsen

Peer to Peer Computing

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

Professor: Ioan Raicu. TA: Wei Tang. Everyone else

Cloud Computing 3. CSCI 4850/5850 High-Performance Computing Spring 2018

Pegasus WMS Automated Data Management in Shared and Nonshared Environments

OCP Engineering Workshop - Telco

Page 1. Goals for Today" Background of Cloud Computing" Sources Driving Big Data" CS162 Operating Systems and Systems Programming Lecture 24

CA485 Ray Walshe Google File System

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Scalable Tools - Part I Introduction to Scalable Tools

Lecture 8: Application Layer P2P Applications and DHTs

Serverless The Future of the Cloud?!

Why Multiprocessors?

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

The amount of data increases every day Some numbers ( 2012):

6. Peer-to-peer (P2P) networks I.

2/26/2017. The amount of data increases every day Some numbers ( 2012):

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short

Sayantan Sur, Intel. ExaComm Workshop held in conjunction with ISC 2018

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

XSEDE Visualization Use Cases

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Distributed Computer Systems = Making Computer Systems Scalable, Reliable, Performant, etc., Yet Able to Form an Efficient Ecosystem

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

Chapter 5. The MapReduce Programming Model and Implementation

Leveraging Flash in HPC Systems

CPET 581 Cloud Computing: Technologies and Enterprise IT Strategies

System Models for Distributed Systems

Cloud Computing CS

Scibox: Online Sharing of Scientific Data via the Cloud

System models for distributed systems

Transcription:

Computing over the Internet: Beyond Embarrassingly Parallel Applications BOINC Workshop 09 Barcelona Fernando Costa University of Coimbra

Overview Motivation Computing over Large Datasets Supporting new Applications MapReduce over the Internet Scientific Workflows Conclusion 2

Motivation Volunteer Computing potential increasing PS3, GPU PCs have increased network and storage capabilities Limited to embarrassingly parallel apps Master/worker model Limitations/Problems with current model 3

Motivation - Problems Current Projects Centralized architecture Data distribution limitations Storage problems Not many new projects HPC stonewalled VC New types of applications needed to reach new projects Unable to take advantage of VC full potential 4

Goals Apply P2P techniques to solve scalability problems in current Volunteer Computing projects Introduce storage layer as support for new computing paradigms and application types Allow new projects to use internet-wide computing Provide mechanisms to handle more demanding applications Adapt existing Grid applications Support data-intensive applications Jobs with dependencies 5

Computing over large Datasets Amazon Model Store large datasets for free Clients pay for computation and storage used by their applications How to adapt to BOINC? Take advantage of previous work with BitTorrent 6

Previous Work Improve data distribution BitTorrent Shared input files Proposal for a Collaborative CDN Super-peer organization (P2P-ADICS) Data Centers Data Lookup Service: DHT with volunteers 7

Computing over large Datasets BOINC + BitTorrent library Wrapper to set BitTorrent as read-only filesystem Use large datasets as inputs Possible command sequence: fd = open(tracker, objid) read(fd, buffer, offset, len) 8

BOINC + BT model fd = open(tracker, objid) read(fd, buffer, offset, len) 9

BOINC + BT Advantages Easy to implement as first version Allows initial testing to evaluate the solution Possible to add read/write support Next step: large outputs or intermediate results as inputs Problem Assumes inter-client communication Solution: Guarantee that at least N% are accessible (public IP) Communication over UDP hole punching techniques Turn this into a super-peer scenario? 10

New Applications on BOINC Build over storage layer Leverage direct transfers Export information for applications MapReduce over the Internet Wider use, but harder to find application Scientific Workflows Not too complex for a VC environment 11

New Applications - MapReduce MapReduce over the Internet Adapt Hadoop to internet-wide computing Volunteer Cloud Computing? Problems with typical applications 12

MapReduce over Internet Applications that would fit Lower Communication - Computation ratio Longer running time Lower latency requirements More shared files Volunteer genomic computations? MapReduce Workflows Separate Dimensions 13

New Applications - Workflows New types of applications Data-intensive applications E.g.: Handle CERN data-intensive computations Workflow Extremely variable characteristics: long or short running, data-intensive or compute-intensive 14

New Applications Current Work Handling new applications Science Workflows Volunteer storage system Store intermediate results and final output Two alternatives Data stored in all nodes: metada in central server Chosen nodes act as data centers 15

Current Work - Cliques Clique Complete graph: each peer is connected to every node Building the overlay/p2p system Peers replicate data between themselves Event-driven Simulator 16

Current Work - Cliques Advantages More resources; Higher availability; Higher transfer speed; Disadvantages Connectivity; Security; Upload bandwidth restrictions; New Issues Accountability Byzantine and selfish/rational behaviour Fault Tolerance Security Authorization Authentication 17

Supporting New Applications Problem How to find a suitable application? Current Focus Virtual machines, GPU and multiprocessor applications Build around existing application Don t develop system that may never be used No requests for computing against large datasets or workflow apps Solution: Collaborations with existing/new projects 18

Conclusion Current Work Building Volunteer Storage Platform Wrapper to use BT as read-only file system Leveraging the Storage Layer Working on simulator that uses Cliques to support workflows MapReduce Paradigm Data-intensive applications Combining with virtualization: Volunteer Cloud Computing? Finding partners Research is meaningless unless it is advantageous to SOMEONE 19