Volunteer Computing with BOINC

Similar documents
BOINC II. Nicolas Maire, Swiss Tropical Institute. with Christian Ulrik Søttrup, Niels Bohr Institute

The BOINC Community. PC volunteers (240,000) Projects. UC Berkeley developers (2.5) Other volunteers: testing translation support. Computer scientists

Submitting and managing distributed computations The researcher's interface to a BOINC project

Distributed Computing with. the Berkeley Open Infrastructure for Network Computing BOINC. Eric Myers. 1 September Mid-Hudson Linux Users Group

Cloud & Control. Any Program on 2000 or 2 Machines. Tom Ritter. Session Classification: General Interest

Public Resource Distributed Modelling. Dave Stainforth, Oxford University. MISU, Stockholm 8 th March 2006

Christian Benjamin Ries 1 and Christian Schröder 1. Wilhelm-Bertelsmann-Straße 10, Bielefeld, Germany. 1. Introduction

Use to exploit extra CPU from busy Tier2 site

Cycle Sharing Systems

Managing a BOINC Server Experiences at World Community Grid. September 10, 2008

BOINC. BOINC: A System for Public-Resource Computing and Storage David P. Anderson. Serge Koren CMSC714 November 22, 2005

The Lattice BOINC Project Public Computing for the Tree of Life

CS 578 Software Architectures Fall 2014 Homework Assignment #1 Due: Wednesday, September 24, 2014 see course website for submission details

Data Access and Analysis with Distributed, Federated Data Servers in climateprediction.net

The LGI Pilot job portal. EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers

ECE 8823: GPU Architectures. Objectives

Condor and BOINC. Distributed and Volunteer Computing. Presented by Adam Bazinet

Towards Ensuring Collective Availability in Volatile Resource Pools via Forecasting

CSE6331: Cloud Computing

The 5th Pan-Galactic BOINC Workshop. Desktop Grid System

Towards Real-Time, Many Task Applications on Large Distributed Systems

Containerizing GPU Applications with Docker for Scaling to the Cloud

Geant4 on Azure using Docker containers

SZDG, ecom4com technology, EDGeS-EDGI in large P. Kacsuk MTA SZTAKI

The Challenge of Volunteer Computing With Lengthy Climate Model Simulations

Overview of Distributed Computing. signin.ritlug.com (pray it works!)

Onto Petaflops with Kubernetes

Embedded Technosolutions

The Distributed Computing Model Based on The Capabilities of The Internet

Introduction to Grid Computing

Chapter 5. The MapReduce Programming Model and Implementation

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

[MS10992]: Integrating On-Premises Core Infrastructure with Microsoft Azure

Developing Microsoft Azure Solutions (70-532) Syllabus

Parallel Programming & Cluster Computing High Throughput Computing

HPC learning using Cloud infrastructure

OpenNebula on VMware: Cloud Reference Architecture

20533B: Implementing Microsoft Azure Infrastructure Solutions

Cloud Computing. What is cloud computing. CS 537 Fall 2017

arxiv: v2 [cs.dc] 19 Jul 2015

Cloud Computing. Summary

The Use of Cloud Computing Resources in an HPC Environment

Virtualization for Desktop Grid Clients

The Evolution of Big Data Platforms and Data Science

Developing Microsoft Azure Solutions (70-532) Syllabus

Presented By: Ian Kelley

Course Overview. ECE 1779 Introduction to Cloud Computing. Marking. Class Mechanics. Eyal de Lara

Amazon EC2 Container Service: Manage Docker-Enabled Apps in EC2

BOINC: A System for Public-Resource Computing and Storage

GLOBAL INFOSKILLS SDN BHD

GLOBAL INFOSKILLS SDN BHD

Developing Microsoft Azure Solutions

Introduction to Cluster Computing

Cloud Computing. Up until now

Developing Microsoft Azure Solutions (70-532) Syllabus

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

Distributed Systems COMP 212. Lecture 18 Othon Michail

Graham vs legacy systems

A Seminar report On LAMP Technology

Tesla GPU Computing A Revolution in High Performance Computing

Cloud Computing 4/17/2016. Outline. Cloud Computing. Centralized versus Distributed Computing Some people argue that Cloud Computing. Cloud Computing.

Azure Certification BootCamp for Exam (Developer)

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

The Cirrus Research Computing Cloud

BOINC extensions in the SZTAKI DesktopGrid system

Cisco Integration Platform

2011 IBM Research Strategic Initiative: Workload Optimized Systems

Setup Desktop Grids and Bridges. Tutorial. Robert Lovas, MTA SZTAKI

Basic Concepts & OS History

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved

IBM Bluemix compute capabilities IBM Corporation

Welcome to the. Migrating SQL Server Databases to Azure

The Stampede is Coming: A New Petascale Resource for the Open Science Community

CIT 668: System Architecture. Amazon Web Services

RAD Studio XE Datasheet

Volunteer Computing at CERN

Identifying Workloads for the Cloud

Migrating Oracle from Unix to the Cloud. Dean Bolton Chief Architect VLSS LLC

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald

A Comparative Study of Various Computing Environments-Cluster, Grid and Cloud

Genomics on Cisco Metacloud + SwiftStack

Software as a Service (SaaS), Service-Oriented Architecture (SOA), and Cloud Computing

Intro to Software as a Service (SaaS) and Cloud Computing

Day 9: Introduction to CHTC

Datacenter Management and The Private Cloud. Troy Sharpe Core Infrastructure Specialist Microsoft Corp, Education

MySQL Cluster Ed 2. Duration: 4 Days

Manual Backup Sql Server 2000 Command Line Restore

Opportunities for container environments on Cray XC30 with GPU devices

SQL Server Virtualization 201

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Large-Scale GPU programming

Installation Guide for Kony Fabric Containers Solution On-Premises

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

Homework 9: Stock Search Android App with Facebook Post A Mobile Phone Exercise

Lecture 1: Introduction and Computational Thinking

Why Choose MS Azure?

Design and Evaluation of a Public Resource Computing Framework

Building a Data-Friendly Platform for a Data- Driven Future

ECMWF Workshop on High Performance Computing in Meteorology. 3 rd November Dean Stewart

Transcription:

Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010

Goals Explain volunteer computing Teach how to create a volunteer computing project using BOINC Target audience: High-throughput computing users Technical skills: Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)

Outline Why use volunteer computing? Basic concepts of BOINC Developing BOINC applications (15 minute break) Deploying a BOINC server Deploying applications Submitting jobs Organizational issues

Part 1: Why use volunteer computing?

The Consumer Digital Infrastructure 1 billion PCs current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS total) Storage: ~1,000 Exabytes Commodity Internet: 10-1,000 Mbps to home Consumers pay for hardware sysadmin network costs electricity

Volunteer computing PC owners donate computing resources to projects (e.g., computational science) Applications run at zero priority while PC in use, and/or while PC is not in use

Examples Project start where area peak #hosts GIMPS 1994 math 10,000 distributed.net 1995 cryptography 100,000 SETI@home I 1999 UCB SETI 600,000 Folding@home 1999 Stanford biology 200,000 United Devices 2002 commercial biomedicine 200,000 CPDN 2003 Oxford climate change 150,000 LHC@home 2004 CERN physics 60,000 Predictor@home 2004 Scripps biology 100,000 WCG 2004 commercial biomedicine 200,000 Einstein@home 2005 LIGO astrophysics 200,000 SETI@home II 2005 UCB SETI 850,000 Rosetta@home 2005 U. Wash biology 100,000 SIMAP 2005 T.U. Munich bioinformatics 10,000...............

Current status ~50 projects 500,000 vounteers 800,000 computers

# processors 1 100 1000 Grid multiple jobs cluster (batch) single job cluster (MPI) 10K-1M Volunteer computing Commercial cloud High-throughput computing supercomputer High-performance computing

Volunteer computing is different You don t buy resources; you ask for them Resources are: - heterogeneous - sporadically available and connected - untrusted and not private - behind firewalls/nats/proxies

Part 2: Basic concepts of BOINC

About BOINC Funded by NSF since 2002 Open-source (LGPL) Based at UC Berkeley Few staff, but lots of volunteers software testing translation documentation support (email lists, message boards, Skype)

Volunteers and projects volunteers projects CPDN LHC@home attachments WCG

BOINC software overview scheduler MySQL daemons data server HTTP project server GUI client screensaver apps volunteer host

BOINC scheduler applications - HW, SW description - existing workload - per resource type: # of instances requested # of seconds requested Win32 Win64 app versions Win32 + NVIDIA Win32 N-core Mac OS X - app version descriptions - job descriptions jobs instances

Job replication Job instances may fail or return wrong results Job replication: do 2, see if they agree - agree may be fuzzy Homogeneous replication - numerical equivalence of hosts Adaptive replication - reduce replication for hosts that seem trustworthy

The job pipeline work generator BOINC validator assimilator

The BOINC data model App versions, job inputs, job output can consist of arbitrarily many files Each file has a physical name (unique, immutable); each reference to a file has a logical name Files have various attributes (e.g., sticky) Each file can have one or more URLs, and are transferred via HTTP App version files are digitally signed

What kinds of jobs can BOINC handle? Pretty much anything you d run on a Grid Bag of tasks (but IPC support soon) Short/long jobs Data intensive, up to a point Geared towards - Few apps, many jobs (high startup cost per app) - Jobs with high slack time

Part 3: Application development for BOINC

The BOINC runtime environment processes files

Native BOINC applications boinc_init() - create runtime system thread boinc_finish() - write finish file boinc_resolve_filename(logical, physical) boinc_fraction_done(x)

Checkpointing bool boinc_time_to_checkpoint() - call when in checkpointable state boinc_checkpoint_done()

The BOINC wrapper Can use for legacy apps XML input file lists sub-jobs - executable, input files What it does: - interfaces to BOINC client - copies files to/from slot directory - runs executables - does checkpointing at sub-job level

Building app versions Linux - gcc Windows - Visual Studio - mingw (gcc) Mac OS X - xcode

Multithread apps boinc_init_parallel() Allows suspend/resume of all threads - Unix: fork/exec - Windows: direct thread control

GPU app versions Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples) Each version has a plan class For each plan class, supply a function that determines - can app run on this host? hardware, driver version, etc. - what resources will it use? #CPUs, #GPUs, GPU RAM, etc.

VM apps Develop apps on your favorite OS Create a VirtualBox VM image App version consists of - VM wrapper (supplied by BOINC) - VM image - app executable

Part 4: Deploying a BOINC server

Hardware options Native Linux host - download/compile BOINC software BOINC server VM (VMware/Debian) BOINC Amazon EC2 image

Components of a project Master URL name MySQL database Directory hierarchy A set of daemon processes and cron jobs

Processes clients scheduler feeder work generator validator assimilator transitioner file deleter DB purger MySQL DB

Project directory hierarchy apps/ application files bin/ daemon programs cgi-bin/ BOINC scheduler and upload GCI config.xml configuration file download/ downloadable files html/ web site; master URL points here keys/ keys for code signing, upload auth log_(hostname) daemon log files project.xml list of platforms and apps upload/ uploaded files

BOINC database platform app app_version user host workunit result...

Creating a project make_project name creates - directory hierarchy - DB - mods for httpd.conf - crontab entry

Project configuration and control config.xml - scheduling and other options - list of daemons - list of periodic tasks project control - bin/start: start daemons, enable scheduler - bin/stop: stop daemons, disable scheduler - bin/status

Scaling a BOINC server Components can run on different machines sharing a file system Each component can be distributed MySQL server is typically the bottleneck 1 server machine can issue ~100K jobs/day; 4 machines can issue > 1 million

Part 5: Deploying applications

Adding an application edit project.xml <app> <name>multi_thread</name> <user_friendly_name>test multi-thread apps</user_friendly_name> </app> run bin/xadd

Adding an application version Create application version directory apps/ uppercase/ uppercase_6.14_windows_intelx86 cuda.exe/ uppercase_6.14_windows_intelx86 cuda.exe graphics_app=uppercase_graphics_6.14_windows_intelx86.exe logo.jpg Helvetica.txf Sign files on offline computer run bin/update_versions

Part 6: Submitting jobs

Describing job inputs Input template file <file_info> <number>0</number> </file_info> <workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> </file_ref> <target_nresults>1</target_nresults> <min_quorum>1</min_quorum> <command_line>-cpu_time 60</command_line> <rsc_fpops_bound>446797000000000</rsc_fpops_bound> <rsc_fpops_est>279248000000000</rsc_fpops_est> </workunit>

Describing job outputs Output template file <file_info> <name><outfile_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><upload_url/></url> </file_info> <result> <file_ref> <file_name><outfile_0/></file_name> <open_name>out</open_name> </file_ref> </result>

Submitting a job Stage input files cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa` Submit job create_work appname A wu_name B wu_template C result_template D

Part 7: Organizational issues

Single-scientist projects Need to: Port apps Get publicity interface with public maintain servers Not many research groups have the resources And it creates a lot of competing brands

Umbrella projects Project publicity web development sysadmin app porting Example: IBM World Community Grid

The Berkeley@home model A university has scientists a powerful brand PR resources IT infrastructure lots of alumni (UCB: 500,000)

Hubs nanohub: science portal for nanoscience social network + app store sharing of ideas, data, software computational portal HUBzero: generalization to other areas currently ~20 hubs Integration of BOINC with HUBzero each hub has a volunteer computing project