Grid Computing M2 DL dacosta@irit.fr
A Brain is a Lot of Data! (Mark Ellisman, UCSD) And comparisons must be made among many We need to get to one micron to know location of every cell. We re just now starting to get to 10 microns 2
Outline A very short introduction to Grids A brief introduction to parallelism A not so short introduction to Grids Gridification of a sequential program Some grid middlewares and grid projects In-depth study of the Globus middleware 3
Grid concepts : an analogy Electric power distribution : the electric network and high voltage 4
Grid concepts Computer power distribution : Internet network and high performance (parallelism and distribution) 5
A Definition of Grid Computing Ian Foster: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations. http://www-fp.mcs.anl.gov/~foster/ Current views: see for instance the survey from Heinz Stockinger in 2006 : http://hst.web.cern.ch/hst/publications/definingtheg rid-1.1.pdf Oldest: Corbato in 1965: http://www.multicians.org/fjcc3.html See http://en.wikipedia.org/wiki/grid_computing for 6 more or alternate definitions
Concrete examples of GridS Europe: EGEE www.eu-egee.org/ 240 institutions, 45 countries 72,000 CPU, 20 PB disk, 200,000 concurrent jobs High Energy Physics, BioInformatics, Astrophysics Several projects in US Labs Japan s Naregi, GridBus in Australia, France: 7
Virtual Organization: CERN s Large Hadron Collider 1800 Physicists, 150 Institutes, 32 Countries 8 30 PB of data yearly; 70,000 CPUs
A Market or a niche? Only a scientist thing? Market: 12 Billions $ by 2008 (IDC) Major vendors are on the hype: IBM, Sun, Intel, Microsoft, Oracle, HP, Hitachi, About 30 big companies involved in the Open Grid Forum: www.gridforum.org + myriad of smaller companies. + services/experts companies (CS-SI, IBM, Atos, Accenture, Cap Gemini, ) IN2P3, CEA, CNES, Metéo France, Airbus, IFP, Banques, Peugeot, FT, Simulations (Physics, Environment, Finance), Genetics, Chemistry, Games, Data centers, 9
Where does all this come from??? 10
Parallelism : an introduction Grids dates back only 1996 Parallelism is older! (first classification in 1972) Motivations : need more computing power (weather forecast, atomic simulation, genetics ) need more storage capacity (petabytes and more) in a word : improve performance! 3 ways... Work harder --> Use faster hardware Work smarter --> Optimize algorithms Get help --> Use more computers! 11
Parallelism : the old classification Flynn (1972) Parallel architectures classified by the number of instructions (single/multiple) performed and the number of data (single/multiple) treated at same time SISD : Single Instruction, Single Data SIMD : Single Instruction, Multiple Data MISD : Multiple Instructions, Single Data MIMD : Multiple Instructions, Multiple Data 12
SIMD architectures Vector machines in decline since 97 (disappeared from market place) concept : same instruction performed on several CPU (as much as 16384) on different data data are treated in parallel 13
MIMD architectures different instructions are performed in parallel on different data divide and conquer : many subtasks in parallel to shorten global execution time large heterogeneity of systems 14
Another taxonomy based on how memories and processors interconnect SMP : Symmetric Multiprocessors MPP : Massively Parallel Processors Constellations Clusters Distributed systems 15
Symmetric Multi-Processors (1/3) Small number of identical processors (2-64) Share-everything architecture single memory (shared memory architecture) single I/O single OS equal access to resources HD Memory network CPU CPU CPU 16
Symmetric Multi-Processors (2/3) Pro : Easy to program : only one address space to exchange data (but programmer must take care of synchronization in memory access : critical section) 17
Symmetric Multi-Processors (3/3) Cons : poor scalability : when the number of processors increases, the cost to transfer data becomes too high; more CPUs = more access memory by the network = more need in memory bandwidth! Direct transfer from proc. to proc. (->MPP) Different interconnection schema (full impossible!, growing in O(n2) when nb of procs increases by O(n)) : bus, crossbar, multistage crossbar,... 18
Massively Parallel Processors (1/2) Several hundred nodes with a high speed interconnection network/switch A share-nothing architecture each node owns a memory (distributed memory), one or more processors, each runs an OS copy CPU Memory network Memory CPU Memory CPU 19
Massively Parallel Processors (2/2) Pros : good scalability Cons : communication between nodes longer than in shared memory; improve interconnection schema : hypercube, (2D or 3D), torus, fat-tree, multistage crossbars harder to program : data and/or tasks have to be explicitly distributed to nodes remote procedure calls (RPC, JavaRMI) message passing between nodes (PVM, MPI), synchronous or asynchronous communications DSM : Distributed Shared Memory : a virtual memory upgrade : processors and/or communication? 20
Constellations a small number of processors (up to 16) clustered in SMP nodes (fast connection) SMPs are connected through a less costly network with poorer performance With DSM, memory may be addressed globally : each CPU has a global memory view, memory and cache coherence is guaranteed (ccnuma) Memory CPU network Memory CPU network Memory CPU network CPUCPU CPU CPUCPU CPU CPUCPU CPU Interconnection network 21 Periph.
Clusters a collection of workstations (PC for instance) interconnected through high speed network, acting as a MPP/DSM with network RAM and software RAID (redundant storage, // IO) clusters = specialized version of NOW : Network Of Workstation Pros : low cost standard components take advantage of unused computing power 22
Distributed systems interconnection of independent computers each node runs its own OS each node might be any of SMPs, MPPs, constellations, clusters, individual computer the heart of the Grid! «A distributed system is a collection of independent computers that appear to the users of the system as a single computer» Distributed Operating System. A. Tanenbaum, Prentice Hall, 1994 23
Where are we today (2016)? a source for efficient and up-to-date information : www.top500.org the 500 best architectures! we are at 54 petaflops=54000 Teraflops 1 Flops = 1 floating point operation per second 1 TeraFlop = 1000 GigaFlops = 100 000 MegaFlops = 1 000 000 000 000 flops = one thousand billion operations per second 24
Today's bests comparison on a matrix maths test (Linpack) :Ax=b 25
NEC earth simulator Single stage crossbar : 2700 km of cables - a MIMD with Distributed Memory 700 TB disk space 1.6 PB mass storage area : 4 tennis court, 3 floors 26
How it grows? in 1993 (14 years ago!) n 1 : 59.7 GFlops n 500 : 0.4 Gflops Sum = 1.17 TFlops in nov 2012 (few days ago) n 1 : 27100 TFlops (x100) n 500 : 132 TFlops (x33) in 2007 (yesterday?) n 1 : 280 TFlops (x4666) n 500 : 4 TFlops (x10000) Sum = 4920 Tflops 27
MPP= Massive Parallel processing 28
29
30
31
Problems of the parallelism Two models of parallelism : driven by data flow : how to distribute data? driven by control flow : how to distribute tasks? Scheduling : which task to execute, on which data, when? how to insure highest compute time (overlap communication/computation?)? Communication using shared memory? using explicit node to node communication? what about the network? Concurrent access to memory (in shared memory systems) to input/output (parallel Input/Output) 32
The performance? Ideally grows linearly Speed-up : if TS is the best time to treat a problem in sequential, its time should be TP=TS/P with P processors! Speedup = TS/TP limited (Amdhal law): any program has a sequential and a parallel part : TS=F+T//, thus the speedup is limited : S = (F+T//)/(F+T///P)<1/F Scale-up : if TPS is the time to treat a problem of size S with P processors, then TPS should also be the time to treat a problem of size n*s with n*p processors 33
Network performance analysis scalability : can the network be extended? limited wire length, physical problems fault tolerance : if one node is down? for instance in an hypercube multiple access to media? inter-blocking? The metrics : latency : time to connect bandwidth : measured in MB/s 34
Tools/environment for parallelism (1/2) Communication between nodes : By global memory! (if possible, plain or virtual) Otherwise : low-level communication : sockets s = socket(af_inet, SOCK_STREAM, 0 ); mid-level communication library (PVM, MPI) info = pvm_initsend( PvmDataDefault ); info = pvm_pkint( array, 10, 1 ); info = pvm_send( tid, 3 ); remote service/object call (RPC, RMI, CORBA) service runs on distant node, only its name and parameters (in, out) have to be known 35
Tools/environment for parallelism (2/2) Programming tools threads : small processes data parallel language (for DM archi.): HPF (High Performance Fortran) say how data (arrays) are placed, the system will infer the best placement of computation (to minimize total computation time (e.g. further communications) task parallel language (for SM archi.): OpenMP : compiler directives and library routines; based on threads. The parallel program is close to sequential; it is a step by step transform Parallel loop directives (PARALLEL DO) Task parallel constructs (PARALLEL SECTIONS) PRIVATE and SHARED data declarations 36
Bibliography / Webography.C Fox, R.D William and P.C Messina, "Parallel Computing Works!" Morgan Kaufmann publisher, 1994, ISBN 1-55860-253-4 M. Cosnard and D Trystram, "Parallel Algorithms and Architectures" Thomson Learning publisher, 1994, ISBN 1-85032-125-6 M. Gengler, S. Ubéda and F. Desprez, "Initiation au parallélisme : concepts, architectures et algorithmes" Masson, 1995, ISBN 2-225-85014-3 Parallelism: www.ens-lyon.fr/~desprez/schedule/tutorials.html www.buyya.com/cluster Grids : www.lri.fr/~fci/hammamet/cosnard-hammamet-9-4-02.ppt TOP 500 : www.top500.org OpenMP : www.openmp.org PVM:www.csm.ornl.gov/pvm HPF : www.crpc.rice.edu/hpff 37
Computational grid HW and SW infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities Performance criteria : security reliability computing power latency services throughput 38
Grid Definition Refined Use Open Protocols Is decentralized Deliver non-trivial QoS 39
Levels of cooperation End system (computer, disk, sensor ) Cluster (heterogeneous) synchronous communications, DSM, parallel I/O parallel processing Intranet multithreading, local I/O heterogeneity, distributed admin, distributed FS and databases low supervision, resource discovery high throughput Internet no control, collaborative systems, (international) WAN brokers, negotiation 40
Grid Characteristics Large Scale Heterogeneity Multiple Domains of Administration Autonomy but coordination Dynamicity Flexibility Extensibility Security 41
Basic services Authentication/Authorization/Traceability Activity control (monitoring) Resource information Resource brokering Scheduling Job submission, data access/migration and execution Accounting 42
Layered Grid Architecture (By Analogy to Internet Architecture) Coordinating multiple resources : ubiquitous infrastructure services, app-specific distributed services Sharing single resources : negotiating access, controlling use Talking to things : communication (Internet protocols) & security Controlling things locally : Access to, & control of, resources Collective Application Resource Connectivity Fabric Transport Internet Link From I. Foster 43 Internet Protocol Architecture Application
Elements of the Problem Resource sharing Coordinated problem solving Computers, storage, sensors, networks, Heterogeneity of device, mechanism, policy Sharing conditional: negotiation, payment, Integration of distributed resources Compound quality of service requirements Dynamic, multi-institutional virtual orgs Dynamic overlays on classic org structures Map to underlying control mechanisms 44 From I. Foster
Aspects of the Problem Need for interoperability when different groups want to share resources Diverse components, policies, mechanisms E.g., standard notions of identity, means of communication, resource descriptions Need for shared infrastructure services to avoid repeated development, installation E.g., one port/service/protocol for remote access to computing, not one per tool/application E.g., Certificate Authorities: expensive to run A common need for protocols & services 45 From I. Foster
Resources Description Advertising Cataloging Matching Claiming Reserving Checkpointing 46
Resource layers Application layer Application resource management layer resource matching, global brokering Owner layer intertask resource management, execution environment System layer tasks, resource requests owner policy : who may use what End-resource layer end-resource policy (e.g. O.S.) 47
Resource management (1) Services and protocols depend on the infrastructure Some parameters stability of the infrastructure (same set of resources or not) freshness of the resource availability information reservation facilities multiple resource or single resource brokering Example request : I need from 10 to 100 CE each with at least 128 MB RAM and a computing power of 50 Mips 48
Resource management and scheduling (1) Levels of scheduling Mapping/scheduling job scheduling (global level ; perf : throughput) resource scheduling (perf : fairness, utilization) application scheduling (perf : response time, speedup, produced data ) resource discovery and selection assignment of tasks to computing resources data distribution task scheduling on the computing resources (communication scheduling) Individual perfs are not necessarily consistent with the global (system) perf! 49
Resource management and scheduling (2) Grid problems predictions are not definitive : dynamicity! Heterogeneous platforms Checkpointing and migration 50
A Resource Management System example (Globus) RSL specialization Broker RSL Queries & Info Application Ground RSL Information Service Co-allocator Simple ground RSL Local resource managers GRAM GRAM GRAM LSF Condor NQE 51
Resource information (1) What is to be stored? Organization, people, computing resources, software packages, communication resources, event producers, devices what about data??? A key issue in such dynamics environments A first approach : (distributed) directory (LDAP) easy to use tree structure distribution static mostly read ; not efficient updating hierarchical poor procedural language 52
Resource information (2) But : dynamicity complex relationships frequent updates complex queries A second approach : (relational) database 53
Programming the grid: potential programming models Message passing (PVM, MPI) Distributed Shared Memory Data Parallelism (HPF, HPC++) Task Parallelism (Condor) Client/server - RPC Agents Integration system (Corba, DCOM, RMI) 54
Program execution : issues Parallelize the program with the right job structure, communication patterns/procedures, algorithms Discover the available resources Select the suitable resources Allocate or reserve these resources Migrate the data (or the code) Initiate computations Monitor the executions ; checkpoints? React to changes Collect results 55
Data management It was long forgotten!!! Though it is a key issue! Issues : indexing retrieval replication caching traceability (auditing) And security!!! 56
From computing grids to information grids 57
From computing grids to information grids (1) Grids for long was lacking most of the tools mandatory to share (index, search, access), analyze, secure, monitor semantic data (information) Several reasons : history money difficulty Why is it so difficult? Sensitivity but openness Multiple administrative domains, multiple actors, heterogeneousness but a single global architecture/view/system Dynamicity and unpredictability but robustness Wideness but high performance 58
From computing grids to information grids (2) ex : the Replica Management Problem Maintain a mapping between logical names for files and collections and one or more physical locations Decide where and when a piece of data must be replicated Important for many applications Example: CERN high-level trigger data Multiple petabytes of data per year Copy of everything at CERN (Tier 0) Subsets at national centers (Tier 1) Smaller regional centers (Tier 2) Individual researchers have copies of pieces of data Much more complex with sensitive and complex data like medical data!!! 59
From computing grids to information grids (3) some (still ) open issues Security, security, security (incl. privacy, monitoring, traceability )) at a semantic level Access protocols (incl. replication, caching, migration ) Indexing tools Brokering of data (incl. accounting) (Content-based) Query optimization and execution Mediation of data Data integration, data warehousing and analysis tools Knowledge discovery and data mining 60
Functional View of Grid Data Management Application Metadata Service Planner: Data location, Replica selection, Selection of compute and storage nodes Replica Location Service Information Services Location based on data attributes Location of one or more physical replicas State of grid resources, performance measurements and predictions Security and Policy Executor: Initiates data transfers and computations Data Movement Data Access Compute Resources Storage Resources 61
Security: Why Grid Security is Hard Resources being used may be extremely valuable & the problems being solved extremely sensitive Resources are often located in distinct administrative domains Users may be different The set of resources used by a single computation may be large, dynamic, and/or unpredictable Each resource may have own policies & procedures Not just client/server The security service must be broadly available & applicable Standard, well-tested, well-understood protocols Integration with wide variety of tools 62
Grid Security : various views User View Resource Owner View 1) Easy to use 1) Specify local access control 2) Single sign-on 2) Auditing, accounting, etc. 3) Run applications ftp,ssh,mpi,condor,web, 3) Integration w/ local system Kerberos, AFS, license mgr. 4) User based trust model 4) Protection from 5) Proxies, delegation compromised resources Developer View API/SDK with authentication, flexible message protection, flexible communication, delegation,... Direct calls to various security functions (e.g. GSS-API) Or security integrated into higher-level SDKs: E.g. GlobusIO, Condor 63
Grid security : requirements Authentication Authorization and delegation of authority Assurance Accounting Auditing and monitoring Traceability Integrity and confidentiality 64
Query optimization and execution Old wine in new bottles? Yes and no : it seems the problem has not changed but the operational context has so changed that classical heuristics and methods are not more pertinent Key issues : dynamicity, unpredictability, adaptability Very few works have specifically addressed this problem Use mobile agents? 65
Service Oriented Architecture Open Grid Service Architecture WSRF : Web Service Resource Framework Everything is resources, WS-Resources Access through services 66
Globus Tutorial 67
Bibliography The Grid 2: Blueprint for a New Computing Architecture. Ian Foster, Carl Kesselman Grid Computing: The Savvy Manager s Guide. Pawel Plaszczak, Richard Wellner Jr 68