Component-based Grid Programming Using the HOC-Service Architecture Sergei Gorlatch University of Münster, Germany 1
PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS Demand growing: Grand Challenges: Scientific computing (numeric simulations in climate research, aircraft construction, biology etc.) Distributed & cooperative Internet applications (Games, E-learning, E-commerce) PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS 2
PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS Demand growing: Grand Challenges: Scientific computing (numeric simulations in climate research, aircraft construction, biology etc.) Distributed & cooperative Internet applications (Games, E-learning, E-commerce) Conditions improving: Processors and networks more powerful and cheaper Parallelism on multiple levels: Inside a processor: pipelining, look-ahead, etc. Across multiple processors of a PCs and parallel computers Across multiple computers (Internet, computational grids) PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS 2-A
PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS Demand growing: Grand Challenges: Scientific computing (numeric simulations in climate research, aircraft construction, biology etc.) Distributed & cooperative Internet applications (Games, E-learning, E-commerce) Conditions improving: Processors and networks more powerful and cheaper Parallelism on multiple levels: Inside a processor: pipelining, look-ahead, etc. Across multiple processors of a PCs and parallel computers Across multiple computers (Internet, computational grids) Software advancing? + Standards (MPI, Java+RMI, etc.) Portability across platforms Low-level, cumbersome, error-prone PARALLEL AND DISTRIBUTED COMPUTING: TOWARD GRIDS 2-B
PROGRAMMING COMPUTATIONAL GRIDS result high performance grid hosts service request Compute-intensive applications are distributed among multiple high-performance computers connected via the Internet Application programming is supported by so-called grid middleware (recent standard: Globus Toolkit) PROGRAMMING COMPUTATIONAL GRIDS 3
PROGRAMMING: FROM HPC TO GRIDS Low-level programming model remains a challenge in the HPC world (e.g., MPI with explicit send-recv) Difficult to manage large numbers of processors Grids connect several HPC systems together, thereby making the task of application programmers even more challenging Grid-specific tasks are added: heterogeneous processing nodes heterogeneous, dynamic interconnects Grid middleware deals with these tasks: Grid Services extend Web Services to become stateful and transient Web Services allow for remote procedure calls via the Internet (passing proxies and firewalls), using SOAP via HTTP PROGRAMMING: FROM HPC TO GRIDS 4
PROGRAMMING GRIDS WITH MODERN MIDDLEWARE Grid Middleware Grid Servers hosted by Application Client request Service deliver Result SOAP/HTTP provide service virtually stateful Web Service Resource Resource interact manage Resource Home obtain endpoints Middleware should take care of resources What should the programmer take care of? PROGRAMMING GRIDS WITH MODERN MIDDLEWARE 5
PROGRAMMER S TASKS IN GLOBUS Configure services & resources using an extended WSDL format Map service namespaces to implementation packages Implement services, resources and "homes" (factories) Write WSDD deployment configuration Deploy the Grid Application Archive (JNDI) WSDD OGSA WSN GT 4 WSDL GAR WSRF WSDD WSDL GT 4 CODE CODE Deployment Descriptor Developer Service Interfaces Globus Implementation GAR Grid Application Archive PROGRAMMER S TASKS IN GLOBUS 6
AN EXAMPLE WSDL-INTERFACE FOR THE GLOBUS WSRF <?xml version="1.0" encoding="utf-8"?> <wsdl:definitions name="masterservice" targetnamespace="http://org.gridhocs/master" xmlns:wsdlpp="http://www.globus.org/namespaces/2004/10/wsdlpreprocessor"... <!-- more namespace declarations --> xmlns="http://schemas.xmlsoap.org/wsdl/"> <wsdl:import location="../../wsrf/properties/ws-resourceproperties.wsdl"/>... <!-- more WSR-import statements --> <wsdl:types> <schema targetnamespace="http://org.gridhocs/master" xmlns="http://www.w3.org/2001/xml <import namespace="http://schemas.xmlsoap.org/soap/encoding/"/> <complextype name="arrayof_xsd_double"> <complexcontent> <restriction base="soapenc:array"> <attribute ref="soapenc:arraytype" wsdl:arraytype="xsd:double[]"/> </restriction> </complexcontent> </complextype>... <!-- more parameter type and --> </element> <!-- resource property declarations --> </schema> </wsdl:types> <wsdl:message name="configurerequest"> <wsdl:part name="in0" type="impl:arrayof_xsd_string"/> </wsdl:message> <wsdl:message name="configureresponse"> <part name="parameters" element="impl:void"/> </wsdl:message> <!-- more message declarations --> <wsdl:porttype name="masterporttype" wsdlpp:extends="wsrpw:getresourceproperty wsrlw:immediateresourcetermination" wsrp:resourceproperties="tns:masterresourceproperties"> <wsdl:operation name="configure" parameterorder="in0"> <wsdl:input message="impl:configurerequest"/> <wsdl:output message="impl:configureresponse"/> </wsdl:operation>... <!-- more operation declarations --> </wsdl:porttype> </wsdl:definitions> AN EXAMPLE WSDL-INTERFACE FOR THE GLOBUS WSRF 7
APPROACH: GRID PROGRAMMING WITH HOCS HOC = Higher-Order Component captures a typical pattern of parallel/distributed behaviour, with application-specific codes as parameters select Component Repository Map HOC Farm HOC Pipeline HOC DWT HOC Divide& Conquer HOC write Java or Script application specific code, e.g., worker parameter for an image processing farm Application Developer mobile code parameter Grid Expert APPROACH: GRID PROGRAMMING WITH HOCS 8
HOCS: ROLE DISTRIBUTION Components are developed by Grid experts capable of writing efficient parallel code for the target machines The component repository is packaged with necessary configuration files (WSDL, WSDD) in a GAR file deployed remotely select Repository of HOCs Map HOC Farm HOC Pipeline HOC Scan HOC Divide & Conquer HOC stored in write Java GAR provides Application Developer mobile code parameter contains: GWSDL, GT3 Code, GWSDD Grid Expert HOCS: ROLE DISTRIBUTION 9
GRID PROGRAMMING USING HOCS Example: For a farm applications the application programmer provides two parameters: Master and Worker A distributed implementation incl. all needed configuration files is provided as Grid Services by the component developer.?? GWSDL CODE GWSDD Service Interfaces Deployment Descriptor Master Worker Parameter Parameter GT 3 CODE Application Developer Farm Service Implementation Grid Expert GRID PROGRAMMING USING HOCS 10
AN EXAMPLE: FARM-HOC EVALUATION ➀ client obtains Farm Service instance, ➁ Farm service obtains worker services ➂, distributes calculations ➃ and reassembles result ➄ 2 client 5 1 Farm HOC Factory Farm HOC 1) Farm Service... create 4 3 3 4 Worker 4 Worker Factory Factory 2) Worker Service... create 2) Worker Service 2) Worker Service... create 2) Worker Service AN EXAMPLE: FARM-HOC EVALUATION 11
HOC-SA: SERVICE ARCHITECTURE FOR HOCS A framework of HOC implementations in the form of grid-enabled Web services (incl. the configuration required for their deployment onto the established grid middleware) programming by selection, composition and parametrization: Java/C++ code parameter provided by the client application specific code only Farm HOC 1) Farm Service... server sided implementation distributed, parallel algorithm efficient and architecture tuned generic, i.e., not application specific 2) Worker Service Worker Factory Worker Factory 2) Worker 2) Worker... Service Service... 2) Worker Service HOC-SA: SERVICE ARCHITECTURE FOR HOCS 12
THE CODE SERVICE & CLASS LOADER OF THE HOC-SA grid client local code... HOC1(Aid, Bid)... server Y A B instantiation of HOC1 service container A C B A HOC1 HOC2 code for A code for B A B server X code service B A C B A remote class loader THE CODE SERVICE & CLASS LOADER OF THE HOC-SA 13
USING THE HOC-SA FOR IMAGE FILTERING " " # # # # # % % Application Client Worker Hosts send image retrieve result process recombine split overlapping subimages Master Host USING THE HOC-SA FOR IMAGE FILTERING 14
APIS AND DEVELOPER ROLES IN THE HOC-SA HOC SA Component Repository Farm HOC Reduce HOC Divide & Conquer HOC Component Framework Interfaces Service Definitions Configuration Client API Portal request service & send parameters Service API Class Loader Code Service use deploy HOC derive Application Programmer get Result Component Developer APIS AND DEVELOPER ROLES IN THE HOC-SA 15
CASE STUDY: COMPUTING JULIA SETS WITH THE HOC-SA calculating fractal images is a compute intensive task procedure can be applied to multiple independent tiles straightforward parallelization possible dynamic process has varying time costs CASE STUDY: COMPUTING JULIA SETS WITH THE HOC-SA 16
MEASURED EXECUTION TIMES for all experiments the distance between the server running the master code unit and the servers running the workers was ca. 500 kilometers 1 remote server 2 remote servers 3 remote servers 4 processors 4 + 8 processors 4 + 8 + 12 processors 198,212 sec 128,165 sec 48,377 sec Experiments observations: sequential time for a local evaluation was more than 1000 seconds (more then five times higher than using a remote server with 4 processors) transferring the result via SOAP takes much time (about 60 sec), due to the complexity of the SOAP encoding. MEASURED EXECUTION TIMES 17
PERFORMANCE PREDICTION FOR HOCS application HOC1 add mull push pop mul call add push pop push HOC runtime prediction/ server characteristics code parameter Client data/code parameter 00101 11010 00101 11001 10100 01000 01011 01100 data HOC1 HOC2 bytecode analysis runtime prediction/ scheduling Internet composition execution result HOC2 PERFORMANCE PREDICTION FOR HOCS 18
A SCALABILITY MODEL FOR COMPUTER GAMES real-time games (player positions etc.) are updated step-by-step S i 2 S i 1 S i S i+1 S i+2 φ Si 3,S i 2 φ Si 2,S i 1 φ Si 1,S i φ Si,S i+1 A SCALABILITY MODEL FOR COMPUTER GAMES 19
A SCALABILITY MODEL FOR COMPUTER GAMES real-time games (player positions etc.) are updated step-by-step S i 2 S i 1 S i S i+1 S i+2 φ Si 3,S i 2 φ Si 2,S i 1 φ Si 1,S i φ Si,S i+1 Analytical model for the maximum number of clients 250 supported clients 1200 kbytes/sec l=1 l=2 l=5 l=10 200 1000 150 100 50 measured cs measured proxy estimated number of proxy servers l 0 0 1 2 3 4 5 6 7 8 9 10 800 600 400 200 25 D prxy (l,n,0): estimated measured number of clients n 0 0 50 100 150 200 250 A SCALABILITY MODEL FOR COMPUTER GAMES 19-A
EXAMPLE: COMPUTATION OVERLOAD EXAMPLE: COMPUTATION OVERLOAD 20
EXAMPLE: COMPUTATION WITHIN THE TIME LIMIT EXAMPLE: COMPUTATION WITHIN THE TIME LIMIT 21
CONCLUSION The HOC-Service Architecture (HOC-SA) simplifies grid application development considerably Many recurring patterns of parallel computation can be implemented as HOCs (Pipes, Divide & Conquer,...) HOCs allow for communication across the boundaries of heterogenous hardware and software in a grid-aware manner using standards like Web services and Globus/WSRF The abstraction offered by HOCs does not cause a significant performance loss The higher-order programming model allows for accurate performance prediction CONCLUSION 22