Derivation and Verification of Parallel Components for the Needs of an HPC Cloud

Similar documents
DISTRIBUTED VIRTUAL CLUSTER MANAGEMENT SYSTEM

Cluster Computing Paul A. Farrell 9/15/2011. Dept of Computer Science Kent State University 1. Benchmarking CPU Performance

Benchmarking CPU Performance. Benchmarking CPU Performance

A Parallel Sweep Line Algorithm for Visibility Computation

An Intelligent and Cost-effective Solution to Implement High Performance Computing

Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation

A Case for High Performance Computing with Virtual Machines

A Relative Development Time Productivity Metric for HPC Systems

CSE5351: Parallel Processing Part III

Understanding and Automating Application-level Caching

Mellanox InfiniBand Training IB Professional, Expert and Engineer Certifications

Verifying control systems using CSP, FDR, and Handel-C.

Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters

[MS10987A]: Performance Tuning and Optimizing SQL Databases

Public Sector Cloud Service Adoption: The Nigerian Case

From Event-B Models to Dafny Code Contracts

Co-array Fortran Performance and Potential: an NPB Experimental Study. Department of Computer Science Rice University

Distributed Systems Programming (F21DS1) Formal Verification

Online Optimization of VM Deployment in IaaS Cloud

The Fractal Open Component Model

An evaluation of the Performance and Scalability of a Yellowstone Test-System in 5 Benchmarks

Slurm Configuration Impact on Benchmarking

SQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases. Upcoming Dates. Course Description.

Rollback-Recovery Protocols for Send-Deterministic Applications. Amina Guermouche, Thomas Ropars, Elisabeth Brunet, Marc Snir and Franck Cappello

A Comparison of Three MPI Implementations

From Hoare Logic to Matching Logic Reachability. Grigore Rosu and Andrei Stefanescu University of Illinois, USA

Performance Modeling for Systematic Performance Tuning

Performance Estimation of High Performance Computing Systems with Energy Efficient Ethernet Technology

Fujitsu s Approach to Application Centric Petascale Computing

Communication Characteristics in the NAS Parallel Benchmarks

CLOUD GOVERNANCE SPECIALIST Certification

Course Outline. Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led

CLOUD SECURITY SPECIALIST Certification. Cloud Security Specialist

HPE Enterprise Maps Data Model, ArchiMate, TOGAF. HPE Software, Cloud and Automation

Performance Tuning & Optimizing SQL Databases Microsoft Official Curriculum (MOC 10987)

Speedup Altair RADIOSS Solvers Using NVIDIA GPU

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Pooya Saadatpanah, Michalis Famelis, Jan Gorzny, Nathan Robinson, Marsha Chechik, Rick Salay. September 30th, University of Toronto.

QoS-aware resource allocation and load-balancing in enterprise Grids using online simulation

10987: Performance Tuning and Optimizing SQL Databases

Algorithm Engineering Applied To Graph Clustering

On the Role of Formal Methods in Software Certification: An Experience Report

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 18. Combining MPI and OpenMP

DISP: Optimizations Towards Scalable MPI Startup

Advances in Programming Languages

Formal Methods Expert to IMA-SP Kernel Qualification Preparation

Phase-Based Application-Driven Power Management on the Single-chip Cloud Computer

9/21/17. Outline. Expression Evaluation and Control Flow. Arithmetic Expressions. Operators. Operators. Notation & Placement

Performance Evaluation of Fast Ethernet, Giganet and Myrinet on a Cluster

Main Goal. Language-independent program verification framework. Derive program properties from operational semantics

Adaptive QoS Control Beyond Embedded Systems

Cover Page. The handle holds various files of this Leiden University dissertation

8. CONCLUSION AND FUTURE WORK. To address the formulated research issues, this thesis has achieved each of the objectives delineated in Chapter 1.

Detection and Analysis of Iterative Behavior in Parallel Applications

! Use of formal notations. ! in software system descriptions. ! for a broad range of effects. ! and varying levels of use. !

Performance Study of Hyper-Threading Technology on the LUSITANIA Supercomputer

Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?

Computer Information Systems (CIS) CIS 105 Current Operating Systems/Security CIS 101 Introduction to Computers

Fundamental Concepts and Models

Index. ADEPT (tool for modelling proposed systerns),

Lithe: Enabling Efficient Composition of Parallel Libraries

Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand

Automatic clustering of similar VM to improve the scalability of monitoring and management in IaaS cloud infrastructures

Packaging Theories of Higher Order Logic

Manifest Safety and Security. Robert Harper Carnegie Mellon University

XML Data in (Object-) Relational Databases

INFORMATION TECHNOLOGY ADMINISTRATION INFORMATION TECHNOLOGY ANALYSIS INFORMATION TECHNOLOGY SUPPORT SPECIALIST INTERNET SERVICES TECHNOLOGY

SOA Architect. Certification

Low-Level Monitoring and High-Level Tuning of UPC on CC-NUMA Architectures

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

ECSA Assessment Report

Tuning Alya with READEX for Energy-Efficiency

Progress report on the Integrative Model for Parallelism

A Characterization of Shared Data Access Patterns in UPC Programs

The design of a programming language for provably correct programs: success and failure

Computer Science (CS)

SQL Server 2014 Internals and Query Tuning

OpenMP at Sun. EWOMP 2000, Edinburgh September 14-15, 2000 Larry Meadows Sun Microsystems

Improving locality of an object store in a Fog Computing environment

Efficiency Evaluation of the Input/Output System on Computer Clusters

SERVICE API SPECIALIST Certification. Service API Specialist

An Empirical Study of Computation-Intensive Loops for Identifying and Classifying Loop Kernels

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

Problem Definition. Clustering nonlinearly separable data:

Tools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley

Cloud Capacity Specialist Certification

Harp-DAAL for High Performance Big Data Computing

Teaching Parallel and Distributed Computing to Undergraduate Computer Science Students EduPar 2013

HYPERDRIVE IMPLEMENTATION AND ANALYSIS OF A PARALLEL, CONJUGATE GRADIENT LINEAR SOLVER PROF. BRYANT PROF. KAYVON 15618: PARALLEL COMPUTER ARCHITECTURE

CLOUD STORAGE SPECIALIST Certification. Cloud Storage Specialist

Software Architecture Recovery based on Dynamic Analysis

Formalizing OO Frameworks and Framework Instantiation

Solving Dense Linear Systems on Graphics Processors

Algorithms, System and Data Centre Optimisation for Energy Efficient HPC

Automatic Translation from Circus to Java

COMMUNICATION CHARACTERISTICS IN THE NAS PARALLEL BENCHMARKS

Formal Foundations of Software Engineering

Large Scale Debugging of Parallel Tasks with AutomaDeD!

Hperformance. In hybrid architectures, more speed up is obtained by overlapping the computations of

Transcription:

XVII Brazilian Symposiun on Formal Methods () In: III Brazilian Conference on Software: Theory and Practice (CBSOFT'2013) Derivation and Verification of Parallel Components for the Needs of an HPC Cloud Thiago Braga Marcilon Francisco Heron de Carvalho Junior ParGO Research Group Pós-Graduação em Ciência da Computação Universidade Federal do Ceará Fortaleza/CE, Brazil MDCC/UFC

Topics Context and motivations (HPC Storm); Goals of this study; The system of formal contracts of HPC Storm; Contract-based formal derivation process; Case studies on derivation of parallel code using Circus/HCL: Conclusions.

Context and Motivations HPC Storm HPC Storm is HPC in clouds through components components HPC applications computational sciences engineering HPC CBHPC CCA Hash Fractal GCM HPC Storm HPC in clouds CBSE in clouds clouds

Context and Motivations HPC Storm Services; IaaS (infrastructure), comprising parallel computing platforms; PaaS (platform), for developing components and applications that may exploit the potential performance of these parallel computing platforms; SaaS (software), built from components, for attending HPC users. applications; Stakeholders: Domain specialists; Application providers; specialists use applications providers build applications Front-End (SaaS) build components maintainers manage infrastructure applications Component developers; Core (PaaS) components built from Platform maintainers. Architecture: Back-End (IaaS) parallel computing platforms includes Front-End / Core / Back-End.

specialists (final users) providers developers maintainers use applications Front-End (SaaS) Core (PaaS) build applications build components components manage infrastructure applications built from Back-End (IaaS) parallel computing platforms includes

Context and Motivations Hash Component Model A model of parallel components; Targeted at distributed-memory parallel computing platforms; Units + overlapping composition + component kinds; Makes possible isolation of parallelism concerns inside components; Hash Programming Environment (HPE) http://hash-programming-environment.googlecode.com Reference implementation of Hash, for cluster computing platforms; Hash Type System (HTS) Discovery and binding of components according to assumptions about the execution context (application + target parallel computing platform); Contextual abstraction, abstract components and instantiation types; Which component is the best for a given context (instantiation type)?

HPC Storm The System of Formal Contracts Each application orquestrates a set of available components to find (computational) solutions for problems in a specific domain of interest; Components may be built from (overlapping) composition of other components, defining depedencies; Hash A component dependency is described by a contract; The contract (of a software component) specifies: HTS extension The assumptions of the component about the execution environment (context = application + parallel computing platform); The computational task performed by the component. A specification of how the component perform its task. The problem we are addressing in this paper The guarantees of the component about its performance (QoS).

HPC Storm The System of Formal Contracts abstract component contract implementation assumptions includes platform assumptions and performance requirements component #2 Algorithms influenced by Implementation assumptions (contextual abstraction) what it does how it does when it does Functionality Algorithms Behaviour

HPC Storm The System of Formal Contracts Front-End (application) Core (component catalog) Back-End platform 2 contracts of components platform 1 platform 3

Context and Motivations HPC Storm Component Certification Specialists demand for correctness and performance obligations on component orquestrations from providers ; Avoiding the high costs of unexpected errors and performance bottlenecks in long-running intensive computations; In turn, providers demand for correctness and performance obligations on component implementations from developers; How to certify components in the cloud under HPC assumptions? In this paper, we are interested in the problem of how to certify that a component performs the computation specified in its contract; We still loosely address concerns about performance.

Goals of this Study Propose a certification process for components in HPC Storm With some emphasis on HPC assumptions; Parallel components of the Hash component model; Circus specification language for describing component contracts; F. H. de Carvalho Junior; R. D. Lins (2010) Compostional Specification of Parallel Components Using Circus, Electronic Notes in Theoretical Computer Science, vol. 260, n. 1, pages 47-72. (FACS'2008 proceedings) Evaluate the feasibility of deriving parallel code from refinement and translation of contracts written in an extension of Circus: Realistic case studies require: 1) An informal descrition ( pencil-and-paper ), from which deriving a contract; 2) An existing tuned implementation, built by professional HPC programmers. NAS Parallel Benchmarks (NPB).

HPC Storm Contract-Based Certification Process contract abstract component + implementation assumptions (context) refinement step abstract specification Refining towards a specification of the appropriate algorithms for the context refinement step abstract specification refinement step refinement Translating towards using the best techniques for implementing the algorithms, taking advantage of particular features of the target parallel computing platform abstract specification refinement step concrete specification translation source code

Front-End (application) HPC Storm Contract-Based Certification Process Component contract (2) the application demands for a component described by a contract of the given abstract component (3) for running, the application asks the Core for a component that implements the contract abstract component contract Core (component catalog) (1) the component is derived by refinement and translation from a contract of a catalogued abstract component concrete specification component (4) if a component is found, the concrete specification must be matched against the abstract component specification source code ensure that it was derived from refinement rules

HPC Storm Contract-Based Certification Process We propose Circus for specification of abstract components; Circus is a synergetic combination of Z, CSP and Dijkstra's guarded commands for formal speficiation of concurrent programs; Why Circus? It supports concurrency and, by consequence, parallelism; It separates behaviour (CSP) and functional (Z) concerns; Z may specify functional tasks (actions) of components; CSP may specify orquestration of component actions towards a goal; It supports a refinement! There are practical experiences with tools for verification (ProofPower-Z), refinement (CRefine) and automatic code generation (JCircus); In a previous work (FACS'2008), we have proposed Circus/HCL, an extension of Circus for specification of parallel components in HPE.

Circus/HCL HCL = Hash Configuration Language An arquitecture description and configuration language for composition and orquestration of parallel components in HPE; HCL + Circus: what Circus may offer to HCL: A language for specification of parallel computations performed by components of the Hash component model in HPE; Circus/HCL incorporates HTS (contextual abstraction); Circus + HCL: what HCL may offer to Circus: A mechainism for (overlapping) composition of Circus specifications describing parallel computations.

Circus/HCL VecVecProduct...... dot_product...

Case Studies NPB Parallel Benchmarks Evaluate the performance of high-end parallel computing platforms for CFD (Computational Fluid Dynamics) code; 8 original benchmarks: 5 kernels: EP, IS, CG, MG, FT; 3 pseudo applications: SP, BT, LU; Standard workload sizes (problem classes): S, W, A, B, C, D, E, F, Informal specifications ( pencil-and-paper ) that must be implemented for exploiting the features of the target parallel computing platforms; Reference implementations developed by HPC specialists; Many versions: 1.x, 2.x, 3.x Different parallel programming platforms: MPI, OpenMP, HPF, Globus, Java;

Case Studies IS and CG IS (Integer Sorting) Symbolic computation (memory intensive); Bucketsort algorithm; Reference implementation in C/MPI; CG (Conjugate Gradient) Numeric computation (float-point intensive); Apply the inverse power iteration method to find the lowest eigenvalue of a sparse positive-definite symmetric matrix. Reference implementation in Fortran/MPI; Circus/HCL specifications have been derived for both kernels, from the pencil-and-paper informal descriptions; IS and CG contracts, aimed at refinement and translation towards C#.

Case Studies Derivation of Parallel Code using Circus/HCL IS Contract (Bucketsort) state actions protocol

Case Studies Derivation of Parallel Code using Circus/HCL Circus Refinement + Translation C#

Case Studies Derivation of Parallel Code using Circus/HCL IS contract verify IS component refine IS concrete specification translate IS C# code

Case Studies Derivation of Parallel Code using Circus/HCL CG contract verify see the details of refinement and translation steps in the paper CG component refine CG concrete specification translate CG C# code

Conclusions The specification (from pencil-and-paper description), refinement and translation processes were time-consumig and error-prone: High mathematical skills are necessary; Enforcing the need of develop tools for guiding code derivation process; It is possible to choose an appropriate sequence of refinement and translation rules for tuning the component performance: In the case of IS and CG, we have used rectangular arrays, instead of jagged ones, and ordered loops for improving data locality; Common assumptions of HPC programmers; The reference implementations of IS and CG were useful as a baseline; Question: is it feasible to systematically guide application of refinement and translation rules according to context in a semi-automatic derivation tool? Language abstractions and syntactic sugar on Circus for helping translation towards HPC code: Ex: type Array k (T) N K T, onde _[_,_,...,_] Array k (T) N K T (indexing);

Conclusions This work have collected evidences about the feasibility of using formal methods of specification and derivation for HPC software, in the context of HPC Storm project; Also, it outlines a process of certified software development for the needs of HPC Storm; (Challenging) further works: Performance comparison of derived code and reference implementations (journal version?); Experimental methodology must be rigorous for achieving useful evidences; Investigate how to use contextual abstraction for guiding code derivation; Incorporate the certification process in the HPC Storm implementation; The Core (component catalog) must manage the component code, their specifications, and contract matching on top of an existing theorem prover; The component developer's Front-End must support specification and semiautomatic derivation of code

XVII Brazilian Symposiun on Formal Methods () In: III Brazilian Conference on Software: Theory and Practice (CBSOFT'2013) Derivation and Verification of Parallel Components for the Needs of an HPC Cloud Thiago Braga Marcilon Francisco Heron de Carvalho Junior ParGO Research Group Pós-Graduação em Ciência da Computação Universidade Federal do Ceará Fortaleza/CE, Brazil MDCC/UFC