Chapel Introduction and

Similar documents
Steve Deitz Chapel project, Cray Inc.

Survey on High Productivity Computing Systems (HPCS) Languages

CS 470 Spring Parallel Languages. Mike Lam, Professor

. Programming in Chapel. Kenjiro Taura. University of Tokyo

Overview: Emerging Parallel Programming Models

Trends and Challenges in Multicore Programming

Scalable Shared Memory Programing

The Mother of All Chapel Talks

Lecture 32: Partitioned Global Address Space (PGAS) programming models

Distributed Shared Memory for High-Performance Computing

OpenMPand the PGAS Model. CMSC714 Sept 15, 2015 Guest Lecturer: Ray Chen

CS558 Programming Languages

HPF High Performance Fortran

Introduction to Parallel Programming

Models and languages of concurrent computation

Interfacing Chapel with traditional HPC programming languages

Parallel Languages: Past, Present and Future

Affine Loop Optimization using Modulo Unrolling in CHAPEL

Introduction to OpenCL!

Linear Algebra Programming Motifs

A Local-View Array Library for Partitioned Global Address Space C++ Programs

6.189 IAP Lecture 5. Parallel Programming Concepts. Dr. Rodric Rabbah, IBM IAP 2007 MIT

Parallel Computing Why & How?

CS558 Programming Languages

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

A General Discussion on! Parallelism!

Primitive Task-Parallel Constructs The begin statement The sync types Structured Task-Parallel Constructs Atomic Transactions and Memory Consistency

A General Discussion on! Parallelism!

Parallel Programming. March 15,

Bulk Synchronous and SPMD Programming. The Bulk Synchronous Model. CS315B Lecture 2. Bulk Synchronous Model. The Machine. A model

PGAS: Partitioned Global Address Space

Introduction to High Performance Computing and X10

Introduction to Scientific Computing Languages

SYNCHRONIZED DATA. Week 9 Laboratory for Concurrent and Distributed Systems Uwe R. Zimmer. Pre-Laboratory Checklist

Chapel Background. Brad Chamberlain. PRACE Winter School 12 February Chapel: a new parallel language being developed by Cray Inc.

FCUDA: Enabling Efficient Compilation of CUDA Kernels onto

Sung-Eun Choi and Steve Deitz Cray Inc.

The Cascade High Productivity Programming Language

Titanium. Titanium and Java Parallelism. Java: A Cleaner C++ Java Objects. Java Object Example. Immutable Classes in Titanium

Halfway! Sequoia. A Point of View. Sequoia. First half of the course is over. Now start the second half. CS315B Lecture 9

High Performance Fortran. James Curry

Parallel Programming Languages. HPC Fall 2010 Prof. Robert van Engelen

CS 261 Fall Mike Lam, Professor. Threads

MIT OpenCourseWare Multicore Programming Primer, January (IAP) Please use the following citation format:

Patterns for! Parallel Programming II!

Statement level control structures

From the latency to the throughput age. Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC

PGAS languages. The facts, the myths and the requirements. Dr Michèle Weiland Monday, 1 October 12

The Hierarchical SPMD Programming Model

Shared Memory programming paradigm: openmp

Patterns for! Parallel Programming!

FCUDA: Enabling Efficient Compilation of CUDA Kernels onto

Introduction to Scientific Computing Languages

Cell Processor and Playstation 3

Intermediate Code Generation

Unified Parallel C (UPC)

3.Constructors and Destructors. Develop cpp program to implement constructor and destructor.

Example of a Parallel Algorithm

INF3380: Parallel Programming for Scientific Problems

A Uniform Programming Model for Petascale Computing

OpenSolaris and the Direction of Future Operating Systems

Parallel Programming Features in the Fortran Standard. Steve Lionel 12/4/2012

Compiling for Advanced Architectures

Steve Deitz Cray Inc.

Parallel Hybrid Computing Stéphane Bihan, CAPS

OpenACC Course. Office Hour #2 Q&A

Parallel Programming Libraries and implementations

2 Introduction to parallel computing. Chip Multiprocessors (ACS MPhil) Robert Mullins

Overview. 2 Introduction to parallel computing. The control structure. Parallel computers

LECTURE 18. Control Flow

Joe Hummel, PhD. Microsoft MVP Visual C++ Technical Staff: Pluralsight, LLC Professor: U. of Illinois, Chicago.

Lecture 15 CIS 341: COMPILERS

The SPL Programming Language Reference Manual

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

Distributed Systems CS /640

Experiences with an SMP Implementation for X10 based on the Java Concurrency Utilities (Extended Abstract)

Introduction to Programming Using Java (98-388)

Lecture 16: Static Semantics Overview 1

Types and Type Inference

INF3380: Parallel Programming for Natural Sciences

G Programming Languages - Fall 2012

Learn C# Errata. 3-9 The Nullable Types The Assignment Operators

Topics. Java arrays. Definition. Data Structures and Information Systems Part 1: Data Structures. Lecture 3: Arrays (1)

CSE 590o: Chapel. Brad Chamberlain Steve Deitz Chapel Team. University of Washington September 26, 2007

Chapel: An Emerging Parallel Programming Language. Thomas Van Doren, Chapel Team, Cray Inc. Northwest C++ Users Group April 16 th, 2014

Visual C# Instructor s Manual Table of Contents

Massively Parallel Architectures

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Implementing a Scalable Parallel Reduction in Unified Parallel C

( ZIH ) Center for Information Services and High Performance Computing. Event Tracing and Visualization for Cell Broadband Engine Systems

Java Threads and intrinsic locks

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

Task: a unit of parallel work in a Chapel program all Chapel parallelism is implemented using tasks

Types and Type Inference

Static Data Race Detection for SPMD Programs via an Extended Polyhedral Representation

High Performance Computing in C and C++

Introduction to parallel Computing

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Performance Issues in Parallelization Saman Amarasinghe Fall 2009

Supporting Data Parallelism in Matcloud: Final Report

Transcription:

Lecture 24 Chapel Introduction and Overview of X10 and Fortress John Cavazos Dept of Computer & Information Sciences University of Delaware www.cis.udel.edu/~cavazos/cisc879

But before that Created a simple program for Cell Main Program Creates an array of 50 doubles (precomputeddata) Initializes the array with numbers (0 through 49) Sends each SPE (through mailbox) address of array Waits for threads to finish SPU Program Reads address of array from mailbox DMA (get) transfers array data from main memory to LS Prints out local array contents ~cavazos/project2-sample-program

Lecture 24: Overview Chapel: Cascade High-Productivity Language Developed for DARPA HPCS Program High Productivity Computing Systems Characteristics: Global-view parallel language Support for general parallelism Locality-aware Object-oriented Generic programming

Global vs Fragmented models Global-view programming model Algorithm/data structures expressed as a whole Model executes as single thread upon entry Parallelism introduced through language constructs Examples: Chapel, OpenMP, HPF Fragmented programming model Algorithms expressed on a task-by-task basis Explicit decomposition of data structures/control flow Examples: MPI, UPC, Titanium

Global vs Fragmented models Global-view languages leave detail to compiler Fragmented languages obfuscate code

Support for General Parallelism Single level of parallelism Prevelance of SPMD model MPI (very popular) Supports coarse-grained parallelism OpenMP Supports fine-grained parallelism Should support nested parallelism Should also cleanly support data/task parallelism

Data distribution and Locality Hard for compiler to do good job of these Responsibility of performance-minded programmer Language should provide abstractions to: control data distribution control locality of interacting variables

Object-oriented Programming Proven successful in mainstream languages Separating interfaces from implementation Enables code reuse Encapsulate related code and data

Generic Programming Algorithms are written without specifying types Types somehow instantiated later Latent types Compiler can infer type from program s context Variable type inferred by initialization expression Function args inferred by actual arguments at callsites If compiler cannot infer declares an error Chapel is statically-typed All types inferred (type checking done) at compile-time For performance reasons

Chapel: Data Parallelism // a 2D ARITHMETIC DOMAIN storing indices (1,1) (m,n) var D: domain(2) = [1..m, 1..n]; // an m X n array of floating point values var A: [D] float; // an INFINITE DOMAIN storing string indicies var People: domain (string); // array of integers indexed with strings in the People domain var Age: [People] int; People += John ; Age( John ) = 62; // add string John to People domain // set John s age

Chapel: Data Parallelism // FORALL over domain of tuple of integers of domain D forall ij in D { A(ij) = ; } // FORALL over domain of strings from People domain forall I in People { Age(I) = ; } // Simple Example forall I in 1..N do a(i) = b(i);

Chapel: Task Parallelism //Begin Statement spawns new task begin writeln ( output from spawned task ); writeln( output from main task ); // Cobegin Statement // synchronization happens at the end of the cobegin block cobegin { stmt1(); stmt2(); stmt3(); }

Chapel: Task Parallelism // NOTE: Parallel tasks can coordinate with sync variables var finishedmainoutput$: sync bool; begin { finishedmainoutput$; writeln ( output from spawned task ); } writeln( output from main task ); finishedmainoutput$ = true;

X10 Overview Developed at IBM X10 is an extended subset of Java Base language = Java 1.4 language Some features removed from Java language Java Concurrency -- threads, synchronized Java Arrays replaced with X10 arrays Some features added to Java language Concurrency -- async, finish, foreach, ateach, etc. Distribution -- points, distributions X10 arrays -- distributed arrays, array reductions/initializers

X10: Activites, Places & PGAS

Fortress Overview Developed at Sun Entirely new language Fortress features Targetted to scientific computing Mathematical notation Implicitly parallel whenever possible Constructs and annotations to serialize when necessary Whenever possible, implement language feature in library

Fortress Code