A General Discussion on! Parallelism!

Similar documents
A General Discussion on! Parallelism!

Parallelism I. John Cavazos. Dept of Computer & Information Sciences University of Delaware

Introduction II. Overview

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

WHY PARALLEL PROCESSING? (CE-401)

Technology for a better society. hetcomp.com

A taxonomy of computer architectures

Types of Parallel Computers

GPUs and Emerging Architectures

Processor Performance and Parallelism Y. K. Malaiya

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

THREAD LEVEL PARALLELISM

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

The Art of Parallel Processing

Duksu Kim. Professional Experience Senior researcher, KISTI High performance visualization

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

Embedded Systems Architecture. Computer Architectures

BlueGene/L (No. 4 in the Latest Top500 List)

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing Introduction

Introduction to Parallel Programming and Computing for Computational Sciences. By Justin McKennon

CS4230 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/28/12. Homework 1: Parallel Programming Basics

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Accelerating Financial Applications on the GPU

Chapter 6. Parallel Processors from Client to Cloud. Copyright 2014 Elsevier Inc. All rights reserved.

Trends and Challenges in Multicore Programming

Test on Wednesday! Material covered since Monday, Feb 8 (no Linux, Git, C, MD, or compiling programs)

Message Passing Interface (MPI)

Chapter 3 Parallel Software

GP-GPU. General Purpose Programming on the Graphics Processing Unit

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Programming Models for Multi- Threading. Brian Marshall, Advanced Research Computing

Parallel Hybrid Computing Stéphane Bihan, CAPS

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

Lecture 1: Introduction and Basics

COSC 6385 Computer Architecture - Multi Processor Systems

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

Review of Last Lecture. CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions. Great Idea #4: Parallelism.

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

Parallel Computing. Hwansoo Han (SKKU)

Particle-in-Cell Simulations on Modern Computing Platforms. Viktor K. Decyk and Tajendra V. Singh UCLA

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions

Introduction to Parallel Programming

Parallel computer architecture classification

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018

CS560 Lecture Parallel Architecture 1

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348

PROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec

Parallel Architectures

Vector Processors and Graphics Processing Units (GPUs)

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions

INTRODUCTION TO OPENACC. Analyzing and Parallelizing with OpenACC, Feb 22, 2017

CS4961 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/30/11. Administrative UPDATE. Mary Hall August 30, 2011

Advanced Computer Architecture. The Architecture of Parallel Computers

Flynn Taxonomy Data-Level Parallelism

Parallel Architecture. Hwansoo Han

and Parallel Algorithms Programming with CUDA, WS09 Waqar Saleem, Jens Müller

OpenACC. Introduction and Evolutions Sebastien Deldon, GPU Compiler engineer

Parallel Computing. November 20, W.Homberg

An Introduction to Parallel Architectures

Parallel Programming for Graphics

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

GPU GPU CPU. Raymond Namyst 3 Samuel Thibault 3 Olivier Aumage 3

Lecture 1. Introduction Course Overview

Towards a codelet-based runtime for exascale computing. Chris Lauderdale ET International, Inc.

CSE 613: Parallel Programming Department of Computer Science SUNY Stony Brook Spring 2015

Parallel Computing Why & How?

Parallel Architectures

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

General introduction: GPUs and the realm of parallel architectures

! Readings! ! Room-level, on-chip! vs.!

RISC Processors and Parallel Processing. Section and 3.3.6

ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications

Parallel and Distributed Programming Introduction. Kenjiro Taura

GPU for HPC. October 2010

Multi-core Programming - Introduction

Addressing Heterogeneity in Manycore Applications

An Introduction to Parallel Programming

GPU Architecture. Alan Gray EPCC The University of Edinburgh

Introduction to parallel computing

Current Trends in High Performance Computing

Advanced Computer Architecture

The Heterogeneous Programming Jungle. Service d Expérimentation et de développement Centre Inria Bordeaux Sud-Ouest

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

Parallel Systems. Project topics

Turbostream: A CFD solver for manycore

COMP 322: Fundamentals of Parallel Programming. Flynn s Taxonomy for Parallel Computers

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

Flynn s Taxonomy of Parallel Architectures

CS650 Computer Architecture. Lecture 10 Introduction to Multiprocessors and PC Clustering

Warps and Reduction Algorithms

Chapel Introduction and

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Transcription:

Lecture 2! A General Discussion on! Parallelism! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879!

Lecture 2: Overview Flynn s Taxonomy of Architectures Types of Parallelism Parallel Programming Models Commercial Multicore Architectures

Flynn s Taxonomy of Arch. SISD - Single Instruction/Single Data SIMD - Single Instruction/Multiple Data MISD - Multiple Instruction/Single Data MIMD - Multiple Instruction/Multiple Data

Single Instruction/Single Data The was the typical machine before multicores. Slide Source: Wikipedia, Flynn s Taxonomy

Single Instruction/Multiple Data Processors that execute same instruction on multiple pieces of data (e.g., vector instructions). Slide Source: Wikipedia, Flynn s Taxonomy

Single Instruction/Multiple Data Each core executes same instruction simultaneously Vector-style of programming Natural for graphics and scientific computing Good choice for massively multicore

SISD versus SIMD SIMD very often requires compiler intervention. Slide Source: ars technica, Peakstream article

Multiple Instruction/Single Data Only Theoretical Machine. None ever implemented. Slide Source: Wikipedia, Flynn s Taxonomy

Multiple Instruction/Multiple Data Many mainstream multicores fall into this category. Slide Source: Wikipedia, Flynn s Taxonomy

Multiple Instruction/Multiple Data Each core simultaneously executing different instructions on different data private upper levels of memory hierarchy may share lower level of cache and RAM SIMD-extensions Programmed using a variety of methods OpenMP, MPI, pthreads, TBB, AMP,

Lecture 2: Overview Flynn s Taxonomy of Architecture Types of Parallelism Parallel Programming Models Commercial Multicore Architectures

Types of Parallelism Instructions: Slide Source: S. Amarasinghe, MIT 6189 IAP 2007

Pipelining Corresponds to SISD architecture. Slide Source: S. Amarasinghe, MIT 6189 IAP 2007

Instruction-Level Parallelism Dual instruction issue superscalar model. Again, corresponds to SISD architecture. Slide Source: S. Amarasinghe, MIT 6189 IAP 2007

Data-Level Parallelism What architecture model from Flynn s Taxonomy does this correspond to? Data Stream or Array Elements Slide Source: Arch. of a Real-time Ray-Tracer, Intel

Data-Level Parallelism Corresponds to SIMD architecture. Data Stream or Array Elements Slide Source: Arch. of a Real-time Ray-Tracer, Intel

Data-Level Parallelism One operation (e.g., +) produces multiple results. X, Y, and result are arrays. Slide Source: Klimovitski & Macri, Intel

Thread-Level Parallelism What architecture from Flynn s Taxonomy does this correspond to? Program partitioned into four threads. Four threads each executed on separate cores. P 1 P 2 P 3 P 4 P 5 P 6 Multicore with 6 cores. Slide Source: SciDAC Review, Threadstorm pic.

Thread-Level Parallelism Corresponds to MIMD architecture. Program partitioned into four threads. Four threads each executed on separate cores. P 1 P 2 P 3 P 4 P 5 P 6 Multicore with 6 cores. Slide Source: SciDAC Review, Threadstorm pic.

Lecture 2: Overview Flynn s Taxonomy of Architecture Types of Parallelism Parallel Programming Models Commercial Multicore Architectures

Multicore Programming Models Message Passing Interface (MPI) OpenMP Threads Pthreads Java Threads Parallel Libraries Intel s Thread Building Blocks (TBB) Microsoft s Task Parallel Library SWARM (GTech) Charm++ (UIUC) STAPL (Texas A&M)

GPU Programming Models CUDA (Nvidia) C/C++ extensions OpenCL (many vendors, including Nvidia and AMD) Pragma-based language extensions HMPP OpenACC Cray and PGI directives

Lecture 2: Overview Flynn s Taxonomy of Architecture Types of Parallelism Parallel Programming Models Commercial Multicore Architectures

Generalized Multicore Slide Source: Michael McCool, Rapid Mind, SuperComputing, 2007

Cell B.E. Architecture Slide Source: Michael McCool, Rapid Mind, SuperComputing, 2007

NVIDIA GPU Architecture G80 Slide Source: Michael McCool, Rapid Mind, SuperComputing, 2007

Commercial Multcores Slide Source: Dave Patterson, Manycore and Multicore Computing Workshop, 2007

Intel Teraflops Video http://www.youtube.com/watch?v=takg0uvtzpe

Next Time More Parallel Programming Models GPUs Architecture and Programming Models Read Chapters 1 of OpenCL book