Current Trends in High Performance Computing

Size: px
Start display at page:

Download "Current Trends in High Performance Computing"

Transcription

1 Current Trends in High Performance Computing Chokchai Box Leangsuksun, PhD SWEPCO Endowed Professor*, Computer Science Director, High Performance Computing Initiative Louisiana Tech University 1 *SWEPCO endowed professorship is made possible by LA Board of Regents Outline What is HPC? Current Trends More on PS3 and GPU computing Conclusion 12 December

2 Mainstream CPUs CPU speed plateaus 3-4 Ghz More cores in a single chip Dual/Quad core is now Manycore (GPGPU) Traditional Applications won t get a free rides Conversion to parallel computing (HPC, MT) 3-4 Ghz cap This diagram is from no free lunch article in DDJ 12 December New trends in computing Old & current SMP, Cluster Multicore computers Intel Core 2 Duo AMD 2x 64 Many-core accelerators GPGPU, FPGA, Cell More Many brains in one computer Not to increase CPU frequency Harness many computers a cluster computing 12/12/11 4 2

3 What is HPC? High Performance Computing Parallel, Supercomputing Achieve the fastest possible computing outcome Subdivide a very large job into many pieces Enabled by multiple high speed CPUs, networking, software & programming paradigms fastest possible solution Technologies that help solving non-trivial tasks including scientific, engineering, medical, business, entertainment and etc. Time to insights, Time to discovery, Times to markets 12 December Parallel Programming Concepts Conventional serial execution where the problem is represented as a series of instructions that are executed by the CPU Problem CPU Parallel execution of a problem involves partitioning of the problem into multiple executable parts that are mutually exclusive and collectively exhaustive represented as a partially ordered set exhibiting concurrency. Problem Task Task Task Task instructions Parallel computing takes advantage of concurrency to : Solve larger problems with less time Save on Wall Clock Time Overcoming memory constraints CPU CPU CPU CPU Utilizing non-local resources 6 Source from Thomas Sterling s intro to HPC 12 December instructions 3

4 HPC Applications and Major Industries Finite Element Modeling Auto/Aero Fluid Dynamics Auto/Aero, Consumer Packaged Goods Mfgs, Process Mfg, Disaster Preparedness (tsunami) Imaging Seismic & Medical Finance & Business Banks, Brokerage Houses (Regression Analysis, Risk, Options Pricing, What if, ) Wal-mart s HPC in their operations Molecular Modeling Biotech and Pharmaceuticals Complex Problems, Large Datasets, Long Runs This slide is from Intel presentation Technologies for Delivering Peak Performance on HPC and Grid Applications 12 December HPC Drives Knowledge Economy 12/12/11 8 4

5 Life Science Problem an example of Protein Folding Take a computing year (in serial mode) to do molecular dynamics simulation for a protein folding problem Excerpted from IBM David Klepacki s The future of HPC 12 December 2011 Petaflop = a thousand trillion floating point operations per second 9 Disaster Preparedness - example Project LEAD Severe Weather prediction (Tornado) OU leads. HPC & Dynamically adaptation to weather forecast Professor Seidel s LSU CCT Hurricane Route Prediction Emergency Preparedness Accuracy of prediction 1 Mile 2 = $1 M 12 December

6 HPC accelerates a product FE analysis on 1 CPU 1,000,000 elements Numerical processing for 1 element =.1 secs One computer will take 100,000 secs = 27.7 hrs Says 100 CPUs.27 hr ~ 16 mins 12 December Avian Flu Pandemic Modeled on a Supercomputer MIDAS (Models of Infectious Disease Agent Study) program The large-scale, stochastic simulation model examines the nationwide spread of a pandemic influenza virus strain A simulation starts with 2 passengers with contaminated AF arriving LAX The simulation rolls out a city-city and census-tract-level picture of the spread of infection a synthetic population of 281 million people over the course of 180 days It is a very large scale and complex multi-variant 12 December

7 Avian Flu Pandemic (90 days) Timothy C. Germann, Kai Kadau, Catherine A. Macken (Los Alamos National Laboratory); Ira M. Longini Jr. (Emory University) Source from 12 December Avian Flu Pandemic (II) The results show that advance preparation of a modestly effective vaccine in large quantities appears to be preferable to waiting for the development of a well-matched vaccine that may be too late. The simulation models a synthetic population that matches U.S. census demographics and worker mobility data by randomly assigning the simulated individuals to households, workplaces, schools, and the like. The models serve as virtual laboratories to study how infectious diseases and what intervention strategies are more effective Run on the Los Alamos supercomputer known as Pink, a 1,024-node (2,048 processor) LinuxBIOS/Bpro with 2 GB/ node. Source from 12 December

8 Significant indicators why HPC now? Main stream computers with multi-cores (Intel or AMD) In past 1-2 years, CPU speed was flatten at 3+ Ghz More CPUs in one chip Dual core, multi-core chips Traditional software won t take advantage of these new processors Personal/Desktop Supercomputing. Many real problems are highly computational intensive. NSA uses supercomputing to do data mining DOE fusion, plasma, energy related (including weaponry). Help solving many other important areas (nanotech, life science etc.) Product design, ERM/Inventory Management Giants recently sneeze out HPC Bush s state of union speech 3 main S&T focus of which Supercomputing is one of them Bill Gates keynote speech at SC05 MS goes after HPC Google search engine - 100,000 nodes Playstation 3 is a personal supercomputing platform Hollywood (Entertainment) is HPC-bound (Pixar more than 3000 CPUs to render animation) 12 December HPC preparedness Build work forces that understand HPC paradigm & its applications HPC/Grid Curriculum in IT/CS/CE/ICT Offer HPC-enabling tracks to other disciplinary (engineering, life science, physic, computational chem, business etc..) Training business community Bring awareness to public National and strategic policies Improve Infrastructure 12 December

9 Pause here Switch to a tour of machine rooms Clusters, our Lab to show what they will be using.. Get students info on signup sheet for accounts on our clusters (azul, quadcore, GPU and PS3). Intro to Linux Then continue on HPC101 12/12/11 17 HPC December

10 How to Run Applications Faster? There are 3 ways to improve performance: Work Harder Work Smarter Get more Help Computer Analogy Using faster hardware Optimized algorithms and techniques used to solve computational tasks Multiple computers to solve a particular task 12 December Parallel Programming Concepts Problem Task Task Task Task instructions CPU CPU CPU CPU Source from Thomas Sterling s intro to HPC 12 December

11 HPC objective High Performance Computing Parallel, Supercomputing Achieve the fastest possible computing outcome Subdivide a very large job into many pieces Enabled by multiple high speed CPUs, networking, software & programming paradigms fastest possible solution Technologies that help solving non-trivial tasks including scientific, engineering, medical, business, entertainment and etc. 12 December Flynn s Taxonomy of Computer Architectures l SISD - Single Instruction/Single Data l SIMD - Single Instruction/Multiple Data l MISD - Multiple Instruction/Single Data l MIMD - Multiple Instruction/Multiple Data 22 11

12 Single Instruction/Single Data PU Processing Unit Your desktop, before the spread of dual core CPUs Slide Source: Wikipedia, Flynn s Taxonomy 23 Flavors of SISD Instructions: 24 12

13 More on pipelining 25 Single Instruction/Multiple Data Processors that execute same instruction on multiple pieces of data: NVIDIA GPUs Slide Source: Wikipedia, Flynn s Taxonomy 26 13

14 Single Instruction/Multiple Data l l Each core runs the same set of instructions on different data Example: l GPGPU: processes pixels of an image in parallel Slide Source: Klimovitski & Macri, Intel 27 SISD versus SIMD Writing a compiler for SIMD architectures is VERY difficult (inter-thread communication complicates the picture ) Slide Source: ars technica, Peakstream article 28 14

15 Multiple Instruction/Single Data Pipe line : CMU Warp machine. Slide Source: Wikipedia, Flynn s Taxonomy 29 Multiple Instruction/Multiple Data e.g. Multicore systems were based on a MIMD architecture + programming paradigm Such as openmp, multithreads Slide Source: Wikipedia, Flynn s Taxonomy 30 15

16 Multiple Instruction/Multiple Data l The sky is the limit: each PU is free to do as it pleases l Can be of either shared memory or distributed memory categories Instructions: 31 Current HPC Hardware Traditionally HPC has adopted expensive parallel hardware: Massively Parallel Processors (MPP) Symmetric Multi-Processors (SMP) Cluster Computers Recent trends in HPC Multicore systems Heterogeneous Computing with Accelerator Boards (GPGPU, FPGA) 12 December

17 HPC cluster Login Compile Submit job At least 2 connections Run tasks 12 December Parallel Programming Env Parallel Programming Environments and Tools Threads (PCs, SMPs, NOW..) POSIX Threads Java Threads MPI Linux, NT, on many Supercomputers OpenMP (predominantly on SMP) PVM (old) UPC, Co-array Fortran CUDA, Brooks+, opencl Software DSMs (Shmem) Compilers RAD (rapid application development tools) Debuggers Performance Analysis Tools Visualization Tools 12 December

18 Recent Trends in HPC Hardware Multicore & Manycore are now. Multi CPUs in a single die Better power consumption tightly couple and better for multi-threading GPGPU As a build blocks for a much larger system New Top 500 HPC systems - clusters of multi-core & GPGPU 12 December What are HPC systems 12/12/

19 Current top 5 systems 12/12/11 37 Shared vs Distributed Memory 12/12/

20 Shared memory Global memory space, accessible by all processors Processors may have local memory to hold copies of some global memory. Consistency of copies is usually maintained by hardware (cache coherency) 12/12/11 39 Two typical classes of SM Uniform Memory Access (UMA): Equal access times identical processors typically represented by Symmetric Multi- processor Machines (SMP) or Multicores Non-Uniform Memory Access (NUMA): Memory access times are not uniform, memory access across a link is slower Often made by physically linking two or more SMPs or heterogeneous computing 12/12/

21 Advantage & Disadvantage Global address space is user-friendly Data sharing between tasks is fast System may suffer from lack of scalability. Adding CPUs increases traffic on shared memory - to - CPU path. This is especially true for cache coherent systems Programmer is responsible for correct synchronization Systems larger than an SMP need some specialpurpose components. 12/12/11 41 Distributed Memory 12/12/

22 Multicores Three multicore classifications Homogeneous Heterogeneous Hybrid 12 December Multicores(I) Homogeneous Cores (a main CPU) All cores are identical A traditional MC with few cores Good for jumbo & few tasks Not as many tasks/threads as accelerators or GPU. E.g. Intel Core2Duo, i3, i5, i7, AMD Programming Multithreads/openMP 12 December

23 Multicores(II) Homogeneous Cores as accelerator or compute device Need a main CPU system As attached processing units All cores are identical and many Good for many SIMD tasks/threads E.g. NVIDIA GPGPU, Clearspeed FPGA Programming library calls from a main program or a new language extension, e.g. CUDA 12 December Multicores(III) Heterogeneous Cores All cores are NOT identical All in one die Programming is more difficult See more in PS3 presentation 12 December

24 Multicores(IV) Hybrid System Mix between host cores & accelerator cores A typical host can be a desktop to server system, e.g. Intel or AMD Accelerator NVDIA, ATI Stream or FPGA Programming model is more complex Issues memory bandwidth between host vs. devices 12 December Introduction to Cell BE (PS3) Programming HPCI: High Performance Computing Initiative 24

25 PS3 - awesome HPC system IBM Cell processor Affordable But currently not many tools 12 December Cell BE Architecture PowerPC Processor Element Main Processor 64 bit Also support Vector/SIMD Run the OS, Manage SPE 12 December 2011 Synergistic Processor Element 128-bit RISC, SIMD processor 256 KB local storage memory Use DMA to transfer data between local storage and main memory Picture ref: 25

26 Cell Programming IBM Cell SDK Main Process run on PPE Threads run on SPEs PPE Centric programming paradigm PPE process SPE thread SPE thread SPE thread December 2011 GPGPU General Purpose Graphic Processing Unit 12/12/

27 Two major players Parallel Computing on a GPU NVIDIA GPU Computing Architecture Via a HW device interface In laptops, desktops, workstations, servers 8-series GPUs deliver 50 to 500 GFLOPS on compiled parallel C applications Tesla T from 1-4 TFLOPS GPU parallelism is better than Moore s law, more doubling every year GPGPU is a GPU that allows user to process both graphics and non-graphics applications. Tesla D870 GeForce 8800 David Kirk/NVIDIA and Wen-mei W. Hwu, 2007 ECE 498AL, University of Illinois, UrbanaChampaign 27

28 NVIDIA GeForce 8800 (G80) the eighth generation of NVIDIA s GeForce graphic cards. High performance CUDA-enabled GPGPU 128 cores Memory MB or 1.5 GB in Tesla High-speed memory bandwidth Supports Scalable Link Interface (SLI) NVIDIA Tesla TM Feature GPU Computing for HPC No display ports Dedicate to computation For massively Multi-threaded computing Supercomputing performance 28

29 NVIDIA Tesla Card >> C-Series(Card) = 1 GPU with 1.5 GB D-Series(Deskside unit) = 2 GPUs S-Series(1U server) = 4 GPUs Note: 1 G80 GPU = 128 cores = ~500 GFLOPs 1 T10 = 240 cores = 1 TFLOPs << NVIDIA G80" David Kirk/ NVIDIA and Wen-mei This slide is from NVDIA CUDA tutorial 29

30 GPGPU Programming with CUDA CUDA (Compute Unified Device Architecture) is a SDK and API that allow a programmer to write C and Fortran programs to execute on GPGPU. Works with NVIDIA G80 or later and Tesla The GPGPU is viewed as a compute device ATI Stream (1) 12/12/

31 ATI /12/11 61 ATI 4870 X2 12/12/

32 Architecture of ATI Radeon 4000 series This slide is from ATI presentation 32

33 This slide is from ATI presentation Introduction to Open CL Toward new approach in Computing Moayad Almohaishi 33

34 Introduction to opencl OpenCL stands for Open Computing Language. It is from consortium efforts such as Apple, NVDIA, AMD etc. The Khronos group who was responsible for OpenGL. Take 6 months to come up with the specifications. OpenCL 1. Royalty-free. 2. Support both task and data parallel programing modes. 3. Works for vendor-agnostic GPGPUs 4. including multi cores CPUs 5. Works on Cell processors. 6. Support handhelds and mobile devices. 7. Based on C language under C99. 34

35 OpenCL Can make query on available devices and build an context of the available devices. Programmers would be able to program more freely for any kind of device. Applications are more resuable even if the hardware changed in the future. 35

36 OpenCL Platform Model CPUs+GPU platforms 12/12/

37 Performance of GPGPU Note: A cluster of dual Xeon 2.8GZ 30 nodes, Peak performance ~336 GFLOPS David Kirk/NVIDIA and Wen-mei W. Hwu, 37

38 Last words! HPC or Supercomputing system is not necessarily gigantic in a big machine room but is accessible for Thais and may now be sitting next to your desk Computing is necessity and Fast computing provides competitive edge, esp Knowledge Economy New trends of HPC includes GPGPU, various multicore architecture Prepare ourselves and strengthen our S&T, and industry as well business community for this phenomenon (HPC goes mainstream) before too late. 12 December Back up slides 12/12/

39 Cancer Gene-mining Unsuccessful on a uni-processor Our approach Novel parallel gene-mining algorithms Input from microarray Retain accuracy Significantly speed up (superlinear) IBM P5 supercomputer (128 node PPC). Time taken(in secs) Time to run the algorithm, keeping number of nodes fixed Number of processors Bladder 100 Mesothelioma Breast Renal Leukemia Prostate 0 Lung Pancreas Colorectal Ovary Lymphoma Melanoma OvaMarker based Selection GeneSetMine based Selection 12 December Drug Delivery By WU & Palmer, Louisiana Tech U Assisted by HPCI A study of microcapsules for drug delivery. Computational Fluid Dynamics methodology to model the generation of droplets or cores (using alginate and oil) Goal: better understanding process parameters needed for generating cores of homogeneous size for the manufacturing of microcapsules. 12 December

40 Droplet Generation: Experimental Procedure 12 December Droplet Generation: Example Results Case 1: Olive oil: Density 930 kg/m3 Viscosity 0.03 kg/m-s Alginate: Density 1012 kg/m3 Viscosity kg/m-s Case 2: Phase 1: Density 918 kg/m3 Viscosity kg/m-s Phase 2: Density kg/m3 Viscosity kg/m-s 12 December 2011 Source from wu s thesis 80 40

High Performance Computing

High Performance Computing GPGPU A Current Trend in High Performance Computing Chokchai Box Leangsuksun, PhD SWEPCO Endowed Professor*, Computer Science Director, High Performance Computing Initiative Louisiana Tech University box@latech.edu

More information

A General Discussion on! Parallelism!

A General Discussion on! Parallelism! Lecture 2! A General Discussion on! Parallelism! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879! Lecture 2: Overview Flynn s Taxonomy of

More information

A General Discussion on! Parallelism!

A General Discussion on! Parallelism! Lecture 2! A General Discussion on! Parallelism! John Cavazos! Dept of Computer & Information Sciences! University of Delaware!! www.cis.udel.edu/~cavazos/cisc879! Lecture 2: Overview Flynn s Taxonomy

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 12

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

Parallel and High Performance Computing CSE 745

Parallel and High Performance Computing CSE 745 Parallel and High Performance Computing CSE 745 1 Outline Introduction to HPC computing Overview Parallel Computer Memory Architectures Parallel Programming Models Designing Parallel Programs Parallel

More information

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Graphics Processor Acceleration and YOU

Graphics Processor Acceleration and YOU Graphics Processor Acceleration and YOU James Phillips Research/gpu/ Goals of Lecture After this talk the audience will: Understand how GPUs differ from CPUs Understand the limits of GPU acceleration Have

More information

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

Box s 1 minute Bio l B. Eng (AE 1983): Khon Kean University

Box s 1 minute Bio l B. Eng (AE 1983): Khon Kean University CSC469/585: Winter 2011-12 High Availability and Performance Computing: Towards non-stop services in HPC/HEC/Enterprise IT Environments Chokchai (Box) Leangsuksun, Associate Professor, Computer Science

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

Experts in Application Acceleration Synective Labs AB

Experts in Application Acceleration Synective Labs AB Experts in Application Acceleration 1 2009 Synective Labs AB Magnus Peterson Synective Labs Synective Labs quick facts Expert company within software acceleration Based in Sweden with offices in Gothenburg

More information

How to Write Fast Code , spring th Lecture, Mar. 31 st

How to Write Fast Code , spring th Lecture, Mar. 31 st How to Write Fast Code 18-645, spring 2008 20 th Lecture, Mar. 31 st Instructor: Markus Püschel TAs: Srinivas Chellappa (Vas) and Frédéric de Mesmay (Fred) Introduction Parallelism: definition Carrying

More information

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29 Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions

More information

What does Heterogeneity bring?

What does Heterogeneity bring? What does Heterogeneity bring? Ken Koch Scientific Advisor, CCS-DO, LANL LACSI 2006 Conference October 18, 2006 Some Terminology Homogeneous Of the same or similar nature or kind Uniform in structure or

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

The Art of Parallel Processing

The Art of Parallel Processing The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a

More information

High Performance Computing with Accelerators

High Performance Computing with Accelerators High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing

More information

Introduction II. Overview

Introduction II. Overview Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and

More information

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge

More information

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1 Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

GPU Architecture. Alan Gray EPCC The University of Edinburgh

GPU Architecture. Alan Gray EPCC The University of Edinburgh GPU Architecture Alan Gray EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? Architectural reasons for accelerator performance advantages Latest GPU Products From

More information

Administrivia. Administrivia. Administrivia. CIS 565: GPU Programming and Architecture. Meeting

Administrivia. Administrivia. Administrivia. CIS 565: GPU Programming and Architecture. Meeting CIS 565: GPU Programming and Architecture Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider and Patrick Cozzi Meeting Monday and Wednesday 6:00 7:30pm Moore 212 Recorded lectures upon

More information

Parallel Computing. Hwansoo Han (SKKU)

Parallel Computing. Hwansoo Han (SKKU) Parallel Computing Hwansoo Han (SKKU) Unicore Limitations Performance scaling stopped due to Power consumption Wire delay DRAM latency Limitation in ILP 10000 SPEC CINT2000 2 cores/chip Xeon 3.0GHz Core2duo

More information

BİL 542 Parallel Computing

BİL 542 Parallel Computing BİL 542 Parallel Computing 1 Chapter 1 Parallel Programming 2 Why Use Parallel Computing? Main Reasons: Save time and/or money: In theory, throwing more resources at a task will shorten its time to completion,

More information

BlueGene/L (No. 4 in the Latest Top500 List)

BlueGene/L (No. 4 in the Latest Top500 List) BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of

More information

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been

More information

Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology

Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology http://cag.csail.mit.edu/ps3 6.189-chair@mit.edu A new processor design pattern emerges: The Arrival of Multicores MIT Raw 16 Cores

More information

Master Informatics Eng.

Master Informatics Eng. Advanced Architectures Master Informatics Eng. 2018/19 A.J.Proença Data Parallelism 3 (GPU/CUDA, Neural Nets,...) (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2018/19 1 The

More information

Using Graphics Chips for General Purpose Computation

Using Graphics Chips for General Purpose Computation White Paper Using Graphics Chips for General Purpose Computation Document Version 0.1 May 12, 2010 442 Northlake Blvd. Altamonte Springs, FL 32701 (407) 262-7100 TABLE OF CONTENTS 1. INTRODUCTION....1

More information

Parallel Computing Why & How?

Parallel Computing Why & How? Parallel Computing Why & How? Xing Cai Simula Research Laboratory Dept. of Informatics, University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 Outline 1 Motivation 2 Parallel

More information

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow.

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow. Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow. Big problems and Very Big problems in Science How do we live Protein

More information

Parallel and Distributed Computing

Parallel and Distributed Computing Parallel and Distributed Computing NUMA; OpenCL; MapReduce José Monteiro MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer Science and Engineering

More information

ECE 8823: GPU Architectures. Objectives

ECE 8823: GPU Architectures. Objectives ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,

More information

Parallel Computing Introduction

Parallel Computing Introduction Parallel Computing Introduction Bedřich Beneš, Ph.D. Associate Professor Department of Computer Graphics Purdue University von Neumann computer architecture CPU Hard disk Network Bus Memory GPU I/O devices

More information

An Introduction to Parallel Programming

An Introduction to Parallel Programming Dipartimento di Informatica e Sistemistica University of Pavia Processor Architectures, Fall 2011 Denition Motivation Taxonomy What is parallel programming? Parallel computing is the simultaneous use of

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

CS 668 Parallel Computing Spring 2011

CS 668 Parallel Computing Spring 2011 CS 668 Parallel Computing Spring 2011 Prof. Fred Annexstein @proffreda fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807 Meeting: TuTh 2:00-3:25 in RecCenter 3240 Lecture

More information

Top500 Supercomputer list

Top500 Supercomputer list Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity

More information

GPU for HPC. October 2010

GPU for HPC. October 2010 GPU for HPC Simone Melchionna Jonas Latt Francis Lapique October 2010 EPFL/ EDMX EPFL/EDMX EPFL/DIT simone.melchionna@epfl.ch jonas.latt@epfl.ch francis.lapique@epfl.ch 1 Moore s law: in the old days,

More information

The MOSIX Scalable Cluster Computing for Linux. mosix.org

The MOSIX Scalable Cluster Computing for Linux.  mosix.org The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org 1 Presentation overview Part I : Why computing clusters (slide 3-7) Part II : What

More information

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs An Extension of the StarSs Programming Model for Platforms with Multiple GPUs Eduard Ayguadé 2 Rosa M. Badia 2 Francisco Igual 1 Jesús Labarta 2 Rafael Mayo 1 Enrique S. Quintana-Ortí 1 1 Departamento

More information

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 11

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve Summary of previous lectures Pthreads: low-level multi-threaded programming OpenMP: simplified interface based on #pragma, adapted to scientific computing OpenMP for and

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied

More information

High Performance Computing (HPC) Introduction

High Performance Computing (HPC) Introduction High Performance Computing (HPC) Introduction Ontario Summer School on High Performance Computing Scott Northrup SciNet HPC Consortium Compute Canada June 25th, 2012 Outline 1 HPC Overview 2 Parallel Computing

More information

Lecture 1: Gentle Introduction to GPUs

Lecture 1: Gentle Introduction to GPUs CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed

More information

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes: BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General

More information

GPGPU. Peter Laurens 1st-year PhD Student, NSC

GPGPU. Peter Laurens 1st-year PhD Student, NSC GPGPU Peter Laurens 1st-year PhD Student, NSC Presentation Overview 1. What is it? 2. What can it do for me? 3. How can I get it to do that? 4. What s the catch? 5. What s the future? What is it? Introducing

More information

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

PARALLEL PROGRAMMING MANY-CORE COMPUTING: INTRO (1/5) Rob van Nieuwpoort

PARALLEL PROGRAMMING MANY-CORE COMPUTING: INTRO (1/5) Rob van Nieuwpoort PARALLEL PROGRAMMING MANY-CORE COMPUTING: INTRO (1/5) Rob van Nieuwpoort rob@cs.vu.nl Schedule 2 1. Introduction, performance metrics & analysis 2. Many-core hardware 3. Cuda class 1: basics 4. Cuda class

More information

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018 Introduction CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University What is Parallel

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming January 14, 2015 www.cac.cornell.edu What is Parallel Programming? Theoretically a very simple concept Use more than one processor to complete a task Operationally

More information

Trends in HPC (hardware complexity and software challenges)

Trends in HPC (hardware complexity and software challenges) Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18

More information

Parallel Architectures

Parallel Architectures Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s

More information

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

Finite Element Integration and Assembly on Modern Multi and Many-core Processors Finite Element Integration and Assembly on Modern Multi and Many-core Processors Krzysztof Banaś, Jan Bielański, Kazimierz Chłoń AGH University of Science and Technology, Mickiewicza 30, 30-059 Kraków,

More information

THREAD LEVEL PARALLELISM

THREAD LEVEL PARALLELISM THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture

More information

Trends and Challenges in Multicore Programming

Trends and Challenges in Multicore Programming Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

High Performance Computing Course Notes HPC Fundamentals

High Performance Computing Course Notes HPC Fundamentals High Performance Computing Course Notes 2008-2009 2009 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

Moore s Law. Computer architect goal Software developer assumption

Moore s Law. Computer architect goal Software developer assumption Moore s Law The number of transistors that can be placed inexpensively on an integrated circuit will double approximately every 18 months. Self-fulfilling prophecy Computer architect goal Software developer

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

Introduction to GPU computing

Introduction to GPU computing Introduction to GPU computing Nagasaki Advanced Computing Center Nagasaki, Japan The GPU evolution The Graphic Processing Unit (GPU) is a processor that was specialized for processing graphics. The GPU

More information

High Performance Computing Course Notes Course Administration

High Performance Computing Course Notes Course Administration High Performance Computing Course Notes 2009-2010 2010 Course Administration Contacts details Dr. Ligang He Home page: http://www.dcs.warwick.ac.uk/~liganghe Email: liganghe@dcs.warwick.ac.uk Office hours:

More information

Duksu Kim. Professional Experience Senior researcher, KISTI High performance visualization

Duksu Kim. Professional Experience Senior researcher, KISTI High performance visualization Duksu Kim Assistant professor, KORATEHC Education Ph.D. Computer Science, KAIST Parallel Proximity Computation on Heterogeneous Computing Systems for Graphics Applications Professional Experience Senior

More information

Parallel Architecture. Hwansoo Han

Parallel Architecture. Hwansoo Han Parallel Architecture Hwansoo Han Performance Curve 2 Unicore Limitations Performance scaling stopped due to: Power Wire delay DRAM latency Limitation in ILP 3 Power Consumption (watts) 4 Wire Delay Range

More information

Current Trends in Computer Graphics Hardware

Current Trends in Computer Graphics Hardware Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)

More information

Multi-Processors and GPU

Multi-Processors and GPU Multi-Processors and GPU Philipp Koehn 7 December 2016 Predicted CPU Clock Speed 1 Clock speed 1971: 740 khz, 2016: 28.7 GHz Source: Horowitz "The Singularity is Near" (2005) Actual CPU Clock Speed 2 Clock

More information

Parallelism and Concurrency. COS 326 David Walker Princeton University

Parallelism and Concurrency. COS 326 David Walker Princeton University Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary

More information

What are Clusters? Why Clusters? - a Short History

What are Clusters? Why Clusters? - a Short History What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by

More information

CDA3101 Recitation Section 13

CDA3101 Recitation Section 13 CDA3101 Recitation Section 13 Storage + Bus + Multicore and some exam tips Hard Disks Traditional disk performance is limited by the moving parts. Some disk terms Disk Performance Platters - the surfaces

More information

The Use of Cloud Computing Resources in an HPC Environment

The Use of Cloud Computing Resources in an HPC Environment The Use of Cloud Computing Resources in an HPC Environment Bill, Labate, UCLA Office of Information Technology Prakashan Korambath, UCLA Institute for Digital Research & Education Cloud computing becomes

More information

Chapter 1: Introduction to Parallel Computing

Chapter 1: Introduction to Parallel Computing Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing Jun Zhang Laboratory for High Performance Computing & Computer Simulation Department of Computer Science University of Kentucky

More information

Introduction to Parallel Processing

Introduction to Parallel Processing Babylon University College of Information Technology Software Department Introduction to Parallel Processing By Single processor supercomputers have achieved great speeds and have been pushing hardware

More information

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems 1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase

More information

MANY-CORE COMPUTING. 7-Oct Ana Lucia Varbanescu, UvA. Original slides: Rob van Nieuwpoort, escience Center

MANY-CORE COMPUTING. 7-Oct Ana Lucia Varbanescu, UvA. Original slides: Rob van Nieuwpoort, escience Center MANY-CORE COMPUTING 7-Oct-2013 Ana Lucia Varbanescu, UvA Original slides: Rob van Nieuwpoort, escience Center Schedule 2 1. Introduction, performance metrics & analysis 2. Programming: basics (10-10-2013)

More information