CS650 Computer Architecture. Lecture 10 Introduction to Multiprocessors and PC Clustering

Similar documents
Computer Architecture

COSC 6385 Computer Architecture - Multi Processor Systems

Dr. Joe Zhang PDC-3: Parallel Platforms

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Parallel Computing Platforms. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Parallel Computing Platforms

Issues in Multiprocessors

Scalability and Classifications

CS4230 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/28/12. Homework 1: Parallel Programming Basics

Introduction II. Overview

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Issues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Multicores, Multiprocessors, and Clusters

Lecture 7: Parallel Processing

Types of Parallel Computers

CSC2/458 Parallel and Distributed Systems Machines and Models

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013

CS Parallel Algorithms in Scientific Computing

Intro to Multiprocessors

Issues in Multiprocessors

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Introduction to Parallel Programming

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015

CS4961 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/30/11. Administrative UPDATE. Mary Hall August 30, 2011

Lecture 7: Parallel Processing

Programmation Concurrente (SE205)

COSC 6385 Computer Architecture - Thread Level Parallelism (I)

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Chapter 11. Introduction to Multiprocessors

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University

High Performance Computing Systems

Computer Architecture Spring 2016

CSE 392/CS 378: High-performance Computing - Principles and Practice

Online Course Evaluation. What we will do in the last week?

Introduction to Parallel Programming

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Objective. We will study software systems that permit applications programs to exploit the power of modern high-performance computers.

Part VII Advanced Architectures. Feb Computer Architecture, Advanced Architectures Slide 1

Chap. 4 Multiprocessors and Thread-Level Parallelism

Parallel Architectures

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

RISC Processors and Parallel Processing. Section and 3.3.6

Computer Architecture and Organization

Parallel Architecture, Software And Performance

Outline Marquette University

Beyond Latency and Throughput

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow.

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism

BİL 542 Parallel Computing

Embedded Systems Architecture. Computer Architectures

THREAD LEVEL PARALLELISM

Introduction to Parallel Programming

Multiprocessors - Flynn s Taxonomy (1966)

Lecture 1: Introduction

What is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.

CSE 262 Spring Scott B. Baden. Lecture 1 Introduction

Lecture 2 Parallel Programming Platforms

ECE/CS 757: Advanced Computer Architecture II

Introduction. Distributed Systems. Introduction. Introduction. Instructor Brian Mitchell - Brian

Lect. 2: Types of Parallelism

Computer Organization. Chapter 16

The Art of Parallel Processing

Chapter 7. Multicores, Multiprocessors, and Clusters

Introduction to Parallel Computing

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348

Computer parallelism Flynn s categories

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.

TDT 4260 lecture 3 spring semester 2015

Chap. 2 part 1. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1

An Introduction to Parallel Architectures

Lecture 25: Interrupt Handling and Multi-Data Processing. Spring 2018 Jason Tang

Introduction to parallel computing

Chapter 18 Parallel Processing

Chapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance

Parallel programming. Luis Alejandro Giraldo León

BlueGene/L (No. 4 in the Latest Top500 List)

Number of processing elements (PEs). Computing power of each element. Amount of physical memory used. Data access, Communication and Synchronization

Chapter 1: Perspectives

3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:

A taxonomy of computer architectures

Parallel Computers. CPE 631 Session 20: Multiprocessors. Flynn s Tahonomy (1972) Why Multiprocessors?

Parallel Architectures

Module 5 Introduction to Parallel Processing Systems

RAID 0 (non-redundant) RAID Types 4/25/2011

ARCHITECTURES FOR PARALLEL COMPUTATION

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud

Parallel Computing Introduction

Review: Parallelism - The Challenge. Agenda 7/14/2011

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming

Multi-core Programming - Introduction

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

Multiprocessors & Thread Level Parallelism

Design of Digital Circuits Lecture 21: GPUs. Prof. Onur Mutlu ETH Zurich Spring May 2017

What is Parallel Computing?

Goals of this lecture

CDA3101 Recitation Section 13

Transcription:

CS650 Computer Architecture Lecture 10 Introduction to Multiprocessors and PC Clustering Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 10: Intro to Multiprocessors/Clustering 1/15 12/7/2003 A. Sohn Key Issues Run PhotoShop on 1 PC or N PCs Programmability How to program a bunch of PCs viewed as a single logical machine. Performance Scalability - Speedup Run PhotoShop on 1 PC (forget the specs of this PC) Run PhotoShop on N PCs Will it run faster on N PCs? Speedup =? Lecture 10: Intro to Multiprocessors/Clustering 2/15 12/7/2003 A. Sohn

Types of Multiprocessors Key: Data and Instruction Single Instruction Single Data (SISD) Intel processors, AMD processors Single Instruction Multiple Data (SIMD) Array processor Pentium X feature Multiple Instruction Single Data (MISD) Systolic array Special purpose machines Multiple Instruction Multiple Data (MIMD) Typical multiprocessors (Sun, SGI, Cray,...) Single Program Multiple Data (SPMD) Programming model Lecture 10: Intro to Multiprocessors/Clustering 3/15 12/7/2003 A. Sohn Shared-Memory Multiprocessor Prcessor Prcessor Interconnection network Main Memory Storage I/O Lecture 10: Intro to Multiprocessors/Clustering 4/15 12/7/2003 A. Sohn

Distributed-Memory Multiprocessor Interconnection network Lecture 10: Intro to Multiprocessors/Clustering 5/15 12/7/2003 A. Sohn Key Issues Programmability Intel Pentium, AMD machines Performance (Scalability) - Speedup Run PhotoShop on 1 machine, or n machines Execution time on 1 machine = x Execution time on n machines = y Speedup = x/y = the best speedup = n (often higher due to various reasons including cache behavior) Lecture 10: Intro to Multiprocessors/Clustering 6/15 12/7/2003 A. Sohn

Parallelization of Applications Parallel portion 80% Serial portion 20% Lecture 10: Intro to Multiprocessors/Clustering 7/15 12/7/2003 A. Sohn Parallelization for 4 s Consider a loop with 100 iterations: for (i=0;i<100;i++) The loop has no loop-carried dependence, straight forward to parallelize the code for 4 processors. Each processor executes 1/4 of the loop, i.e. 25 iterations as follows: P0: for (i=0;i<25;i++) P1: for (i=25;i<50;i++) P2: for (i=50;i<75;i++) P3: for (i=75;i<100;i++) Lecture 10: Intro to Multiprocessors/Clustering 8/15 12/7/2003 A. Sohn

Shared-Memory Multiprocessor Prcessor Prcessor Interconnection network for (i=0;i<100;i++) Storage I/O Main Memory Lecture 10: Intro to Multiprocessors/Clustering 9/15 12/7/2003 A. Sohn Distributed-Memory Multiprocessor for (i=0;i<25;i++) for (i=25;i<50;i++) Interconnection network for (i=50;i<75;i++) for (i=75;i<100;i++) Lecture 10: Intro to Multiprocessors/Clustering 10/15 12/7/2003 A. Sohn

Amdahl s Law Maximum speedup is limited by the serial fraction of a program N = the number of processors, s = the time spent by a processor on serial part of a program, p = the time spent by a processor on parallel part of a program t = total time = s + p = 1 Maximum possible speedup is given by: s + p 1 1 D = ------------ = --------------------- = ------------------------------ p p s + --- 1 p + --- 1 N N 1 p 1 --- N p = 1 1 --- D ------------ 1 1 --- N Lecture 10: Intro to Multiprocessors/Clustering 11/15 12/7/2003 A. Sohn Amdahl s Law - Example N = 10 PCs, the speedup to achieve D = 8, the portion of program to be parallelized will be p 1 1 1 --- 1 -- D 8 0.875 = ------------ = -------------- = ------------ = 0.972 1 1 0.9 1 --- 1 ----- N 10 What do these numbers mean? want to achieve 8 fold speedup on 10 PCs --> 97.2% of the entire code has to be parallelized! Lecture 10: Intro to Multiprocessors/Clustering 12/15 12/7/2003 A. Sohn

Amdahl s Law - Example Parallel portion 97.2% Serial portion 2.8% main() { } Lecture 10: Intro to Multiprocessors/Clustering 13/15 12/7/2003 A. Sohn Clustering of PCs A Poor Man s Supercomputer Linux box clustering MS Windows clustering For scalability and high availability Mail server clustering: MS Exchange DB server clustering - MS SQL server, MySQL sever,... VPN box clustering Firewall clustering Filtering clustering Web clustering Lecture 10: Intro to Multiprocessors/Clustering 14/15 12/7/2003 A. Sohn

Data Center Infrastructure Chapter 6 of our textbook Lecture 10: Intro to Multiprocessors/Clustering 15/15 12/7/2003 A. Sohn