Flynn s Classification

Similar documents
Classification of Parallel Architecture Designs

Module 5 Introduction to Parallel Processing Systems

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

EC 513 Computer Architecture

Course II Parallel Computer Architecture. Week 2-3 by Dr. Putu Harry Gunawan

Lecture 8: RISC & Parallel Computers. Parallel computers

Processor Architecture and Interconnect

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Parallel computer architecture classification

Lecture 26: Parallel Processing. Spring 2018 Jason Tang

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism

Computer Architecture

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih

Chapter 4 Data-Level Parallelism

Lecture 7: Parallel Processing

Parallel Processors. Session 1 Introduction

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Lecture 7: Parallel Processing

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

ARCHITECTURES FOR PARALLEL COMPUTATION

WHY PARALLEL PROCESSING? (CE-401)

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

Online Course Evaluation. What we will do in the last week?

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization

Part VII Advanced Architectures. Feb Computer Architecture, Advanced Architectures Slide 1

! Readings! ! Room-level, on-chip! vs.!

RAID 0 (non-redundant) RAID Types 4/25/2011

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

PIPELINE AND VECTOR PROCESSING

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.

Computer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

CS425 Computer Systems Architecture

omputer Design Concept adao Nakamura

Advanced Computer Architectures CC721

Architectures of Flynn s taxonomy -- A Comparison of Methods

RISC, CISC, and ISA Variations

Pipelining to Superscalar

Multiple Issue ILP Processors. Summary of discussions

CMSC 611: Advanced. Parallel Systems

Microprocessor Architecture Dr. Charles Kim Howard University

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013

Pipelining and Vector Processing

THREAD LEVEL PARALLELISM

The University of Texas at Austin

Unit 9 : Fundamentals of Parallel Processing

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Static Compiler Optimization Techniques

ECE 172 Digital Systems. Chapter 4.2 Architecture. Herbert G. Mayer, PSU Status 6/10/2018

Introduction. EE 4504 Computer Organization

Architecture of parallel processing in computer organization

SMD149 - Operating Systems - Multiprocessing

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Announcement. Computer Architecture (CSC-3501) Lecture 25 (24 April 2008) Chapter 9 Objectives. 9.2 RISC Machines

Lecture 9: Multiple Issue (Superscalar and VLIW)

Computer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015

Computer Architecture Spring 2016

Issues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

A taxonomy of computer architectures

ARCHITECTURES FOR PARALLEL COMPUTATION

Advanced Computer Architecture. The Architecture of Parallel Computers

ECE/CS 552: Pipelining to Superscalar Prof. Mikko Lipasti

LECTURE 10. Pipelining: Advanced ILP

Fundamentals of Computers Design

Multiprocessors & Thread Level Parallelism

This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers

Advanced Parallel Architecture. Annalisa Massini /2017

Processor Performance and Parallelism Y. K. Malaiya

CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture. VLIW, Vector, and Multithreaded Machines

Hakam Zaidan Stephen Moore

Computer Architecture 计算机体系结构. Lecture 4. Instruction-Level Parallelism II 第四讲 指令级并行 II. Chao Li, PhD. 李超博士

Fundamentals of Computer Design

CMSC 313 Lecture 27. System Performance CPU Performance Disk Performance. Announcement: Don t use oscillator in DigSim3

Instruction Level Parallelism

Portland State University ECE 588/688. Cray-1 and Cray T3E

Objective. We will study software systems that permit applications programs to exploit the power of modern high-performance computers.

anced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer

Chapter 18 Parallel Processing

Parallel Computing: Parallel Architectures Jin, Hai

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud

Pipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD

Embedded Systems Architecture. Computer Architectures

COSC 6385 Computer Architecture - Thread Level Parallelism (I)

Multi-cycle Instructions in the Pipeline (Floating Point)

Design of Digital Circuits Lecture 21: GPUs. Prof. Onur Mutlu ETH Zurich Spring May 2017

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering

COSC4201 Multiprocessors

BlueGene/L (No. 4 in the Latest Top500 List)

What is Pipelining. work is done at each stage. The work is not finished until it has passed through all stages.

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

CSE 490/590 Computer Architecture Homework 2

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

Computer Architecture: SIMD and GPUs (Part I) Prof. Onur Mutlu Carnegie Mellon University

IF1/IF2. Dout2[31:0] Data Memory. Addr[31:0] Din[31:0] Zero. Res ALU << 2. CPU Registers. extension. sign. W_add[4:0] Din[31:0] Dout[31:0] PC+4

Adapted from instructor s. Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK]

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Appendix , and 2.21

These slides do not give detailed coverage of the material. See class notes and solved problems (last page) for more information.

Transcription:

Flynn s Classification Guang R. Gao ACM Fellow and IEEE Fellow Endowed Distinguished Professor Electrical & Computer Engineering University of Delaware ggao@capsl.udel.edu 652-14F-PXM-intro 1

Execution Model Programming Models Users Users Programming Environment Platforms Execution Model API Abstract Machine Models Execution Model and Abstract Machines 652-12F-PXM-intro 2

Classification of Parallel Architecture Designs Flynn (1972) Problem pipelined computer is not well classified there are arch. which may be in >1 classes Consider Pipelining Functional Array MIMD 652-14F-PXM-intro 3

Reading List Slides. Henn&Patt: Chapter 8.1, 8.11 (may change depending on your book s version). Other assigned readings from homework and classes 652-14F-PXM-intro 4

Level of Parallelism Job level between jobs between phases of a job Program level between parts of a program within do-loops between different function invocations Instruction stream (thread) level Instruction level between phases of instruction execution between instructions Arithmetic and bit level within ALU units 652-14F-PXM-intro 5

Job Level Different job phases exp: CPU activities - I/O activities the overlapping may be achieved by programmer visible scheduling of resources Different jobs OS T1 T2 CPU job 1 job 2 I/O job 2 job 1 Architecture requirement: a balanced set of replicated resources in a computer installation. 652-14F-PXM-intro 6

Common memory Computational processor (stage 1) Input/output processor (stage 2) Stage 1 2 Task 1 Task 2 Task 1 Task 3... Task 2 Task 1 Task 3 Task1 Time idle time CPU/IO Overlapping 652-14F-PXM-intro 7

Program Level Parallelism Different code sections: diff. procedure/functions diff. code blocks Different iterations for the same loop Data-dependencies and program partitioning 652-14F-PXM-intro 8

Instruction Level Parallelism (ILP) Between instructions parallel execution of different instructions - spatial key: dependency between instructions Between phases of instructions overlapping different suboperations - pipelining pipelining of a suboperations itself, e.g. ALU pipelining 652-14F-PXM-intro 9

Pipeline: Overlay vs. Pipeline tightly coupled subfunctions fixed basic stage time independent basic function evaluation Overlap loosely coupled subfunctions variable basic stage time dependency between function evaluation 652-14F-PXM-intro 10

1 2 3 N 1 2 3 N 1 2 3 N 1 2 3 N 1 2 3 N Time Hardware design method: RT Principles of Pipelining 652-14F-PXM-intro 11

Instruction Fetch Instruction Decode Execute Memory Op Register Update Pipelining of an Instruction Execution 652-14F-PXM-intro 12

Hazards Any conditions within the pipelined system that disrupt, delay or prevent smooth flow of tasks through the pipelines. The detection and resolution of hazards constitute a major aspect of pipeline design Types of hazards 652-14F-PXM-intro 13

Flynn(72) Classification of parallel architecture is not based on the structure of the machine, but based on how the machine relates its instructions (streams) to the data (stream) being processed. A stream: a sequence of items: Instructions/Data. being executed or operated on by a processor. 652-14F-PXM-intro 14

S I S D + I L P S I M D + Vector M I S D M I M D I L P gains increasing attention! 652-14F-PXM-intro 15

S I S D for practical purpose: only one processor is useful pipelining may or may not be used, exp: CDC 6600 (not pipelined) CDC 7600 (pipelined ALU) Amdahl 470V/6 IBM 360/91 RISC (Pipelined instruction processing) often called as serial scalar computer 652-14F-PXM-intro 16

SIMD (single inst stream/multiple data stream) single processor vector operations one v-op includes many ops on a data stream both pipelined processing or array of processors are possible Example: CRAY -1 ILLIAC-IV ICL DAP 652-14F-PXM-intro 17

............... IS I/O CU IS PU DS MU (a) SISD uniprocessor architecture Program loaded from host IS CU PE 1 LM DS 1 DS PE n LM n DS DS Data sets loaded from host (b) SIMD architecture (with distributed memory) Captions: C = Control Unit PU = Processing Unit MU= Memory Unit IS = Instructin Stream DS = Data Stream PE = Processing Element LM = Local Memory I/O I/O IS IS CU 1 CU 1 IS IS PU 1 PU n DS DS Shard Memory (c) MIMD architecture (with shared memory) 652-14F-PXM-intro 18

Problems of Flynn s Scheme too broad everything in SIMD: vector machines? MISD? 652-14F-PXM-intro 19

Computers Single I stream Multiple I stream Single unpipelined E unit Serial unicomputers pipelined or multiple E units Parallel unicomputers MIMD The broad subdivisions in computer architecture 652-14F-PXM-intro 20

Unpipelined S I S D + I L P Pipelined Vector Multiple E unit Only scalar instructions Vector instructions Horizontal control Issue-whenready reg-reg mem-mem CDC 6600 FPS AP-120B CDC 7600 CRAY-1 CDC Cyber-205 VLIW IBM 360/91 Parallel unicomputers based on functional parallelism and pipelining 652-14F-PXM-intro 21

ILP Architectures Multiple inst /op issuing + deep-pipelining Superscalar Multiple inst/cycle (Power PC, HP Precision, Intel Pentium) VLIW Multiple op in one inst/cycle ESL/polycyclic or Cydra5 Multiflow new HP/Intel mp Intel ia-64 architecture? Superpipelined MIPS 4000/8000 DEC Alpha (earlier versions) Decoupled Arch Multithreaded Arch EARTH/MTA Multiscalar [Sohi] 652-14F-PXM-intro 22