Chapter 06: Instruction Pipelining and Parallel Processing. Lesson 14: Example of the Pipelined CISC and RISC Processors

Similar documents
Chapter 04: Instruction Sets and the Processor organizations. Lesson 20: RISC and converged Architecture

Real instruction set architectures. Part 2: a representative sample

Basic Computer Architecture

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 14 Instruction Level Parallelism and Superscalar Processors

CISC / RISC. Complex / Reduced Instruction Set Computers

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-11: 80x86 Architecture

Processing Unit CS206T

EN164: Design of Computing Systems Lecture 24: Processor / ILP 5

Intel released new technology call P6P

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

Chapter 02: Computer Organization. Lesson 02: Functional units and components in a computer organization- Part 1: Processor

RISC Processors and Parallel Processing. Section and 3.3.6

Advanced processor designs

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 12 Processor Structure and Function

Advanced Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 5:: Target Machine Architecture (cont.)

Superscalar Machines. Characteristics of superscalar processors

Advanced Processor Architecture

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

EJEMPLOS DE ARQUITECTURAS

IF1 --> IF2 ID1 ID2 EX1 EX2 ME1 ME2 WB. add $10, $2, $3 IF1 IF2 ID1 ID2 EX1 EX2 ME1 ME2 WB sub $4, $10, $6 IF1 IF2 ID1 ID2 --> EX1 EX2 ME1 ME2 WB

Advanced d Processor Architecture. Computer Systems Laboratory Sungkyunkwan University

CPU Structure and Function

EE 4980 Modern Electronic Systems. Processor Advanced

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

EC 513 Computer Architecture

ECE 486/586. Computer Architecture. Lecture # 8

ECE 341. Lecture # 15

Lecture Topics. Branch Condition Options. Branch Conditions ECE 486/586. Computer Architecture. Lecture # 8. Instruction Set Principles.

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Superscalar Processors

Like scalar processor Processes individual data items Item may be single integer or floating point number. - 1 of 15 - Superscalar Architectures

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Superscalar Processors

Case Study IBM PowerPC 620

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

Complex Pipelines and Branch Prediction

Pipelining and Vector Processing

ELC4438: Embedded System Design Embedded Processor

Lecture 4: Instruction Set Design/Pipelining

Cycles Per Instruction For This Microprocessor

More advanced CPUs. August 4, Howard Huang 1

History of the Intel 80x86

CISC 662 Graduate Computer Architecture Lecture 13 - CPI < 1

PowerPC 740 and 750

Microprocessor Architecture Dr. Charles Kim Howard University

UNIT- 5. Chapter 12 Processor Structure and Function

Chapter 04: Instruction Sets and the Processor organizations. Lesson 08: Processor Instructions - Part 1

Computer System Architecture

SUPERSCALAR AND VLIW PROCESSORS

Instruction Set Architecture (ISA)

Computer Architecture and Data Manipulation. Von Neumann Architecture

Chapter 2 Logic Gates and Introduction to Computer Architecture

Computer Organization CS 206 T Lec# 2: Instruction Sets

omputer Design Concept adao Nakamura

Pipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15

Topics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation

Administration CS 412/413. Instruction ordering issues. Simplified architecture model. Examples. Impact of instruction ordering

CS 101, Mock Computer Architecture

SAE5C Computer Organization and Architecture. Unit : I - V

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU

Lecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2

Superscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?

Superscalar Processors Ch 14

Reduction of Data Hazards Stalls with Dynamic Scheduling So far we have dealt with data hazards in instruction pipelines by:

ECE 571 Advanced Microprocessor-Based Design Lecture 4

Chapter 12. CPU Structure and Function. Yonsei University

TECH. CH14 Instruction Level Parallelism and Superscalar Processors. What is Superscalar? Why Superscalar? General Superscalar Organization

Chapter 2: Data Manipulation

NPTEL. High Performance Computer Architecture - Video course. Computer Science and Engineering.

E0-243: Computer Architecture

CISC, RISC and post-risc architectures

Chapter 2: Data Manipulation

Lecture 4: RISC Computers

This Material Was All Drawn From Intel Documents

Chapter 13 Reduced Instruction Set Computers

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

ECE 485/585 Microprocessor System Design

Processors. Young W. Lim. May 9, 2016

17. Instruction Sets: Characteristics and Functions

Advance CPU Design. MMX technology. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. ! Basic concepts

Chapter 2: Data Manipulation. Copyright 2015 Pearson Education, Inc.

CS425 Computer Systems Architecture

CISC Attributes. E.g. Pentium is considered a modern CISC processor

Photo David Wright STEVEN R. BAGLEY PIPELINES AND ILP

Advanced Computer Architecture

Next Generation Technology from Intel Intel Pentium 4 Processor

Modern Computer Architecture (Processor Design) Prof. Dan Connors

Processor: Superscalars Dynamic Scheduling

The von Neumann Architecture. IT 3123 Hardware and Software Concepts. The Instruction Cycle. Registers. LMC Executes a Store.

ARM Processor. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

What is Superscalar? CSCI 4717 Computer Architecture. Why the drive toward Superscalar? What is Superscalar? (continued) In class exercise

Instruction Sets: Characteristics and Functions

Reorder Buffer Implementation (Pentium Pro) Reorder Buffer Implementation (Pentium Pro)

Lecture 4: RISC Computers

Department of Industrial Engineering. Sharif University of Technology. Operational and enterprises systems. Exciting directions in systems

ECE 486/586. Computer Architecture. Lecture # 7

PowerPC 620 Case Study

Modern Processors. RISC Architectures

Transcription:

Chapter 06: Instruction Pipelining and Parallel Processing Lesson 14: Example of the Pipelined CISC and RISC Processors 1

Objective To understand pipelines and parallel pipelines in CISC and RISC Processors by Examples

Most modern processors Improve performance SunSparc, Pentiums, PowerPCs are the superscalars and examples of pipelining and parallel processing of instructions 3

Pentium and PowerPC pipelines in parallel Performance improves in processors by allowing independent instructions to execute simultaneously (instruction-level parallelism) 4

Pentiums 5

Pentiums The superscalar processors Have simple I-bit-technique based dynamic branch prediction 6

Pentiums Two pipelines Named as U and V V pipeline for simple instructions and U for any instruction 64-bit wide external data bus The conditional jump was made a second instruction among the two running in parallel 7

Pentium U and V pipelines 8

Two instructions proceed through the parallel pipelines Two instructions proceed through the parallel pipelines at one stage per cycle, until they reach the register (result) write-back (WB) stage At WB the execution of instruction I n is complete in pipeline 1 and I n+1 in 2 9

Pentium U and V pipelines In earlier versions, Pentium processors executed two integer instructions or two floating-point instructions in parallel 10

Pentium U and V pipelines Pentium s later versions had three independent units, two for integer operations and one for floating-point operations Pentium Pro version had an additional pipeline and supported two integer operations and two floating-point operations in four pipelines 11

Pentium U and V pipelines Pentium II versions had additional MMX instructions 12

Pentium MMX (a) 128-bit MMX registers, each for two packed 64-bit floating-point operations and pipelines has 20-stages (b) 128-bit MMX registers, each for two packed 64-bit floating-point operations and pipelines has 10-stages 13

Pentium MMX (c) Two number 64-bit MMX registers, each for one packed 64-bit floating-point operations and pipelines has 5 stages (d) Four number 32-bit MMX registers, each for two packed 64-bit floating-point operations and pipelines has four sets of 5-stage pipelines. 14

Pentium III versions Introduced single instruction multiple data SIMD instructions with extension for execution of streaming floating point operations in pipelines Pipelines had 10 stages 15

Pentium IV versions Introduced in the twenty-first century with 1GHz plus clock cycles 128-bit XMM (extended multi media) registers, each for two packed 64-bit floatingpoint operations The 128-bit registers handle long integers also Pipelines have 20 stages 16

PowerPC Processor 17

PowerPC IBM, Motorola, and Apple developed the PowerPC superscalars An RISC design but provides for more than 200 instructions at the instruction set 18

PowerPC Load and store architecture (only the LD and ST access the memory and not the arithmetic and logic instructions) Separate GPRs and floating-point registers 19

PowerPC PowerPC 601 and 603 have three pipelines each one for branch processing, one for integers (E-integer Pipeline), and one for floating-point (E-floating point Pipeline) operations 20

Uneven number of Stages in Pipelines There are five stages in the integerprocessing pipeline, six in the floating-point pipeline, and two in the branch-processing pipeline stages 21

Stages in Pipelines The second stage in E-integer line decodes, reads, and dispatches operands The third stage in E-integer line executes and computes memory addresses The fourth stage is cache access 22

E-unit Integer Operations Pipeline 23

E-unit Floating point operations pipeline 24

E-branch pipeline 25

PowerPC version 604 Six pipelines Three integer and one floating-point instruction in parallel per clock cycle. There were six independent units, two for integer operations, two for floating-point operations, and two for branch processing 26

PowerPC version 620 PowerPC 620 had 64-bit operations. It also had six pipelines and three integers and one floatingpoint instruction in parallel per clock cycle 27

PowerPC 750 series Introduced in twenty-first century and has many features. MPC7XXX has eleven independent pipeline units load and store processing unit, branch-processing unit, floating-point unit, four integer operations, and two for floating-point operations and packed data operations. 28

ARM 29

ARM (Advance RISC Machines) ARM (Advance RISC Machines) developed the large variations in superscalars as per application-specific needs 30

ARM (Advance RISC Machines) It has separate 16 GPRs, CPSR, and SPSR (Current Process Saved Status Register and Saved Process Status Register) 31

ARM RISC design ARM7TDMI had three stages pipeline 32

ARM7 three stage pipeline 33

Pipelines in StrongARM and ARM 9 Five Clock Cycle Stages Pipeline Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 StrongARM has 5-stage pipeline and a high-speed multiplier circuitry ARM9 has 5-stage pipeline 34

Pipelines in ARM 10 Five Stages Clock Cycle Pipeline Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 ARM10 has 6-stage pipelines 35

Summary 36

We learnt Superscalar Processors Pentium, PowerPC ARM pipelines 37

End of Lesson 14 on Example of the Pipelined CISC and RISC Processors