C 1. Last Time. CSE 490/590 Computer Architecture. ISAs and MIPS. Instruction Set Architecture (ISA) ISA to Microarchitecture Mapping

Similar documents
ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 4 Reduced Instruction Set Computers

CS 152 Computer Architecture and Engineering. Lecture 3 - From CISC to RISC

Simple Instruction Pipelining

CS 152 Computer Architecture and Engineering. Lecture 3 - From CISC to RISC

Lecture 4 - Pipelining

C 1. Last time. CSE 490/590 Computer Architecture. Complex Pipelining I. Complex Pipelining: Motivation. Floating-Point Unit (FPU) Floating-Point ISA

ECE 4750 Computer Architecture Topic 2: From CISC to RISC

Lecture 3 - From CISC to RISC

Lecture 6 Datapath and Controller

Lecture 08: RISC-V Single-Cycle Implementa9on CSE 564 Computer Architecture Summer 2017

CS 152 Computer Architecture and Engineering. Lecture 2 - Simple Machine Implementations

CS 152 Computer Architecture and Engineering. Lecture 10 - Complex Pipelines, Out-of-Order Issue, Register Renaming

Lecture 3 - From CISC to RISC

CS 152 Computer Architecture and Engineering. Lecture 2 - Simple Machine Implementations

Introduction. What is Computer Architecture? Design constraints. What is Computer Architecture? Computer Architecture ELEC3441

Introduction. What is Computer Architecture? Meltdown & Spectre. Meltdown & Spectre. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So

Computer Architecture ELEC3441

ECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 8 Instruction-Level Parallelism Part 1

CS 152 Computer Architecture and Engineering. Lecture 16 - VLIW Machines and Statically Scheduled ILP

EC 513 Computer Architecture

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 2 CISC and Microcoding

CS 152 Computer Architecture and Engineering. Lecture 13 - VLIW Machines and Statically Scheduled ILP

Introduction. What is Computer Architecture? Meltdown & Spectre. Meltdown & Spectre. Computer Architecture ELEC3441. Dr. Hayden Kwok-Hay So

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 13 - Out-of-Order Issue and Register Renaming

Computer Architecture 计算机体系结构. Lecture 2. Instruction Set Architecture 第二讲 指令集架构. Chao Li, PhD. 李超博士

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 6 Pipelining Part 1

CS 152, Spring 2011 Section 8

CS 152 Computer Architecture and Engineering. Lecture 16: Vector Computers

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 3 - From CISC to RISC

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

CS 152, Spring 2011 Section 2

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

Computer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So

CS252 Spring 2017 Graduate Computer Architecture. Lecture 8: Advanced Out-of-Order Superscalar Designs Part II

Lecture 13 - VLIW Machines and Statically Scheduled ILP

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 3 Early Microarchitectures

CS 152 Computer Architecture and Engineering. Lecture 12 - Advanced Out-of-Order Superscalars

Instruction Set Principles. (Appendix B)

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Lecture 4: Instruction Set Architecture

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Computer Architecture

Typical Processor Execution Cycle

Instruction Set Architecture (ISA)

John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley

55:132/22C:160, HPCA Spring 2011

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 15 Very Long Instruction Word Machines

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CS 152 Computer Architecture and Engineering. Lecture 3 - From CISC to RISC. Last Time in Lecture 2

RISC, CISC, and ISA Variations

ECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 9 Instruction-Level Parallelism Part 2

Lecture 12 Branch Prediction and Advanced Out-of-Order Superscalars

ECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 14 Very Long Instruction Word Machines

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Lecture 4 Pipelining Part II

Alternate definition: Instruction Set Architecture (ISA) What is Computer Architecture? Computer Organization. Computer structure: Von Neumann model

CS 152 Computer Architecture and Engineering. Lecture 8 - Memory Hierarchy-III

Instructions: Language of the Computer

VLIW/EPIC: Statically Scheduled ILP

Lecture 7 Pipelining. Peng Liu.

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

Lecture 14: Multithreading

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

ECE 486/586. Computer Architecture. Lecture # 7

EC 513 Computer Architecture

Chapter 2: Instructions How we talk to the computer

Lecture 7 - Memory Hierarchy-II

Processor (I) - datapath & control. Hwansoo Han

Outline. What Makes a Good ISA? Programmability. Implementability

CS 152 Computer Architecture and Engineering. Lecture 2 - Simple Machine Implementations

CS 252 Graduate Computer Architecture. Lecture 4: Instruction-Level Parallelism

CS 152 Computer Architecture and Engineering. Lecture 22: Virtual Machines

Systems Architecture

Multicycle Approach. Designing MIPS Processor

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

COMP3221: Microprocessors and. and Embedded Systems. Instruction Set Architecture (ISA) What makes an ISA? #1: Memory Models. What makes an ISA?

ECE232: Hardware Organization and Design

CS 152 Computer Architecture and Engineering. Lecture 8 - Memory Hierarchy-III

CS252 Spring 2017 Graduate Computer Architecture. Lecture 18: Virtual Machines

ECE369. Chapter 5 ECE369

Lecture 2 - Simple Machine Implementations, Microcode

General Purpose Processors

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

From CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Appendix , and 2.21

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

CSCI-564 Advanced Computer Architecture

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

C 1. Last time. CSE 490/590 Computer Architecture. Virtual Machines I. Types of Virtual Machine (VM) Outline. User Virtual Machine = ISA + Environment

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE. Debdeep Mukhopadhyay, CSE, IIT Kharagpur. Instructions and Addressing

Computer Architecture

CS 152 Computer Architecture and Engineering. Lecture 18: Multithreading

CPE 335 Computer Organization. Basic MIPS Architecture Part I

Transcription:

CSE 49/59 Computer Architecture ISAs and MIPS Last Time Computer Architecture >> ISAs and RTL Comp. Arch. shaped by technology and applications Computer Architecture brings a quantitative approach to the table 5 quantitative principles of design The current performance trend sho that Latency lags behind bandwidth Steve Ko Computer Sciences and gineering University at Bualo CSE 49/59, Spring 211 CSE 49/59, Spring 211 2 Instruction Set Architecture (ISA) The contract beten software and hardware Typically described by giving all the programmervisible state (registers + memory) plus the semantics of the ructions that operate on that state IBM 36 was first line of machines to separate ISA from implementation (aka. microarchitecture) Many implementations possible for a given ISA E.g., today you can buy AMD or Intel processors that run the x86-64 ISA. E.g.2: many cellphones use the ARM ISA with implementations from many dierent companies including TI, Qualcomm, Samsung, Marvell, etc. E.g.3., the Soviets build code-compatible clones of the IBM36, as did Amdhal after he left IBM. ISA to Microarchitecture Mapping ISA often designed with particular microarchitectural style in mind, e.g., CISC microcoded RISC hardwired, pipelined VLIW fixed-latency in-order parallel pipelines JVM software interpretation But can be implemented with any microarchitectural style Intel Nehalem: hardwired pipelined CISC (x86) machine (with some microcode support) Intel could implement a dynamically scheduled outof-order VLIW Itanium (IA-64) processor ARM Jaelle: A hardware JVM processor CSE 49/59, Spring 211 3 CSE 49/59, Spring 211 4 Datapath vs Datapath signals ler Microcoded Microarchitecture (CISC) busy? ero? opcode µcontroller (ROM) holds fixed microcode ructions Datapath Points Data r Datapath: Storage, FU, interconnect suicient to perform the desired functions Inputs are Points Outputs are signals ler: State machine to orchestrate operation on the data path Based on desired function and signals CSE 49/59, Spring 211 5 holds user program written in macrocode ructions (e.g., MIPS, x86, etc.) (RAM) enmem MemWrt CSE 49/59, Spring 211 6 C 1

CISC 7 cycles Inst 1 Inst 2 5 cycles 1 cycles Time Variable cycles per ruction Variable ess modes reg-reg, reg-mem, mem-mem, etc. Convenient for programmers Support many ructions Diicult to predict completion time No good for pipelining Inst 3 Hardwired is pure Combinational Logic (RISC) op code ero? combinational logic BSrc Sel MemWrite WBSrc RegDst Src CSE 49/59, Spring 211 7 CSE 49/59, Spring 211 8 A "Typical" RISC ISA 32-bit fixed format ruction (3 formats) 32 32-bit GPR (R contains ero, DP take pair) 3-ess, reg-reg arithmetic ruction Single ess mode for load/store: base + displacement no indirection Simple branch conditions Delayed branch Designed for use by compilers & pipelining see: SPARC, MIPS, HP PA-Risc, DEC Alpha, IBM Por, CDC 66, CDC 76, Cray-1, Cray-2, Cray-3 Example: MIPS Register-Register 31 26 25 21 2 16 15 11 1 6 5 Rs1 Rs2 Register-ediate 31 26 25 21 2 16 15 Rs1 Rd immediate 31 26 25 target Rd x Branch 31 26 25 21 2 16 15 Rs1 Rs2/x immediate Jump / Call CSE 49/59, Spring 211 9 CSE 49/59, Spring 211 1 Hardware Elements Combinational circuits Mux, Decoder,, A A 1 A n-1. Sel lg(n) O Mux Synchronous state elements Flipflop, Register, Register file, SRAM, DRAM D Q D Q Edge-triggered: Data is sampled at the rising edge CSE 49/59, Spring 211 A lg(n) Decoder. O O 1 O n-1 A B Select -, Sub, - And, Or, Xor, Not, - GT, LT, EQ, Zero, Result Comp? Register Files ReadSel1 ReadSel2 WriteSel WriteData Reads are combinational D Q register D 1 Q 1 D 2 Q 2 Clock WE Register rd2 file 2R+1W wd D n-1 Q n-1 ReadData1 ReadData2 CSE 49/59, Spring 211 12 C 2

A Simple Model ress WriteData Writeable Clock MAGIC RAM ReadData Reads and writes are always completed in one cycle a Read can be done any time (i.e. combinational) a Write is performed at the rising clock edge if it is enabled the write ess and data must be stable at the clock edge CSE 49/59 Administrivia Please check the b page: http://www.cse.bualo.edu/~stevko/ courses/cse49/spring11! Don t forget Recitations start from this ek. Please purchase a BASYS2 board (1K) as soon as possible. Projects should be done individually. Please read the syllabus bpage. I have no idea how fast/slow I m going. Please stop me if too fast! CSE 49/59, Spring 211 13 CSE 49/59, Spring 211 14 Implementing MIPS: Single-cycle per ruction datapath & control logic CSE 49/59, Spring 211 15 The MIPS ISA Processor State 32 32-bit, R always contains a 32 single precision FPRs, may also be vied as 16 double precision FPRs FP status register, used for FP compares & exceptions, the program counter some other special registers Data types 8-bit byte, 16-bit half word 32-bit word for integers 32-bit word for single precision floating point 64-bit word for double precision floating point Load/Store style ruction set data essing modes- immediate & indexed branch essing modes- relative & register indirect Byte essable memory- big endian mode All ructions are 32 bits CSE 49/59, Spring 211 16 Example: MIPS Register-Register 31 26 25 21 2 16 15 11 1 6 5 Rs1 Rs2 Register-ediate 31 26 25 21 2 16 15 Rs1 Rd immediate 31 26 25 target Rd x Branch 31 26 25 21 2 16 15 Rs1 Rs2/x immediate Jump / Call Instruction Execution Execution of an ruction involves 1. ruction fetch 2. decode and register fetch 3. operation 4. memory operation (optional) 5. write back and the computation of the ess of the next ruction CSE 49/59, Spring 211 17 CSE 49/59, Spring 211 18 C 3

Datapath: Reg-Reg Instructions Datapath: Reg- Instructions x4 x4 <25:21> <2:16> <15:11> <25:21> <2:16> <15:> <5:> <31:26> Timing? rs rt rd func rd (rs) func (rt) 31 26 25 21 2 16 15 11 5 CSE 49/59, Spring 211 19 6 5 5 16 31 26 25 212 16 15 CSE 49/59, Spring 211 2 Conflicts in Merging Datapath Datapath for Instructions x4 <25:21> <2:16> <15:11> <15:> Introduce muxes x4 <25:21> <2:16> <15:11> <15:> <31:26> <5:> <31:26>, <5:> rs rt rd func rd (rs) func (rt) CSE 49/59, Spring 211 21 RegDst Sel BSrc rt / rd Reg / rs rt rd func rd (rs) func (rt) CSE 49/59, Spring 211 22 Datapath for Instructions Should program and data memory be separate? Harvard style: separate (Aiken and Mark 1 influence) - read-only program memory - read/write data memory - Note: Somehow there must be a way to load the program memory Princeton style: the same (von Neumann s influence) - single read/write memory for program and data Load/Store Instructions:Harvard Datapath x4 base disp MemWrite rdata Data wdata WBSrc / Mem - Note: A Load or Store ruction requires accessing the memory more than once during its execution CSE 49/59, Spring 211 23 RegDst Sel 6 5 5 16 essing mode opcode rs rt displacement (rs) + displacement 31 26 25 21 2 16 15 rs is the base register rt is the destination of a Load or the source for a Store CSE 49/59, Spring 211 24 BSrc C 4

Acknowledgements These slides heavily contain material developed and copyright by Krste Asanovic (MIT/UCB) David Patterson (UCB) And also by: Arvind (MIT) Joel Emer (Intel/MIT) James Hoe (CMU) John Kubiatowic (UCB) MIT material derived from course 6.823 UCB material derived from course CS252 CSE 49/59, Spring 211 25 C 5