EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

Similar documents
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EC-801 Advanced Computer Architecture

Lecture 4: Instruction Set Architecture

EITF20: Computer Architecture Part2.2.1: Pipeline-1

Instruction Set Principles and Examples. Appendix B

55:132/22C:160, HPCA Spring 2011

Instruction Set Architecture (ISA)

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Instruction Set Architectures

CS4617 Computer Architecture

Instruction Set Architecture. "Speaking with the computer"

Lecture 3 Machine Language. Instructions: Instruction Execution cycle. Speaking computer before voice recognition interfaces

Review of instruction set architectures

CSCE 5610: Computer Architecture

Reminder: tutorials start next week!

Instruction Set Design

CHAPTER 5 A Closer Look at Instruction Set Architectures

ISA and RISCV. CASS 2018 Lavanya Ramapantulu

From CISC to RISC. CISC Creates the Anti CISC Revolution. RISC "Philosophy" CISC Limitations

EC 413 Computer Organization

CPU Architecture and Instruction Sets Chapter 1

ECE 486/586. Computer Architecture. Lecture # 7

Lecture 4: Instruction Set Design/Pipelining

CHAPTER 5 A Closer Look at Instruction Set Architectures

Typical Processor Execution Cycle

Instruction Set Principles. (Appendix B)

Chapter 2. Instruction Set Design. Computer Architectures. software. hardware. Which is easier to change/design??? Tien-Fu Chen

EITF20: Computer Architecture Part2.2.1: Pipeline-1

CSCI 402: Computer Architectures. Instructions: Language of the Computer (1) Fengguang Song Department of Computer & Information Science IUPUI

Evolution of ISAs. Instruction set architectures have changed over computer generations with changes in the

Chapter 2: Instructions How we talk to the computer

Chapter 2. Instruction Set Principles and Examples. In-Cheol Park Dept. of EE, KAIST

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Instruction Set Architecture

CHAPTER 5 A Closer Look at Instruction Set Architectures

Instruction Set II. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 7. Instruction Set. Quiz. What is an Instruction Set?

Chapter 2A Instructions: Language of the Computer

Math 230 Assembly Programming (AKA Computer Organization) Spring 2008

CENG3420 Lecture 03 Review

Graduate Computer Architecture. Chapter 2. Instruction Set Principles

ISA: The Hardware Software Interface

Instruction Set Architectures. Part 1

Lecture 4: RISC Computers

GRE Architecture Session

Computer Architecture

Computer Hardware Generations

CHAPTER 2: INSTRUCTION SET PRINCIPLES. Prepared by Mdm Rohaya binti Abu Hassan

CSEE 3827: Fundamentals of Computer Systems

Instruction Set Architectures. CS301 Prof. Szajda

CS/ECE/752 Chapter 2 Instruction Sets Instructor: Prof. Wood

Instruction Sets Ch 9-10

Instruction Sets Ch 9-10

ECE 486/586. Computer Architecture. Lecture # 8

Understand the factors involved in instruction set

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Instruction Set. Instruction Sets Ch Instruction Representation. Machine Instruction. Instruction Set Design (5) Operation types

Chapter 1. Computer Abstractions and Technology. Lesson 3: Understanding Performance

Lecture Topics. Branch Condition Options. Branch Conditions ECE 486/586. Computer Architecture. Lecture # 8. Instruction Set Principles.

EITF20: Computer Architecture Part1.1.1: Introduction

COMP3221: Microprocessors and. and Embedded Systems. Instruction Set Architecture (ISA) What makes an ISA? #1: Memory Models. What makes an ISA?

Announcements HW1 is due on this Friday (Sept 12th) Appendix A is very helpful to HW1. Check out system calls

Chapter 5. A Closer Look at Instruction Set Architectures

Computer Architecture. MIPS Instruction Set Architecture

Lecture 5: Instruction Set Architectures II. Take QUIZ 2 before 11:59pm today over Chapter 1 Quiz 1: 100% - 29; 80% - 25; 60% - 17; 40% - 3

ECE232: Hardware Organization and Design

Instructions: Language of the Computer

Computer Organization CS 206 T Lec# 2: Instruction Sets

Review: What is a Computer?

Chapter 13 Reduced Instruction Set Computers

Math 230 Assembly Programming (AKA Computer Organization) Spring MIPS Intro

Computer Architecture

COSC 6385 Computer Architecture. Instruction Set Architectures

Multiple Issue ILP Processors. Summary of discussions

CSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements

COMPUTER ORGANIZATION & ARCHITECTURE

ECE 154A Introduction to. Fall 2012

Computer Systems Architecture I. CSE 560M Lecture 3 Prof. Patrick Crowley

ENGN1640: Design of Computing Systems Topic 03: Instruction Set Architecture Design

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

General Purpose Processors

Communicating with People (2.8)

1.3 Data processing; data storage; data movement; and control.

CSCI 402: Computer Architectures. Instructions: Language of the Computer (1) Fengguang Song Department of Computer & Information Science IUPUI

Instruction Set Architecture

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

EITF20: Computer Architecture Part2.2.1: Pipeline-1

17. Instruction Sets: Characteristics and Functions

CS222: Dr. A. Sahu. Indian Institute of Technology Guwahati

Topics Power tends to corrupt; absolute power corrupts absolutely. Computer Organization CS Data Representation

Instruction Set Architecture

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

ELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)

Chapter 5. A Closer Look at Instruction Set Architectures

ECE 4750 Computer Architecture, Fall 2014 T01 Single-Cycle Processors

Systems Architecture I

Lecture 4: MIPS Instruction Set

COS 140: Foundations of Computer Science

Chapter 2 Instruction: Language of the Computer

Transcription:

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1

Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2

Main Content Computer Architecture Performance Quantitative Principles 3

What Computer Architecture? Design ISA Orgnization (microarchitecture) Implementation To meet requirements of Functionality (application, standards ) Price Performance Power Reliability Compatability Dependability.. 4

Performance Time to complete a task (T exe ) Execution time, response time, latency Task per day, hour Total amount of tasks for given time Thoughput, bandwidth 5

Quantitative Principles This is intro to design and analysis Take advantage of parallelism ILP, DLP, TLP,... Principle of locality 90% of execution time in only 10% of the code Focus on the common case In makeing a design trade-off, favor the frquent case ove the infrequent case Amdahl s law The performance improvement gained from uisng faster mode is limited by the fraction of the time the faster mode can be used The processor performance equation 6

Amdahl s Law Best you could ever hope to do: Speedup maximum 1 1 - Fraction enhanced 7

Aspect of CPU performance 8

Instructions are not created equally 9

Average CPI: example Invest resources where time is spent! 10

Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 12

Instruction Set Serves as an interface between software and hardware Provides a mechanism by which the software tells the hardware what should be done High level language code : C, C++, Java, Fortran, compiler Assembly language code: architecture specific statements assembler Machine language code: architecture specific bit patterns software instruction set hardware 13

Interface Design A good interface Lasts through many implementations (portability, compatibility) Can be used in many different ways (generality) Provides sufficient functionality to higher levels Permits an efficient implementation at lower levels use use use Interface imp 1 imp 2 imp 3 14

ISA Classification What s needed in an instruction set? Addressing Operands Operations Control Flow Classification of instruction sets Register model The number of operands for instructions Addressing modes The operations provided in the instruction set Type and size of operands Control flow instructions Encoding 15

ISA Design Issues Where are operands stored? registers, memory, stack, accumulator How many explicit operands are there? 0, 1, 2, or 3 How is the operand location specified? register, immediate, indirect,... What type & size of operands are supported? byte, int, float, double, string, vector... What operations are supported? add, sub, mul, move, compare... How is the operation flow controlled? branches, jumps, procedure calls... What is the encoding format fixed, variable, hybrid... 16

ISA Classes: Where are operands stored 17

ISA Classes 18

Example: C=A+B Stack Accumulator Register (register-memory) Push A Load A Load R1, A Push B Add B Add R1, B Add Store C Store C, R1 Pop C Register (loadstore) Load R1,A Load R2, B Add R3, R1, R2 Store C, R3 memory memory acc = acc + mem[b] R1 = R1 + mem[b] R3 = R1 + R2 19

Evolution of ISA Single Accumulator (EDSAC 1950, Maurice Wilkes) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation Stack Architecture Concept of a Family (RR,RM,MM) (Burroughs Corporation B5000 1963) (IBM 360 1964) General Purpose Register Machines Complex Instruction Sets Load/Store Architecture (Vax, Intel 432 1977-80) (CDC 6600, Cray 1 1963-76) CISC Intel x86, Pentium RISC (MIPS,SUN-Sparc,HP-PA, IBM RS6000,PowerPC...1987) 21

GPR (General Purpose Register) Registers are much faster than memory (even cache) Register values are available immediately When memory isn t ready, processor must wait ( stall ) Registers are convenient for variable storage Compiler assigns some variables (especially frequently used ones) just to registers More compact code since small fields specify registers (compared to memory addresses) Disadvantages Higher instruction count Dependent on good compiler (Reg. assignment) Higher hardware cost (comparing to MEM) 22

word line Memory Architecture storage (RAM) cell 23

Register File Example: 4-word register file with 1 write port and two read ports Register array: 4*16bit registers Each register has an enable signal Write decoding circuit: 0000 if wr_en is 0 1 bit asserted according to w_addr if wr_en is 1 Read circuit: A mux for each read port 24

Reg v.s. Mem (65nm CMOS) Register Bank Memory Size 256*4Byte 1K*4Byte Area 0.14mm 2 0.04mm 2 Density 7KB/mm 2 100KB/mm 2 25

Example: RISC-CICS MULT 2:3, 5:2 LOAD A, 2:3 LOAD B, 5:2 PROD A, B STORE 2:3, A CISC Emphasis on hardware Includes multi-clock complex instructions Memory-to-memory: "LOAD" and "STORE" incorporated in instructions Small code sizes, high cycles per second Transistors used for storing complex instructions RISC Emphasis on software Single -clock, reduced instruction only Register to register: "LOAD" and "STORE" are independent instructions Low cycles per second, large code sizes Spends more transistors on memory registers 26

Example: RISC-CICS RISC (Reduced Instruction Set Computing) Simple instructions MIPS, ARM,... Easier to design, build Less power Larger code size (IC), but in total (byte)? Easier for compiler, but for optimization? CISC (Complex Instruction Set Computing) Complex instructions VAX, Intel 80x86 (now RISC-like internally),... http://cs.stanford.edu/people/eroberts/courses/soco/proj ects/risc/risccisc/ 27

Example: RISC-CICS 28

Stack v.s. GPR Advantages Very compact object code Push A Load R1,A Push B Load R2, B Add Add R3, R1, R2 Simple compilers (no reg. assignment) Minimal processor state (simple hardware) Disadvantages More memory references (if stack is implemented with MEM) X+1 load X, push to memory load 1, push to memory pop 2 values from memory, add, and push result to memory Factoring out common subexpressions has high cost Can add some registers (hybrid) 29

Memory Addressing A 32 bit (4Byte) integer variable (0x01234567) stored at address 0x100 Big Endian Least significant byte has highest address 0x100 0x101 0x102 0x103 01 23 45 67 Little Endian Least significant byte has lowest address 0x100 0x101 0x102 0x103 67 45 23 01 Important for exchange of data, (and strings) 30

Memory Addressing Memory is generally byte addressed and provides access for bytes (8 bits), half words (16 bits), words (32 bits), and double words(64 bits) Access to data-objects > 1 byte? An architecture may require that data is aligned: Adreess index is multiple of date type size (depending on memory implementation) byte always aligned half word (16 bits) aligned at byte offsets 0,2,4,6,... word (32 bits) aligned at byte offsets 0,4,8,12,... double word (64 bits) aligned at byte offsets 0,8,16,24,... 31

Memory Alignment 32

Memory Alignment 33

Memory Addressing Mode Addressing Mode Example Action 1. Register direct Add R4, R3 R4 <- R4 + R3 2. Immediate Add R4, #3 R4 <- R4 + 3 3. Displacement Add R4, 100(R1) R4 <- R4 + M[100 + R1] 4. Register indirect Add R4, (R1) R4 <- R4 + M[R1] 5. Indexed Add R4, (R1 + R2) R4 <- R4 + M[R1 + R2] 6. Direct Add R4, (1000) R4 <- R4 + M[1000] 7. Memory Indirect Add R4, @(R3) R4 <- R4 + M[M[R3]] 8. Auto-increment Add R4, (R2)+ R4 <- R4 + M[R2] R2 <- R2 + d 9. Auto-decrement Add R4, (R2)- R4 <- R4 + M[R2] R2 <- R2 - d 10. Scaled Add R4, 100(R2)[R3] R4 <- R4 + M[100 + R2 + R3*d] 34

Memory addressing mode (on VAX) Are all these addressing modes needed? 35

Memory addressing mode How many bits are needed for address displacement? 36

Memory addressing mode How important are immediats? 37

Memory addressing mode How many bits are needed for immediate? 38

ISA Classification What s needed in an instruction set? Addressing Operands Operations Control Flow 39

What does it mean? 40

What does it mean? 41

Types and sizes of operands integer floating point (single precision) character packed decimal... etc... 42

Floating point 43

Floating point v.s. fixed point 44

Different Data representation 6/8 = 0.75 Binary: 0.11 Stochastic : 10111011, p=p(x=1) 3/10 = 0.3 Binary:? Stochastic : 0010010001 46

Advantagies: Error tolerent Binary: 3/8: 0.011 7/8: 0.111 Stochastic: 3/8: 00011010 4/8: 10011010 47

Advantagies: Simple arithemetic Z = X 1 X 2 3/8 = 4/8 6/8 X 1 X 2 Z 4/8 6/8 3/8 48

ISA Classification What s needed in an instruction set? Addressing Operands Operations Control Flow 49

Types of operations Arithmetic and Logic: Data Transfer: Control System Floating Point Decimal String Graphics AND, ADD MOVE, LOAD, STORE BRANCH, JUMP, CALL OS CALL, VM ADDF, MULF, DIVF ADDD, CONVERT MOVE, COMPARE, SEARCH (DE)COMPRESS 50

Types of operations (frequency) 9 Rank Instruction Frequency 1 load 22% 2 branch 20% 3 compare 16% 4 store 12% 5 add 8% 6 and 6% 7 sub 5% 8 register move 4% 9 call 1% 10 return 1% Total 96% 80x86 Instruction Frequency 51

ISA Classification What s needed in an instruction set? Addressing Operands Operations Control Flow 52

Types of control instructions Conditional branches Unconditional branches (jumps) Procedure call/returns 53

Types of control instructions How large need the branch displacement be? 54

Instruction format Variable instruction format Compact code but the instruction decoding is more complex and thus slower Examples: VAX, Intel 80x86 (1-17 byte) Fixed instruction format Easy and fast to decode but gives large code size Examples: Alpha, ARM, MIPS (4byte), PowerPC, SPARC 55

Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 56

ISA and compiler Instruction set architecture is a compiler target By far most instructions executed are generated by a compiler (exception certain special purpose processors) Interaction compiler - ISA critical for overall performance Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Architecture Design Circuit Design 57

ISA and compiler: a good or bad? example 58

ISA and compiler: a good or bad? example 59

The role of compilers High Level Language Program Compiler Assembly Language Program Assembler Machine Language Program Machine Interpretation temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw $15, 0($2) lw $16, 4($2) sw$16, 0($2) sw$15, 4($2) 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Control Signal Specification 60

The structure of a compiler Any compiler must perform two major tasks Analysis of the source program Synthesis of a machine-language program Compiler Analysis Synthesis 61

The structure of a compiler 63

GCC optimization options O0 No optimizations performed O1 Local optimizations such as common subexpression elimination, copy propagation, deadcode elimination etc O2 Global optimization, aggressive instruction scheduling, global register allocation O3 Inlining of procedures 64

Example of compiler optimization Code improvements made by the compiler are called optimizations and can be classified: High-order transformations: procedure inlining Optimizations: dead code elimination Constant propagation Common sub-expression elimination Loop-unrolling Register allocation (almost most important) Machine-dependent optimizations: takes advantage of specific architectural features 66

Example of compiler optimization Procedure inlining Constant propagation Dead code elimination Common expression elimination 67

Example of compiler optimization Code improvements made by the compiler are called optimizations and can be classified: High-order transformations: procedure inlining Optimizations: dead code elimination Constant propagation Common sub-expression elimination Loop-unrolling Register allocation (almost most important) Machine-dependent optimizations: takes advantage of specific architectural features Almost all of these optimizations are easier to do if there are many general registers available! E.g., common sub/expression elimination stores temporary value into a register Loop-unrolling Procedure inlining 68

The impact of compiler optimization 69

How can you aid compiler Rules of thumb when designing an instruction set (for general purpose processor): Regularity operations, data types and addressing modes should be orthogonal Generality Provide primitives, not high-level constructs or solutions Complex instructions are often too specialized Simplify trade-offs among alternatives 70

Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 71

The MIPS64 architecture An architecture representative of modern ISA: 64 bit load/store GPR architecture 32 general integer registers (R0 = 0) and 32 floating point registers Supported data types: bytes, half word (16 bits), word (32 bits), double word (64 bits), single and double precision IEEE floating points Memory byte addressable with 64 bit addresses Addressing modes: immediate and displacement 72

MIPS instructions MIPS instructions fall into 5 classes: Arithmetic/logical/shift/comparison Control instructions (branch and jump) Load/store Other (exception, register movement to/from GP registers, etc.) 73

MIPS instruction example 74

MIPS instruction format 75

Summary The instruction set architecture have importance for the performance The important aspects of an ISA are: register model addressing modes types of operations data types encoding Benchmark measurements can reveal the most common case Interaction compiler - ISA important 76