CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yajig Li Departmet of Computer Sciece Uiversity of Chicago
Admiistrative Stuff Lab1 out toight Due Thursday (10/18) Lab1 review sessio Tomorrow, 10/05, 4:30-5:30pm Locatio: JCL 346 2
Lecture Outlie Itroductio to ISA Case Study: ARMv8 / LEGv8 3
Review: Basic Cocepts Basic cocepts What is a computer? What is the vo Neuma model? What is ISA? What is uarch? Desig poit 4
ISA Istructios Opcodes, Addressig Modes, Data Types Istructio Types ad Formats Registers, Coditio Codes Memory orgaizatio Address space, Addressability, Aligmet Virtual memory maagemet Call, Iterrupt/Exceptio Hadlig Access Cotrol, Priority/Privilege I/O: memory-mapped vs. istr. Task/thread Maagemet Power ad Thermal Maagemet Multi-threadig support, Multiprocessor support 5
ISA Elemet: Istructio Or machie code, cosists of opcode: what the istructio does (add, sub, ) operads: who it is to do it to (register, memory, immediate) Example 6
Istructio Classes Operate istructios Process data: arithmetic ad logical operatios Fetch operads, compute result, store result Implicit seuetial cotrol flow (e.g., PC <= PC + 4) Data movemet istructios Move data betwee memory, registers, I/O devices Implicit seuetial cotrol flow Cotrol flow istructios Chage the seuece of istructios that are executed 7
Load/Store vs. Memory/Memory Architectures Load/store architecture: operate istructios operate oly o registers E.g., MIPS, ARM ad may RISC ISAs Memory/memory architecture: operate istructios ca operate o memory locatios E.g., x86 8
Data Types Represetatio of iformatio for which there are istructios that operate o the represetatio ARMv8 Iteger (byte, half word, word, doubleword, uad word) Floatig poit (half-, sigle-, double-precisio) Fixed poit Vector formats Others E.g., strigs i x86 9
Istructio Process Style Specifies the umber of operads a istructio operates o ad how it does so 0, 1, 2, 3 address machies 0-address: stack machie (op, push, pop) 1-address: accumulator machie (e.g., add mem) 2-address: 2-operad machie (op D, S; oe is both source ad dest) 3-address: 3-operad machie (op D, S1, S2; source ad dest separate) E.g., ARMv8 represets a 3-address machie 10
Istructio Addressig Modes Specifies how to obtai a operad of a istructio Register Immediate Memory (displacemet, register idirect, idexed, absolute, memory idirect, autoicremet, autodecremet, ) 11
Istructio Addressig Modes for Memory Specify how to obtai memory operads Absolute LW Rt, 10000 use immediate value as address Register Idirect: LW Rt, (r base ) use GPR[r base ] as address Displaced or based: LW Rt, offset(r base ) use offset+gpr[r base ] as address Idexed: LW Rt, (r base, r idex ) use GPR[r base ]+GPR[r idex ] as address Memory Idirect LW Rt ((r base )) use value at M[ GPR[ r base ] ] as address Auto ic/decremet LW Rt, (r base ) use GRP[r base ] as address, but ic. or dec. GPR[r base ] each time 12
Istructio Legth Fixed legth: Legth of all istructios the same + Easier to decode sigle istructio i hardware + Easier to decode multiple istructios cocurretly (superscalar) -- Wasted bits i istructios (Why is this bad?) -- Harder-to-exted ISA (how to add ew istructios?) Variable legth: Legth of istructios differet + Compact ecodig (Why is this good?) + extesibility -- More logic to decode a sigle istructio -- Harder to decode multiple istructios cocurretly Tradeoffs Code size (memory space, badwidth, latecy) vs. hardware complexity ISA extesibility ad expressiveess vs. hardware complexity Performace/eergy efficiecy? Smaller code vs. ease of decode 13
Uiform/No-uiform Decode of Ist Uiform decode: Same bits i each istructio correspod to the same meaig Opcode is always i the same locatio Ditto operad specifiers, immediate values, + Easier decode, simpler hardware + Eables parallelism: geerate target address before kowig the istructio is a brach -- Restricts istructio format (fewer istructios?) or wastes space No-uiform decode E.g., opcode ca be the 1st-7th byte i x86 + More compact ad powerful istructio format -- More complex decode logic Uiform decode usually meas fixed legth as well 14
x86 vs. MIPS Istructio Formats x86 MIPS: 0 6-bit opcode 6-bit opcode 6-bit rs 5-bit rs 5-bit rt 5-bit rt 5-bit immediate 26-bit rd 5-bit immediate 16-bit shamt 5-bit fuct 6-bit R-type I-type J-type 15
ISA Elemet: Registers Fast storage How may? Size of each register? Geeral purpose vs. special purpose? Why is havig registers a good idea? Because programs exhibit a characteristic called data locality A recetly produced/accessed value is likely to be used more tha oce (temporal locality) Storig that value i a register elimiates the eed to go to memory each time that value is eeded Complier: Register optimizatio is importat! 16
ISA Elemet: Memory Orgaizatio Address space: How may uiuely idetifiable locatios i memory Addressability: How much data does each uiuely idetifiable locatio store Byte addressable: most ISAs Aliged/ualiged access MSB byte-3 byte-2 byte-1 byte-0 byte-7 byte-6 byte-5 byte-4 LSB 17
ISA Elemet: I/O How to iterface with I/O devices Memory mapped I/O A regio of memory is mapped to I/O devices I/O operatios are loads ad stores to those locatios Special I/O istructios IN ad OUT istructios i x86 deal with ports of the chip Tradeoffs? Which oe is more geeral purpose? 18
Other ISA Elemets Privilege modes User vs supervisor Who ca execute what istructios? Who ca access which registers? Exceptio ad iterrupt hadlig What procedure is followed whe somethig goes wrog with a istructio? What procedure is followed whe a exteral device reuests the processor? Virtual memory Each program has the illusio of the etire memory space, which is greater tha physical memory Access protectio 19
May Differet ISAs X86 ARM MIPS SPARC IBM 360 What/why are the fudametal differeces? 20
CISC vs. RISC CISC, Complex istructio set computer à complex istructios Iitially motivated by ot good eough code geeratio Memory size/badwidth cosideratios RISC, Reduced istructio set computer à simple istructios Goal: eable better compiler cotrol ad optimizatio Motivated by Simplifyig the hardware à lower cost, higher freuecy Eablig the compiler to optimize the code better Simple compiler, complex hardware vs. complex compiler, simple hardware 21
CISC vs. RISC Usually, RISC Simple istructios Fixed legth Uiform decode Few addressig modes CISC Complex istructios Variable legth No-uiform decode May addressig modes 22
ARMv8/LEGv8 Case Study 23
The ARMv8 ISA Commercialized by ARM Holdigs (www.arm.com) Large share of embedded core market Applicatios i mobile, cosumer electroics, etwork/storage euipmet, cameras, priters, Typical of may moder ISAs Referece (5740 pages) https://developer.arm.com/docs/ddi0487/a/arm-architecturereferece-maual-armv8-for-armv8-a-architecture-profile **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
ARMv8 Overview RISC, Load/store architecture, both 32- ad 64-bit 3-address machie 32-bit istructios Simple datatypes it, floatig poit, fixed poit, vector Addressig modes: reg, imm, simple mem addressig mem addr. calculated from reg ad istructio cotets oly 32 GPRs, PC, SP, ELR, 32 SIMD/FP registers Byte addressable You will implemet ARMv8 i C (Lab1)
LEGv8 A subset of ARMv8 With some differeces Referece Gree card from textbook Also available olie http://booksite.elsevier.com/9780128017333/arm_ref.php
Istructio Formats **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
R-format Istructios opcode Rm shamt R Rd 11 bits 5 bits 6 bits 5 bits 5 bits Istructio fields opcode: operatio code Rm: the secod register source operad shamt: shift amout (oly used for shift operatios) R: the first register source operad Rd: the register destiatio **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
R-format Example opcode Rm shamt R Rd 11 bits 5 bits 6 bits 5 bits 5 bits ADD X9,X20,X21 // add the values i X20 ad X21, ad put //the result i X9, or GPR[x9] = GPR[x20]+GPR[x21] 10001011000 two 10101 two 000000 two 10100 two 01001 two 1000 1011 0001 0101 0000 0010 1000 1001 two = 8B150289 16 **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
shamt i R-format istructios opcode Rm shamt R Rd shamt: how may positios to shift Shift left logical (LSL) R[Rd] <- R[R] << shamt //Shift left ad fill with 0 bits LSL by i bits: multiplies by 2 i Shift right logical (LSR) 11 bits 5 bits 6 bits 5 bits 5 bits R[Rd] <- R[R] >> shamt //Shift right ad fill with 0 bits LSR by i bits: divides by 2 i (usiged oly) Note, the shifted register versios of R istructios i ARMv8 support shift operatios i the secod operad before applyig the operatio specified i opcode **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
C to Assembly 101 C code: f = (g + h) - (i + j); f,, j i X19, X20,, X23 Compiled ito assembly: ADD X9, X20, X21 ADD X10, X22, X23 SUB X19, X9, X10 **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
I-format Istructios opcode immediate R Rd 10 bits 12 bits 5 bits 5 bits Immediate istructios R: source register Rd: destiatio register Immediate field: costat data; zero-exteded Example: ADDI X22, X22, #4 What does the machie code look like for ADDI? **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
D-format Istructios Load/store istructios opcode addoffset op2 R Rt 11 bits 9 bits 2 bits 5 bits 5 bits R: base register addoffset: costat offset from cotets of base register (+/- 32 doublewords) op2: expads the opcode field Rt: destiatio (load) or source (store) register umber Example: LDUR X9,[X22,#64] LDUR opcode:11111000010 2; op2:0 X9 (Rt field) X22 (R field) **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
C to Assembly 201 C code: A[12] = h + A[8]; h i X21, base address of A i X22, elemets i A are 8 bytes X9 is used as a temp register Compiled code: LDUR X9,[X22,#64] ADD STUR X9,X21,X9 X9,[X22,#96] Note: Idex 8 reuires offset of 64 (byte-addressed memory) **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
B Format Istructios opcode BR_address 6 bits 26 bits Example: B L1 brach ucoditioally to istructio labeled L1; B opcode: 000101 2 (ARMv8) Effect: PC = PC + BrachAddr **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]
CB Format Istructios Brach to a labeled istructio if a coditio is true Otherwise, cotiue seuetially Examples opcode CBZ register, L1 COND_BR_address 8 bits 19 bits 5 bits if (register == 0) brach to istructio labeled L1; CBNZ register, L1 if (register!= 0) brach to istructio labeled L1; Effect: if take, PC = PC + CodBrachAddr; else PC=PC+4 18 Rt **Based o origial figure from [P&H CO&D, COPYRIGHT 2016 Elsevier. ALL RIGHTS RESERVED.]