AIMS Embedded Systems Programming MT 2017 Micro Architectures

Size: px
Start display at page:

Download "AIMS Embedded Systems Programming MT 2017 Micro Architectures"

Transcription

1 AIS Embedded Systems Programming T 7 icro Architectures Outline X86/Y86 Daniel Kroening AR Pipelining University of Oxford, Computer Science Department emory Version., 4 D. Kroening: AIS Embedded Systems Programming T 7 High-Level View of icroarchitectures CPUs CPU Process a sequential assembler program IP eax ecx ZF... registers ebx... memory module memory module I/O (USB,...) Data held in registers Program controls which data is given to which FU, and where the result is stored Float emory FUs L, L caches data address control Program controls transfer of data between registers and memory Caches speed up access to frequently used memory cells D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 4 Instruction Set Architectures These summarise the behavior of a CPU from the point of view of the programmer We will study two ISAs:. CISC: specifically the Y86 (academic variant of Intel s x86). RISC: specifically the AR 3 architecture An ISA describes what the CPU does Ideally as little as possible about how the CPU does it One of the goals of this course is to understand the difference D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 6

2 Visible Registers RA Contains data and the program Y86 Assembler Subset of Intel s x86 assembler Data registers Index Name eax ecx edx ebx esp ebp esi edi Instruction Pointer (IP) Points to address of current instruction Flag registers (ZF,...) Store flags for branches You can run a Y86 program on your x86 machine! The reverse does not work in general, as too many instructions are missing (you are welcome to mend this) D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 8 Y86 Instructions Y86 Loads and Stores add/sub: Addition/subtraction of the values in two registers; ZF is set appropriately Loads and stores have a Displacement : ea = esi + Displacement RRmov: copies value of one register into another Rmov: copies value of a register into RA Rmov: copies value from RA into a register The displacement is included in the instruction word as immediate constant jnz: Jumps to relative address if ZF = The register esi is used as offset D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 Y86 Instruction Formats Example nemonic Semantics add RD RD+RS Opcode RS RD add eax, edx sub RD RD-RS 9 RS RD Intel convention: the target register is always on the jnz if( ZF) IP IP+Distance 7 Distance left-hand side RRmov RD RS 89 RS RD The target register is a source register, too! Rmov E[ea] RS 89 RS Displacement Semantics: Rmov RS E[ea] 8b RS Displacement eax eax + edx hlt D. Kroening: AIS Embedded Systems Programming T 7 f4 D. Kroening: AIS Embedded Systems Programming T 7

3 Example How do Branches Work? Opcode (Rmov) mov edx, [BYTE one+esi] 8B 6 7 }{{} edx Displacement i f ( a==b ) { T ; } else { F ; } mov eax, [BYTE a+esi ] mov ebx, [BYTE b+esi ] sub eax, ebx jnz f ; ; Code f o r T ; mov eax, [BYTE one+esi ] add eax, eax jnz e f ; ; Code for F ; Semantics: e ;... edx E[esi+7] D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 4 Assembler Example Address achine Code Assembler using nemonics 9 F6 sub esi, esi 9C sub eax, eax 4 9DB sub ebx, ebx 6 8B 6 7 l mov edx, [BYTE one+esi] 9 D add eax, edx B C3 add ebx, eax D 89C mov ecx, eax F 8B 6 B mov edx, [BYTE ten+esi] 9D sub ecx, edx 4 7 F jnz l 6 F4 hlt 7 one dd B A ten dd The result is in ebx The NAS Assembler Windows: nasm -f win3 my test.asm link /subsystem:console /entry:start my test.obj Linux: nasm -f elf my test.asm ld -s -o my test my test.o acos: nasm -f macho my test.asm ld -arch i386 -o my test my test.o D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 6 Inline Assembler with Visual Studio Debugging with GDB (Part ) int one =, ten =, r e s u l t ; int main ( ) { asm { sub e s i, e s i sub eax, eax sub ebx, ebx l : mov edx, [ one+e s i ] add eax, edx add ebx, eax mov ecx, eax mov edx, [ ten+e s i ] sub ecx, edx j n z l mov [ r e s u l t+e s i ], ebx } p r i n t f ( Result : %d\n, r e s u l t ) ; return ; } run Start execution x/[size] Label Dump a region of the memory x/[sizei] Label Disassemble some memory region, e. g. x/i $pc info registers Show the value of the registers step Execute one instruction D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 8

4 Debugging with GDB (Part ) Debugging with Visual Studio break label set breakpoint at label info break show the breakpoints delete breakpoints number well, delete a breakpoint continue resume the execution after a breakpoint D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 Debugging with XCode Extensions: Comparisons We would love to have Y86 commands for i f ( a<b ) {... } These obviously depend on the number representation: with sign > 7 twoc() > twoc() without sign < 9 bin() < bin() D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 Reminder: Number Interpretation Binary representation: Two s complement: bin() : {, } n {,..., n } n bin(x) = x i i i= twoc() : {, } n { n,..., n } twoc(x) = n x n + bin(x n,..., x ) Comparing Unsigned Integers Unsigned integers: bin(a) < bin(b) bin(a) bin(b) < Recall: b = ( b) + We get the + for free by setting the carry-in of the adder. Let s pretend we compute with one more bit ( zero extension ): a n... a a + b n... b b c n c n... c (carry bits) = s n s n... s s (sum) Thus: bin(a) bin(b) < s n c n D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 4

5 Comparing Signed Integers Two s complement: twoc(a) < twoc(b) twoc(a) twoc(b) < Again, let s pretend we have an extra bit ( sign extension ): a n a n... a a + b n b n... b b c n c n... c (carry bits) = s n s n... s s (sum) Thus: twoc(a) twoc(b) < s n a n b n c n s n c n c n New Flags: CF, SF, OF We introduce three new flags for arithmetic operations: CF: The carry flag (c n in case of additions, c n in case of subtraction) SF: The sign flag (s n ) OF: The overflow flag (c n c n ) meaning Intel did so D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 6 Examples (Part ) Examples (Part )... = +... =... =... =... =... =... =... =... = +... =... =... = ZF =, CF =, SF =, OF = ZF =, CF =, SF =, OF = ZF =, CF =, SF =, OF =... = n +... =... =... = n ZF =, CF =, SF =, OF =... = n... =... =... = n ZF =, CF =, SF =, OF = D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 8 Branching Instructions for Comparisons Instruction jz, je jnz, jne jnae, jb jae, jnb jna, jbe ja, jnbe jnge, jl jge, jnl jng, jle jg, jnle jmp near Flags ZF ZF CF CF CF ZF (CF ZF) SF OF (SF OF) ((SF OF) ZF) ((SF OF) ZF) unconditional n = not, z = zero, e = equal, g = greater, l = less, a = above, b = below i.e. jnbe = jump if not (below or equal) Branching Instructions for Comparisons sub ax, bx Jxxx t a r g e t... t a r g e t : branch if with sign without sign ax = bx je je ax bx jne jne ax > bx jg ja ax bx jge jae ax < bx jl jb ax bx jle jbe D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 3

6 Example Branching Instructions s t a r t sub esi, esi ; array index mov edx, [BYTE Intmax+esi ] ; inimum mov ecx, [BYTE Top+esi ] ; top index sub ebx, ebx ; counter L mov eax, ebx sub eax, ecx jae end ; counter Top? mov esi, ebx mov edi, [BYTE Array+esi ] ; edi :=array[ebx] mov eax, edi sub eax, edx jge s k i p ; array[ebx] inimum? Example Branching Instructions (Part ) Four dd 4 Top dd 4 Array dd,, 3, 4,, 6, 7, 8, 9, Intmax dd x 7 f f f f f f f mov edx, edi ; inimum:= array[ebx] s k i p sub esi, esi mov eax, [BYTE Four+esi ] add ebx, eax ; counter+=4 D. Kroening: AIS Embedded jmp Systems near Programming L T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 3 end hlt History AR AR Today 98s: Acorn Computers 98: BBC icro (8 bit) 986: AR development kit 99: AR, Advanced RISC achines, founded; owners: Acorn Computers, Apple and VLSI Technology Now primarily licensed as IP, with focus on low-end embedded systems and phones (>9 % market share) Built by Apple, Nvidia, Qualcomm, Samsung, TI 3: 37 billion AR processors produced Early 64-bit prototypes for application in low-power servers D. Kroening: AIS Embedded Systems Programming T 7 33 D. Kroening: AIS Embedded Systems Programming T 7 34 Visible Data Basic Instructions RA, organised in 3-bit words Registers R to R R is a special case: this is the PC R3 is the stack pointer (SP) R4 is used for the return address for function calls (LR) CPSR for various flags (There is another register file for floating-point numbers) ADD R d, R n, R m R d R n + R m SUB R d, R n, R m R d R n R m UL R d, R m, R s R d (R m R s )[3 : ] SUL R dl, R dh, R m, R s R dh, R dl R m R s UUL R dl, R dh, R m, R s R dh, R dl R m R s SDIV R d, R m, R s R d R m /R s UDIV R d, R m, R s R d R m /R s AND R d, R n, R m R d R n & R m B label PC label BL label LR P C +4; PC label BX R m BX R m any variants! D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 36

7 Setting Condition Flags Using Condition Flags ost instructions can be given a suffix S. In addition to the usual behaviour, the condition flags (in CPSR) are updated N Z C V ost instructions can be given condition suffixes: EQ equal NE not equal CS/HS carry set CC/LO carry clear I negative PL positive (or zero) VS overflow VC no overflow HI higher LS lower or same GE greater or equal LT less than GT greater than LE less than or equal N = negative, Z = zero, C = carry, V = overflow These use 4 bits in the instruction word. D. Kroening: AIS Embedded Systems Programming T 7 37 D. Kroening: AIS Embedded Systems Programming T 7 38 AR Instruction Formats AR Instruction Formats AR uses a fixed-size instruction word: Cond Opcode S R n R d R m data processing There is a compressed version called Thumb- Instruction Set The instructions have 6 bit Cond L offset branch and branch&link Fewer options, conditions are a separate instruction Aimed at better I-Cache efficiency D. Kroening: AIS Embedded Systems Programming T 7 39 D. Kroening: AIS Embedded Systems Programming T 7 4 Sequential Processors with Pipeline We will start with an implementation that has the form and shape of a pipeline, but processes one instruction at a time processes the instructions in a fixed order of phases These aren t built, but only exist for illustrative purposes. But: The step to a proper pipeline is minimal (will show!) The Instruction Phases (Stages). Instruction Fetch () The instruction is copied from the RA into a register (). Instruction Decode () Loads the values of the operands from the register file into registers A and B; also increments the program counter 3. Execute () Perform any operation (say add/sub), address arithmetic for /store 4. emory () RA access for /store. Write-Back () Store any result in the register file D. Kroening: AIS Embedded Systems Programming T 7 4 D. Kroening: AIS Embedded Systems Programming T 7 4

8 An Implementation: High-level View Sequential Execution IP We first implement a sequential machine: The stages are processed one after the other in the order AR DRw C We execute exactly one instruction at a time In contrast to multi-cycle designs: We stick to this even if an instruction doesn t actually use a particular stage D. Kroening: AIS Embedded Systems Programming T 7 43 D. Kroening: AIS Embedded Systems Programming T 7 44 Sequential Execution Example: Processing add Let I, I,... be the sequence of instructions in program order. time I I I I I I E I I I mov [+esi], edx AR IP DRw C D. Kroening: AIS Embedded Systems Programming T 7 4 D. Kroening: AIS Embedded Systems Programming T 7 46 Example: Processing add () Example: Processing add () mov [+esi], edx AR IP DRw C mov [+esi], edx 9, 6 IP AR DRw C add, 3 D. Kroening: AIS Embedded Systems Programming T 7 47 D. Kroening: AIS Embedded Systems Programming T 7 48

9 Example: Processing add (3) Example: Processing add (4) add add mov [+esi], edx IP 9, 6 3 AR DRw C 3 mov [+esi], edx AR IP DRw C 3 D. Kroening: AIS Embedded Systems Programming T 7 49 D. Kroening: AIS Embedded Systems Programming T 7 Example: Processing add () Example: Processing Rmov add 4 mov [+esi], edx AR IP DRw C 3 mov [+esi], edx AR IP DRw C D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 Example: Processing Rmov () Example: Processing Rmov () mov [+esi], edx AR IP DRw C 6 mov [+esi], edx, 3 Rmov IP AR DRw C 6, D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 4

10 Example: Processing Rmov (3) Example: Processing Rmov (4) Rmov Rmov 7 mov [+esi], edx IP, 3 3 AR DRw C 8 mov [+esi], edx AR IP DRw C 3 D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 6 Example: Processing Rmov () Example: Processing jnz Rmov 9 mov [+esi], edx AR IP DRw C jnz l the distance is AR IP DRw C D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 8 Example: Processing jnz () Example: Processing jnz () jnz l the distance is AR IP DRw C jnz l the distance is AR IP DRw C jnz D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 6

11 Example: Processing jnz (3) Example: Processing jnz (4) jnz jnz jnz l the distance is AR IP DRw C 3 jnz l the distance is AR IP DRw C D. Kroening: AIS Embedded Systems Programming T 7 6 D. Kroening: AIS Embedded Systems Programming T 7 6 Example: Processing jnz () Pipelining Increases the performance using the assembly-line idea jnz 4 jnz l the distance is AR IP DRw C performance = instructions per cycle clock frequency }{{}}{{} IPC /τ Standard technique in virtually all modern circuitry (not just CPUs, but also GPUs, video, networking, wireless,...) D. Kroening: AIS Embedded Systems Programming T 7 63 D. Kroening: AIS Embedded Systems Programming T 7 64 Pipelining time 3 4 I I I 3 I 4 I I 6 I I I 3 I 4 I I I I 3 I 4 E I I I 3 I I Best case: one instruction per cycle! Pipelining Performance Performance: IPC τ IPC τ D FF + D n where: IPC : instructions per cycle τ: cycle time n: # stages D: combinational delay without the flip flops D. Kroening: AIS Embedded Systems Programming T 7 6 D. Kroening: AIS Embedded Systems Programming T 7 66

12 Implementing the Pipeline: Roadmap Resource Conflicts Let s look at our sequential machine again:. Resolving resource conflicts. odifying the control 3. Dealing with data and control hazards IP AR DRw C Consider the C register of an instruction followed by another instruction! once the nd instruction is fetched? D. Kroening: AIS Embedded Systems Programming T 7 67 D. Kroening: AIS Embedded Systems Programming T 7 68 Register Lifetime Register Lifetime IP AR DRw C W R R R R W R IP R W AR W R DRw W R C W R R Flags R W W R eax... R W IP AR DRw C3 C4 3 4 We resolve by replication! Problem: and C need to be remembered for multiple stages! D. Kroening: AIS Embedded Systems Programming T 7 69 D. Kroening: AIS Embedded Systems Programming T 7 7 Resource Conflicts Example Pipeline Q: Which other resources are shared by stages? A: The (shared by and E)! Q: What do we do? A: ost CPUs have an L-cache that permits two (read-)accesses simultaneously. (Really two L caches: an I- and a D-cache) program (modified): mov [+esi], ecx IP AR DRw C3 C4 3 4 D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 7

13 Example Pipeline () Example Pipeline () program (modified): mov [+esi], ecx IP AR DRw C3 3 program (modified): mov [+esi], ecx 9, 6 IP AR DRw C3 add 3 C4 4 C4 4, 3 D. Kroening: AIS Embedded Systems Programming T 7 73 D. Kroening: AIS Embedded Systems Programming T 7 74 Example Pipeline (3) Example Pipeline (4) program (modified): mov [+esi], ecx, IP 9, 6 3 AR DRw C3 Rmov add 3 3 program (modified): mov [+esi], ecx IP, AR DRw C3 3 Rmov 3 add C4 4 C4 4 6, D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 76 Example Pipeline () Example Pipeline (6) 4 program (modified): mov [+esi], ecx AR DRw C3 IP 3 Rmov program (modified): mov [+esi], ecx IP AR DRw C3 3 C4 3 4 add C4 4 Rmov D. Kroening: AIS Embedded Systems Programming T 7 77 D. Kroening: AIS Embedded Systems Programming T 7 78

14 Data and Control Dependencies Example program with data dependency: add edx, ebx mov [+ esi ], edx Execution in the pipeline: Like that? time mov [+esi], edx E Data and Control Dependencies Example program with data dependency: add edx, ebx mov [+ esi ], edx Execution in the pipeline: time 3... mov [+esi], edx BUBBLE E D. Kroening: AIS Embedded Systems Programming T 7 79 D. Kroening: AIS Embedded Systems Programming T 7 8 emory RA in PCs RO: read-only memory RA: random-access memory (but usually means random-access read and write memory) 3 pin SI 7 pin SI icrodi 84 pin RABus RI SRA: static RA stores state as long as power is supplied pin DI 7 pin SODI 44 pin SDRA SODI pin DDR SODI pin DDR- SODI DRA: dynamic RA implemented using capacitors; the state is lost without periodic refresh 68 pin SDRA DI 84 pin DDR DI 4 pin DDR- DI D. Kroening: AIS Embedded Systems Programming T 7 8 D. Kroening: AIS Embedded Systems Programming T 7 8 Addresses Structure WE DATA RA address RA/RO-Chips store many (billions of) bits Distinguish using an address The address is given in binary Plus WE: read/write The data pins are used for reading as well as writing decoder 4 decoder address RA/RO chips are a D matrix The address is split into a row and column The binary encoding is turned into unary using a decoder D. Kroening: AIS Embedded Systems Programming T 7 83 D. Kroening: AIS Embedded Systems Programming T 7 84

15 SRA Cell with Two Inverters SRA Cell in COS Address Line Address Line VDD Data Data Reading and writing Address line selects the cell State is held using the inverters (latch) Read by comparing Data and Data GND Data Data D. Kroening: AIS Embedded Systems Programming T 7 8 D. Kroening: AIS Embedded Systems Programming T 7 86 DRA Reminder: Capacitors DRA uses capacitors more simplistic and easier to build than SRA high density, low cost But: slower! charging discharging charge % time fast but expensive SRA for caches (more on that later) slow but inexpensive DRA for the main memory Store an electric charge but only for limited time D. Kroening: AIS Embedded Systems Programming T 7 87 D. Kroening: AIS Embedded Systems Programming T 7 88 DRA Cell Data Buses Address Line Connecting multiple memory chips: memory module CPU memory module GND Data memory module A bit is stored as a capacity and has to be refreshed periodically No! I/O pins are expensive! D. Kroening: AIS Embedded Systems Programming T 7 89 D. Kroening: AIS Embedded Systems Programming T 7 9

16 Data Buses Interface RA Chips Goal: effective use of the pricey wires Idea: share wires for data and among RA modules Control signals: CS (Chip Select) activates a particular chip WE (Write Enable) OE (Output Enable) memory module memory module memory module Inactive chips have high-impedance outputs (Z) CPU data address control Write by setting WE, read by setting OE Interface constraint: OE and WE are never both active D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 9 Write Cycle Read Cycle CS W E OE Address CS W E OE Address Data valid Data valid D. Kroening: AIS Embedded Systems Programming T 7 93 D. Kroening: AIS Embedded Systems Programming T 7 94 Row- und Column-Address-Strobes RAS/CAS Write Cycle Idea: save even more wires by sending the address in two (or more) steps Address Row Col RAS Typical: row and column are sent separately CAS W E RAS: Row Address Strobe, CAS: Column Address Strobe Data valid D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 96

17 RAS/CAS Read Cycle Bus-Bursts Address Row Col RA has long latency RA is often accessed sequentially RAS CAS W E Caches therefore are arranged in lines: a sequence of consecutive (e.g. 6 bytes) Data valid Bus-bursts: efficient transmission of an entire cache line D. Kroening: AIS Embedded Systems Programming T 7 97 D. Kroening: AIS Embedded Systems Programming T 7 98 Bus-Bursts Double Data Rate (DDR) RA Address Row Col Address Row Col RAS RAS CAS CAS OE OE DATA D D D D3 DATA D D D D3 CLK CLK CAS Latency (CL) CAS Latency (CL) D. Kroening: AIS Embedded Systems Programming T 7 99 D. Kroening: AIS Embedded Systems Programming T 7 Timings Timings 4GB 8Hz DDR Non-ECC CL (---) DI (Kit of ) 4GB 8Hz DDR Non-ECC Low-Latency CL4 (4-4-4-) DI (Kit of ) 4GB 66Hz DDR Non-ECC CL (---) DI (Kit of ) 4GB 66Hz DDR Non-ECC CL (---) DI (Kit of ) Tall HS 4GB 66Hz DDR Non-ECC CL (---) DI (Kit of 4) 8GB 8Hz DDR Non-ECC Low-Latency CL4 (4-4-4-) DI (Kit of 4) HyperX DDR3 37Hz, 6HZ, 6Hz, 8Hz, 866Hz and Hz Description GB 37Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI GB 6Hz DDR3 Non-ECC CL9 ( ) DI GB 6Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI GB 6Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI GB 8Hz DDR3 Non-ECC CL8 ( ) DI GB 8Hz DDR3 Non-ECC CL8 ( ) DI GB 37Hz DDR3 Non-ECC CL9 (9-9-9) DI (Kit of ) GB 37Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI GB 37Hz DDR3 Non-ECC CL7 (7-7-7-) DI (Kit of ) GB 37Hz DDR3 Non-ECC CL7 (7-7-7-) DI (Kit of ) Intel XP GB 6Hz DDR3 Non-ECC CL9 ( ) DI GB 6Hz DDR3 Non-ECC CL9 ( ) DI (Kit of ) GB 6Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI GB 6Hz DDR3 Non-ECC Low-Latency CL7 (7-7-7-) DI (Kit of ) GB 6Hz DDR3 Low Latency CL8 (8-7-7-) DI (Kit of ) NVIA SLI KHX64DK/4G KHX64DLLK/4G KHX8DK/4G KHX8DTK/4G KHX8DK4/4G KHX64DLLK4/8G Part Number KHXD3LL/G KHX8D3/G KHX3D3LL/G KHX3AD3LL/G KHX44D3/G KHX44AD3/G KHXD3K/G KHXD3LL/G KHXD3LLK/G KHXD3LLK/GX KHX8D3/G KHX8D3K/G KHX3D3LL/G KHX3D3LLK/G KHX3D3LLK/GN Example: --- Current standard:. CAS Latency. RAS-to-CAS Delay 3. RAS Precharge 4. Act-to-Precharge Delay D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7

18 Caches Caches: Overview Recall: DRA slow/cheap, SRA fast/pricey 3 46 Idea: use SRA as fast cache for lots of DRA Hides the latency of the slow DRA cache line index offset 3 tag Usually good hit rates >9 % cache... main memory D. Kroening: AIS Embedded Systems Programming T 7 3 D. Kroening: AIS Embedded Systems Programming T 7 4 Caches: Hashing Collisions Q: How to map the? Easiest answer: use least-significant bits address = tag index offset tag: distinguishes lines with same index index: address in cache offset: distinguishes words in cache line cache line 3 index tag tag D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 6 Overview of Design Options for Caches Cache Size size line size number of bytes stored together allocation policy when is a new entry created? associativity length of list in hash table replacement policy which entries to purge (sectoring) write policy write through or write back split I/D cache or unified I/D cache Bigger cache better hit rate Bigger caches are also more expensive and have longer paths Partially addressed by hierarchy (more on that later) We will have more options once hierarchy is added. D. Kroening: AIS Embedded Systems Programming T 7 7 D. Kroening: AIS Embedded Systems Programming T 7 8

19 Line Size Associativity Observation: memory accesses are clustered I.e., the subsequent accesses are often next to each other Cache entries have overhead: address bits plus flag bits Also remember the latency of memory! Also called ways An n-way cache can store n entries with the same address hash Reduce overhead by making cache entry bigger Think of the length of the list in a hash table Typical size: 64 bytes ( bits) This reduces the number of collisions D. Kroening: AIS Embedded Systems Programming T 7 9 D. Kroening: AIS Embedded Systems Programming T 7 Associativity Cache Hierarchies tag cache line 3 3 index -way cache tag Recall that fast SRA is expensive, and bigger caches have long paths Thus: build a cache for the cache L: closest to CPU L, L3, L4: cache the next level Caches get bigger the closer they get to the memory D. Kroening: AIS Embedded Systems Programming T 7 D. Kroening: AIS Embedded Systems Programming T 7 Statistics odel Year L Cache L Cache L3 Cache L4 Cache 8486DX KB joint Pentium KB+8 KB Pentium Pro 99 8 KB+8 KB. B Pentium X KB+6 KB Pentium II KB+6 KB. B Xeon KB+8 KB. B Pentium III KB+6 KB. B Pentium 4 6 KB+6 KB.. B Itanium 6 KB+6 KB. 9 B or 4 B Pentium 3 3 KB+3 KB. B Core Duo 6 3 KB+3 KB B Core i7 8 3 KB+3 KB. B 8 B Core i 9 3 KB+3 KB. B 8 B Core i3 3 KB+3 KB. B 4 B Atom SoC 3 KB+4 KB. B Core 4. B 3 B 8 B Numbers are per core unless shared. D. Kroening: AIS Embedded Systems Programming T 7 3

AIMS Embedded Systems Programming MT 2016

AIMS Embedded Systems Programming MT 2016 AIMS Embedded Systems Programming MT 2016 Micro Architectures Daniel Kroening University of Oxford, Computer Science Department Version 1.0, 2014 Outline X86/Y86 ARM Pipelining Memory D. Kroening: AIMS

More information

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM EXPERIMENT WRITE UP AIM: Assembly language program to search a number in given array. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

More information

Introduction to 8086 Assembly

Introduction to 8086 Assembly Introduction to 8086 Assembly Lecture 5 Jump, Conditional Jump, Looping, Compare instructions Labels and jumping (the jmp instruction) mov eax, 1 add eax, eax jmp label1 xor eax, eax label1: sub eax, 303

More information

Basic Assembly Instructions

Basic Assembly Instructions Basic Assembly Instructions Ned Nedialkov McMaster University Canada SE 3F03 January 2013 Outline Multiplication Division FLAGS register Branch Instructions If statements Loop instructions 2/21 Multiplication

More information

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

X86 Addressing Modes Chapter 3 Review: Instructions to Recognize X86 Addressing Modes Chapter 3" Review: Instructions to Recognize" 1 Arithmetic Instructions (1)! Two Operand Instructions" ADD Dest, Src Dest = Dest + Src SUB Dest, Src Dest = Dest - Src MUL Dest, Src

More information

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016 CS 31: Intro to Systems ISAs and Assembly Kevin Webb Swarthmore College February 9, 2016 Reading Quiz Overview How to directly interact with hardware Instruction set architecture (ISA) Interface between

More information

The Instruction Set. Chapter 5

The Instruction Set. Chapter 5 The Instruction Set Architecture Level(ISA) Chapter 5 1 ISA Level The ISA level l is the interface between the compilers and the hardware. (ISA level code is what a compiler outputs) 2 Memory Models An

More information

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018 CS 31: Intro to Systems ISAs and Assembly Kevin Webb Swarthmore College September 25, 2018 Overview How to directly interact with hardware Instruction set architecture (ISA) Interface between programmer

More information

mith College Computer Science CSC231 Assembly Week #9 Spring 2017 Dominique Thiébaut

mith College Computer Science CSC231 Assembly Week #9 Spring 2017 Dominique Thiébaut mith College Computer Science CSC231 Assembly Week #9 Spring 2017 Dominique Thiébaut dthiebaut@smith.edu 2 Videos to Watch at a Later Time https://www.youtube.com/watch?v=fdmzngwchdk https://www.youtube.com/watch?v=k2iz1qsx4cm

More information

Selection and Iteration. Chapter 7 S. Dandamudi

Selection and Iteration. Chapter 7 S. Dandamudi Selection and Iteration Chapter 7 S. Dandamudi Outline Unconditional jump Compare instruction Conditional jumps Single flags Unsigned comparisons Signed comparisons Loop instructions Implementing high-level

More information

BAHAR DÖNEMİ MİKROİŞLEMCİLER LAB4 FÖYÜ

BAHAR DÖNEMİ MİKROİŞLEMCİLER LAB4 FÖYÜ LAB4 RELATED INSTRUCTIONS: Compare, division and jump instructions CMP REG, memory memory, REG REG, REG memory, immediate REG, immediate operand1 - operand2 Result is not stored anywhere, flags are set

More information

Real instruction set architectures. Part 2: a representative sample

Real instruction set architectures. Part 2: a representative sample Real instruction set architectures Part 2: a representative sample Some historical architectures VAX: Digital s line of midsize computers, dominant in academia in the 70s and 80s Characteristics: Variable-length

More information

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,...

We can study computer architectures by starting with the basic building blocks. Adders, decoders, multiplexors, flip-flops, registers,... COMPUTER ARCHITECTURE II: MICROPROCESSOR PROGRAMMING We can study computer architectures by starting with the basic building blocks Transistors and logic gates To build more complex circuits Adders, decoders,

More information

Module 3 Instruction Set Architecture (ISA)

Module 3 Instruction Set Architecture (ISA) Module 3 Instruction Set Architecture (ISA) I S A L E V E L E L E M E N T S O F I N S T R U C T I O N S I N S T R U C T I O N S T Y P E S N U M B E R O F A D D R E S S E S R E G I S T E R S T Y P E S O

More information

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs CSC 2400: Computer Systems Towards the Hardware: Machine-Level Representation of Programs Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32)

More information

LABORATORY WORK NO. 7 FLOW CONTROL INSTRUCTIONS

LABORATORY WORK NO. 7 FLOW CONTROL INSTRUCTIONS LABORATORY WORK NO. 7 FLOW CONTROL INSTRUCTIONS 1. Object of laboratory The x86 microprocessor family has a large variety of instructions that allow instruction flow control. We have 4 categories: jump,

More information

Assembly Language: IA-32 Instructions

Assembly Language: IA-32 Instructions Assembly Language: IA-32 Instructions 1 Goals of this Lecture Help you learn how to: Manipulate data of various sizes Leverage more sophisticated addressing modes Use condition codes and jumps to change

More information

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam Assembly Language Lecture 2 - x86 Processor Architecture Ahmed Sallam Introduction to the course Outcomes of Lecture 1 Always check the course website Don t forget the deadline rule!! Motivations for studying

More information

Ex: Write a piece of code that transfers a block of 256 bytes stored at locations starting at 34000H to locations starting at 36000H. Ans.

Ex: Write a piece of code that transfers a block of 256 bytes stored at locations starting at 34000H to locations starting at 36000H. Ans. INSTRUCTOR: ABDULMUTTALIB A H ALDOURI Conditional Jump Cond Unsigned Signed = JE : Jump Equal JE : Jump Equal ZF = 1 JZ : Jump Zero JZ : Jump Zero ZF = 1 JNZ : Jump Not Zero JNZ : Jump Not Zero ZF = 0

More information

CS61 Section Solutions 3

CS61 Section Solutions 3 CS61 Section Solutions 3 (Week of 10/1-10/5) 1. Assembly Operand Specifiers 2. Condition Codes 3. Jumps 4. Control Flow Loops 5. Procedure Calls 1. Assembly Operand Specifiers Q1 Operand Value %eax 0x104

More information

Hardware and Software Architecture. Chapter 2

Hardware and Software Architecture. Chapter 2 Hardware and Software Architecture Chapter 2 1 Basic Components The x86 processor communicates with main memory and I/O devices via buses Data bus for transferring data Address bus for the address of a

More information

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017 CS 31: Intro to Systems ISAs and Assembly Martin Gagné Swarthmore College February 7, 2017 ANNOUNCEMENT All labs will meet in SCI 252 (the robot lab) tomorrow. Overview How to directly interact with hardware

More information

Lab 6: Conditional Processing

Lab 6: Conditional Processing COE 205 Lab Manual Lab 6: Conditional Processing Page 56 Lab 6: Conditional Processing Contents 6.1. Unconditional Jump 6.2. The Compare Instruction 6.3. Conditional Jump Instructions 6.4. Finding the

More information

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer? Interfacing Compiler and Hardware Computer Systems Architecture FORTRAN 90 program C++ program Processor Types And Sets FORTRAN 90 Compiler C++ Compiler set level Hardware 1 2 What s Should A Processor

More information

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86 Lecture 15 Intel Manual, Vol. 1, Chapter 3 Hampden-Sydney College Fri, Mar 6, 2009 Outline 1 2 Overview See the reference IA-32 Intel Software Developer s Manual Volume 1: Basic, Chapter 3. Instructions

More information

Assembly Language. Lecture 2 x86 Processor Architecture

Assembly Language. Lecture 2 x86 Processor Architecture Assembly Language Lecture 2 x86 Processor Architecture Ahmed Sallam Slides based on original lecture slides by Dr. Mahmoud Elgayyar Introduction to the course Outcomes of Lecture 1 Always check the course

More information

Computer Architecture..Second Year (Sem.2).Lecture(4) مدرس المادة : م. سندس العزاوي... قسم / الحاسبات

Computer Architecture..Second Year (Sem.2).Lecture(4) مدرس المادة : م. سندس العزاوي... قسم / الحاسبات مدرس المادة : م. سندس العزاوي... قسم / الحاسبات... - 26 27 Assembly Level Machine Organization Usage of AND, OR, XOR, NOT AND : X Y X AND Y USE : to chick any bit by change ( to ) or ( to ) EX : AX = FF5

More information

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary 1 Outline Introduction to assembly programing Introduction to Y86 Y86 instructions,

More information

Complex Instruction Set Computer (CISC)

Complex Instruction Set Computer (CISC) Introduction ti to IA-32 IA-32 Processors Evolutionary design Starting in 1978 with 886 Added more features as time goes on Still support old features, although obsolete Totally dominate computer market

More information

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CSC 8400: Computer Systems. Machine-Level Representation of Programs CSC 8400: Computer Systems Machine-Level Representation of Programs Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32) 1 Compilation Stages

More information

Introduction to IA-32. Jo, Heeseung

Introduction to IA-32. Jo, Heeseung Introduction to IA-32 Jo, Heeseung IA-32 Processors Evolutionary design Starting in 1978 with 8086 Added more features as time goes on Still support old features, although obsolete Totally dominate computer

More information

INTRODUCTION TO IA-32. Jo, Heeseung

INTRODUCTION TO IA-32. Jo, Heeseung INTRODUCTION TO IA-32 Jo, Heeseung IA-32 PROCESSORS Evolutionary design Starting in 1978 with 8086 Added more features as time goes on Still support old features, although obsolete Totally dominate computer

More information

An Introduction to x86 ASM

An Introduction to x86 ASM An Introduction to x86 ASM Malware Analysis Seminar Meeting 1 Cody Cutler, Anton Burtsev Registers General purpose EAX, EBX, ECX, EDX ESI, EDI (index registers, but used as general in 32-bit protected

More information

IFE: Course in Low Level Programing. Lecture 6

IFE: Course in Low Level Programing. Lecture 6 IFE: Course in Low Level Programing Lecture 6 Instruction Set of Intel x86 Microprocessors Conditional jumps Jcc jump on condition cc, JMP jump always, CALL call a procedure, RET return from procedure,

More information

9/25/ Software & Hardware Architecture

9/25/ Software & Hardware Architecture 8086 Software & Hardware Architecture 1 INTRODUCTION It is a multipurpose programmable clock drive register based integrated electronic device, that reads binary instructions from a storage device called

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures ISAs Brief history of processors and architectures C, assembly, machine code Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface contain?

More information

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions? administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions? exam on Wednesday today s material not on the exam 1 Assembly Assembly is programming

More information

The x86 Architecture

The x86 Architecture The x86 Architecture Lecture 24 Intel Manual, Vol. 1, Chapter 3 Robb T. Koether Hampden-Sydney College Fri, Mar 20, 2015 Robb T. Koether (Hampden-Sydney College) The x86 Architecture Fri, Mar 20, 2015

More information

Assembly Language Programming Introduction

Assembly Language Programming Introduction Assembly Language Programming Introduction October 10, 2017 Motto: R7 is used by the processor as its program counter (PC). It is recommended that R7 not be used as a stack pointer. Source: PDP-11 04/34/45/55

More information

Intel 8086 MICROPROCESSOR. By Y V S Murthy

Intel 8086 MICROPROCESSOR. By Y V S Murthy Intel 8086 MICROPROCESSOR By Y V S Murthy 1 Features It is a 16-bit μp. 8086 has a 20 bit address bus can access up to 2 20 memory locations (1 MB). It can support up to 64K I/O ports. It provides 14,

More information

Branching and Looping

Branching and Looping Branching and Looping EECE416 uc Fall 2011 Unconditional Jumps jmp Like a goto in a high-level language Format: jmp StatementLabel The next statement executed will be the one at StatementLabel: jmp Encoding

More information

Program Control Instructions

Program Control Instructions Program Control Instructions Introduction This chapter explains the program control instructions, including the jumps, calls, returns, interrupts, and machine control instructions. This chapter also presents

More information

Read this before starting!

Read this before starting! Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 2150 (Tarnoff) Computer Organization TEST 3 for Fall Semester,

More information

Conditional Processing

Conditional Processing ١ Conditional Processing Computer Organization & Assembly Language Programming Dr Adnan Gutub aagutub at uqu.edu.sa Presentation Outline [Adapted from slides of Dr. Kip Irvine: Assembly Language for Intel-Based

More information

CMSC Lecture 03. UMBC, CMSC313, Richard Chang

CMSC Lecture 03. UMBC, CMSC313, Richard Chang CMSC Lecture 03 Moore s Law Evolution of the Pentium Chip IA-32 Basic Execution Environment IA-32 General Purpose Registers Hello World in Linux Assembly Language Addressing Modes UMBC, CMSC313, Richard

More information

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department

IA-32 Architecture COE 205. Computer Organization and Assembly Language. Computer Engineering Department IA-32 Architecture COE 205 Computer Organization and Assembly Language Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Basic Computer Organization Intel

More information

Jump instructions. Unconditional jumps Direct jump. do not change flags. jmp label

Jump instructions. Unconditional jumps Direct jump. do not change flags. jmp label do not change flags Unconditional jumps Direct jump jmp label Jump instructions jmp Continue xor eax,eax Continue: xor ecx,ecx Machine code: 0040340A EB 02 0040340C 33 C0 0040340E 33 C9 displacement =

More information

Name: CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1. Question Points I. /34 II. /30 III.

Name: CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1. Question Points I. /34 II. /30 III. CMSC 313 Fall 2001 Computer Organization & Assembly Language Programming Exam 1 Name: Question Points I. /34 II. /30 III. /36 TOTAL: /100 Instructions: 1. This is a closed-book, closed-notes exam. 2. You

More information

Branching and Looping

Branching and Looping X86 Assembly Language Programming: Branching and Looping EECE416 uc Charles Kim Howard University Fall 2013 www.mwftr.com Unconditional Jump: JMP jmp Like a goto in a high-level language Format: jmp StatementLabel

More information

Section 001 & 002. Read this before starting!

Section 001 & 002. Read this before starting! Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 2150 (Tarnoff) Computer Organization TEST 3 for Spring Semester,

More information

Intel 8086 MICROPROCESSOR ARCHITECTURE

Intel 8086 MICROPROCESSOR ARCHITECTURE Intel 8086 MICROPROCESSOR ARCHITECTURE 1 Features It is a 16-bit μp. 8086 has a 20 bit address bus can access up to 2 20 memory locations (1 MB). It can support up to 64K I/O ports. It provides 14, 16

More information

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Machine-level Representation of Programs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Program? 짬뽕라면 준비시간 :10 분, 조리시간 :10 분 재료라면 1개, 스프 1봉지, 오징어

More information

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM

EXPERIMENT WRITE UP. LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM EXPERIMENT WRITE UP AIM: Assembly language program for 16 bit BCD addition LEARNING OBJECTIVES: 1. Get hands on experience with Assembly Language Programming 2. Write and debug programs in TASM/MASM TOOLS/SOFTWARE

More information

Intel x86-64 and Y86-64 Instruction Set Architecture

Intel x86-64 and Y86-64 Instruction Set Architecture CSE 2421: Systems I Low-Level Programming and Computer Organization Intel x86-64 and Y86-64 Instruction Set Architecture Presentation J Read/Study: Bryant 3.1 3.5, 4.1 Gojko Babić 03-07-2018 Intel x86

More information

Lecture 5: Computer Organization Instruction Execution. Computer Organization Block Diagram. Components. General Purpose Registers.

Lecture 5: Computer Organization Instruction Execution. Computer Organization Block Diagram. Components. General Purpose Registers. Lecture 5: Computer Organization Instruction Execution Computer Organization Addressing Buses Fetch-Execute Cycle Computer Organization CPU Control Unit U Input Output Memory Components Control Unit fetches

More information

ADVANCE MICROPROCESSOR & INTERFACING

ADVANCE MICROPROCESSOR & INTERFACING VENUS INTERNATIONAL COLLEGE OF TECHNOLOGY Gandhinagar Department of Computer Enggineering ADVANCE MICROPROCESSOR & INTERFACING Name : Enroll no. : Class Year : 2014-15 : 5 th SEM C.E. VENUS INTERNATIONAL

More information

8086 INTERNAL ARCHITECTURE

8086 INTERNAL ARCHITECTURE 8086 INTERNAL ARCHITECTURE Segment 2 Intel 8086 Microprocessor The 8086 CPU is divided into two independent functional parts: a) The Bus interface unit (BIU) b) Execution Unit (EU) Dividing the work between

More information

CS 16: Assembly Language Programming for the IBM PC and Compatibles

CS 16: Assembly Language Programming for the IBM PC and Compatibles CS 16: Assembly Language Programming for the IBM PC and Compatibles Discuss the general concepts Look at IA-32 processor architecture and memory management Dive into 64-bit processors Explore the components

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures! ISAs! Brief history of processors and architectures! C, assembly, machine code! Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface

More information

Introduction to Machine/Assembler Language

Introduction to Machine/Assembler Language COMP 40: Machine Structure and Assembly Language Programming Fall 2017 Introduction to Machine/Assembler Language Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah

More information

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature CSE2421 FINAL EXAM SPRING 2013 Name KEY Instructions: This is a closed-book, closed-notes, closed-neighbor exam. Only a writing utensil is needed for this exam. No calculators allowed. If you need to go

More information

Process Layout and Function Calls

Process Layout and Function Calls Process Layout and Function Calls CS 6 Spring 07 / 8 Process Layout in Memory Stack grows towards decreasing addresses. is initialized at run-time. Heap grow towards increasing addresses. is initialized

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures Computer Systems: Section 4.1 Suppose you built a computer What Building Blocks would you use? Arithmetic Logic Unit (ALU) OP1 OP2 OPERATION ALU RES ALU + Registers R0: 0x0000

More information

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013 CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 03, SPRING 2013 TOPICS TODAY Moore s Law Evolution of Intel CPUs IA-32 Basic Execution Environment IA-32 General Purpose Registers

More information

Chapter 6 (Part a) Conditional Processing

Chapter 6 (Part a) Conditional Processing Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2025: Assembly Language Discussion Chapter 6 (Part a) Conditional Processing Eng. Eman R. Habib April, 2014 2 Assembly

More information

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control. Part 5 Intel x86 Jump Instructions Control Logic Fly over code Operations: Program Flow Control Operations: Program Flow Control Unlike high-level languages, processors don't have fancy expressions or

More information

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU) Part 2 Computer Processors Processors The Brains of the Box Computer Processors Components of a Processor The Central Processing Unit (CPU) is the most complex part of a computer In fact, it is the computer

More information

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Assembly Language for Intel-Based Computers, 4 th Edition Kip R. Irvine Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Slides prepared by Kip R. Irvine Revision date: 09/25/2002

More information

mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut

mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut mith College Computer Science CSC231 Assembly Week #10 Fall 2017 Dominique Thiébaut dthiebaut@smith.edu 2 Videos to Start With https://www.youtube.com/watch?v=fdmzngwchdk https://www.youtube.com/watch?v=k2iz1qsx4cm

More information

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control.

Intel x86 Jump Instructions. Part 5. JMP address. Operations: Program Flow Control. Operations: Program Flow Control. Part 5 Intel x86 Jump Instructions Control Logic Fly over code Operations: Program Flow Control Operations: Program Flow Control Unlike high-level languages, processors don't have fancy expressions or

More information

Addressing Modes on the x86

Addressing Modes on the x86 Addressing Modes on the x86 register addressing mode mov ax, ax, mov ax, bx mov ax, cx mov ax, dx constant addressing mode mov ax, 25 mov bx, 195 mov cx, 2056 mov dx, 1000 accessing data in memory There

More information

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture CS429: Computer Organization and Architecture Warren Hunt, Jr. and Bill Young Department of Computer Sciences University of Texas at Austin Last updated: October 1, 2014 at 12:03 CS429 Slideset 6: 1 Topics

More information

x64 Cheat Sheet Fall 2014

x64 Cheat Sheet Fall 2014 CS 33 Intro Computer Systems Doeppner x64 Cheat Sheet Fall 2014 1 x64 Registers x64 assembly code uses sixteen 64-bit registers. Additionally, the lower bytes of some of these registers may be accessed

More information

Sungkyunkwan University

Sungkyunkwan University - 2 - Complete addressing mode, address computation (leal) Arithmetic operations Control: Condition codes Conditional branches While loops - 3 - Most General Form D(Rb,Ri,S) Mem[ Reg[ R b ] + S Reg[ R

More information

Branching and Looping

Branching and Looping Branching and Looping Ray Seyfarth August 10, 2011 Branching and looping So far we have only written straight line code Conditional moves helped spice things up In addition conditional moves kept the pipeline

More information

CPS104 Recitation: Assembly Programming

CPS104 Recitation: Assembly Programming CPS104 Recitation: Assembly Programming Alexandru Duțu 1 Facts OS kernel and embedded software engineers use assembly for some parts of their code some OSes had their entire GUIs written in assembly in

More information

VARDHAMAN COLLEGE OF ENGINEERING (AUTONOMOUS) Shamshabad, Hyderabad

VARDHAMAN COLLEGE OF ENGINEERING (AUTONOMOUS) Shamshabad, Hyderabad Introduction to MS-DOS Debugger DEBUG In this laboratory, we will use DEBUG program and learn how to: 1. Examine and modify the contents of the 8086 s internal registers, and dedicated parts of the memory

More information

EC 333 Microprocessor and Interfacing Techniques (3+1)

EC 333 Microprocessor and Interfacing Techniques (3+1) EC 333 Microprocessor and Interfacing Techniques (3+1) Lecture 6 8086/88 Microprocessor Programming (Arithmetic Instructions) Dr Hashim Ali Fall 2018 Department of Computer Science and Engineering HITEC

More information

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Assembly II: Control Flow Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Processor State (x86-64) RAX 63 31 EAX 0 RBX EBX RCX RDX ECX EDX General-purpose

More information

Section 001. Read this before starting!

Section 001. Read this before starting! Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 2150 (Tarnoff) Computer Organization TEST 3 for Fall Semester,

More information

eaymanelshenawy.wordpress.com

eaymanelshenawy.wordpress.com Lectures on Memory Interface Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. Al-Azhar University Email : eaymanelshenawy@yahoo.com eaymanelshenawy.wordpress.com Lecture

More information

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

Low-Level Essentials for Understanding Security Problems Aurélien Francillon Low-Level Essentials for Understanding Security Problems Aurélien Francillon francill@eurecom.fr Computer Architecture The modern computer architecture is based on Von Neumann Two main parts: CPU (Central

More information

CSE351 Spring 2018, Midterm Exam April 27, 2018

CSE351 Spring 2018, Midterm Exam April 27, 2018 CSE351 Spring 2018, Midterm Exam April 27, 2018 Please do not turn the page until 11:30. Last Name: First Name: Student ID Number: Name of person to your left: Name of person to your right: Signature indicating:

More information

Assembly II: Control Flow

Assembly II: Control Flow Assembly II: Control Flow Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

IBM PC Hardware CPU 8088, Pentium... ALU (Arithmetic and Logic Unit) Registers. CU (Control Unit) IP.

IBM PC Hardware CPU 8088, Pentium... ALU (Arithmetic and Logic Unit) Registers. CU (Control Unit) IP. IBM PC Hardware CPU 8088, 8086 80286 80386 80486 Pentium... ALU (Arithmetic and Logic Unit) Registers CU (Control Unit) IP Memory ROM BIOS I/O RAM OS Programs Video memory BIOS data Interrupt Vectors Memory

More information

Machine Programming 2: Control flow

Machine Programming 2: Control flow Machine Programming 2: Control flow CS61, Lecture 4 Prof. Stephen Chong September 13, 2011 Announcements Assignment 1 due today, 11:59pm Hand in at front during break or email it to cs61- staff@seas.harvard.edu

More information

The Microprocessor and its Architecture

The Microprocessor and its Architecture The Microprocessor and its Architecture Contents Internal architecture of the Microprocessor: The programmer s model, i.e. The registers model The processor model (organization) Real mode memory addressing

More information

UMBC. contain new IP while 4th and 5th bytes contain CS. CALL BX and CALL [BX] versions also exist. contain displacement added to IP.

UMBC. contain new IP while 4th and 5th bytes contain CS. CALL BX and CALL [BX] versions also exist. contain displacement added to IP. Procedures: CALL: Pushes the address of the instruction following the CALL instruction onto the stack. RET: Pops the address. SUM PROC NEAR USES BX CX DX ADD AX, BX ADD AX, CX MOV AX, DX RET SUM ENDP NEAR

More information

It is possible to define a number using a character or multiple numbers (see instruction DB) by using a string.

It is possible to define a number using a character or multiple numbers (see instruction DB) by using a string. 1 od 5 17. 12. 2017 23:53 (https://github.com/schweigi/assembler-simulator) Introduction This simulator provides a simplified assembler syntax (based on NASM (http://www.nasm.us)) and is simulating a x86

More information

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth Registers Ray Seyfarth September 8, 2011 Outline 1 Register basics 2 Moving a constant into a register 3 Moving a value from memory into a register 4 Moving values from a register into memory 5 Moving

More information

Representation of Information

Representation of Information Representation of Information CS61, Lecture 2 Prof. Stephen Chong September 6, 2011 Announcements Assignment 1 released Posted on http://cs61.seas.harvard.edu/ Due one week from today, Tuesday 13 Sept

More information

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents. Intel Assembly Format of an assembly instruction: LABEL OPCODE OPERANDS COMMENT DATA1 db 00001000b ;Define DATA1 as decimal 8 START: mov eax, ebx ;Copy ebx to eax LABEL: Stores a symbolic name for the

More information

CPSC 313, Winter 2016 Term 1 Sample Date: October 2016; Instructor: Mike Feeley

CPSC 313, Winter 2016 Term 1 Sample Date: October 2016; Instructor: Mike Feeley CPSC 313, Winter 2016 Term 1 Sample Date: October 2016; Instructor: Mike Feeley NOTE: This sample contains all of the question from the two CPSC 313 midterms for Summer 2015. This term s midterm will be

More information

6/29/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to:

6/29/2011. Introduction. Chapter Objectives Upon completion of this chapter, you will be able to: Chapter 6: Program Control Instructions Introduction This chapter explains the program control instructions, including the jumps, calls, returns, interrupts, and machine control instructions. This chapter

More information

Control. Young W. Lim Mon. Young W. Lim Control Mon 1 / 16

Control. Young W. Lim Mon. Young W. Lim Control Mon 1 / 16 Control Young W. Lim 2016-11-21 Mon Young W. Lim Control 2016-11-21 Mon 1 / 16 Outline 1 Introduction References Condition Code Accessing the Conditon Codes Jump Instructions Translating Conditional Branches

More information

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung ASSEMBLY II: CONTROL FLOW Jo, Heeseung IA-32 PROCESSOR STATE Temporary data Location of runtime stack %eax %edx %ecx %ebx %esi %edi %esp %ebp General purpose registers Current stack top Current stack frame

More information

Introduction to Microprocessor

Introduction to Microprocessor Introduction to Microprocessor The microprocessor is a general purpose programmable logic device. It is the brain of the computer and it performs all the computational tasks, calculations data processing

More information

Basic Execution Environment

Basic Execution Environment Basic Execution Environment 3 CHAPTER 3 BASIC EXECUTION ENVIRONMENT This chapter describes the basic execution environment of an Intel Architecture processor as seen by assembly-language programmers.

More information

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Assembly II: Control Flow Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu IA-32 Processor State %eax %edx Temporary data Location of runtime stack

More information

CS241 Computer Organization Spring Introduction to Assembly

CS241 Computer Organization Spring Introduction to Assembly CS241 Computer Organization Spring 2015 Introduction to Assembly 2-05 2015 Outline! Rounding floats: round-to-even! Introduction to Assembly (IA32) move instruction (mov) memory address computation arithmetic

More information