administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions? exam on Wednesday today s material not on the exam 1
Assembly Assembly is programming language closest to actual CPU machine language is binary no wants to write that I will start with a simplified ideal assembly then look more closely at x86 assembly 2
Simple Assembly Line in assembly corresponds to 1 instruction Typical instructions load read data from memory into register save write data from register into memory move move data from one register to another add add data from register to another 3
Registers To be manipulated, data must be in registers using load if need be some registers have special purposes r0 traditionally used to return values bp points to stack frame (base pointer) other registers are general purpose just used for computing 4
Simple C code int foo(int x) { int y = x-3; } return x+y-1; 5
Simple C code int foo(int x) { int y = x-3; load x,y ; load x into y sub $3,y ; subtract 3 from y } return x+y-1; 5
Simple C code int foo(int x) { int y = x-3; load x,y ; load x into y sub $3,y ; subtract 3 from y } return x+y-1; load x,r0 ; load x into r0 add y,r0 ; y to r0 sub $1,r0 ; sub 1 from r0 5
Simple C code int foo(int x) { int y = x-3; load x,y ; load x into y sub $3,y ; subtract 3 from y } return x+y-1; load x,r0 ; load x into r0 add y,r0 ; y to r0 sub $1,r0 ; sub 1 from r0 Can t use x and y in asm! 5
memory locations BP x pixie dust y +8 +4 +0-4 6
memory locations x is a parameter it is 8 above where the base pointer points x is 8(bp) y is a local it is 4 below where the base pointer points y is -4(bp) 7
using registers could just substitute for x,y but cannot use memory in most instructions must use register load 8(bp), -4(bp) ; load x into y sub $3, -4(bp) ; subtract 3 from y load 8(bp), r0 ; load x into r0 add -4(bp), r0 ; add y to r0 sub $1, r0 ; sub 1 from r0 8
using registers must use register for doing computations load 8(bp), r1 ; load x into register sub $3, r1 ; subtract 3 from value store -4(bp), r1 ; store x-3 into y load 8(bp), r0 ; load x into r0 load -4(bp), r1 ; load y into a register add r1, r0 ; add y to r0 (now x+y) sub $1, r0 ; sub 1 from r0 9
PC PC is a very special register points to the address of the next instruction load 8(bp), r1 sub $3, r1 ; load x into register ; subtract 3 from value before executing either, PC points to load instr after executing load, PC points to sub instr 10
CPU (logical) Decode ALU Mem Buffer R0 R1 R2 R3... FP SP PC Memory 11
CPU (logical) Decode ALU Mem Buffer R0 R1 R2 R3... FP SP PC add r0, r1 Memory 11
CPU (logical) PHASES ALU R0 R1 R2 R3... FP SP PC a b c d Decode Mem Buffer add r0, r1 Memory 11
CPU (logical) PHASES FETCH ALU R0 R1 R2 R3... FP SP PC a b c d Decode Mem Buffer add r0, r1 Memory 11
CPU (logical) PHASES FETCH DECODE R1 R1 R0 + ALU R0 R1 R2 R3... FP SP PC a b c d Decode Mem Buffer add r0, r1 Memory 11
CPU (logical) PHASES FETCH DECODE OPFETCH R1 R1 R0 + b ALU a R0 R1 R2 R3... FP SP PC a b c d Decode Mem Buffer add r0, r1 Memory 11
CPU (logical) PHASES FETCH DECODE OPFETCH EXECUTE R1 R1 R0 + Decode b ALU a+b a Mem Buffer R0 R1 R2 R3... FP SP PC a b c d add r0, r1 Memory 11
CPU (logical) PHASES FETCH DECODE OPFETCH EXECUTE WRITEBACK R1 R1 R0 + Decode b ALU a+b a Mem Buffer R0 R1 R2 R3... FP SP PC a a+b b c d add r0, r1 Memory 11
CPU (actual) 12
Processor Families Many processors share the same assembly the same instructions or at least a major overlap x86 Architecture Intel (Core, i7,...), AMD most computers PowerPC xbox 360, wii, ps3 (closely related) other high end devices (printers,...) Dragonball/ARM most PDA s, ipods, phones 13
x86 architecture the original intel 8086 defined x86 instructions used by all descendants including Pentiums and AMD processors also called IA32 for Intel Architecture 32-bit what we will focus on for next few weeks 14
x86 registers 6 general purpose registers %eax %ebx %ecx %edx %esi %edi dedicated registers %esp %ebp stack pointer frame pointer (base pointer) without e, gives low order 16 bits 15
basic instructions mov s,d add s,d sub s,d inc d dec d neg d moves s to d d + s goes into d d - s goes into d increments d decrements d negates d s is a source, never modified d is a destination, generally modified 16
stack params %ebp %esp pixie dust locals 17
stack instructions push s pushes s onto stack params %ebp %esp pixie dust locals 18
stack instructions push s push r0 pushes s onto stack ; assume r0 has 7 in it %ebp %esp params pixie dust locals 18
stack instructions push s pushes s onto stack params push r0 ; assume r0 has 7 in it first grow stack by subtracting 4 from esp %ebp pixie dust locals %esp 18
stack instructions push s pushes s onto stack params push r0 ; assume r0 has 7 in it first grow stack by subtracting 4 from esp now put value into space %ebp %esp pixie dust locals 7 18
stack instructions pop d pops top of stack into d params %ebp %esp pixie dust locals 7 19
stack instructions pop d pop r1 pops top of stack into d ; puts top of stack into r1 %ebp %esp params pixie dust locals 7 19
stack instructions pop d pops top of stack into d pop r1 ; puts top of stack into r1 first move 7 into r1 %ebp %esp params pixie dust locals 7 19
stack instructions pop d pops top of stack into d pop r1 ; puts top of stack into r1 first move 7 into r1 now remove extra space %ebp %esp params pixie dust locals 7 19
instruction sizes data instructions can add l,w,b l (long) does 32 bits w (word) does 16 bits b (byte) does 8 bits examples: movl %eax,%ebx ; moves 32 bits from eax to ebx addw %ax,%bx ; adds low 16 bits from ax to bx 20
addressing modes x86 operands can be more than just registers called addressing modes or operand specifiers in text some simple ones we have already seen register %eax use contents of %eax immediate $3 use value 3 (never d) memory 8(%ebp) use mem at %ebp+8 21
more addressing modes absolute 2A004 use memory at addr 2A004 indexed 8(%eax,%ebp) use memory at %eax+%ebp+8 scaled indexed 8(%eax,%ebp,4) use memory at %eax*4+%ebp+8 22
memory instructions there is no load or store instruction just use mov with s or d as memory example: movl %eax,-8(%ebp) ; stores eax into local var movw (%ebx),%dx ; loads word into dx 23
example program consider following C code void storesum(int a,int b, int * p) { *p = a + b; } 24
x86 code void storesum(int a,int b, int * p) { *p = a + b; } ignoring function setup/return, x86 code is movl 12(%ebp), %eax ; load b into eax movl 8(%ebp), %edx ; load a into edx addl %eax,%edx ; a+b into edx movl 16(%ebp),%eax ; load p into eax movl %edx,(%eax) ; store a+b into *p 25
lea instruction lea is special memory instruction load effective address computes address of memory operand without actually referencing memory think & operator example: lea 8(%eax,%ebx),%ecx ; puts 8+%eax+%ebx into %ecx 26
unconditional jumps x86 jmp instruction just jumps to the label example: movl $2,%eax jmp L2 movl $3,%eax L2: movl %eax,%ebx ; never executed results in 2 in %ebx 27
jump encoding target of jump can be an absolute address set by the assembler/linker can be a pc relative address set by the linker can be a register or memory used for function pointers, for example 28
condition codes also conditional jumps most use one or both of the condition codes Z (or ZF) is previous result 0 N (or SF) is previous result negative (signed) two instructions just set these codes cmp s,d set codes for d-s test s set codes for s itself (as s&s) these instructions also use l,w,b 29
conditional jumps numerous conditional jump instructions je jump if the result was equal to 0 jne jump if the result was not equal to 0 js jump if the result was < 0 jns jump if the result was >= 0 jg jump if the result was > 0 jge jump if the result was >= 0 jl jump if the result was < 0 jle jump if the result was <= 0 30
conditional jumps the last 4 are signed comparisons also jumps for unsigned comparisons if condition not true then execute the next instruction 31
If example consider C fragment void foo(int a) { int b;... if (a < 3) b = 7; else b = -a; 32
If example if (a < 3) b = 7; else b = -a; cmpl $2,8(%ebp) ; compare 2 and a jg L2 ; jump if a > 2 movl $7,-12(%ebp) ; move 7 into b jmp L4 ; skip over else L2: ; else part movl 8(%ebp),%eax ; load a negl %eax ; compute -a movl %eax,-12(%ebp) ; store -a into b L4: ; join back up 33
function setup remember what stack frame looks like ebp esp +8 +4 args return PC pixie saved dust ebp locals 34
function setup remember what stack frame looks like args ebp esp +8 +4 return PC saved ebp locals 34
function setup when function is called args are on stack return PC is on stack stack pointer (esp) points at return PC slot must push ebp set ebp to point to saved ebp move esp to leave room for locals 35
function setup the code to do this looks like pushl %ebp ; save the current ebp movl %esp,%ebp ; make ebp point to it subl 8,%esp ; allocate locals 36
leave instruction the leave instruction undoes this same effect as movl %ebp,%esp popl %ebp ; pop off locals ; restore old ebp 37
call instruction the actual function call is done with call the C code foo(3) is implemented with the x86 code pushl $3 call _foo ; push the argument ; save the next pc and ; jump to the start of foo 38
ret instruction the ret instruction undoes a call it pops the return address off the stack and jumps to it 39
return value if the function returns a value the function puts the value into eax the calling code gets the value out of eax so consider the function int sum(int a,int b) { return a+b; } 40
sum code int sum(int a,int b) { return a+b; } the x86 code for this function is pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax leave ret 41
calling sum consider calling sum as c = sum(x,3); where c is a local and x is a parameter possible code for this is pushl $3 ; push second arg movl 8(%ebp),%eax ; get x pushl %eax ; push first arg call _sum ; call sum movl %eax,-12(%ebp) ; save return in c 42
what s wrong? this is bad code in one respect anyone see it? 43
creeping stack it leaves the arguments on the stack could insert subl $8,%esp after call gcc actually pre-allocates arg space args are put on stack with movl $3,4(%esp) movl 8(%ebp),%eax movl %eax,(%esp) no need to clean up after each call 44