Machine-Level Programming (2)

Similar documents
Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Machine-Level Programming II: Control

Machine-Level Programming III: Procedures

Machine-Level Programming II: Control

Machine-level Programs Procedure

Machine- Level Representa2on: Procedure

Machine Level Programming: Control

CS429: Computer Organization and Architecture

2/12/2016. Today. Machine-Level Programming II: Control. Condition Codes (Implicit Setting) Processor State (x86-64, Partial)

CS429: Computer Organization and Architecture

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

CS367. Program Control

Outline. Review: Assembly/Machine Code View. Processor State (x86-64, Par2al) Condi2on Codes (Explicit Se^ng: Compare) Condi2on Codes (Implicit Se^ng)

Mechanisms in Procedures. CS429: Computer Organization and Architecture. x86-64 Stack. x86-64 Stack Pushing

Machine-Level Programming II: Control

Assembly Programming IV

L09: Assembly Programming III. Assembly Programming III. CSE 351 Spring Guest Lecturer: Justin Hsia. Instructor: Ruth Anderson

CS 33. Machine Programming (2) CS33 Intro to Computer Systems XII 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Assembly II: Control Flow

x86-64 Programming III

Procedures and the Call Stack

x86 Programming II CSE 351 Winter

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

x86-64 Programming II

The Stack & Procedures

The Hardware/Software Interface CSE351 Spring 2013

Assembly Programming III

Assembly Programming IV

Control flow (1) Condition codes Conditional and unconditional jumps Loops Conditional moves Switch statements

CS 33. Machine Programming (2) CS33 Intro to Computer Systems XI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Machine- Level Programming II: Arithme6c & Control

x86 Programming III CSE 351 Autumn 2016 Instructor: Justin Hsia

Credits to Randy Bryant & Dave O Hallaron

Lecture 4: x86_64 Assembly Language

The Stack & Procedures

Machine Language CS 3330 Samira Khan

void P() {... y = Q(x); print(y); return; } ... int Q(int t) { int v[10];... return v[t]; } Computer Systems: A Programmer s Perspective

x86-64 Programming III & The Stack

Machine-Level Programming II: Arithmetic & Control /18-243: Introduction to Computer Systems 6th Lecture, 5 June 2012

Stack Frame Components. Using the Stack (4) Stack Structure. Updated Stack Structure. Caller Frame Arguments 7+ Return Addr Old %rbp

%r8 %r8d. %r9 %r9d. %r10 %r10d. %r11 %r11d. %r12 %r12d. %r13 %r13d. %r14 %r14d %rbp. %r15 %r15d. Sean Barker

Foundations of Computer Systems

x86 64 Programming II

Controlling Program Flow

Machine Program: Procedure. Zhaoguo Wang

Machine Level Programming II: Arithmetic &Control

Machine-Level Programming II: Arithmetic & Control. Complete Memory Addressing Modes

Machine-Level Programming II: Control and Arithmetic

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Processor state. Information about currently executing program. %rax, %rdi, ... %r14 %r15. %rsp. %rip CF ZF SF OF. Temporary data

Sungkyunkwan University

Foundations of Computer Systems

Sungkyunkwan University

Machine-Level Programming II: Control Flow

CS 107. Lecture 13: Assembly Part III. Friday, November 10, Stack "bottom".. Earlier Frames. Frame for calling function P. Increasing address

Areas for growth: I love feedback

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

CSCI 2021: x86-64 Control Flow

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

Assembly Language II: Addressing Modes & Control Flow

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS241 Computer Organization Spring Addresses & Pointers

x64 Cheat Sheet Fall 2014

How Software Executes

Compiling C Programs Into X86-64 Assembly Programs

Condition Codes The course that gives CMU its Zip! Machine-Level Programming II Control Flow Sept. 13, 2001 Topics

Condition Codes. Lecture 4B Machine-Level Programming II: Control Flow. Setting Condition Codes (cont.) Setting Condition Codes (cont.

CSE351 Spring 2018, Midterm Exam April 27, 2018

Intel x86-64 and Y86-64 Instruction Set Architecture

CF Carry Flag SF Sign Flag ZF Zero Flag OF Overflow Flag. ! CF set if carry out from most significant bit. "Used to detect unsigned overflow

Assembly Language: Function Calls

Systems Programming and Computer Architecture ( )

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 5

Princeton University Computer Science 217: Introduction to Programming Systems. Assembly Language: Function Calls

University of Washington

Machine- Level Programming II: Arithme c & Control

Giving credit where credit is due

CSC 252: Computer Organization Spring 2018: Lecture 6

Machine- Level Programming II: Arithme6c & Control

CISC 360. Machine-Level Programming II: Control Flow Sep 23, 2008

Page # CISC 360. Machine-Level Programming II: Control Flow Sep 23, Condition Codes. Setting Condition Codes (cont.)

Homework 0: Given: k-bit exponent, n-bit fraction Find: Exponent E, Significand M, Fraction f, Value V, Bit representation

Chapter 3 Machine-Level Programming II Control Flow

Lecture 3 CIS 341: COMPILERS

Computer Architecture I: Outline and Instruction Set Architecture. CENG331 - Computer Organization. Murat Manguoglu

CSE 351 Midterm Exam Spring 2016 May 2, 2015

Computer Organization: A Programmer's Perspective

Data Representa/ons: IA32 + x86-64

Control flow. Condition codes Conditional and unconditional jumps Loops Switch statements

Do not turn the page until 5:10.

System Programming and Computer Architecture (Fall 2009)

CSE 351 Midterm Exam

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 12

cmovxx ra, rb 2 fn ra rb irmovq V, rb 3 0 F rb V rmmovq ra, D(rB) 4 0 ra rb mrmovq D(rB), ra 5 0 ra rb OPq ra, rb 6 fn ra rb jxx Dest 7 fn Dest

The Hardware/Software Interface CSE351 Spring 2015

ASSEMBLY III: PROCEDURES. Jo, Heeseung

Review addressing modes

Assembly III: Procedures. Jo, Heeseung

Transcription:

Machine-Level Programming (2) Yanqiao ZHU Introduction to Computer Systems Project Future (Fall 2017) Google Camp, Tongji University

Outline Control Condition Codes Conditional Branches and Conditional Moves Loops Switch Statements Procedures Stack Structure Calling Conventions Passing Control Passing Data Managing Local Data

Conditional Branches Introduction to Computer Systems

Processor State (x86-64, Partial) Information about currently executing program: Temporary data (%rax, ) Location of runtime stack (%rsp) Location of current code control point (%rip, ) Status of recent tests (CF, ZF, SF, OF)

Condition Codes (Implicit Setting) Single bit registers CF Carry Flag (for unsigned) SF Sign Flag (for signed) ZF Zero Flag OF Overflow Flag (for signed) Implicitly set (as side effect) of arithmetic operations. Example: addq Src,Dest t = a+b CF set if carry out from most significant bit (unsigned overflow) ZF set if t == 0 SF set if t < 0 (as signed) OF set if two s-complement (signed) overflow (a>0 && b>0 && t<0) (a<0 && b<0 && t>=0) Not set by leaq instruction.

Condition Codes (Explicit Setting: Compare) Explicit setting by compare instruction: cmpq Src2, Src1 cmpq b,a like computing a-b without setting destination CF set if carry out from most significant bit (used for unsigned comparisons) ZF set if a == b SF set if (a-b) < 0 (as signed) OF set if two s-complement (signed) overflow (a>0 && b<0 && (a-b)<0) (a<0 && b>0 && (a-b)>0)

Condition Codes (Explicit Setting: Test) Explicit setting by test instruction: testq Src2, Src1 testq b,a like computing a&b without setting destination Sets condition codes based on value of Src1 & Src2. Useful to have one of the operands be a mask. ZF set when a&b == 0 SF set when a&b < 0

Reading Condition Codes setx instructions: Set low-order byte of destination to 0 or 1 based on combinations of condition codes. Does not alter remaining 7 bytes. setx Condition Description sete ZF Equal / Zero setne ~ZF Not Equal / Not Zero sets SF Negative setns ~SF Nonnegative setg ~(SF^OF)&~ZF Greater (Signed) setge ~(SF^OF) Greater or Equal (Signed) setl (SF^OF) Less (Signed) setle (SF^OF) ZF Less or Equal (Signed) seta ~CF&~ZF Above (unsigned) setb CF Below (unsigned)

Reading Condition Codes (cont.) setx instructions: Set single byte based on combination of condition codes. One of addressable byte registers: Does not alter remaining bytes. Typically use movzbl to finish job. 32-bit instructions also set upper 32 bits to 0.

Reading Condition Codes (cont.) Figure 3.1: Understanding gt().

Jumping jx Instructions: Jump to different part of code depending on condition codes. jx Condition Description jmp 1 Unconditional je ZF Equal / Zero jne ~ZF Not Equal / Not Zero js SF Negative jns ~SF Nonnegative jg ~(SF^OF)&~ZF Greater (Signed) jge ~(SF^OF) Greater or Equal (Signed) jl (SF^OF) Less (Signed) jle (SF^OF) ZF Less or Equal (Signed) ja ~CF&~ZF Above (unsigned) jb CF Below (unsigned)

Expressing with Goto Code C allows goto statement. Jump to position designated by label. long absdiff (long x, long y) { long result; if (x > y) result = x-y; else result = y-x; return result; } long absdiff_j (long x, long y) { long result; int ntest = x <= y; if (ntest) goto Else; result = x-y; goto Done; Else: result = y-x; Done: return result; }

Using Conditional Moves Conditional move instructions: Instruction supports: if (Test) Dest Src Supported in post-1995 x86 processors. GCC tries to use them, but only when known to be safe. Why? Branches are very disruptive to instruction flow through pipelines. Conditional moves do not require control transfer. // C version val = Test? Then_Expr : Else_Expr; // Goto version result = Then_Expr; eval = Else_Expr; nt =!Test; if (nt) result = eval; return result;

Using Conditional Moves (cont.) Figure 3.2: Conditional move example.

Bad Cases for Conditional Move Expensive computations Bad performance: both values get computed. Only makes sense when computations are very simple. Risky computations val = Test(x)? Hard1(x) : Hard2(x); val = p? *p : 0; Unsafe: both values get computed but may have undesirable effects. Computations with side effects Illegal: must be side-effect free. val = x > 0? x*=7 : x+=3;

Loops Introduction to Computer Systems

Do-While Loop Figure 3.3: Do-while loop example.

Do-While Loop (cont.) Figure 3.4: Do-while loop compilation.

General While Translation #1 Jump-to-middle translation Used with -Og While version while (Test) Body Goto Version goto test; loop: Body test: if (Test) goto loop; done:

General While Translation #2 Do-while conversion Used with -O1 While version while (Test) Body Goto Version if (!Test) goto done; loop: Body if (Test) goto loop; done: Do-While Version if (!Test) goto done; do Body while(test); done:

Switch Statements Introduction to Computer Systems

Jump Table Figure 3.5: Jump table structure.

Switch Statement Figure 3.6: Switch statement example.

Assembly Setup Explanation Table Structure Each target requires 8 bytes. Base address at.l4. Jumping Direct: jmp.l8 Jump target is denoted by label.l8. Indirect: jmp *.L4(,%rdi,8) Start of jump table:.l4. Must scale by factor of 8 (addresses are 8 bytes). Fetch target from effective Address.L4 + x * 8. Only for 0 x 6. Jump table.section.rodata.align 8.L4:.quad.L8 # x = 0.quad.L3 # x = 1.quad.L5 # x = 2.quad.L9 # x = 3.quad.L8 # x = 4.quad.L7 # x = 5.quad.L7 # x = 6

Wrapping Up Assembler Control Conditional jump Conditional move Indirect jump (via jump tables) Compiler generates code sequence to implement more complex control Standard Techniques Loops converted to do-while or jump-to-middle form Large switch statements use jump tables Sparse switch statements may use decision trees (if-elseif-elseif-else)

Procedures Introduction to Computer Systems

Mechanisms in Procedures (1/5) Passing control To beginning of procedure code Back to return point Passing data Procedure arguments Return value Memory management Allocate during procedure execution Deallocate upon return P() { y = Q(x); print(y) } int Q(int i) { int t = 3 * i; int v[10]; return v[t]; }

Mechanisms in Procedures (2/5) Passing control To beginning of procedure code Back to return point Passing data Procedure arguments Return value Memory management Allocate during procedure execution Deallocate upon return P() { y = Q(x); print(y) } int Q(int i) { int t = 3 * i; int v[10]; return v[t]; }

Mechanisms in Procedures (3/5) Passing control To beginning of procedure code Back to return point Passing data Procedure arguments Return value Memory management Allocate during procedure execution Deallocate upon return P() { y = Q(x); print(y) } int Q(int i) { int t = 3 * i; int v[10]; return v[t]; }

Mechanisms in Procedures (4/5) Passing control To beginning of procedure code Back to return point Passing data Procedure arguments Return value Memory management Allocate during procedure execution Deallocate upon return P() { y = Q(x); print(y) } int Q(int i) { int t = 3 * i; int v[10]; return v[t]; }

Mechanisms in Procedures (5/5) Mechanisms all implemented with machine instructions. x86-64 implementation of a procedure uses only those mechanisms require. Machine instructions implement the mechanisms, but the choices are determined by designers. These choices make up the Application Binary Interface (ABI).

x86-64 Stack (1/3) Region of memory managed with stack discipline. Memory viewed as array of bytes. Different regions have different purposes. Like ABI, a policy decision. stack code

x86-64 Stack (2/3) Region of memory managed with stack discipline. stack bottom stack code Stack Pointer: %rsp stack top

x86-64 Stack (3/3) Region of memory managed with stack discipline. stack bottom Grows toward lower addresses. Register %rsp contains lowest stack address. Address of top element Stack Pointer: %rsp stack top increasing addresses stack grows down

x86-64 Stack: Push pushq Src Fetch operand at Src val Decrement %rsp by 8 Write operand at address given by %rsp stack bottom increasing addresses Stack Pointer: %rsp stack top stack grows down

x86-64 Stack: Push pushq Src Fetch operand at Src val Decrement %rsp by 8 Write operand at address given by %rsp stack bottom increasing addresses Stack Pointer: %rsp 8 stack top stack grows down

x86-64 Stack: Push pushq Src Fetch operand at Src Decrement %rsp by 8 Write operand at address given by %rsp stack bottom increasing addresses Stack Pointer: %rsp 8 val stack grows down stack top

x86-64 Stack: Pop popq Dest Read value at address given by %rsp Increment %rsp by 8 Store value at Dest (usually a register) stack bottom increasing addresses stack grows down Stack Pointer: %rsp val stack top

x86-64 Stack: Pop popq Dest Read value at address given by %rsp Increment %rsp by 8 Store value at Dest (usually a register) stack bottom increasing addresses Stack Pointer: %rsp +8 val stack grows down stack top

x86-64 Stack: Pop popq Dest Read value at address given by %rsp val Increment %rsp by 8 Store value at Dest (usually a register) stack bottom increasing addresses The memory doesn t change, only the value of %rsp. Stack Pointer: %rsp val stack top stack grows down

Code Examples void multstore(long x, long y, long *dest) { long t = mult2(x, y); *dest = t; } 0000000000400540 <multstore>: 400540: push %rbx # Save %rbx 400541: mov %rdx, %rbx # Save dest 400544: callq 400550 <mult2> # mult2(x,y) 400549: mov %rax, (%rbx) # Save at dest 40054c: pop %rbx # Restore %rbx 40054d: retq # Return

Code Examples (cont.) long mult2(long a, long b) { long s = a * b; return s; } 0000000000400550 <mult2>: 400550: mov %rdi, %rax # a 400553: imul %rsi, %rax # a * b 400557: retq # Return

Procedure Control Flow Use stack to support procedure call and return. Procedure call: call label Push return address on stack Jump to label Return address: Address of the next instruction right after call Example from disassembly Procedure return: ret Pop address from stack Jump to address

Control Flow Example (1/4) 0x130 0x128 0x120 0000000000400540 <multstore>: 400544: callq 400550 <mult2> 400549: mov %rax, (%rbx) %rsp %rip 0x120 0x400544 0000000000400550 <mult2>: 400550: mov %rdi, %rax 400557: retq

Control Flow Example (2/4) 0x130 0x128 0x120 0x118 0x400549 0000000000400540 <multstore>: 400544: callq 400550 <mult2> 400549: mov %rax, (%rbx) %rsp %rip 0x118 0x400550 0000000000400550 <mult2>: 400550: mov %rdi, %rax 400557: retq

Control Flow Example (3/4) 0x130 0x128 0x120 0x118 0x400549 0000000000400540 <multstore>: 400544: callq 400550 <mult2> 400549: mov %rax, (%rbx) %rsp %rip 0x118 0x400557 0000000000400550 <mult2>: 400550: mov %rdi, %rax 400557: retq

Control Flow Example (4/4) 0x130 0x128 0x120 0000000000400540 <multstore>: 400544: callq 400550 <mult2> 400549: mov %rax, (%rbx) %rsp %rip 0x120 0x400549 0000000000400550 <mult2>: 400550: mov %rdi, %rax 400557: retq

Procedure Data Flow Registers First 6 arguments Return value %rdi %rsi %rdx %rcx %r8 %r9 %rax Stack Only allocate stack space when needed Arg n Arg 8 Arg 7

Recall: x86-64 Integer Registers

Data Flow Examples void multstore(long x, long y, long *dest) { long t = mult2(x, y); *dest = t; } 0000000000400540 <multstore>: # x in %rdi, y in %rsi, dest in %rdx 400541: mov %rdx,%rbx # Save dest 400544: callq 400550 <mult2> # mult2(x,y) # t in %rax 400549: mov %rax,(%rbx) # Save at dest

Data Flow Examples (cont.) long mult2(long a, long b) { long s = a * b; return s; } 0000000000400550 <mult2>: # a in %rdi, b in %rsi 400550: mov %rdi,%rax # a 400553: imul %rsi,%rax # a * b # s in %rax 400557: retq # Return

Stack-Based Languages Languages that support recursion E.g., C, Pascal, Java Code must be Reentrant. Multiple simultaneous instantiations of single procedure. Need some place to store state of each instantiation. Arguments Local variables Return pointer

Stack-Based Languages (cont.) Stack discipline State for given procedure needed for limited time. From when called to when return. Callee returns before caller does. Stack allocated in Frames. State for single procedure instantiation

Call Chain Example (1/10) Example Call Chain: yoo() { who(); } who() { ami(); ami(); } ami() { ami(); } yoo who ami ami ami Figure 3.7: Procedure ami() is recursive.

Stack Frames Contents Return information Local storage (if needed) Temporary space (if needed) Management Space allocated when enter procedure. Set-up code Includes push by call instruction Deallocated when return. Finish code Includes pop by ret instruction Frame Pointer: %rbp (optional) Stack Pointer: %rsp Previous Frame Frame for proc Stack Top

Call Chain Example (2/10) Stack yoo() { who(); } yoo who ami ami Previous yoo %rbp %rsp ami

Call Chain Example (3/10) Stack yoo() who() { { who(); ami(); } ami(); } yoo who ami ami ami Previous yoo who %rbp %rsp

Call Chain Example (4/10) Stack yoo() who() { { ami() who(); ami(); { } ami(); ami(); } } yoo who ami ami ami Previous yoo who ami %rbp %rsp

Call Chain Example (5/10) Stack yoo() yoo Previous who() { { who yoo ami() who(); ami(); { ami() ami ami who } { ami(); ami(); ami(); } ami ami } } ami %rbp %rsp

Call Chain Example (6/10) Stack yoo() who() { { ami() who(); ami(); { } ami(); ami(); } } yoo who ami ami ami Previous yoo who ami %rbp %rsp

Call Chain Example (7/10) Stack yoo() who() { { who(); ami(); } ami(); } yoo who ami ami ami Previous yoo who %rbp %rsp

Call Chain Example (8/10) Stack yoo() who() { { ami() who(); ami(); { } ami(); ami(); } } yoo who ami ami ami Previous yoo who ami %rbp %rsp

Call Chain Example (9/10) Stack yoo() who() { { who(); ami(); } ami(); } yoo who ami ami ami Previous yoo who %rbp %rsp

Call Chain Example (10/10) Stack yoo() { who(); } yoo who ami ami Previous yoo %rbp %rsp ami

x86-64/linux Stack Frame Current Stack Frame ( Top to Bottom) Argument build: Parameters for function about to call. Local variables: If can t keep in registers Saved register context Old frame pointer (optional) Caller Stack Frame Return address Pushed by call instruction Arguments for this call Arguments 7+ Return Address Old %rbp Saved Registers + Local Variables Argument Build (optional) Caller Frame Frame Pointer (optional) %rbp Stack Pointer %rsp Figure 3.8: x86-64/linux stack frame.

Example: incr long incr(long *p, long val) { long x = *p; long y = x + val; *p = y; return x; } incr: movq addq movq ret (%rdi), %rax %rax, %rsi %rsi, (%rdi) Register %rdi %rsi %rax Use(s) Argument p Argument val, y x, Return value

Example: Calling incr (1/6) long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } Initial Stack Structure Rtn address %rsp Resulting Stack Structure call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret Rtn address 15213 Unused %rsp+8 %rsp

Example: Calling incr (2/6) long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret Stack Structure Rtn address Aside 1: movl $3000, %esi Remember, movl -> %exx zeros 15213 out high order %rsp+8 32 bits. Why use movl instead of movq? 1 byte shorter. Aside 2: leaq 8(%rsp), %rdi Unused %rsp Computes %rsp+8 Actually, used for what it is meant! Register %rdi Use(s) &v1 %rsi 3000

Example: Calling incr (3/6) long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } Stack Structure Rtn address call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret 18213 Unused Register Use(s) %rdi &v1 %rsi 3000 %rsp+8 %rsp

Example: Calling incr (4/6) long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret Stack Structure Rtn address 18213 Unused Register Use(s) %rax Return value %rsp+8 %rsp

Stack Structure Example: Calling incr (5/6) long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } Rtn address 18213 Unused %rsp+8 %rsp call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret Register Use(s) %rax Return value Updated Stack Structure Rtn address %rsp

Example: Calling incr (6/6) Updated Stack Structure long call_incr() { long v1 = 15213; long v2 = incr(&v1, 3000); return v1+v2; } Rtn address %rsp call_incr: subq $16, %rsp movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq 8(%rsp), %rax addq $16, %rsp ret Register Use(s) %rax Return value Final Stack Structure %rsp

Register Saving Conventions When procedure yoo calls who: yoo is the caller. who is the callee. Can register be used for temporary storage? Contents of register %rdx overwritten by who. This could be trouble something should be done! Need some coordination. yoo: movq $15213, %rdx call who addq %rdx, %rax ret who: subq $18213, %rdx ret

Register Saving Conventions (cont.) When procedure yoo calls who: yoo is the caller. who is the callee. Conventions Caller Saved Caller saves temporary values in its frame before the call. Callee Saved Callee saves temporary values in its frame before using. Callee restores them before returning to caller.

x86-64 Linux Register Usage %rax Return value, also caller-saved Can be modified by procedure %rdi,, %r9 Arguments, also caller-saved Can be modified by procedure %r10, %r11 Caller-saved Can be modified by procedure %rax %rdi %rsi %rdx %rcx %r8 %r9 %r10 %r11 Return value Arguments Caller-saved temporaries

x86-64 Linux Register Usage (cont.) %rbx, %r12, %r13, %r14 Callee-saved Callee must save and restore %rbp Callee-saved Callee must save and restore May be used as frame pointer Can mix and match %rsp Special form of callee save Restored to original value upon exit from procedure Special %rbx %r12 %r13 %r14 %rbp %rsp Callee-saved Temporaries

Callee-Saved Example (1/8) x comes in register %rdi. We need %rdi for the call to incr. Where should be put x, so we can use it after the call to incr? long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } Initial Stack Structure Rtn address %rsp

Callee-Saved Example (2/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret Initial Stack Structure Rtn address Resulting Stack Structure Rtn address Saved %rbx %rsp %rsp

Callee-Saved Example (3/8) Initial Stack Structure long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } Rtn address Saved %rbx %rsp call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret Resulting Stack Structure Rtn address Saved %rbx %rsp + 8 %rsp

Callee-Saved Example (4/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret x saved in %rbx A callee saved register. Stack Structure Rtn address Saved %rbx %rsp + 8 %rsp

Callee-Saved Example (5/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret Stack Structure Rtn address Saved %rbx 15213 Unused %rsp + 8 %rsp

Callee-Saved Example (6/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret x is safe in %rbx Return result in %rax Stack Structure Rtn address Saved %rbx 18213 Unused %rsp + 8 %rsp

Callee-Saved Example (7/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret Stack Structure Rtn address Saved %rbx 18213 Unused %rsp

Initial Stack Structure Callee-Saved Example (8/8) long call_incr2(long x) { long v1 = 15213; long v2 = incr(&v1, 3000); return x + v2; } Rtn address Saved %rbx 18213 Unused %rsp call_incr2: pushq %rbx subq $16, %rsp movq %rdi, %rbx movq $15213, 8(%rsp) movl $3000, %esi leaq 8(%rsp), %rdi call incr addq %rbx, %rax addq $16, %rsp popq %rbx ret Final Stack Structure Rtn address Saved %rbx 18213 Unused %rsp

Wrapping Up Important Points Stack is the right data structure for procedure call/return. If P calls Q, then Q returns before P. Recursion (& mutual recursion) handled by normal calling conventions. Can safely store values in local stack frame and in callee-saved registers. Put function arguments at top of stack. Result return in %rax. Pointers are addresses of values. On stack or global