CS 316: Procedure Calls/Pipelining

Similar documents
Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Lec 10: Assembler. Announcements

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

Anne Bracy CS 3410 Computer Science Cornell University

Anne Bracy CS 3410 Computer Science Cornell University

Memory Usage 0x7fffffff. stack. dynamic data. static data 0x Code Reserved 0x x A software convention

Anne Bracy CS 3410 Computer Science Cornell University

Lecture 5: Procedure Calls

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Hakim Weatherspoon CS 3410 Computer Science Cornell University

ECE232: Hardware Organization and Design

Function Calling Conventions 1 CS 64: Computer Organization and Design Logic Lecture #9

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

ECE260: Fundamentals of Computer Engineering

Subroutines. int main() { int i, j; i = 5; j = celtokel(i); i = j; return 0;}

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

Lecture 5: Procedure Calls

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Review of Activation Frames. FP of caller Y X Return value A B C

Lecture 5. Announcements: Today: Finish up functions in MIPS

Instruction Set Architectures (4)

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

CSE Lecture In Class Example Handout

CS153: Compilers Lecture 8: Compiling Calls

CS 61C: Great Ideas in Computer Architecture (Machine Structures) More MIPS Machine Language

MIPS Functions and Instruction Formats

Computer Architecture. Chapter 2-2. Instructions: Language of the Computer

MIPS Procedure Calls. Lecture 6 CS301

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions

EE 361 University of Hawaii Fall

Stacks and Procedures

Mark Redekopp, All rights reserved. EE 352 Unit 6. Stack Frames Recursive Routines

COE608: Computer Organization and Architecture

Procedure Call and Return Procedure call

MIPS function continued

CS 61c: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture More MIPS, MIPS Functions

Computer Science 2500 Computer Organization Rensselaer Polytechnic Institute Spring Topic Notes: MIPS Programming

Functions in MIPS. Functions in MIPS 1

Implementing Procedure Calls

CS 61C: Great Ideas in Computer Architecture Strings and Func.ons. Anything can be represented as a number, i.e., data or instruc\ons

Function Calls. 1 Administrivia. Tom Kelliher, CS 240. Feb. 13, Announcements. Collect homework. Assignment. Read

CS 110 Computer Architecture Lecture 6: More MIPS, MIPS Functions

Code Generation. Lecture 19

2/16/2018. Procedures, the basic idea. MIPS Procedure convention. Example: compute multiplication. Re-write it as a MIPS procedure

Stacks and Procedures

Calling Conventions. See P&H 2.8 and Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Function Calls. Tom Kelliher, CS 220. Oct. 24, SPIM programs due Wednesday. Refer to homework handout for what to turn in, and how.

EN164: Design of Computing Systems Lecture 11: Processor / ISA 4

Announcements. EE108B Lecture MIPS Assembly Language III. MIPS Machine Instruction Review: Instruction Format Summary

Chapter 2A Instructions: Language of the Computer

Today. Putting it all together

Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

Course Administration

Lecture 2. Instructions: Language of the Computer (Chapter 2 of the textbook)

Lecture 6: Assembly Programs

Thomas Polzer Institut für Technische Informatik

More C functions and Big Picture [MIPSc Notes]

CS 61C: Great Ideas in Computer Architecture More RISC-V Instructions and How to Implement Functions

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

Compiling Code, Procedures and Stacks

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

MIPS Datapath. MIPS Registers (and the conventions associated with them) MIPS Instruction Types

ECE331: Hardware Organization and Design

Common Problems on Homework

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

Lecture 7: Procedures

MIPS Assembly (Functions)

Function Calling Conventions 2 CS 64: Computer Organization and Design Logic Lecture #10

Code Generation. Lecture 12

CS61C : Machine Structures

COMP2611: Computer Organization MIPS function and recursion

CENG3420 Lecture 03 Review

ECE 331 Hardware Organization and Design. Professor Jay Taneja UMass ECE - Discussion 3 2/8/2018

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

CA Compiler Construction

Computer Architecture

Architecture II. Computer Systems Laboratory Sungkyunkwan University

Reduced Instruction Set Computer (RISC)

SPIM Procedure Calls

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

comp 180 Lecture 10 Outline of Lecture Procedure calls Saving and restoring registers Summary of MIPS instructions

CENG3420 Computer Organization and Design Lab 1-2: System calls and recursions

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

Inequalities in MIPS (2/4) Inequalities in MIPS (1/4) Inequalities in MIPS (4/4) Inequalities in MIPS (3/4) UCB CS61C : Machine Structures

CS64 Week 5 Lecture 1. Kyle Dewey

CS61C : Machine Structures

Procedure Calls Main Procedure. MIPS Calling Convention. MIPS-specific info. Procedure Calls. MIPS-specific info who cares? Chapter 2.7 Appendix A.

MIPS Functions and the Runtime Stack

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

Reduced Instruction Set Computer (RISC)

ECE 30 Introduction to Computer Engineering

UCB CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. More MIPS, MIPS Functions

Functions and the MIPS Calling Convention 2 CS 64: Computer Organization and Design Logic Lecture #11

Transcription:

CS 316: Procedure Calls/Pipelining Kavita Bala Fall 2007 Computer Science Cornell University Announcements PA 3 IS out today Lectures on it this Fri and next Tue/Thu Due on the Friday after Fall break 1

Procedures Enable code to be reused by allowing code snippets to be invoked Will need a way to call the routine pass arguments to it fixed length variable length Recursive calls return value to caller manage registers Call Stacks A call stack contains activation records (aka stack frames) high mem Each activation record contains the return address for that invocation the local variables for that procedure sp Laftercall1 Linside low mem 2

Take 3: JAL/JR with Activation Records mult: main: addiu sp,sp,-4 jal mult sw $31, 0(sp) Laftercall1: beq $4, $0, Lout add $1,$2,$3... jal mult jal mult Linside: Laftercall2: sub $3,$4,$5 Lout: lw $31, 0(sp) addiu sp,sp,4 jr $31 Stack used to save and restore contents of $31 Simple Argument Passing main: li a0, 6 li a1, 7 jal min // result in v0 First four arguments are passed in registers Specifically, $4, $5, $6 and $7, aka a0, a1, a2, a3 The returned result is passed back in a register Specifically, $2, aka v0 3

Variable Length Arguments Best to use an (initially confusing but ultimately simpler) approach: Pass the first four arguments in registers, as usual Pass the rest on the stack Reserve space on the stack for all arguments, including the first four Simplifies functions that use variable-length arguments Store a0-a3 on the slots allocated on the stack, refer to all arguments through the stack Register Layout on Stack main: li a0, 0 li a1, 1 li a2, 2 li a3, 3 addiu sp,sp,-24 li $8, 4 sw $8, 16(sp) li $8, 5 sw $8, 20(sp) jal subf // result in v0 sp 5 4 space for a3 space for a2 space for a1 space for a0 First four arguments are in registers The rest are on the stack There is room on the stack for the first four arguments, just in case 4

Frame Layout on Stack 5 4 space for a3 space for a2 space for a1 space for a0 return address blue() { pink(0,1,2,3,4,5); pink() { sp Frame Pointer It is sometimes cumbersome to keep track of location of data on the stack The offsets change as new values are pushed onto and popped off of the stack Keep a pointer to the top of the stack frame Simplifies the task of referring to items on the stack A frame pointer, $30, aka fp Value of sp upon procedure entry Can be used to restore sp on exit 5

Frame Pointer Register Usage Callee-save Save it if you modify it Assumes caller needs it Save the previous contents of the register on procedure entry, restore just before procedure return E.g. $31 (if you are a non-leaf what is that?) Caller-save Save it if you need it after the call Assume callee can clobber any one of the registers Save contents of the register before proc call Restore after the call 6

Caller vs Callee tradeoff What is tradeoff? If all caller save, could be waste If all callee save, could be waste MIPS supports both Callee-save regs: $16-$23 (s0-s7) Caller-save regs: $8-$15,$24,$25 (t0-t9) Leaf Simple, fast Leaf vs. non-leaf Don t save registers int f(int x, int y) {return (x+y); f: add $v0, $a0, $a1 # add x and y j $ra # return 7

Callee-Save mult: addiu sp,sp,-12 sw $31,8(sp) sw $17, 4(sp) sw $16, 0(sp) [use $17 and $16] lw $31,8(sp) lw $17, 4(sp) lw $16, 0(sp) addiu sp,sp,12 Assume caller is using the registers Save on entry, restore on exit Pays off if caller is actually using the registers, else the save and restore are wasted Caller-Save main: [use $9 & $8] addiu sp,sp,-8 sw $9, 4(sp) sw $8, 0(sp) jal mult lw $9, 4(sp) lw $8, 0(sp) addiu sp,sp,8 [use $9 & $8] Assume registers are free for the taking But other subroutines will do the same must protect values that will be used later save and restore them before and after subroutine invocations Pays off if a routine makes few calls to other routines with values that need to be preserved 8

Frame Layout on Stack sp saved regs arguments return address local variables saved regs arguments return address local variables blue() { pink(0,1,2,3,4,5); pink() { orange(10,11,12,13,14); Buffer Overflows sp saved regs arguments return address local variables saved regs arguments return address local variables blue() { pink(0,1,2,3,4,5); pink() { orange(10,11,12,13,14); orange() { char buf[100]; gets(buf); // read string, no check 9

Main () { int res = mult (a, b); Mult example intmult(inta, intb) { if (b == 0) {return 0; else { res = a + mult (a, b-1); return res; Translates to Main: move a0, a move a1, b jal mult mult: beq $a1, $zer0, Done addi $sp, $sp, -12 NotDone: sw $ra, 8($sp) sw $a0,4($sp) sw $a1,0($sp) move $a0, $a0 subi $a1, $a1, 1 jal mult lw $a0,4(sp) lw $a1,0(sp) lw $ra,8(sp) addi $sp, $sp, -12 add v0, $a0, $v0 j Exit Done: move $v0, $zero Exit: return $ra Mult example 10

Preserved vs. Not preserved Preserved (Callee Save) $s0-$s7 Save prior to use, restore before return $sp, $fp, $gp, $ra Not preserved (Caller Save) $t0-$t9, $a0-$a3, $v0, $v1 Saved by caller if needed after proc call MIPS Register Recap Return address: $31 (ra) Stack pointer: $29 (sp) Frame pointer: $30 (fp) First four arguments: $4-$7 (a0-a3) Return result: $2-$3 (v0-v1) Callee-save free regs: $16-$23 (s0-s7) Caller-save free regs: $8-$15,$24,$25 (t0-t9) Reserved: $26, $27 Global pointer: $28 (gp) Assembler temporary: $1 (at) 11

What happens on a call? Caller Save caller-saved registers $a0-$a3, $t0-$t9 Load arguments in $a0-$a3, rest passed on stack Execute jal Callee Setup Allocate memory for new frame ($sp = $sp-frame) Save callee-saved registers $s0-$s7, $fp, $ra Set frame pointer ($fp = $sp + frame size 4) Callee Return Place return value in $v0 and $v1 Restore any callee-saved registers Pop stack ($sp = $sp + frame size) Return by jr $ra 12

Foo and Bar int foo (int num) { return bar(num+1); int bar (int num) { return num+1; foo: addiu $sp, $sp, -32 #push frame sw $ra, 28($sp) #store $ra sw $fp, 24($sp) #store $fp addiu $fp, $sp, 28 #set new fp addiu $a0, $a0, 1 #num + 1 jal bar lw $fp, 24($sp) #load $fp lw $ra, 28($sp) #load $ra addiu $sp, $sp, 32 #pop frame jr $ra bar: addiu $v0,$a0,1 jr $ra #leaf procedure #with no frame Factorial int fact (int n) { if (n <= 1) return 1; return n*fact(n-1); fact: slti $t0, $a0, 2 # a0 < 2 beq $t0,$zero, skip # goto skip ori $v0, $zero, 1 # return 1 jr $ra skip: addiu $sp, $sp, -32 # $sp down 32 sw $ra, 28($sp) # save $ra sw $fp, 24($sp) # save $fp addiu $fp, $sp, 28 # set up $fp sw $a0, 32($sp) # save n addui $a0, $a0, -1 # n = n-1 jal fact link: lw $a0, 32($sp) # restore n mul $v0, $v0, $a0 # n * fact (n-1) lw $ra, 28($sp) # load $ra lw $fp, 24($sp) # load $fp addiu $sp, $sp, 32 #pop stack jr $ra #return 13

Pipelined Architectures Alice 14

Alice Bob Alice Bob They don t like each other! 15

The Laundry Four sequential tasks Laundry Room Design #1 A large room with a one entry-door and one exit-door 16

Laundry Room Design #1 First Alice owns the room Laundry Room Design #1 First Alice owns the room Bob can enter as soon as she is done No possibility for Alice and Bob to fight 17

Laundry Room Design #1 Time Elapsed Time for Alice: 4 Elapsed Time for Bob: 4 Elapsed Time for both: 8 A better laundry room can improve utilization and speed up task completion Laundry Room Design #2 Elapsed Time for Alice: 4 Elapsed Time for Bob: 4 Elapsed Time for both: 5!!! Time 18

Laundry Room Design #2 The room is partitioned into stages One person owns a stage at a time, the room can hold up to four people simultaneously Laundry Room Design #2 Alice 19

Laundry Room Design #2 Bob Alice Laundry Room Design #2 Charlie Bob Alice 20

Laundry Room Design #2 Dave Charlie Bob Alice Throughput is good What about latency? 21

Look at Real Possible Numbers 15 min 30 min 45 min 90 min Impact stall stall stall Latency: 180 min Throughput: Batch every 90 min Bottleneck! 22

Latency:? Throughput: Batch every 45 minutes Implications Principle: Latencies can be masked by running isolated operations in parallel Need mechanisms for isolation Need mechanisms for handling dependencies between tasks Let s apply this principle to processor design 23