Advanced Compilers Code Generation. Fall Chungnam National Univ. Eun-Sun Cho

Similar documents
x86 assembly CS449 Fall 2017

CIT Week13 Lecture

CMSC 313 Lecture 12 [draft] How C functions pass parameters

Assembly Language: Function Calls

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Assembly Language: Function Calls" Goals of this Lecture"

An Experience Like No Other. Stack Discipline Aug. 30, 2006

CS241 Computer Organization Spring 2015 IA

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

THEORY OF COMPILATION

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

Secure Programming Lecture 3: Memory Corruption I (Stack Overflows)

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

AS08-C++ and Assembly Calling and Returning. CS220 Logic Design AS08-C++ and Assembly. AS08-C++ and Assembly Calling Conventions

CSCI 4061: Virtual Memory

Compilers and Code Optimization EDOARDO FUSELLA

Stack Discipline Jan. 19, 2018

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

CYSE 411/AIT681 Secure Software Engineering Topic #8. Secure Coding: Pointer Subterfuge

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

Section 4: Threads CS162. September 15, Warmup Hello World Vocabulary 2

Secure Coding Topics. Readings. CYSE 411/AIT681 Secure Software Engineering. Pointer Subterfuge. Outline. Data Locations (cont d) Data Locations

Secure Coding Topics. CYSE 411/AIT681 Secure Software Engineering. Readings. Outline. This lecture: Topic #8. Secure Coding: Pointer Subterfuge

Roadmap: Security in the software lifecycle. Memory corruption vulnerabilities

CSC 2400: Computing Systems. X86 Assembly: Function Calls

ASSEMBLY III: PROCEDURES. Jo, Heeseung

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

Assembly III: Procedures. Jo, Heeseung

Compilation /15a Lecture 7. Activation Records Noam Rinetzky

x86 assembly CS449 Spring 2016

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Topic 7: Activation Records

7 Translation to Intermediate Code

Process Layout and Function Calls

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 11

Machine-level Programming (3)

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

Compiler Construction D7011E

Assignment 11: functions, calling conventions, and the stack

CPS104 Recitation: Assembly Programming

W4118: PC Hardware and x86. Junfeng Yang

Practical Malware Analysis

Short Notes of CS201

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Section 4: Threads and Context Switching

The course that gives CMU its Zip! Machine-Level Programming III: Procedures Sept. 17, 2002

CS213. Machine-Level Programming III: Procedures

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

CS61, Fall 2012 Midterm Review Section

Prof. Kavita Bala and Prof. Hakim Weatherspoon CS 3410, Spring 2014 Computer Science Cornell University. See P&H 2.8 and 2.12, and A.

Machine Programming 3: Procedures

CS201 - Introduction to Programming Glossary By

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Calling Conventions. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 2.8 and 2.12

Course Administration

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

Lab 10: Introduction to x86 Assembly

CS 537 Lecture 2 - Processes

2 Sadeghi, Davi TU Darmstadt 2012 Secure, Trusted, and Trustworthy Computing Chapter 6: Runtime Attacks

Stacks and Frames Demystified. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han

Compiler Internals. Reminders. Course infrastructure. Registering for the course

University of Washington

Intel assembly language using gcc

CPEG421/621 Tutorial

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Run-time Environment

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 11

Run-time Environments

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Run-time Environments

Processes. Johan Montelius KTH

Processes (Intro) Yannis Smaragdakis, U. Athens

Computer Systems CEN591(502) Fall 2011

A process. the stack

System Software Assignment 1 Runtime Support for Procedures

Run-time Environments - 2

Rui Wang, Assistant professor Dept. of Information and Communication Tongji University.

Systems I. Machine-Level Programming V: Procedures

Anne Bracy CS 3410 Computer Science Cornell University

G Programming Languages - Fall 2012

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

Typical Runtime Layout. Tiger Runtime Environments. Example: Nested Functions. Activation Trees. code. Memory Layout

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

Anne Bracy CS 3410 Computer Science Cornell University

Region of memory managed with stack discipline Grows toward lower addresses. Register %esp contains lowest stack address = address of top element

Sungkyunkwan University

X86 Assembly -Procedure II:1

Transcription:

Advanced Compilers Code Generation Fall. 2017 Chungnam National Univ. Eun-Sun Cho 1

Backend of Compilers Machine -independent Optimization Machine-independent Optimization Virtual to physical Mapping / Machine-dependent Optimization Instruction Selection Instruction Scheduling Register Allocation Machine Code Emission/Opti Backend = Code generation + Optimization

Storage Management Exception Handling Instruction Selection Register Allocation Code Generation 3

Storage Management 4

Management of Storage In compiler generated machine codes, memory management codes play critical roles. #include <stdio.h> void main(){ int i; printf( Hello, CSE!\n ); }.file "s09.c".section.rodata.lc0:.string "Hello, CSE!".text.globl main.type main, @function main: pushl %ebp allocate stack movl %esp, %ebp memory for subl $20, %esp int i, and the string movl $.LC0, (%esp) parameter call puts compile addl $20, %esp popl %ebp ret.size main,.-main 5

2 Classes of Storage in Process Registers Fast access Invisible for users (programmers) in most cases NO indirect access is allowed Memory (relatively) Slow access, indirect accesses are allowed Candidates: Globals/statics, Composite types (structs, arrays..), Variables accessed via & operator *Whether a variable is translated as a register or a memory variable should be determined in the middle of HIR to LIR translation

4 Categories of Memory Code space : an area of memory for instruction sequence read-only, if possible Static (or Global) an area of memory for a set of variables with the same life time as the program Stack an area of memory for a set of local variables (with block life time) Heap an area of dynamically allocated memory by System calls (via malloc, new, etc.)

Memory Organization Stack... Heap Static Data Code stack, heap : variable sizes at runtime Stack: grows upward Heap; grows downward The relative positions of stack/heap might be switched code, static data : fixed sizes (by the compiler)

Executable Formats Windows PE (Portable Excutable) ELF (Executable and Linkable Format) 9

Memory Organization Stack... Heap Static Data Code stack, heap : variable sizes at runtime Stack: grows upward Heap; grows downward The relative positions of stack/heap might be switched code, static data : fixed sizes (by the compiler)

Run-time stack A stack made of frames one frame (or an activation record) for each function call Activation record : execution environment for execution of a corresponding function Each call has one frame even for recursive calls contents: local variables, arguments, return values, other temporary storage... Heap allocation a contiguous portion of the global area, returned from OS operations for memory-request and memory-return during the program execution are necessary, otherwise, garbage collection should be supported in the programming language keep available memory categorized into free section and in-use section (see OS textbook!)

Initial Stack Frame (startup state) Command line arguments argc, argv Environment variables (env) NULL env[0] env[1] env[n] NULL argv[1] argv[arc-1] argv[0] argc end of environment (integer) environment variables (pointers) end of args (integer) program args (pointer) program name (poiner) argument counter (integer) bottom top <Initial stack layout for ELF binaries> A figure in http://asm.sourceforge.net/articles/startup.html#st used with some modificcation address decreasing

stack system env / argv / argc stack frame for main() available for stack growth higher addr. ebp of main() stack pointer : esp Runtime Layout (ELF) shared library malloc.o (lib*.so) printf.o (lib*.so) library functions (dynamically linked) available for heap heap data text heap (malloc(), calloc(), new) int x; (global var) int y = 100; (global var) xx.o (lib*.a) xxx.o (lib*.a) file.o what existed from before loading.text,.data.. library functions (static linked) What if main() calls function func(72, 73)? (a.out) main.o func(72,73); crt0.o (startup routine) lower addr. 13

stack system env / argv / argc stack frame for main() stack frame for func() available for stack growth higher addr. ebp of main() ebp of func() stack pointer : esp Runtime Layout (ELF) ; while executing func() shared library malloc.o (lib*.so) printf.o (lib*.so) library functions (dynamically linked) available for heap heap data text heap (malloc(), calloc(), new) int x; (global var) int y = 100; (global var) xx.o (lib*.a) xxx.o (lib*.a) file.o (a.out) main.o func(72,73); crt0.o (startup routine) what existed from before loading.text,.data.. library functions (static linked) lower addr. What if main() calls function func(72, 73)? 14

Functions and Run-time stacks Call/return of a function and run-time stack operation when f is called, push f s frame to RT stack when f is returned, pop-up f s frame from RT stack Top frame = frame of the function currently being executed How to access the top frame? Stack pointer (esp): top position of the frame Base pointer (ebp): base position of the frame A local variable is accessed via its offset from FP (or SP) Role of Compiler to generate codes which force the above system 15

When main()calls func(72, 73) stack system env / argv / argc mpf main() s local variables +12 73 y frame for main() +8 72 x +4 ra return address ebp 0 mpf caller s frame pointer func( -4 garbage a -8 garbage b[2] -12 garbage b[1] frame for func() { int x, int y) int a; -16 garbage b[0] int b[3]; esp available for stack growth 16

mpf ebp esp stack Variables and Arguments func(72, 73) system env / argv / argc main() s local variables +12 73 y +8 72 x +4 ra return address 0 mpf caller s frame pointer -4 garbage a -8 garbage b[2] -12 garbage b[1] -16 garbage b[0] available for stack growth [ebp+4] : return address [ebp+8] : 72, that is, x [ebp+12] : 73, that is, y [ebp] : main() s ebp a b[1] frame for main() frame for func() : [ebp-4] : [ebp-12] func( { int x, int y) int a; int b[3]; 17

mpf ebp esp stack Variables and Arguments system env / argv / argc func(72, 73) main() s local variables +12 73 y +8 72 x +4 ra return address 0 mpf caller s frame pointer -4 garbage a -8 garbage b[2] -12 garbage b[1] -16 garbage b[0] available for stack growth push 73 ; y push 72 ; x call func ; frame for main() frame for func() func( { int x, int y) int a; int b[3]; 18

Alignment In most cases, a variable is aligned based on its size eg. C/C++ : char byte aligned, short halfword aligned, int word aligned char w; eg. int x[3] char y; short z; char w 1 byte x[3] 12 bytes, starting at a word aligned address (3 empty bytes between w and x) char y 1byte, starting at any address short z 2 bytes, starting at a halfword aligned address (1 empty byte between y and z) Total size = 20 bytes!

Alignment of Structures struct { char w; int x[3] char y; short z; } fields in struct : align to the largest field size eg. the largest field is int (4 bytes) size of the struct : a multilcation of 4 starting address of the struct : also a multiplication of 4 word aligned

Example. GCC-x86 4.7.x 16 for i and 4 for the string pointer in gcc-x86 16 for i, because of their own alignment policy #include <stdio.h> void main(){ } int i; printf( Hello,CSE!\n );.file "s09.c".section.rodata.lc0:.string "Hello, CSE!".text.globl main.type main, @function main: pushl %ebp movl %esp, %ebp subl $20, %esp movl $.LC0, (%esp) compile call puts addl $20, %esp popl %ebp ret.size main,.-main allocate stack memory for int i, and the string parameter 21

Exception Handling Codes 22

Exceptions Exception is for error-handling invalid input invalid resource state file not exists, network error, erroraneous execution condition divide-by-zero, In real production code, error-handling code may be a large part (30%-50% or more) 23

C++ #include <iostream> #include <fstream> using namespace std; int main () { ifstream file; //Set the state flags for which a failure exception is thrown. file.exceptions ( ifstream::failbit ifstream::badbit ); try { file.open ("test.txt"); while (!file.eof()) file.get(); } catch (ifstream::failure e) { } cout << "Exception opening/reading file"; file.close(); class ios_base::failure : public exception { // the exceptions thrown by the elements of // the standard input/output library public: explicit failure (const string& msg); virtual ~failure(); virtual const char* what() const noexcept; } flag values of std::ios_base::iostate eofbit failbit badbit goodbit 24

Java InputStream input = null; try{ input = new FileInputStream("c:\\data\\input-text.txt"); int data = input.read(); while(data!= -1) { //do something with data... dosomethingwithdata(data); data = input.read(); } }catch(ioexception e){ //do something with e... log, perhaps rethrow etc. } finally { if(input!= null) input.close(); } Note : C++ does not support 'finally' blocks. 25

Throw int main () { } try { throw 20; } catch (int e) { } cout << "An exception occurred. Exception Nr. " return 0; << e << endl; An exception occurred. Exception Nr. 20 26

Chaining InputStream input = null; try{ input = new FileInputStream("c:\\data\\input-text.txt"); int data = input.read(); while(data!= -1) { //do something with data... dosomethingwithdata(data); data = input.read(); } }catch(ioexception e){ throw new MyException(); } 27

What Should Do for An Exception Occurs try { f(1); Object x; g(2); }catch (Exc) { } when an exception occurs when an exception occurs // handler goto handler A destroy x + goto handler A handler A: catches type Exc Note: Try can be nested, so the handlers are organized in a stack 28

Basic Exception Handling Mechanism 1 Setjmp/longjmp-based global goto C s primitive exception 2 Table-driven method more complex and more space usage but faster 29

1 Setjmp/longjmp #include < setjmp.h > main() { jmp_buf env; int i; i = setjmp(env); printf("i = %d\n", i); if (i!= 0) exit(0); longjmp(env, 2); printf("get printed?\n"); } setjmp() : save the contents of the registers longjmp() : restore them later. ``returns'' to the state of the program when setjmp() was called. $ sj1 i = 0 i = 2 $ _ First, we call setjmp(), and it returns 0. Then we call longjmp() with a value of 2, which causes the code to return from setjmp() with a value of 2. That value is printed out, and the code exits. ( get printed? will not be printed) 30

Setjmp/longjmp Approach buffer buf; void f() { if (0 == setjmp(buf)) g(); } void g() { h(); } void h() { longjmp(buf, 1); } struct context { int ebx; int edi; int esi; int ebp; int esp; int eip; }; typedef struct context buffer[1]; 31

Setjmp/longjmp Approach Conts' buffer buf; void f () { if (0==setjmp (buf)) g (); else k(); } void g () { h (); } void h () { longjmp (buf, 1); } try..catch Save the context before try block This context also calls handler Handle exception with k() throws Fetch the handler, restore machine states and jump to the handler s code 32

2 Table Driven Approach Table 1 : Each throw point to its action table from the program counter (PC) at the point where the exception is thrown to an action table Table 2 : Action table perform the various operations required for exception processing eg. invoking destructors adjusting the stack matching the exception type to the address of an exception handling 33

Discussions All variables that are declared outside the try block have to be restored to their initial value Lecture s = new Lecture(); // s.lecturer is assumed initially null try { s.lecturer = new ThatMan(); FileInputStream(); // exception! // s.lecturer (in memory) should be restored... } catch (IOException e) {...} 34

Discussions Setjmp/longjmp approach setjump should be called at the beginning of every try block even if no exception is ever thrown list of buf must be maintained list of objects on the stack must be maintained (in C++) 35

Discussions Conts Table driven approach Mostly used Significantly more efficient than setjmp/longjmp approach Table themselves have to encode a lot of possible actions Space problem Reorganizing the code implies reorganizing the table accordingly Vulnerable to attack Compiler optimization should not be allowed void f(){ int x = 0; // dead code, but cannot be optimized out try { x = f1(x); } catch ( ) { cout << ;} } 36

Exception Handling in GIMPLE throw is NOT directly supported BUT by function calls 37

invoking destructors and adjusting the stack 38

for throwing an exception ref 39

Instruction Selection 40

Low-level, Tree-based Intermediate Representation Tree-based IR With abstract machine instructions used in machine code generation eg) from Tiger book MEM cf. RTL BINOP PLUS CONST e c e + c 41

Tree-based Intermediate Representation from Tiger book MEM(e) : this means the value of one word of memory starting at the address e. When this is used at left-hand side of MOVE, it is interpreted as store, otherwise it means a fetch operation TEMP(t) : register t SEQ (s1, s2) : after evaluation of statement s1, statement s2 is evaluated ESEQ(s,e) : statement s evaluated for side effects and then e is evaluated for a result BINOP(o, e1, e2) : o is a binary operator like PLUS and MINUS. The result is the evaluation of o with e1 and e2 as operands This result is saved in memory and the address is returned const(i) : integer constant i 42

Simple Equivalence Relationships We can choose one among the sub-trees of the same semantics ESEQ ESEQ s1 ESEQ SEQ e s2 e s1 s2 43

op BINOP e1 ESEQ s e2 ESEQ MOVE TEMP e1 t ESEQ s BINOP op TEMP e2 t BINOP BINOP ESEQ op e1 ESEQ op ESEQ e2 s BINOP s e2 s e1 op e1 e2 44

More Instruction Selection (Option1) MOVE MEM MEM BINOP BINOP PLUS MEM BINOP PLUS TEMP CONST PLUS BINOP TEMP fp MULT TEMP CONST i a CONST 4 fp x 45

More Instruction Selection (Option2) MOVE MEM MEM BINOP BINOP PLUS MEM BINOP PLUS TEMP CONST PLUS BINOP TEMP fp MULT TEMP CONST i a CONST 4 fp x 46

Equivalence of The Machine Codes LOAD r1 M[fp+a] ADDI r2 r0 + 4 MUL r2 ri r2 ADD r1 r1 + r2 LOAD r2 M[fp+x] STORE M[r1+0] r2 LOAD r1 M[fp+a] ADDI r2 r0 + 4 MUL r2 ri r2 ADD r1 r1 + r2 LOAD r2 fp + x STORE M[r1] M[r2] 47

Register Allocation 48

Operand in Low Level IR Review Operands Virtual registers We assume infinitely many virtual registers Special registers stack pointer, pc, Literals We assume there is no limits of values of literals Symbolic names in most cases, labels 49

Register Allocation Motivation Virtual register (VR) Although we assume infinitely many virtual registers The number of actual registers is finite, and various from machine to machine Register allocation Put as many as VRs to physical registers, and allocate the remained VRs to memory Optimization for the best performance : put frequently used VRs to physical registers Spilling : allocating virtual registers to memory, inevitably

Interference Interference : two different definitions have a common operations in their live ranges Live range : generated from liveness analysis and reaching definition analysis Interference graph Nodes of the graph = variables Edges : linked if two nodes interfere each other 1: a = 0 2: b = a 3: b*b 4: c = 2 5: a*c+3 b a c For def1 a = {1,2,3,4,5} For def2 b = {2,3} For def4 c = {4,5} examples and materials from Princeton Univ. 51

Graph Coloring Graph Coloring Used to allocate virtual registers (that is, variables) to physical registers Linked nodes should be painted in different colors Simple example: Two registers : 2-coloring (two colors) color register 1: a = 0 2: b = a 3: b*b 4: c = 2 5: a*c+3 b a c eax ebx

K-Graph Coloring Algorithm Kempe s algorithm [1879] --- Old problem Step 1 (simplify) Find a node linked with less than k edges, and cut that node with the edges linked to it save these to a stack Step 2 (color) if a remaining graph is a simplied subgraph and can be k-graphed colored pop a node (and all the related edges pushed together) from the stack, and color the node in different colors from all the neighbor nodes Step 3 (Spill) optional If failed with above the algorithm Actually Step1~step2 is not applicable to many cases Graph coloring is NP-complete problem Solution : select several (victim) variables and allocate them to memory 53

Step 1 stack: a stack: a c b c e c b c d e d e stack: b a e c b a c stack: a e c b a c d e d e

stack: b a e c b a c Step 2 stack: a e c b a c d e d e stack: a stack: a c b c e c b c d e d e

Case of Step 3(1) Some lucky cases! color register eax a ebx b c stack: d d e all nodes have 2 neighbours!

Case of Step 3 (2) But there exist graphs where coloring with only k colors is impossible spilling! a b c d e no colors left for e or a!

Spilling code Code rewriting Introduce new temporary, and rewrite codes eg. Assuming that t2 is supposed to be spilled Then, add t1, t2 will be; define a memory area bound to to-be-spilled variables (here, t2) eg. [ebp-24] in runtime stack and introduce a new temporary variable (here, t35) mov t35, [ebp 24] add t1, t35 note : t35 s live range is very short (one or two commands) so possibility of interference is very low (much less than t2)