Generation. representation to the machine

Similar documents
Unoptimized Code Generation

6.035 Project 3: Unoptimized Code Generation. Jason Ansel MIT - CSAIL

Machine Program: Procedure. Zhaoguo Wang

Princeton University Computer Science 217: Introduction to Programming Systems. Assembly Language: Function Calls

Machine-Level Programming (2)

CS429: Computer Organization and Architecture

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

How Software Executes

Assembly Language: Function Calls

Assembly Language II: Addressing Modes & Control Flow

Machine Language CS 3330 Samira Khan

Machine-level Programs Procedure

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

University of Washington

Machine-Level Programming III: Procedures

Machine Programming 3: Procedures

CSE351 Spring 2018, Midterm Exam April 27, 2018

The Hardware/Software Interface CSE351 Spring 2013

CSCI 2021: x86-64 Control Flow

Areas for growth: I love feedback

Procedures and the Call Stack

Assembly Programming IV

Functions. Ray Seyfarth. August 4, Bit Intel Assembly Language c 2011 Ray Seyfarth

C to Assembly SPEED LIMIT LECTURE Performance Engineering of Software Systems. I-Ting Angelina Lee. September 13, 2012

Assembly II: Control Flow

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

Credits and Disclaimers

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 5

Do not turn the page until 5:10.

CS-220 Spring 2018 Test 2 Version Practice Apr. 23, Name:

CSE 351 Section 4 GDB and x86-64 Assembly Hi there! Welcome back to section, we re happy that you re here

How Software Executes

CSE 351 Midterm - Winter 2015 Solutions

Machine- Level Representa2on: Procedure

x86-64 Programming II

Lecture 3 CIS 341: COMPILERS

1 Number Representation(10 points)

Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

Mechanisms in Procedures. CS429: Computer Organization and Architecture. x86-64 Stack. x86-64 Stack Pushing

CS429: Computer Organization and Architecture

Computer Organization: A Programmer's Perspective

CSE 351 Midterm - Winter 2015

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Instruction Set Architectures

CS 107. Lecture 13: Assembly Part III. Friday, November 10, Stack "bottom".. Earlier Frames. Frame for calling function P. Increasing address

CS 261 Fall Machine and Assembly Code. Data Movement and Arithmetic. Mike Lam, Professor

Review addressing modes

x86-64 Programming III & The Stack

The Stack & Procedures

6.1. CS356 Unit 6. x86 Procedures Basic Stack Frames

Machine-Level Programming II: Control

Assembly Programming III

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

Machine-Level Programming II: Control

CSE P 501 Exam Sample Solution 12/1/11

The Stack & Procedures

Intel x86-64 and Y86-64 Instruction Set Architecture

CS153: Compilers Lecture 2: Assembly

Practical Malware Analysis

Process Layout and Function Calls

x64 Cheat Sheet Fall 2014

x86 Programming II CSE 351 Winter

x86 Programming I CSE 351 Winter

Computer Architecture and System Programming Laboratory. TA Session 3

Changelog. Assembly (part 1) logistics note: lab due times. last time: C hodgepodge

x86-64 Programming III

Corrections made in this version not seen in first lecture:

Machine/Assembler Language Putting It All Together

How Software Executes

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

%r8 %r8d. %r9 %r9d. %r10 %r10d. %r11 %r11d. %r12 %r12d. %r13 %r13d. %r14 %r14d %rbp. %r15 %r15d. Sean Barker

CSE 351 Midterm Exam

CSC 252: Computer Organization Spring 2018: Lecture 11

Lecture 4 CIS 341: COMPILERS

Data Representa/ons: IA32 + x86-64

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 7

Second Part of the Course

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Credits and Disclaimers

CS356: Discussion #15 Review for Final Exam. Marco Paolieri Illustrations from CS:APP3e textbook

CSE 351 Midterm - Winter 2017

Registers. Ray Seyfarth. September 8, Bit Intel Assembly Language c 2011 Ray Seyfarth

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 12

Buffer Overflow Attack (AskCypert CLaaS)

CS61 Section Solutions 3

a translator to convert your AST representation to a TAC intermediate representation; and

An Overview to Compiler Design. 2008/2/14 \course\cpeg421-08s\topic-1a.ppt 1

CS 261 Fall Mike Lam, Professor. x86-64 Control Flow

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 6

Instructions and Instruction Set Architecture

Function Calls and Stack

Computer Architecture I: Outline and Instruction Set Architecture. CENG331 - Computer Organization. Murat Manguoglu

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Question Points Score Total: 100

Princeton University Computer Science 217: Introduction to Programming Systems Exceptions and Processes

SOEN228, Winter Revision 1.2 Date: October 25,

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Transcription:

Unoptimized i Code Generation From the intermediate representation to the machine code

5 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Anatomy of a compiler Program (character stream) Lexical Analyzer (Scanner) Token Stream Syntax Analyzer (Parser) Parse Tree Semantic Analyzer Intermediate Representation Intermediate Code Optimizer Optimized Intermediate Representation Code Generator Assembly code

Anatomy of a compiler Program (character stream) Lexical Analyzer (Scanner) Token Stream Syntax Analyzer (Parser) Parse Tree Semantic Analyzer High-level IR Low-level IR Intermediate Representation Code Generator Assembly code

Components of a High Level Language CODE Procedures Control Flow Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data

Machine Code Generator Should... Translate all the instructions in the intermediate representation to assembly language Allocate space for the variables, arrays etc. Adhere to calling conventions Create the necessary symbolic information

7 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Machines understand... LOCATION DATA 0046 8B45FC 0049 4863F0 004c 8B45FC 004f 4863D0 0052 8B45FC 0055 4898 0057 8B048500 000000 005e 8B149500 000000 0065 01C2 0067 8B45FC 006a 4898 006c 89D7 006e 033C8500 000000 0075 8B45FC 0078 4863C8 007b 8B45F8 007e 4898 0080 8B148500 000000

Machines understand... LOCATION DATA ASSEMBLY INSTRUCTION 0046 8B45FC -4(%rbp), %eax 0049 4863F0 %eax,%rsi 004c 8B45FC -4(%rbp), %eax 004f 4863D0 %eax,%rdx 0052 8B45FC movl -4(%rbp), %eax 0055 4898 0057 8B048500 B(,%rax,4), %eax 000000 movl 005e 8B149500 A(,%rdx,4), %edx movslq 000000 movl 0065 01C2 %eax, %edx movslq 0067 8B45FC -4(%rbp), %eax 006a 4898 movl 006c 89D7 addl, %edi cltq 006e 033C8500 C(,%rax,4), %edi movl %edx 000000 0075 8B45FC movl -4(%rbp), %eax 0078 4863C8 %eax,%rcx 007b 8B45F8 movl - 8(%rbp), %eax movl 007e 4898 0080 8B148500 cltqaddl B(,%rax,4), %edx movl

Program (character stream) Lexical Analyzer (Scanner) Token Stream Syntax Analyzer (Parser) Parse Tree Intermediate Code Generator High-level IR Low-level IR Intermediate Representation Code Generator Assembly code

Program (character stream) Lexical Analyzer (Scanner) Token Stream Syntax Analyzer (Parser) Parse Tree Intermediate Code Generator High-level IR Low-level IR Intermediate Representation Code Generator Assembly code Assembler & linker Binary executable Processor

Advantages Assembly language g Simplifies code generation due to use of symbolic instructions and symbolic names Logical abstraction layer Multiple Architectures can describe by a single assembly language can modify the implementation macro assembly instructions Disadvantages Additional process of assembling and linking Assembler adds overhead

Assembly language g Relocatable machine language (object modules) all locations(addresses) represented by symbols Mapped to memory addresses at link and load time Flexibility of separate compilation Absolute machine language addresses are hard-codedd simple and straightforward implementation inflexible -- hard to reload generated code Used in interrupt handlers and device drivers

Assembly example.section.lc0: 0000 6572726F7200.string "error".text.globl fact.rodata fact: 0000 55 pushq %rbp 0001 4889E5 %rsp, %rbp 0004 4883EC10 $16, %rsp 0008 897DFC %edi, -4(%rbp) 000b 837DFC00 000f 7911 0011 BF00000000 movl $0, $.LC0, -4(%rbp) %edi 0016 B800000000 001b E800000000.L2 0020 EB22 jns $0, %eax movq subq.l2: printf 0022 837DFC00 0026 7509.L3 movl cmpl jne L4 0028 C745F801000000 jmp $0, -4(%rbp) 002f EB13 movl. call.l4: $1, -8(%rbp) 0031 8B7DFC.L3-4(%rbp), %edi 0034 FFCF movl 0036 E800000000 jmp ll fact 003b 0FAF45FC cmpl %edi -4(%rbp), %eax 003f 8945F8 %eax, -8(%rbp) 0042 EB00 decl.l3: 0044 8B45F8.L1-8(%rbp), %eax 0047 C9 ca 0048 C3 jmp movl imull 11 movl

Composition of an Object File We use the ELF file format The object file has: Multiple Segments Symbol Information Relocation Information Segments Global Offset Table Procedure Linkage Table Text (code) Data Read Only Data.file.LC0: "test2.c".section.string "error %d".rodata.section.text.globl fact fact: pushq %rbp movq %rsp, %rbp subq $16, %rsp movl -8(%rbp), %eax leave ret..comm bar,4,4.comm a,1,1.comm b,1,1.section.long.lecie1-.lscie1.long 0x0.eh_frame,"a",@progbits.byte 0x1.string "".uleb128 0x1

11 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Overview of a modern processor ALU Control Memory Memory Registers ALU Registers Control

Arithmetic and Logic Unit Performs most of the data operations Has the form: OP <oprnd 1 >, <oprnd 2 > <oprnd 2 > = <oprnd 1 > OP <oprnd 2 > Or OP <oprnd 1 > Operands are: Immediate Value $25 Register %rax Memory 4(%rbp) Operations are: Arithmetic operations (add, sub, imul) Logical operations (and, sal) Unitary operations (inc, dec) Memory Registers ALU Control

Arithmetic and Logic Unit Many arithmetic operations can cause an exception overflow and underflow Can operate on different data types addb 8 bits addw 16 bits addl 32 bits addq 64 bits (Decaf is all 64 bit) signed and unsigned arithmetic Floating-point operations (separate ALU) Memory Registers ALU Control

Control Handles the instruction sequencing Executing instructions All instructions are in memory Fetch the instruction pointed by the PC and execute it For general instructions, increment the PC to point to the next location in memory Memory Registers ALU Control

Control Unconditional Branches Fetch the next instruction ti from a different location Unconditional jump to an address jmp.l32 Unconditional jump to an address in a register jmp %rax To handle procedure calls call fact call %r11 Memory Registers ALU Control

Control All arithmetic operations update the condition codes (rflags) Compare explicitly sets the rflags cmp $0, %rax Conditional jumps on the rflags Jxx.L32 Jxx 4(%rbp) Examples: JO Jump Overflow JC Jump Carry JAE Jump if above or equal JZ Jump is Zero JNE Jump if not equal Memory Registers ALU Control

Control Control transfer in special (rare) cases traps and exceptions Mechanism Save the next(or current) instruction location find the address to jump to (from an exception vector) jump to that location Memory Registers ALU Control

18 When to use what? Give an example where each of the branch instructions can be used 1. jmp L0 2. call L1 3. jmp %rax 4. jz -4(%rbp) 5. jne L1

Memory Flat Address Space composed of words byte addressable Need to store Program Local variables Global variables and data Stack Heap Memory Registers ALU Control

Memory Dynamic Heap Memory 0x800 0000 0000 Stack Registers ALU Data Text Globals/ Read-only data Program Control 0x40 0000 0x0 Unmapped

Registers Instructions allow only limited memory operations add -4(%rbp), -8(%rbp) add %r10, -8(%rbp) Important for performance limited in number Memory Registers ALU Control Special registers %rbp base pointer %rsp stack pointer

Moving Data mov source dest Moves data from one register to another from registers to memory from memory to registers push source Pushes data into the stack pop dest Pops data from the stack to dest Memory Registers ALU Control

Other interactions Other operations Input/Output Privilege / secure operations Handling special hardware TLBs, Caches etc. Memory Registers ALU Control Mostly via system calls hand-coded in assembly compiler can treat them as a normal function call

22 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Components of a High Level Language CODE Control Flow Procedures Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data

Memory Layout Heap management free lists starting location in the text segment 0x800 0000 0000 Dynamic Stack Data Heap Globals/ Read-only data CODE Control Flow Procedures Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data 0x40 0000 0x0 Text Unmapped Program

Allocating Read-Only Data All Read-Only data in the text segment Integers use load immediate Strings use the.string macro.section.text.globl main main: enter $0, $0 movq $5, x(%rip) push x(%rip) push $.msg call printf_035 add $16, %rsp leave ret CODE Control Flow Procedures Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data.msg:.string "Five: %d\n"

Global Variables Allocation: Use the assembler's.comm directive Use PC relative addressing %rip is the current instruction address X(%rip) will add the offset from the current instruction location to the space for x in the data segment to %rip Creates easily recolatable binaries CODE Control Flow Procedures Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data.section.text.globl main main: enter $0, $0 movq $5, x(%rip) push x(%rip) call printf_035 add $16, %rsp leave ret.comm x, 8.comm name, size, alignment The.comm directive i allocates storage in the data section. The storage is referenced by the identifier name. Size is measured in bytes and must be a positive integer. Name cannot be predefined. Alignment is optional. If alignment is specified, the address of name is aligned to a multiple of alignment

22 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Procedure Abstraction Requires system-wide compact Broad agreement on memory layout, protection, resource allocation calling sequences, & error handling Must involve architecture (ISA), OS, & compiler Provides shared access to system-wide facilities Storage management, flow of control, interrupts Interface to input/output p devices, protection facilities, timers, synchronization flags, counters, Establishes the need for a private context Create private storage for each procedure invocation i Encapsulate information about control flow & data abstractions The procedure abstraction is a social contract (Rousseau)

Procedure Abstraction In practical terms it leads to... multiple procedures library calls compiled by many compilers, written in different languages, hand-written assembly For the project, we need to worry about Parameter passing Registers Stack Calling convention

Parameter passing disciplines Many different methods call by reference call by value call by value-result (copy-in/copy-out)

Parameter Passing Disciplines Program { int A; foo(int B) { B = B + 1 B = B + A } Main() { A = 10; foo(a); } } Call by value A is??? Call by reference A is??? Call by value-result A is??? 25

Parameter Passing Disciplines Program { int A; foo(int B) { B = B + 1 B = B + A } Main() { A = 10; foo(a); } } Call by value A is 10 Call by reference A is 22 Call by value-result A is 21 25

Parameter passing disciplines Many different methods call by reference call by value call by value-result How do you pass the parameters? via. the stack via. the registers or a combination In the Decaf ca lling convention, the fi first 6 parameters are passed in registers. The rest are passed in the stack

Registers What to do with live registers across a procedure call? Caller Saved Calliee Saved

30 Question: What are the advantages/disadvantages g of: Calliee saving of registers? Caller saving of registers? What registers should be used at the caller and calliee if half is caller-saved and the other half is calliee-saved?

Registers What to do with live registers across a procedure call? Caller Saved Calliee Saved In this segment, use registers only as short-lived temporaries mov -4(%rbp), %r10 mov -8(%rbp), %r11 add %r10, %r11 mov %r11, -8(%rbp) Should not be live across procedure calls Will start keeping data in the registers for performance in Segment V

The Stack Arguments 0 to 6 are in: %rdi, %rsi, %rdx, %rcx, %r8 and %r9 8*n+16(%rbp) 16(%rbp) 8(%rbp) 0(%rbp) -8(%rbp) -8*m-8(%rbp) 0(%rsp) argument n argument 7 Return address Previous %rbp local 0 local m Pre vious Cu urrent Variable size

33 Question: Why use a stack? Why not use the heap or y y p pre-allocated in the data segment?

34 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

Procedure Linkages Standard procedure linkage procedure p prolog procedure q prolog Procedure has standard prolog standard epilog pre-call post-return epilog Each call involves a pre-call sequence post-return sequence epilog

Calling: Caller Assume %rcx is live and is caller save Call foo(a, B, C, D, E, F, G, H, I) Stack A to I are at -8(%rbp) to -72(%rbp) push push push push %rcx -72(%rbp) -64(%rbp) -56(%rbp) mov -48(%rbp), %r9 mov -40(%rbp), %r8 mov mov mov mov call -32(%rbp), %rcx -24(%rbp), %rdx -16(%rbp), %rsi -8(%rbp), %rdi foo return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area caller saved registers argument 9 argument 8 argument 7 return address rbp rsp

Calling: Calliee Stack Assume %rbx is used in the function and is calliee save Assume 40 bytes are required for locals foo: push %rbp enter $48, $0 mov sub mov %rsp, %rbp $48, %rsp %rbx, -8(%rbp) return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area caller saved registers argument 9 argument 8 argument 7 return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area rbp rsp

Arguments Call foo(a, B, C, D, E, F, G, H, I) CODE Control Flow Procedures Statements Data Access Passed in by pushing before the call push -72(%rbp) push -64(%rbp) push -56(%rbp) mov -48(%rbp), %r9 mov -40(%rbp), %r8 mov -32(%rbp), %rcx mov -24(%rbp), %rdx mov -16(%rbp), %rsi mov -8(%rbp), %rdi call foo Access A to F via registers or put them in local memory Access rest using 16+xx(%rbp) mov Stack 16(%rbp), %rax mov 24(%rbp), %r10 DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area caller saved registers argument 9 argument 8 argument 7 return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area rbp rsp

Stack Locals and Temporaries Calculate the size and allocate space on the stack sub $48, %rsp return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area or enter $48, 0 caller saved registers argument 9 argument 8 argument 7 Access using -8-xx(%rbp) return address mov -28(%rbp), %r10 previous frame pointer mov %r11, -20(%rbp) calliee saved registers CODE Control Flow Procedures Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data local variables stack temporaries dynamic area rbp rsp

Returning Calliee Stack Assume the return value is the first temporary Restore the caller saved register Put the return value in %rax Tear-down the call stack mov -8(%rbp), %rbx mov -16(%rbp), %rax mov pop ret leave %rbp, %rsp %rbp return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area caller saved registers argument 9 argument 8 argument 7 return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area rbp rsp

Returning Caller Stack Assume the return value goes to the first temporary Restore the stack to reclaim the argument space Restore the caller save registers return address previous frame pointer calliee saved registers local variables stack temporaries dynamic area caller saved registers argument 9 argument 8 argument 7 rbp rsp call foo CODE DATA add $24, %rsp pop %rcx mov %rax, 8(%rbp) Procedures Global Static Variables Global Dynamic Data Control Flow Local Variables Statements Temporaries Parameter Passing Data Access Read-only Data

47 Question: Do you need the $rbp? What are the advantages and disadvantages of having $rbp?

49 Outline Introduction Machine Language Overview of a modern processor Memory Layout Procedure Abstraction Procedure Linkage Guidelines in Creating a Code Generator

What We Covered Today.. CODE Procedures Control Flow Statements Data Access DATA Global Static Variables Global Dynamic Data Local Variables Temporaries Parameter Passing Read-only Data

Guidelines for the code generator Lower the abstraction level slowly Do many passes, that do few things (or one thing) Easier to break the project down, generate and debug Keep the abstracti tion level cons istent IR should have correct semantics at all time At least you should know the semantics You may want to run some of the optimizations between the passes. Use assertions liberally Use an assertion to check your assumption

Guidelines for the code generator Do the simplest but dumb thing it is ok to generate 0 + 1*x + 0*y Code is painful to look at, but will help optimizations Make sure you know want can be done at Compile time in the compiler Runtime using generated code

Guidelines for the code generator Remember that optimizations will come later Let the optimizer do the optimizations Think about what optimizer will need and structure your code accordingly Example: Register allocation, algebraic simplification, constant propagation Setup a good testing infrastructure regression tests If a input program creates a bug, use it as a regression test Learn good bug hunting procedures Example: binary search

MIT OpenCourseWare http://ocw.mit.edu 6.035 Computer Language Engineering Spring 2010 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.