CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

Similar documents
Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

CS241 Computer Organization Spring 2015 IA

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

W4118: PC Hardware and x86. Junfeng Yang

CSC 8400: Computer Systems. Machine-Level Representation of Programs

Program Exploitation Intro

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Process Layout and Function Calls

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

x86 assembly CS449 Fall 2017

Where We Are. Optimizations. Assembly code. generation. Lexical, Syntax, and Semantic Analysis IR Generation. Low-level IR code.

An Introduction to x86 ASM

ADDRESSING MODES. Operands specify the data to be used by an instruction

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

THEORY OF COMPILATION

Basic Pentium Instructions. October 18

16.317: Microprocessor Systems Design I Fall 2014

MODE (mod) FIELD CODES. mod MEMORY MODE: 8-BIT DISPLACEMENT MEMORY MODE: 16- OR 32- BIT DISPLACEMENT REGISTER MODE

Chapter 11. Addressing Modes

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Digital Forensics Lecture 3 - Reverse Engineering

Lab 3. The Art of Assembly Language (II)

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

How Software Executes

Module 3 Instruction Set Architecture (ISA)

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

Practical Malware Analysis

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Assembly Language: IA-32 Instructions

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

The x86 Architecture

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

Reverse Engineering II: The Basics

How Software Executes

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

Second Part of the Course

Reverse Engineering II: The Basics

CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

x86 Assembly Crash Course Don Porter

Compiler Construction D7011E

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

16.317: Microprocessor Systems Design I Fall 2015

Credits and Disclaimers

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

16.317: Microprocessor Systems Design I Spring 2015

x86 assembly CS449 Spring 2016

Credits and Disclaimers

Instruction Selection. Problems. DAG Tiling. Pentium ISA. Example Tiling CS412/CS413. Introduction to Compilers Tim Teitelbaum

x86 Assembly Tutorial COS 318: Fall 2017

CPS104 Recitation: Assembly Programming

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

x86 architecture et similia

How Software Executes

Machine Code and Assemblers November 6

x86 Programming I CSE 351 Winter

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

Assembly Language: Function Calls" Goals of this Lecture"

Towards the Hardware"

Function Call Convention

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

Intro to GNU Assembly Language on Intel Processors

Introduction to IA-32. Jo, Heeseung

Assembly Language: Function Calls" Goals of this Lecture"

X86 Review Process Layout, ISA, etc. CS642: Computer Security. Drew Davidson

INTRODUCTION TO IA-32. Jo, Heeseung

Interfacing Compiler and Hardware. Computer Systems Architecture. Processor Types And Instruction Sets. What Instructions Should A Processor Offer?

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

Lecture 10 Return-oriented programming. Stephen Checkoway University of Illinois at Chicago Based on slides by Bailey, Brumley, and Miller

Assembly Language: Function Calls

Access. Young W. Lim Fri. Young W. Lim Access Fri 1 / 18

Assembly Language Each statement in an assembly language program consists of four parts or fields.

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

Complex Instruction Set Computer (CISC)

appendix b From LC-3 to x86

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CS61 Section Solutions 3

COMPUTER ENGINEERING DEPARTMENT

Where we are. Instruction selection. Abstract Assembly. CS 4120 Introduction to Compilers

Access. Young W. Lim Sat. Young W. Lim Access Sat 1 / 19

Secure Programming Lecture 3: Memory Corruption I (Stack Overflows)

from WRITE GREAT CODE Volume 2: Thinking Low-Level, Writing High-Level ONLINE APPENDIX A The Minimal 80x86 Instruction Set by Randall Hyde

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 12

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Chapter 3: Addressing Modes

History of the Intel 80x86

CS Bootcamp x86-64 Autumn 2015

Computer Processors. Part 2. Components of a Processor. Execution Unit The ALU. Execution Unit. The Brains of the Box. Processors. Execution Unit (EU)

Process Layout, Function Calls, and the Heap

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Betriebssysteme und Sicherheit Sicherheit. Buffer Overflows

CSE351 Spring 2018, Midterm Exam April 27, 2018

Marking Scheme. Examination Paper Department of CE. Module: Microprocessors (630313)

Transcription:

CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 21: Generating Pentium Code 10 March 08 CS 412/413 Spring 2008 Introduction to Compilers 1

Simple Code Generation Three-address code makes it easy to generate assembly Complex expressions in the input program already lowered to sequences of simple IR instructions Just need to translate each low IR instruction into a sequence of assembly instructions e.g. a = p+q mov 16(%ebp), %ecx add 8(%ebp), %ecx mov %ecx, -8(%ebp) Need to consider many language constructs: Operations: arithmetic, logic, comparisons Accesses to local variables, global variables Array accesses, field accesses Control flow: conditional and unconditional jumps Method calls, dynamic dispatch Dynamic allocation (new) Run-time checks CS 412/413 Spring 2008 Introduction to Compilers 2

x86 Quick Overview Registers: General purpose 32bit: eax, ebx, ecx, edx, esi, edi Also 16-bit: ax, bx, etc., and 8-bit: al, ah, bl, bh, etc. Stack registers: esp, ebp Instructions: Arithmetic: add, sub, inc, mod, idiv, imul, etc. Logic: and, or, not, xor Comparison: cmp, test Control flow: jmp, jcc, jecz Function calls: call, ret Data movement: mov (many variants) Stack manipulations: push, pop Other: lea CS 412/413 Spring 2008 Introduction to Compilers 3

Big Picture of Program Memory Stack variables Param n Param 1 Return address Previous fp Local 1 Local n Heap variables Global (static) variables Global 1 Global n CS 412/413 Spring 2008 Introduction to Compilers 4

Memory Layout low Code Globals, Static data Object fields, arrays Static area Heap Locals, parameters Stack high CS 412/413 Spring 2008 Introduction to Compilers 5

Accessing Stack Variables To access stack variables: use offsets from ebp Example: 8(%ebp) = parameter 1 12(%ebp) = parameter 2-4(%ebp) =local 1 ebp+ ebp+8 ebp ebp-4 esp Param n Param 1 Return address Previous fp Local 1 Local n Param 1 Param n --- CS 412/413 Spring 2008 Introduction to Compilers 6

Accessing Stack Variables Translate accesses to variables: For parameters, compute offset from %ebp using: Parameter number Sizes of other parameters For local variables, decide on data layout and assign offsets from frame pointer to each local Store offsets in the symbol table Keep track of high-water mark for frame allocation Example: a: local, offset-4 p: parameter, offset+16, q: parameter, offset+8 Assignment a = p + q becomes equivalent to: -4(%ebp) = 16(%ebp) + 8(%ebp) How to write this in assembly? CS 412/413 Spring 2008 Introduction to Compilers 7

Arithmetic How to translate: p+q? Assume p and q are locals or parameters Determine offsets for p and q Perform the arithmetic operation Problem: the ADD instruction in x86 cannot take both operands from memory; notation for possible operands: mem32: register or memory 32 bit (similar for r/m8, r/m16) reg32: register 32 bit (similar for reg8, reg16) imm32: immediate 32 bit (similar for imm8, imm16) At most one operand can be mem! Translation requires using an extra register Place p into a register (e.g. %ecx): mov 16(%ebp), %ecx Perform addition of q and %ecx: add 8(%ebp), %ecx CS 412/413 Spring 2008 Introduction to Compilers 8

Data Movement Translate a = p+q: Loadmemory location (p) into register (%ecx) using a move instr. Perform the addition Store result from register into memory location (a): mov 16(%ebp), %ecx (load) add 8(%ebp), %ecx (arithmetic) mov %ecx, -8(%ebp) (store) Move instructions cannot have two memory operands Therefore, copy instructions must be translated using an extra register: a = p mov 16(%ebp), %ecx mov %ecx, -8(%ebp) However, loading constants doesn t require extra registers: a = 12 mov $12, -8(%ebp) CS 412/413 Spring 2008 Introduction to Compilers 9

Accessing Global Variables Global (static) variables and constants not stack allocated Have fixed addresses throughout the execution of the program Compile-time known addresses (relative to the base address where program is loaded) Hence, can directly refer to these addresses using symbolic names in the generated assembly code Example: string constants str:.string Hello world! The string will be allocated in the static area of the program Here, str is a label representing the address of the string Can use $str as a constant in other instructions: push $str CS 412/413 Spring 2008 Introduction to Compilers 10

Accessing Heap Data Heap data allocated with new (Java) or malloc (C/C++) Such allocation routines return address of allocated data References to data stored into local variables Access heap data through these references Array accesses in language with dynamic array size access a[i] requires: Compute address of element: a + i * size Access memory at that address Can use indexed memory accesses to compute addresses Example: assume size of array elements is 4 bytes, and local variables a, i (offsets 4, -8) a[i] = 1 mov 4(%ebp), %ebx (load a) mov 8(%ebp), %ecx (load i) mov $1, (%ebx,%ecx,4) (store into the heap) CS 412/413 Spring 2008 Introduction to Compilers 11

Control-Flow Label instructions Simply translated as labels in the assembly code E.g., label2: mov $2, %ebx Unconditional jumps: Use jump instruction, with a label argument E.g., jmp label2 Conditional jumps: Translate conditional jumps using test/cmp instructions: E.g., tjump b L cmp %ecx, $0 jnz L where %ecx hold the value of b, and we assume booleans are represented as 0=false, 1=true CS 412/413 Spring 2008 Introduction to Compilers 12

Run-time checks: Run-time Checks Check if array/object references are non-null Check if array index is within bounds Example: array bounds checks: if v holds the address of an array, insert array bounds checking code for v before each load ( =v[i]) or store (v[i] = ) Assume array length is stored just before array elements: cmp $0, -12(%ebp) (compare i to 0) jl ArrayBoundsError mov 8(%ebp), %ecx mov 4(%ecx), %ecx cmp 12(%ebp), %ecx jle ArrayBoundsError (test lower bound) (load v into %ecx) (load array length into %ecx) (compare i to array length) (test upper bound)... CS 412/413 Spring 2008 Introduction to Compilers 13

X86 Assembly Syntax Two different notations for assembly syntax: AT&T syntax and Intel syntax In the examples: AT&T syntax Summary of differences: Order of operands op a, b : b is destination op a, b : a is destination Memory addressing disp(base,offset,scale) [base + offset*scale + disp] Size of memory operands instruction suffixes (b,w,l) (e.g., movb, movw, movl) operand prefixes (byte ptr, word ptr, dword ptr) Registers %eax, %ebx, etc. eax, ebx, etc. Constants $4, $foo, etc 4, foo, etc CS 412/413 Spring 2008 Introduction to Compilers 14