CSE P 501 Compilers. x86 Lite for Compiler Writers Hal Perkins Autumn /25/ Hal Perkins & UW CSE J-1

Similar documents
CSE 401/M501 Compilers

Program Exploitation Intro

CSE 401 Compilers. x86-64 Lite for Compiler Writers A quick (a) introduccon or (b) review. Hal Perkins Winter UW CSE 401 Winter 2017 J-1

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

Practical Malware Analysis

CSC 8400: Computer Systems. Machine-Level Representation of Programs

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

An Introduction to x86 ASM

Compiler Construction D7011E

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

Computer Systems Organization V Fall 2009

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

CPS104 Recitation: Assembly Programming

W4118: PC Hardware and x86. Junfeng Yang

SOEN228, Winter Revision 1.2 Date: October 25,

Lecture 2 Assembly Language

CSE351 Spring 2018, Midterm Exam April 27, 2018

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2016 Lecture 12

x86 assembly CS449 Fall 2017

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

Second Part of the Course

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

CSE P 501 Compilers. x86-64 Lite for Compiler Writers A quick (a) introduceon (b) review. Hal Perkins Winter UW CSE P 501 Winter 2016 J-1

Basic Pentium Instructions. October 18

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

Towards the Hardware"

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 5

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Lab 3. The Art of Assembly Language (II)

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Credits and Disclaimers

16.317: Microprocessor Systems Design I Fall 2014

x64 Cheat Sheet Fall 2014

CS61 Section Solutions 3

Digital Forensics Lecture 3 - Reverse Engineering

Computer Architecture and Assembly Language. Practical Session 3

Process Layout and Function Calls

complement) Multiply Unsigned: MUL (all operands are nonnegative) AX = BH * AL IMUL BH IMUL CX (DX,AX) = CX * AX Arithmetic MUL DWORD PTR [0x10]

CMSC 313 Lecture 07. Short vs Near Jumps Logical (bit manipulation) Instructions AND, OR, NOT, SHL, SHR, SAL, SAR, ROL, ROR, RCL, RCR

16.317: Microprocessor Systems Design I Fall 2015

Scott M. Lewandowski CS295-2: Advanced Topics in Debugging September 21, 1998

Computer Systems C S Cynthia Lee

Computer Architecture and System Programming Laboratory. TA Session 3

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

THEORY OF COMPILATION

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

Sample Exam I PAC II ANSWERS

ASSEMBLY II: CONTROL FLOW. Jo, Heeseung

Assembly Language: Function Calls" Goals of this Lecture"

How Software Executes

Chapter 4 Processor Architecture: Y86 (Sections 4.1 & 4.3) with material from Dr. Bin Ren, College of William & Mary

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Sistemi Operativi. Lez. 16 Elementi del linguaggio Assembler AT&T

Subprograms: Arguments

CSE P 501 Exam 8/5/04 Sample Solution. 1. (10 points) Write a regular expression or regular expressions that generate the following sets of strings.

Basic Assembly Instructions

Assembly Language: IA-32 Instructions

x86 Assembly Crash Course Don Porter

CSE 582 Autumn 2002 Exam Sample Solution

Assembly Language: Function Calls

Assembly Language: Function Calls" Goals of this Lecture"

A CRASH COURSE IN X86 DISASSEMBLY

CSC 8400: Computer Systems. Using the Stack for Function Calls

3.1 DATA MOVEMENT INSTRUCTIONS 45

Assembly II: Control Flow. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Reverse Engineering II: The Basics

CSC 2400: Computer Systems. Using the Stack for Function Calls

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The x86 Architecture

x86 assembly CS449 Spring 2016

Module 3 Instruction Set Architecture (ISA)

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

Language of x86 processor family

Intro to x86 Binaries. From ASM to exploit

22 Assembly Language for Intel-Based Computers, 4th Edition. 3. Each edge is a transition from one state to another, caused by some input.

CS241 Computer Organization Spring 2015 IA

CSCI 2121 Computer Organization and Assembly Language PRACTICE QUESTION BANK

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

CSIS1120A. 10. Instruction Set & Addressing Mode. CSIS1120A 10. Instruction Set & Addressing Mode 1

CS429: Computer Organization and Architecture

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Assembly Language Programming: Procedures. EECE416 uc. Charles Kim Howard University. Fall

Inline Assembler. Willi-Hans Steeb and Yorick Hardy. International School for Scientific Computing

Real instruction set architectures. Part 2: a representative sample

CO Computer Architecture and Programming Languages CAPL. Lecture 13 & 14

Computer System Architecture

UNIT II. 2 Marks. 2. List out the various register of IA- 32 Register structure. 3. Write about General purpose registers

CMSC 313 Lecture 05 [draft]

Instruction Set Architecture

Instruction Set Architecture

Machine-Level Programming II: Control and Arithmetic

Introduction to Reverse Engineering. Alan Padilla, Ricardo Alanis, Stephen Ballenger, Luke Castro, Jake Rawlins

Transcription:

CSE P 501 Compilers x86 Lite for Compiler Writers Hal Perkins Autumn 2011 10/25/2011 2002-11 Hal Perkins & UW CSE J-1

Agenda Learn/review x86 architecture Core 32-bit part only for now Ignore crufty, backward-compatible things Look at x86-64 extensions later Suggest either 32- or 64-bit x86 as compiler target for project (tradeoffs either way) If you want to try something else (ARM, MIPS,?), let s talk After we ve reviewed the x86 we ll look at how to map language constructs to code 10/25/2011 2002-11 Hal Perkins & UW CSE J-2

x86 Selected History 30 Years of x86 1978: 8086 16-bit processor, segmentation 1982: 80286 protected mode, floating point 1985: 80386 32-bit architecture, general-purpose register set, VM 1993: Pentium mmx 1999: Pentium III SSE 2000: Pentium IV SSE2, SSE3, HT, virtualization 2006: Core Duo, Core 2 Multicore, SSE4+, x86-64 2008: Atom, i7, Many internal implementation changes, pipelining, concurrency, &c 10/25/2011 2002-11 Hal Perkins & UW CSE J-3

And It s Backward-Compatible! Current processors can run 8086 code (!) (You can get VisiCalc 1.0 on the web!) The Intel descriptions of the architecture are loaded down with modes and flags that obscure the modern, fairly simple 32-bit and 64-bit processor models Modern processors have a RISC-like core Simple, register-register & load/store architecture Simple x86 instructions preferred; complex CICS instructions supported We ll focus on the basic 32-bit core instructions for now 10/25/2011 2002-11 Hal Perkins & UW CSE J-4

x86 Assembler The nice thing about standards Two main assembler languages for x86: Intel/Microsoft what s in the documentation GNU / AT&T syntax (Linux, OS X) Use gcc S to generate examples from C/C++ code You can use either for your project Slides use Intel descriptions Brief information later on differences Main changes: dst,src reversed, data types in gnu opcodes, various syntactic annoyances 10/25/2011 2002-11 Hal Perkins & UW CSE J-5

Intel ASM Statements Format is optlabel: opcode operands ; comment optlabel is an optional label opcode and operands make up the assembly language instruction Anything following a ; is a comment Language is very free-form Comments and labels may appear on separate lines by themselves (we ll take advantage of this) 10/25/2011 2002-11 Hal Perkins & UW CSE J-6

x86 Memory Model 8-bit bytes, byte addressable 16-, 32-, 64-bit words, doublewords, and quadwords Data should almost always be aligned on natural boundaries; huge performance penalty on modern processors if it isn t Little-endian address of a 4-byte integer is address of low-order byte 10/25/2011 2002-11 Hal Perkins & UW CSE J-7

x86 Processor Registers 8 32-bit, mostly general purpose registers eax, ebx, ecx, edx, esi, edi, ebp (base pointer), esp (stack pointer) Other registers, not directly addressable 32-bit eflags register Holds condition codes, processor state, etc. 32-bit instruction pointer eip Holds address of first byte of next instruction to execute 10/25/2011 2002-11 Hal Perkins & UW CSE J-8

Processor Fetch-Execute Cycle Basic cycle (same as every processor you ve ever seen) while (running) { } fetch instruction beginning at eip address eip <- eip + instruction length execute instruction Sequential execution unless a jump stores a new next instruction address in eip 10/25/2011 2002-11 Hal Perkins & UW CSE J-9

Instruction Format Typical data manipulation instruction opcode dst,src Meaning is dst <- dst op src Normally, one operand is a register, the other is a register, memory location, or integer constant Can t have both operands in memory can t encode two separate memory addresses 10/25/2011 2002-11 Hal Perkins & UW CSE J-10

x86 Memory Stack Register esp points to the top of stack Dedicated for this use; don t use otherwise Points to the last 32-bit doubleword pushed onto the stack (not next free one) Should always be doubleword aligned It will start out this way, and will stay aligned unless your code does something bad Stack grows down 10/25/2011 2002-11 Hal Perkins & UW CSE J-11

Stack Instructions push src esp <- esp 4; memory[esp] <- src (e.g., push src onto the stack) pop dst dst <- memory[esp]; esp <- esp + 4 (e.g., pop top of stack into dst and logically remove it from the stack) These are highly optimized and heavily used 32-bit function call protocol is stack-based 32-bit x86 doesn t have enough registers, so the stack is frequently used for temporary space 10/25/2011 2002-11 Hal Perkins & UW CSE J-12

Stack Frames When a method is called, a stack frame is traditionally allocated on the top of the stack to hold its local variables Frame is popped on method return By convention, ebp (base pointer) points to a known offset into the stack frame Local variables referenced relative to ebp (This is often optimized to use esp-relative references instead. Frees up ebp, needs additional bookkeeping at compile time.) 10/25/2011 2002-11 Hal Perkins & UW CSE J-13

Operand Address Modes (1) These should cover most of what we ll need mov eax,17 mov eax,ecx mov eax,[ebp-12] mov [ebp+8],eax ; store 17 in eax ; copy ecx to eax ; copy memory to eax ; copy eax to memory References to object fields work similarly put the object s memory address in a register and use that address plus field offset 10/25/2011 2002-11 Hal Perkins & UW CSE J-14

Operand Address Modes (2) In full generality, a memory address can combine the contents of two registers (with one being scaled) plus a constant displacement: [basereg + index*scale + constant] Scale can be 2, 4, 8 Main use is for array subscripting Example: suppose Array of 4-byte ints, address of the array A is in ecx, subscript i is in eax Code to store edx in A[i] mov [ecx+eax*4],edx ;; and we didn t even use the offset! 10/25/2011 2002-11 Hal Perkins & UW CSE J-15

dword ptr Intel assembler Obscure, but sometimes necessary If the assembler can t figure out the size of the operands to move, you can explicitly tell it to move 32 bits with the qualifier dword ptr mov dword ptr [eax+16],[ebp-8] Use this if the assembler complains; otherwise ignore Not an issue in GNU asembler different opcode mnemonics for different operand sizes 10/25/2011 2002-11 Hal Perkins & UW CSE J-16

Basic Data Movement and Arithmetic Instructions mov dst,src dst <- src add dst,src dst <- dst + src sub dst,src dst <- dst src inc dst dst <- dst + 1 dec dst dst <- dst - 1 neg dst dst <- - dst (2 s complement arithmetic negation) 10/25/2011 2002-11 Hal Perkins & UW CSE J-17

Integer Multiply and Divide imul dst,src dst <- dst * src 32-bit product dst must be a register imul dst,src,imm8 dst <- dst*src*imm8 imm8 8 bit constant Obscure, but useful for optimizing array refs There are other mul instructions see docs idiv src cdq Divide edx:eax by src (edx:eax holds signextended 64-bit value; cannot use other registers for division) eax <- quotient edx <- remainder edx:eax <- 64-bit sign extended copy of eax 10/25/2011 2002-11 Hal Perkins & UW CSE J-18

Bitwise Operations and dst,src dst <- dst & src or dst,src dst <- dst src xor dst,src dst <- dst ^ src not dst dst <- ~ dst (logical or 1 s complement) 10/25/2011 2002-11 Hal Perkins & UW CSE J-19

Shifts and Rotates shl dst,count dst shifted left count bits shr dst,count dst <- dst shifted right count bits (0 fill) sar dst,count dst <- dst shifted right count bits (sign bit fill) rol dst,count dst <- dst rotated left count bits ror dst,count dst <- dst rotated right count bits 10/25/2011 2002-11 Hal Perkins & UW CSE J-20

Uses for Shifts and Rotates Can often be used to optimize multiplication and division by small constants If you re interested, look at Hacker s Delight by Henry Warren, A-W, 2003 Lots of very cool bit fiddling and other algorithms But be careful be sure semantics are OK There are additional instructions that shift and rotate double words, use a calculated shift amount instead of a constant, etc. 10/25/2011 2002-11 Hal Perkins & UW CSE J-21

Load Effective Address The unary & operator in C lea dst,src ; dst <- address of src dst must be a register Address of src includes any address arithmetic or indexing Useful to capture addresses for pointers, reference parameters, etc. Also useful for computing arithmetic expressions that match r1+scale*r2+const 10/25/2011 2002-11 Hal Perkins & UW CSE J-22

Control Flow - GOTO At this level, all we have is goto and conditional goto Loops and conditional statements are synthesized from these Optimization note: random jumps play havoc with pipeline efficiency; much work is done in modern compilers and processors to minimize this impact 10/25/2011 2002-11 Hal Perkins & UW CSE J-23

Unconditional Jumps jmp dst eip <- address of dst Assembly language notes: dst will be a label Can have multiple labels on separate lines preceding an instruction Convenient in compiler-generated asm lang. 10/25/2011 2002-11 Hal Perkins & UW CSE J-24

Conditional Jumps Most arithmetic instructions set condition code bits in eflags to record information about the result (zero, non-zero, positive, etc.) True of add, sub, and, or; but not imul or idiv Other instructions that set eflags cmp dst,src test dst,src ; compare dst to src ; calculate dst & src (logical ; and); doesn t change either 10/25/2011 2002-11 Hal Perkins & UW CSE J-25

Conditional Jumps Following Arithmetic Operations jz label ; jump if result == 0 jnz label ; jump if result!= 0 jg label ; jump if result > 0 jng label ; jump if result <= 0 jge label ; jump if result >= 0 jnge label ; jump if result < 0 jl label ; jump if result < 0 jnl label ; jump if result >= 0 jle label ; jump if result <= 0 jnle label ; jump if result > 0 Obviously, the assembler is providing multiple opcode mnemonics for several of these instructions 10/25/2011 2002-11 Hal Perkins & UW CSE J-26

Compare and Jump Conditionally Want: compare two operands and jump if a relationship holds between them Would like to do this jmp cond op1,op2,label but can t, because 3-address instructions can t be encoded in x86 10/25/2011 2002-11 Hal Perkins & UW CSE J-27

cmp and jcc Instead, use a 2-instruction sequence cmp jcc op1,op2 label where jcc is a conditional jump that is taken if the result of the comparison matches the condition cc 10/25/2011 2002-11 Hal Perkins & UW CSE J-28

Conditional Jumps Following Arithmetic Operations je label ; jump if op1 == op2 jne label ; jump if op1!= op2 jg label ; jump if op1 > op2 jng label ; jump if op1 <= op2 jge label ; jump if op1 >= op2 jnge label ; jump if op1 < op2 jl label ; jump if op1 < op2 jnl label ; jump if op1 >= op2 jle label ; jump if op1 <= op2 jnle label ; jump if op1 > op2 Again, the assembler is mapping more than one mnemonic to some machine instructions 10/25/2011 2002-11 Hal Perkins & UW CSE J-29

Function Call and Return The x86 instruction set itself only provides for transfer of control (jump) and return Stack is used to capture return address and recover it Everything else parameter passing, stack frame organization, register usage is a matter of convention and not defined by the hardware 10/25/2011 2002-11 Hal Perkins & UW CSE J-30

call and ret Instructions call label ret Push address of next instruction and jump esp <- esp 4; memory[esp] <- eip eip <- address of label Pop address from top of stack and jump eip <- memory[esp]; esp <- esp + 4 WARNING! The word on the top of the stack had better be an address, not some leftover data 10/25/2011 2002-11 Hal Perkins & UW CSE J-31

enter and leave Complex instructions for languages with nested procedures enter can be slow on current CPUs best avoided i.e., don t use it in your project leave is equivalent to mov esp,ebp pop ebp and is generated by many compilers. Fits in 1 byte, saves space. Not clear if it s any faster. 10/25/2011 2002-10 Hal Perkins & UW CSE J-32

Win 32 C Function Call Conventions Wintel code obeys the following conventions for C programs Note: calling conventions normally designed very early in the instruction set/ basic software design. Hard (e.g., basically impossible) to change later. C++ augments these conventions to include the this pointer We ll use these conventions in our code 10/25/2011 2002-11 Hal Perkins & UW CSE J-33

Win32 C Register Conventions These registers must be restored to their original values before a function returns, if they are altered during execution esp, ebp, ebx, esi, edi Traditional: push/pop from stack to save/restore A function may use the other registers (eax, ecx, edx) without having to save/restore A 32-bit function result is expected to be in eax when the function returns 10/25/2011 2002-11 Hal Perkins & UW CSE J-34

Call Site Caller is responsible for Pushing arguments on the stack from right to left (allows implementation of varargs) Execute call instruction Pop arguments from stack after return For us, this means add 4*(# arguments) to esp after the return, since everything is either a 32- bit variable (int, bool), or a reference (pointer) 10/25/2011 2002-11 Hal Perkins & UW CSE J-35

Call Example n = sumof(17,42) push 42 ; push args push 17 call sumof ; jump & ; push addr add esp,8 ; pop args mov [ebp+offset n ],eax ; store result 10/25/2011 2002-11 Hal Perkins & UW CSE J-36

Callee Called function must do the following Save registers if necessary Allocate stack frame for local variables Execute function body Ensure result of non-void function is in eax Restore any required registers if necessary Pop the stack frame Return to caller 10/25/2011 2002-11 Hal Perkins & UW CSE J-37

Win32 Function Prologue The code that needs to be executed before the statements in the body of the function are executed is referred to as the prologue For a Win32 function f, it looks like this: f: push ebp ; save old frame pointer mov ebp,esp ; new frame ptr is top of ; stack after arguments and sub ; return address are pushed esp, # bytes needed ; allocate stack frame (size ; must be multiple of 4) 10/25/2011 2002-11 Hal Perkins & UW CSE J-38

Win32 Function Epilogue The epilogue is the code that is executed for a return statement (or if execution falls off the bottom of a void function) For a Win32 function, it looks like this: mov eax, function result ; put result in eax if not already ; there (if non-void function) mov esp,ebp ; restore esp to old value ; before stack frame allocated pop ebp ; restore ebp to caller s value ret ; return to caller 10/25/2011 2002-11 Hal Perkins & UW CSE J-39

Example Function Source code int sumof(int x, int y) { int a, int b; a = x; b = a + y; return b; } 10/25/2011 2002-11 Hal Perkins & UW CSE J-40

Stack Frame for sumof int sumof(int x, int y) { int a, int b; a = x; b = a + y; return b; } 10/25/2011 2002-11 Hal Perkins & UW CSE J-41

Assembly Language Version ;; int sumof(int x, int y) { ;; int a, int b; sumof: push ebp ; prologue mov ebp,esp sub esp, 8 ;; a = x; mov eax,[ebp+8] mov [ebp-4],eax ;; b = a + y; mov eax,[ebp-4] add eax,[ebp+12] mov [ebp-8],eax ;; return b; mov eax,[ebp-8] mov esp,ebp pop ebp ret ;; } 10/25/2011 2002-11 Hal Perkins & UW CSE J-42

Coming Attractions Now that we ve got a basic idea of the x86 instruction set, we need to map language constructs to x86 Code Shape Then on to basic code generation and execution And later, optimizations 10/25/2011 2002-11 Hal Perkins & UW CSE J-43