Be a Binary Rockst r. An Introduction to Program Analysis with Binary Ninja

Size: px
Start display at page:

Download "Be a Binary Rockst r. An Introduction to Program Analysis with Binary Ninja"

Transcription

1 Be a Binary Rockst r An Introduction to Program Analysis with Binary Ninja

2 Agenda Motivation Current State of Program Analysis Design Goals of Binja Program Analysis Building Tools 2

3 Motivation 3

4 Tooling - Concrete -> Symbolic Increase speed & effectiveness of RE / VR Make Program Analysis more accessible & useful 4

5 Foundations Need to understand code semantics Could be done directly on the assembly An Intermediate Language (IL) is needed 5

6 Why IL? Architecture Abstraction Smaller number of instructions 6

7 Easy to lift Simple flags calculation As close to native instructions as possible Typeless - types inferred later 7

8 Easy to read Intuitive to read Tree-based infix notation No register abstraction Flags calculation only when necessary Avoid excessive temporaries 8

9 Easier to analyze Instruction Set Size Easier to lift IL Instruction Set Size 9

10 The Options 10

11 Existing Options for IL BAP VEX REIL LLVM IDA 11

12 BAP Tree-tree based :) Flags are explicit and inhibit readability :( Written in OCAML :( 12

13 add ebx, eax shl ebx, cl addr add %eax,%ebx t:u32 = REBX:u32 REBX:u32 = REBX:u32 + REAX:u32 RCF:bool = REBX:u32 < t:u32 addr shl %cl,%ebx t1:u32 = REBX:u32 >> 0x20:u32 (RECX:u32 & 0x1f:u32) RCF:bool = ((RECX:u32 & 0x1f:u32) = 0:u32) & RCF:bool ((RECX:u32 & 0x1f:u32) = 0:u32) & low:bool(t1:u32) 13

14 VEX Register names are abstracted :( Single assignment :( Over 1000 instructions! :( Yet they call it RISC-like Even Angr is planning a move away from it 14

15 subs R2, R2, #8 t0 = GET:I32(16) t1 = 0x8:I32 t3 = Sub32(t0,t1) PUT(16) = t3 PUT(68) = 0x59FC8:I32 15

16 REIL Tiny instruction set Horrible readability Makes abstractions nearly impossible Flags are explicit and inhibit readability :( 16

17 test eax, eax STR R_EAX:32,, V_00: STR 0:1,, R_CF: AND V_00:32, ff:8, V_01: SHR V_01:8, 7:8, V_02: SHR V_01:8, 6:8, V_03: XOR V_02:8, V_03:8, V_04: SHR V_01:8, 5:8, V_05: SHR V_01:8, 4:8, V_06: XOR V_05:8, V_06:8, V_07: XOR V_04:8, V_07:8, V_08: a SHR V_01:8, 3:8, V_09: b SHR V_01:8, 2:8, V_10: c XOR V_09:8, V_10:8, V_11: d SHR V_01:8, 1:8, V_12: e XOR V_12:8, V_01:8, V_13: f XOR V_11:8, V_13:8, V_14: XOR V_08:8, V_14:8, V_15: AND V_15:8, 1:1, V_16: NOT V_16:1,, R_PF: STR 0:1,, R_AF: EQ V_00:32, 0:32, R_ZF: SHR V_00:32, 1f:32, V_17: AND 1:32, V_17:32, V_18: EQ 1:32, V_18:32, R_SF: STR 0:1,, R_OF:1 17

18 LLVM Easy to analyze and has great tools already available It s a compiler! Reversers want a decompiler. Cannot be the only goal 18

19 LLVM Challenges Hard to lift well from compiled binaries Designed for compiler output Expects type information in the instructions SSA form - assembly is not Stack in assembly looks like a structure, but structures lose many advantages of SSA 19

20 IDA? 20

21 Binary Ninja s Answer Binary Ninja Intermediate Language (BNIL) 21

22 IL Goals & Design 22

23 Why Another IL? Popular existing ILs for compiled binaries are not very human readable. They are extremely low level and verbose. Existing ILs are single stage. Heavyweight analysis must be performed to get anywhere close to decompiled output. Writing a lifter for a new architecture is usually very time consuming. 23

24 Binary Ninja IL Create a family of ILs with multiple stages of analysis Lowest level is close to assembly After analysis and transformations, higher levels are closer to decompiled output and would be much easier to translate to good LLVM code Analysis involved in each transformation is easy to understand, fast, and directly aids further analysis 24

25 IL Design Goals Human readable Computer understandable (SSA, 3AF, etc.) Plugin understandable Easy to lift native architectures Translation to other ILs such as LLVM 25

26 Human Readable Reads like pseudocode, even in lowest level form Flags are resolved into readable expressions 26

27 Low Level IL Example lea lea push sub mov cmp ja rax, [0x201047] rdi, [0x201040] rbp rax, rdi rbp, rsp rax, 0xe 0x68d rax = 0x rdi = 0x push(rbp) rax = rax - rdi rbp = rsp if (rax u> 0xe) then 0x68d else 0x68b x86-64 Assembly Low Level IL 27

28 Low Level IL Example addiu $sp, $sp, -0x18 sw $ra, 0x14($sp) lw $a0, ($a1) jal atoi nop sltiu $at, $v0, 0x20 beqz $at, 0x4002d8 nop MIPS Assembly $sp = $sp - 0x18 [$sp + 0x14].d = $ra $a0 = [$a1].d call(atoi) $at = $v0 u< 0x20? 1 : 0 if ($at == 0) then 0x4002d8 else 0x Low Level IL 28

29 Computer Understandable Multiple IL forms Pick the right IL for the task at hand 29

30 IL Forms Lifted IL ASM -> IL Low Level IL Flags use resolved SSA / 3AF High Level IL Calls in high level form Expression folding Like decompiled output Medium Level IL Stack usage resolved Type propagation SSA / 3AF 30

31 Plugin Understandable All IL forms directly accessible from API Analysis performed on IL also accessible by API 31

32 Easy to Lift Expression tree Designed for quick, modular lifter implementations Semantic flags eases the burden of describing flag effects during lifting 32

33 Semantic Flags Architecture plugins define the set of flags and their semantic roles Instructions can define a set of flags they write Data flow analysis is performed to link flag uses to flag writes 33

34 Semantic Flags In most compiled code, flags are resolved to simple comparison expressions with no effort from the architecture plugin Special cases fall back to emitting concrete flag write expressions 34

35 Semantic Flags Example Writes to all ALU flags sub.q{*}(rax, 0xe) if (u>) then else Folded expression describing use of flags if (rax u> 0xe) then else Flag state representing unsigned greater than 35

36 Translating Upwards Semantic flags analysis gives Low Level IL with flag usage fully resolved Stack is represented as memory accesses, so data flow can be difficult to compute on stack variables in Low Level IL Need to analyze and translate to Medium Level IL 36

37 Low Level IL to Medium Level IL Low Level IL is translated to SSA form Use implicit data flow from SSA to resolve stack layout Data flow based stack layout resolution avoids problems with nonstandard frame pointer behavior Translate loads and stores on stack to stack variable uses and assignments 37

38 Medium Level IL Example push(ebp) ebp = esp esp = esp - 0x18 eax = [ebp + 8].d [esp].d = eax call(free) esp = ebp ebp = pop <return> jump(pop) var_4 = ebp eax = arg_4 var_1c = eax free(var_1c) ebp = var_4 return Medium Level IL 38

39 Medium Level IL Registers and stack usage are now both treated as variables Stack variables no longer use explicit memory access Translate to SSA form to obtain implicit data flow on both registers and stack variables Type propagation is performed on SSA form 39

40 Using Medium Level IL - Jump Tables 40

41 Using Medium Level IL - Jump Tables Jump table resolution based on path-sensitive data flow SSA conversion process also tracks control flow dependence for every block Data flow computations allow disjoint sets of possible values Reads from memory are simulated At jump site, possible values are the possible jump targets 41

42 x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Jump Table Example x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Solve for this to get jump targets Medium Level IL SSA Form 42

43 x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Jump Table Example x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Track flow backwards with SSA to find definitions 43

44 x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Jump Table Example x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Memory read depends on value of x8#1 44

45 Jump Table Example x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Value used in branch comparison x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) 45

46 Jump Table Example x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Branch condition must be false to reach jump site x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) 46

47 x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Jump Table Example When false, we know that x0#2.d is between 0 and 0x1f inclusive x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) 47

48 x8#1 = zx.q(x0#2.d) if (x0#2.d u> 0x1f) then else Jump Table Example Resolve forward to obtain possible jump targets x8#2 = sx.q([table + (x8#1 << 2)].d) x8#3 = x8#2 + table jump(x8#3) Set of possible values here are the jump targets 48

49 Using Medium Level IL - Jump Tables More complex idioms need to combine multiple sources of information Value through SSA ϕ-functions is the set union of the inputs Value of a specific SSA variable is the set intersection of the information found in the definition and all uses of the variable 49

50 Leveraging the Jump Table Algorithm Single jump table algorithm works on all architectures with no additional effort from architecture plugin Control flow dependence information accessible from API Queries for set of possible values accessible from API 50

51 The Final Forms Medium Level IL has the type information, stack knowledge, and SSA form to translate easily into LLVM IR Further analysis can be performed to translate to High Level IL, the Binary Ninja IL that will be used to create its decompiler All aspects of every IL form are plugin accessible, so translating to other representations is straightforward 51

52 Binary Ninja for Profit 52

53 Python, C and C++ API s (headless) Binja API Branches: Basic block/ Function edges (incoming & outgoing) Get the register states, some naive range analysis api.binary.ninja/search.html 53

54 binja_memcpy.py: IL /bin/bash 54

55 binja_memcpy.py: IL /bin/bash 55

56 binja_memcpy.py: API 56

57 binja_memcpy.py: API 57

58 binja_memcpy.py: API 58

59 binja_memcpy.py: API 59

60 binja_memcpy.py: Output 60

61 SSA: Uninitialized variable for func in bv.functions: for block in func.medium_level_il.ssa_form: for instr in block: visit_instr(instr) 61

62 SSA: Uninitialized variable def visit_instr(inst): # Read of variable if inst.operation == MLIL_VAR_SSA: # Not written if inst.index == 0: if inst.src.type == StackVariableSourceType: # Local variables if inst.src.identifier < 0: print ("Uninitialized stack variable reference at " + hex(inst.address)) 62

63 SSA: Uninitialized variable else: for op in instr.operands: if isinstance(op, MediumLevelILInstruction): visit_instr(op) 63

64 Very accurate Symbolic Execution Takes time, data, and memory, often not feasible IDEA! Reasoning only about what we care about. Apply complex data to abstract domains! Domains: type, sign, range, color etc. 64

65 Abstract Interpretation Sets of concrete values are abstracted imprecisely Galois Connection formalizes Concrete <-> Abstract 65

66 Abstract Interpretation int x; int[] a = new int[10]; a[2 * x] = 3; X s value is imprecise Compilers perform imprecise abstraction 1. Add precision - i.e. declare 1. abstract value [0, 9] 2. Symbolically execute with abstract domain/ values Requires control-flow analysis 66

67 Abstract Domains & Sign Analysis int a,b,c; a = 42; b = 87; if (input) { c = a + b; } else { c = a - b; } Map variables to an abstract value 67

68 Abstract Domains & Sign Analysis Binary Ninja plugin Under approximate Path sensitive - construct lattices of abstract values One abstract state per CFG node Avoid loss in precision for fractions. 68

69 Demo! Analyze example program PHP CVE

70 UAF Analysis: PointsTo for Binja IL Before: Allocation -> Write UAF Analysis: Allocation -> Free -> Use Key Idea: Data flow graph, assignments, copies, dereferences, and frees of pointers Context and path sensitive (path API == soon!). blog.trailofbits.com/2016/03/09/the-problem-with-dynamic-program-analysis/ 70

71 Devirtualizing C++ VTable Function Call Example: mov eax, [ecx]; call [eax + 4] 71

72 Devirtualizing C++ 72

73 Devirtualizing C++ 73

74 Playing with Scripts! memcpy, headless python API script Depth-first-search, path sensitive CFG template trailofbits/binjascripts Sign analysis, abstract domain plugin, CFG traversal script And much much more. 74

75 Conclusion: Resources binary.ninja/ Abstract Interpretation talk: santos.cs.ksu.edu/schmidt/escuela03/wssa/talk1p.pdf Static Program Analysis Book! cs.au.dk/~amoeller/spa/spa.pdf 75

76 Conclusion: Binary Ninja 76

77 Contact Us Sophia d Antoine - - sophia@trailofbits.com Peter LaFosse Rusty Wagner binaryninja@vector35.com 77

Practical Malware Analysis

Practical Malware Analysis Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7 Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the

More information

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html

Rev101. spritzers - CTF team. spritz.math.unipd.it/spritzers.html Rev101 spritzers - CTF team spritz.math.unipd.it/spritzers.html Disclaimer All information presented here has the only purpose of teaching how reverse engineering works. Use your mad skillz only in CTFs

More information

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta 1 Reverse Engineering Low Level Software CS5375 Software Reverse Engineering Dr. Jaime C. Acosta Machine code 2 3 Machine code Assembly compile Machine Code disassemble 4 Machine code Assembly compile

More information

Where we are. Instruction selection. Abstract Assembly. CS 4120 Introduction to Compilers

Where we are. Instruction selection. Abstract Assembly. CS 4120 Introduction to Compilers Where we are CS 420 Introduction to Compilers Andrew Myers Cornell University Lecture 8: Instruction Selection 5 Oct 20 Intermediate code Canonical intermediate code Abstract assembly code Assembly code

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 4 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2018 Lecture 4 LAST TIME Enhanced our processor design in several ways Added branching support Allows programs where work is proportional to the input values

More information

CSE351 Spring 2018, Midterm Exam April 27, 2018

CSE351 Spring 2018, Midterm Exam April 27, 2018 CSE351 Spring 2018, Midterm Exam April 27, 2018 Please do not turn the page until 11:30. Last Name: First Name: Student ID Number: Name of person to your left: Name of person to your right: Signature indicating:

More information

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline. The Main Idea of Today s Lecture Code Generation We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

We can emit stack-machine-style code for expressions via recursion

We can emit stack-machine-style code for expressions via recursion Code Generation The Main Idea of Today s Lecture We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012) CSE351 Autumn 2012 Midterm Exam (5 Nov 2012) Please read through the entire examination first! We designed this exam so that it can be completed in 50 minutes and, hopefully, this estimate will prove to

More information

Program Exploitation Intro

Program Exploitation Intro Program Exploitation Intro x86 Assembly 04//2018 Security 1 Univeristà Ca Foscari, Venezia What is Program Exploitation "Making a program do something unexpected and not planned" The right bugs can be

More information

How Software Executes

How Software Executes How Software Executes CS-576 Systems Security Instructor: Georgios Portokalidis Overview Introduction Anatomy of a program Basic assembly Anatomy of function calls (and returns) Memory Safety Programming

More information

Summary: Direct Code Generation

Summary: Direct Code Generation Summary: Direct Code Generation 1 Direct Code Generation Code generation involves the generation of the target representation (object code) from the annotated parse tree (or Abstract Syntactic Tree, AST)

More information

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018 CS 31: Intro to Systems ISAs and Assembly Kevin Webb Swarthmore College September 25, 2018 Overview How to directly interact with hardware Instruction set architecture (ISA) Interface between programmer

More information

Inside VMProtect. Introduction. Internal. Analysis. VM Logic. Inside VMProtect. Conclusion. Samuel Chevet. 16 January 2015.

Inside VMProtect. Introduction. Internal. Analysis. VM Logic. Inside VMProtect. Conclusion. Samuel Chevet. 16 January 2015. 16 January 2015 Agenda Describe what VMProtect is Introduce code virtualization in software protection Methods for circumvention VM logic Warning Some assumptions are made in this presentation Only few

More information

OptiCode: Machine Code Deobfuscation for Malware Analysis

OptiCode: Machine Code Deobfuscation for Malware Analysis OptiCode: Machine Code Deobfuscation for Malware Analysis NGUYEN Anh Quynh, COSEINC CONFidence, Krakow - Poland 2013, May 28th 1 / 47 Agenda 1 Obfuscation problem in malware analysis

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures! ISAs! Brief history of processors and architectures! C, assembly, machine code! Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University Fall 2014-2015 Compiler Principles Lecture 12: Register Allocation Roman Manevich Ben-Gurion University Syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering Local

More information

Automated static deobfuscation in the context of Reverse Engineering

Automated static deobfuscation in the context of Reverse Engineering Automated static deobfuscation in the context of Reverse Engineering Sebastian Porst (sebastian.porst@zynamics.com) Christian Ketterer (cketti@gmail.com) Sebastian zynamics GmbH Lead Developer BinNavi

More information

The IA-32 Stack and Function Calls. CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta

The IA-32 Stack and Function Calls. CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta 1 The IA-32 Stack and Function Calls CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta 2 Important Registers used with the Stack EIP: ESP: EBP: 3 Important Registers used with the Stack EIP:

More information

Intermediate Code & Local Optimizations

Intermediate Code & Local Optimizations Lecture Outline Intermediate Code & Local Optimizations Intermediate code Local optimizations Compiler Design I (2011) 2 Code Generation Summary We have so far discussed Runtime organization Simple stack

More information

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017

CS 31: Intro to Systems ISAs and Assembly. Martin Gagné Swarthmore College February 7, 2017 CS 31: Intro to Systems ISAs and Assembly Martin Gagné Swarthmore College February 7, 2017 ANNOUNCEMENT All labs will meet in SCI 252 (the robot lab) tomorrow. Overview How to directly interact with hardware

More information

Code Generation. Lecture 30

Code Generation. Lecture 30 Code Generation Lecture 30 (based on slides by R. Bodik) 11/14/06 Prof. Hilfinger CS164 Lecture 30 1 Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source

More information

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher

Reverse Engineering II: Basics. Gergely Erdélyi Senior Antivirus Researcher Reverse Engineering II: Basics Gergely Erdélyi Senior Antivirus Researcher Agenda Very basics Intel x86 crash course Basics of C Binary Numbers Binary Numbers 1 Binary Numbers 1 0 1 1 Binary Numbers 1

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 13 & 14

CO Computer Architecture and Programming Languages CAPL. Lecture 13 & 14 CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 13 & 14 Dr. Kinga Lipskoch Fall 2017 Frame Pointer (1) The stack is also used to store variables that are local to function, but

More information

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016 CS 31: Intro to Systems ISAs and Assembly Kevin Webb Swarthmore College February 9, 2016 Reading Quiz Overview How to directly interact with hardware Instruction set architecture (ISA) Interface between

More information

Overview REWARDS TIE HOWARD Summary CS 6V Data Structure Reverse Engineering. Zhiqiang Lin

Overview REWARDS TIE HOWARD Summary CS 6V Data Structure Reverse Engineering. Zhiqiang Lin CS 6V81-05 Data Structure Reverse Engineering Zhiqiang Lin Department of Computer Science The University of Texas at Dallas September 2 nd, 2011 Outline 1 Overview 2 REWARDS 3 TIE 4 HOWARD 5 Summary Outline

More information

16.317: Microprocessor Systems Design I Fall 2014

16.317: Microprocessor Systems Design I Fall 2014 16.317: Microprocessor Systems Design I Fall 2014 Exam 2 Solution 1. (16 points, 4 points per part) Multiple choice For each of the multiple choice questions below, clearly indicate your response by circling

More information

Reverse Engineering II: The Basics

Reverse Engineering II: The Basics Reverse Engineering II: The Basics This document is only to be distributed to teachers and students of the Malware Analysis and Antivirus Technologies course and should only be used in accordance with

More information

Lab 3. The Art of Assembly Language (II)

Lab 3. The Art of Assembly Language (II) Lab. The Art of Assembly Language (II) Dan Bruce, David Clark and Héctor D. Menéndez Department of Computer Science University College London October 2, 2017 License Creative Commons Share Alike Modified

More information

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections

Binghamton University. CS-220 Spring x86 Assembler. Computer Systems: Sections x86 Assembler Computer Systems: Sections 3.1-3.5 Disclaimer I am not an x86 assembler expert. I have never written an x86 assembler program. (I am proficient in IBM S/360 Assembler and LC3 Assembler.)

More information

CPS104 Recitation: Assembly Programming

CPS104 Recitation: Assembly Programming CPS104 Recitation: Assembly Programming Alexandru Duțu 1 Facts OS kernel and embedded software engineers use assembly for some parts of their code some OSes had their entire GUIs written in assembly in

More information

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines

Lecture Outline. Code Generation. Lecture 30. Example of a Stack Machine Program. Stack Machines Lecture Outline Code Generation Lecture 30 (based on slides by R. Bodik) Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation of the

More information

Do not turn the page until 11:30.

Do not turn the page until 11:30. University of Washington Computer Science & Engineering Autumn 2016 Instructor: Justin Hsia 2016-11-02 Last Name: First Name: Perfect Perry Student ID Number: 1234567 Section you attend (circle): Chris

More information

Second Part of the Course

Second Part of the Course CSC 2400: Computer Systems Towards the Hardware 1 Second Part of the Course Toward the hardware High-level language (C) assembly language machine language (IA-32) 2 High-Level Language g Make programming

More information

C to Assembly SPEED LIMIT LECTURE Performance Engineering of Software Systems. I-Ting Angelina Lee. September 13, 2012

C to Assembly SPEED LIMIT LECTURE Performance Engineering of Software Systems. I-Ting Angelina Lee. September 13, 2012 6.172 Performance Engineering of Software Systems SPEED LIMIT PER ORDER OF 6.172 LECTURE 3 C to Assembly I-Ting Angelina Lee September 13, 2012 2012 Charles E. Leiserson and I-Ting Angelina Lee 1 Bugs

More information

CS 61c: Great Ideas in Computer Architecture

CS 61c: Great Ideas in Computer Architecture MIPS Functions July 1, 2014 Review I RISC Design Principles Smaller is faster: 32 registers, fewer instructions Keep it simple: rigid syntax, fixed instruction length MIPS Registers: $s0-$s7,$t0-$t9, $0

More information

Machine Language CS 3330 Samira Khan

Machine Language CS 3330 Samira Khan Machine Language CS 3330 Samira Khan University of Virginia Feb 2, 2017 AGENDA Logistics Review of Abstractions Machine Language 2 Logistics Feedback Not clear Hard to hear Use microphone Good feedback

More information

Compilers and computer architecture: A realistic compiler to MIPS

Compilers and computer architecture: A realistic compiler to MIPS 1 / 1 Compilers and computer architecture: A realistic compiler to MIPS Martin Berger November 2017 Recall the function of compilers 2 / 1 3 / 1 Recall the structure of compilers Source program Lexical

More information

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated CNIT 127: Exploit Development Ch 1: Before you begin Updated 1-14-16 Basic Concepts Vulnerability A flaw in a system that allows an attacker to do something the designer did not intend, such as Denial

More information

CSE 351 Midterm - Winter 2015 Solutions

CSE 351 Midterm - Winter 2015 Solutions CSE 351 Midterm - Winter 2015 Solutions February 09, 2015 Please read through the entire examination first! We designed this exam so that it can be completed in 50 minutes and, hopefully, this estimate

More information

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be "tricky") when you translate high-level code into assembler code.

MIPS Programming. A basic rule is: try to be mechanical (that is, don't be tricky) when you translate high-level code into assembler code. MIPS Programming This is your crash course in assembler programming; you will teach yourself how to program in assembler for the MIPS processor. You will learn how to use the instruction set summary to

More information

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture CS429: Computer Organization and Architecture Warren Hunt, Jr. and Bill Young Department of Computer Sciences University of Texas at Austin Last updated: October 1, 2014 at 12:03 CS429 Slideset 6: 1 Topics

More information

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

See P&H 2.8 and 2.12, and A.5-6. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H 2.8 and 2.12, and A.5-6 Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University Upcoming agenda PA1 due yesterday PA2 available and discussed during lab section this week

More information

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015

CS165 Computer Security. Understanding low-level program execution Oct 1 st, 2015 CS165 Computer Security Understanding low-level program execution Oct 1 st, 2015 A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns

More information

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 21: Generating Pentium Code 10 March 08 CS 412/413 Spring 2008 Introduction to Compilers 1 Simple Code Generation Three-address code makes it

More information

Reverse Engineering II: The Basics

Reverse Engineering II: The Basics Reverse Engineering II: The Basics Gergely Erdélyi Senior Manager, Anti-malware Research Protecting the irreplaceable f-secure.com Binary Numbers 1 0 1 1 - Nibble B 1 0 1 1 1 1 0 1 - Byte B D 1 0 1 1 1

More information

W4118: PC Hardware and x86. Junfeng Yang

W4118: PC Hardware and x86. Junfeng Yang W4118: PC Hardware and x86 Junfeng Yang A PC How to make it do something useful? 2 Outline PC organization x86 instruction set gcc calling conventions PC emulation 3 PC board 4 PC organization One or more

More information

Compiling Code, Procedures and Stacks

Compiling Code, Procedures and Stacks Compiling Code, Procedures and Stacks L03-1 RISC-V Recap Computational Instructions executed by ALU Register-Register: op dest, src1, src2 Register-Immediate: op dest, src1, const Control flow instructions

More information

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p text C program (p1.c p2.c) Compiler (gcc -S) text Asm

More information

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs CSC 2400: Computer Systems Towards the Hardware: Machine-Level Representation of Programs Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32)

More information

Do not turn the page until 5:10.

Do not turn the page until 5:10. University of Washington Computer Science & Engineering Autumn 2018 Instructor: Justin Hsia 2018-10-29 Last Name: First Name: Student ID Number: Name of person to your Left Right All work is my own. I

More information

Course Administration

Course Administration Fall 2018 EE 3613: Computer Organization Chapter 2: Instruction Set Architecture Introduction 4/4 Avinash Karanth Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 45701

More information

Code Generation. Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1

Code Generation. Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1 Code Generation Lecture 31 (courtesy R. Bodik) CS164 Lecture14 Fall2004 1 Lecture Outline Stack machines The MIPS assembly language The x86 assembly language A simple source language Stack-machine implementation

More information

CSE 351 Midterm Exam Spring 2016 May 2, 2015

CSE 351 Midterm Exam Spring 2016 May 2, 2015 Name: CSE 351 Midterm Exam Spring 2016 May 2, 2015 UWNetID: Solution Please do not turn the page until 11:30. Instructions The exam is closed book, closed notes (no calculators, no mobile phones, no laptops,

More information

Code Generation. Lecture 12

Code Generation. Lecture 12 Code Generation Lecture 12 1 Lecture Outline Topic 1: Basic Code Generation The MIPS assembly language A simple source language Stack-machine implementation of the simple language Topic 2: Code Generation

More information

How Software Executes

How Software Executes How Software Executes CS-576 Systems Security Instructor: Georgios Portokalidis Overview Introduction Anatomy of a program Basic assembly Anatomy of function calls (and returns) Memory Safety Intel x86

More information

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015 Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits PC-relative

More information

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks. T E W H A R E W Ā N A N G A O T E Ū P O K O O T E I K A A M Ā U I VUW V I C T O R I A UNIVERSITY OF WELLINGTON EXAMINATIONS 2014 TRIMESTER 1 SWEN 430 Compiler Engineering Time Allowed: THREE HOURS Instructions:

More information

CSC 8400: Computer Systems. Machine-Level Representation of Programs

CSC 8400: Computer Systems. Machine-Level Representation of Programs CSC 8400: Computer Systems Machine-Level Representation of Programs Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32) 1 Compilation Stages

More information

MIPS Procedure Calls. Lecture 6 CS301

MIPS Procedure Calls. Lecture 6 CS301 MIPS Procedure Calls Lecture 6 CS301 Function Call Steps Place parameters in accessible location Transfer control to function Acquire storage for procedure variables Perform calculations in function Place

More information

Representation of Information

Representation of Information Representation of Information CS61, Lecture 2 Prof. Stephen Chong September 6, 2011 Announcements Assignment 1 released Posted on http://cs61.seas.harvard.edu/ Due one week from today, Tuesday 13 Sept

More information

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions? administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions? exam on Wednesday today s material not on the exam 1 Assembly Assembly is programming

More information

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1

Chapter 2. lw $s1,100($s2) $s1 = Memory[$s2+100] sw $s1,100($s2) Memory[$s2+100] = $s1 Chapter 2 1 MIPS Instructions Instruction Meaning add $s1,$s2,$s3 $s1 = $s2 + $s3 sub $s1,$s2,$s3 $s1 = $s2 $s3 addi $s1,$s2,4 $s1 = $s2 + 4 ori $s1,$s2,4 $s2 = $s2 4 lw $s1,100($s2) $s1 = Memory[$s2+100]

More information

CS241 Computer Organization Spring 2015 IA

CS241 Computer Organization Spring 2015 IA CS241 Computer Organization Spring 2015 IA-32 2-10 2015 Outline! Review HW#3 and Quiz#1! More on Assembly (IA32) move instruction (mov) memory address computation arithmetic & logic instructions (add,

More information

C++ to assembly to machine code

C++ to assembly to machine code IBCM 1 C++ to assembly to machine code hello.cpp #include int main() { std::cout

More information

Unoptimized Code Generation

Unoptimized Code Generation Unoptimized Code Generation Last time we left off on the procedure abstraction Saman Amarasinghe 2 6.035 MIT Fall 1998 The Stack Arguments 0 to 6 are in: %b %rbp %rsp %rdi, %rsi, %rdx, %rcx, %r8 and %r9

More information

Overview of Compiler. A. Introduction

Overview of Compiler. A. Introduction CMPSC 470 Lecture 01 Topics: Overview of compiler Compiling process Structure of compiler Programming language basics Overview of Compiler A. Introduction What is compiler? What is interpreter? A very

More information

Intermediate representation

Intermediate representation Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing HIR semantic analysis HIR intermediate code gen.

More information

Ramblr. Making Reassembly Great Again

Ramblr. Making Reassembly Great Again Ramblr Making Reassembly Great Again Ruoyu Fish Wang, Yan Shoshitaishvili, Antonio Bianchi, Aravind Machiry, John Grosen, Paul Grosen, Christopher Kruegel, Giovanni Vigna Motivation Available Solutions

More information

Lectures 3-4: MIPS instructions

Lectures 3-4: MIPS instructions Lectures 3-4: MIPS instructions Motivation Learn how a processor s native language looks like Discover the most important software-hardware interface MIPS Microprocessor without Interlocked Pipeline Stages

More information

Digital Forensics Lecture 3 - Reverse Engineering

Digital Forensics Lecture 3 - Reverse Engineering Digital Forensics Lecture 3 - Reverse Engineering Low-Level Software Akbar S. Namin Texas Tech University Spring 2017 Reverse Engineering High-Level Software Low-level aspects of software are often the

More information

CSE 351 Midterm - Winter 2015

CSE 351 Midterm - Winter 2015 CSE 351 Midterm - Winter 2015 February 09, 2015 Please read through the entire examination first! We designed this exam so that it can be completed in 50 minutes and, hopefully, this estimate will prove

More information

Code Generation. Lecture 19

Code Generation. Lecture 19 Code Generation Lecture 19 Lecture Outline Topic 1: Basic Code Generation The MIPS assembly language A simple source language Stack-machine implementation of the simple language Topic 2: Code Generation

More information

x86 assembly CS449 Fall 2017

x86 assembly CS449 Fall 2017 x86 assembly CS449 Fall 2017 x86 is a CISC CISC (Complex Instruction Set Computer) e.g. x86 Hundreds of (complex) instructions Only a handful of registers RISC (Reduced Instruction Set Computer) e.g. MIPS

More information

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number The plot thickens Some MIPS instructions you can write cannot be translated to a 32-bit number some reasons why 1) constants are too big 2) relative addresses are too big 3) absolute addresses are outside

More information

CSCI 2121 Computer Organization and Assembly Language PRACTICE QUESTION BANK

CSCI 2121 Computer Organization and Assembly Language PRACTICE QUESTION BANK CSCI 2121 Computer Organization and Assembly Language PRACTICE QUESTION BANK Question 1: Choose the most appropriate answer 1. In which of the following gates the output is 1 if and only if all the inputs

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures ISAs Brief history of processors and architectures C, assembly, machine code Assembly basics: registers, operands, move instructions 1 What should the HW/SW interface contain?

More information

Module 3 Instruction Set Architecture (ISA)

Module 3 Instruction Set Architecture (ISA) Module 3 Instruction Set Architecture (ISA) I S A L E V E L E L E M E N T S O F I N S T R U C T I O N S I N S T R U C T I O N S T Y P E S N U M B E R O F A D D R E S S E S R E G I S T E R S T Y P E S O

More information

Instruction Set Architectures

Instruction Set Architectures Instruction Set Architectures Computer Systems: Section 4.1 Suppose you built a computer What Building Blocks would you use? Arithmetic Logic Unit (ALU) OP1 OP2 OPERATION ALU RES ALU + Registers R0: 0x0000

More information

Assignment 11: functions, calling conventions, and the stack

Assignment 11: functions, calling conventions, and the stack Assignment 11: functions, calling conventions, and the stack ECEN 4553 & 5013, CSCI 4555 & 5525 Prof. Jeremy G. Siek December 5, 2008 The goal of this week s assignment is to remove function definitions

More information

Anne Bracy CS 3410 Computer Science Cornell University

Anne Bracy CS 3410 Computer Science Cornell University Anne Bracy CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. See P&H 2.8 and 2.12, and

More information

Computer Systems Organization V Fall 2009

Computer Systems Organization V Fall 2009 Computer Systems Organization V22.0201 Fall 2009 Sample Midterm Exam ANSWERS 1. True/False. Circle the appropriate choice. (a) T (b) F At most one operand of an x86 assembly instruction can be an memory

More information

Intermediate Representations

Intermediate Representations Intermediate Representations Intermediate Representations (EaC Chaper 5) Source Code Front End IR Middle End IR Back End Target Code Front end - produces an intermediate representation (IR) Middle end

More information

Towards the Hardware"

Towards the Hardware CSC 2400: Computer Systems Towards the Hardware Chapter 2 Towards the Hardware High-level language (Java) High-level language (C) assembly language machine language (IA-32) 1 High-Level Language Make programming

More information

6.1. CS356 Unit 6. x86 Procedures Basic Stack Frames

6.1. CS356 Unit 6. x86 Procedures Basic Stack Frames 6.1 CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer) PC/IP is used to fetch an instruction PC/IP contains the address of the next instruction The value in

More information

Let s Break Modern Binary Code Obfuscation

Let s Break Modern Binary Code Obfuscation Let s Break Modern Binary Code Obfuscation 34 th Chaos Communication Congress, Leipzig December 27, 2017 Tim Blazytko @mr_phrazer http://synthesis.to Moritz Contag @dwuid https://dwuid.com Chair for Systems

More information

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Machine-level Representation of Programs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Program? 짬뽕라면 준비시간 :10 분, 조리시간 :10 분 재료라면 1개, 스프 1봉지, 오징어

More information

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number

The plot thickens. Some MIPS instructions you can write cannot be translated to a 32-bit number The plot thickens Some MIPS instructions you can write cannot be translated to a 32-bit number some reasons why 1) constants are too big 2) relative addresses are too big 3) absolute addresses are outside

More information

UW CSE 351, Winter 2013 Midterm Exam

UW CSE 351, Winter 2013 Midterm Exam Full Name: Student ID: UW CSE 351, Winter 2013 Midterm Exam February 15, 2013 Instructions: Make sure that your exam is not missing any of the 9 pages, then write your full name and UW student ID on the

More information

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats

CS 61C: Great Ideas in Computer Architecture. MIPS Instruction Formats CS 61C: Great Ideas in Computer Architecture MIPS Instruction Formats Instructor: Justin Hsia 6/27/2012 Summer 2012 Lecture #7 1 Review of Last Lecture New registers: $a0-$a3, $v0-$v1, $ra, $sp Also: $at,

More information

Computer Architecture Instruction Set Architecture part 2. Mehran Rezaei

Computer Architecture Instruction Set Architecture part 2. Mehran Rezaei Computer Architecture Instruction Set Architecture part 2 Mehran Rezaei Review Execution Cycle Levels of Computer Languages Stored Program Computer/Instruction Execution Cycle SPIM, a MIPS Interpreter

More information

CS Bootcamp x86-64 Autumn 2015

CS Bootcamp x86-64 Autumn 2015 The x86-64 instruction set architecture (ISA) is used by most laptop and desktop processors. We will be embedding assembly into some of our C++ code to explore programming in assembly language. Depending

More information

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei

Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Instruction Set Architecture part 1 (Introduction) Mehran Rezaei Overview Last Lecture s Review Execution Cycle Levels of Computer Languages Stored Program Computer/Instruction Execution Cycle SPIM, a

More information

Instruction Selection. Problems. DAG Tiling. Pentium ISA. Example Tiling CS412/CS413. Introduction to Compilers Tim Teitelbaum

Instruction Selection. Problems. DAG Tiling. Pentium ISA. Example Tiling CS412/CS413. Introduction to Compilers Tim Teitelbaum Instruction Selection CS42/CS43 Introduction to Compilers Tim Teitelbaum Lecture 32: More Instruction Selection 20 Apr 05. Translate low-level IR code into DAG representation 2. Then find a good tiling

More information

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics:

More information

6.035 Project 3: Unoptimized Code Generation. Jason Ansel MIT - CSAIL

6.035 Project 3: Unoptimized Code Generation. Jason Ansel MIT - CSAIL 6.035 Project 3: Unoptimized Code Generation Jason Ansel MIT - CSAIL Quiz Monday 50 minute quiz Monday Covers everything up to yesterdays lecture Lexical Analysis (REs, DFAs, NFAs) Syntax Analysis (CFGs,

More information

T Jarkko Turkulainen, F-Secure Corporation

T Jarkko Turkulainen, F-Secure Corporation T-110.6220 2010 Emulators and disassemblers Jarkko Turkulainen, F-Secure Corporation Agenda Disassemblers What is disassembly? What makes up an instruction? How disassemblers work Use of disassembly In

More information

Do You Trust a Mutated Binary? Drew Bernat Correct Relocation

Do You Trust a Mutated Binary? Drew Bernat Correct Relocation Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat bernat@cs.wisc.edu April 30, 2007 Correct Relocation Binary Manipulation We want to: Insert new code Modify or delete code These operations

More information

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine.

Compiler construction. x86 architecture. This lecture. Lecture 6: Code generation for x86. x86: assembly for a real machine. This lecture Compiler construction Lecture 6: Code generation for x86 Magnus Myreen Spring 2018 Chalmers University of Technology Gothenburg University x86 architecture s Some x86 instructions From LLVM

More information