Abstract Interpretation
|
|
- Andrea Janis Allen
- 6 years ago
- Views:
Transcription
1 Abstract Interpretation MATHE MATICAL PROGRAM CHE CKING
2 Overview High level mathematical tools Originally conceived to help give a theoretical grounding to program analysis Useful for other kinds of analyses too Can also be viewed as a very general algorithmic tool for ensuring termination of algorithms Its discoverer (Cousout) is somewhat famous for defining every other analysis or algorithm as an instantiation of abstract interpretation
3 Goal Programs run, but sometimes they do the wrong thing Can we write a program and know before we run it if it will do the right thing? Can we try and answer this question for every possible execution of a program? Can we construct algorithms to help us efficiently answer these questions? The answer to all is: sometimes
4 Mathematical properties Concrete vs abstract interpretation of a program Concrete is easy Write an interpreter Supply concrete values for input / unbound variables Evaluate the program Abstract is harder Concrete semantics are obvious What to do with abstract semantics? How do you know when to stop executing?
5 Abstract interpretation Map program semantics to an abstract machine Evaluate those semantics to produce approximations of all possible program input states Now we can check all possible program states for errors Perhaps For example, abstract integer variables can be represented by ranges instead of integers Abstract integer operations (negation, addition) operate on ranges Now we can prove whether or not a program will divide by 0 With some false positives Type checking is abstract interpretation
6 Abstract Interpretation to build tools Building tools on top of AI has benefits You can be sure that your tools analysis will finish and produce some answer You can be sure that this answer is sound and as accurate as can be expected There are cases where analysis produces no good answers of course The construction of the tool is a modular composition Semantics Data types Standard algorithms
7 Program facts and lattices We want to represent that a variable could contain some combination of values Each value was produced previously in the program However, it could be any combination including none, so, all values, some set of values, one value, or no values We can define a strict ordering between these sets Program facts can be mapped onto lattices The utility of this will be apparent later
8 Example fun foo(a, b) { var j = a + 2; if(b % 2 == 0) { j = j + b + 1; } return j; } b % 2, {a + 2} j + b + 1, {a + 2} {a + 2} {b % 2} j + b + 1, {b % 2} {j + b + 1} At each fact point in the program we assign a set variable that corresponds to one of the sets within the lattice
9 Data Flow Analysis DFA frameworks can construct facts about abstract program states Well understood algorithms and approaches Some implementations for binary programs (REIL) In DFA, analyses produce specific kind of facts Available expressions Live variables Each fact has an abstraction into semantics Each fact s semantics can be applied algorithmically to the program to discover facts about the program At each step, we can compute a gen and a kill set for facts gen fact generated kill fact destroyed
10 High-level DFA algorithm (backward may) 1. Initialize a work list of statements to all statements in the program 2. Set all known facts to empty 3. Iterate over the work list while the work list is not empty 4. Compute gen and kill sets for current statement 5. Are gen and kill for current statement identical to current known facts? If no, update current facts with newly generated facts Add current statement to work list An advantage of mapping facts onto lattices becomes apparent Our analysis algorithm must terminate, the worst case is we drive all facts to bottom
11 Example LUA bytecode LUA has a bytecode that the language is compiled to LUA bytecode interpreter decodes and dispatches Some example bytecode semantics R(1) := Assign the constant 4 to register 1 R(1) := R(2) + R(0) Read from register 2 and register 0, add those values, and store the result in register 1 R(0) := R(1) Read from register 1, assign result to register 0
12 Abstract semantics (gen/kill) for liveness In Ocaml (*[dst, src]*) Move (a,b) -> (* gen b *) let gens = insert b bvs in (* kill a *) let kills = insert a bvs in (gens, kills) After a statement with a move, the source of the move is still live, having been read, but the destination of the move is dead, because it was not read
13 DFA frameworks on binary code BinNavi has a DFA framework that can be applied to x86 programs Internally it defines core data structures Users define facts in a lattice Users provide gen/kill transfer functions Framework iteratively applies user supplied transfer functions until lattice facts converge Allows you to do neat things like liveness analysis on registers In practice, without a mature view of memory, this is limited Developing a mature view of memory is an open research problem
14 Generic DFA shortcomings Generally, DFA has been considered to operate on source languages that have concepts of variables When your only variables are registers, but you have memory locations to load/store, this is a problem Generic DFA considers load/store as killing all statements in the program, sending them to top An alternative requires a whole-program points-to analysis and this is hard Extraction of variables from programs aims to resolve this somewhat Programs that make heavy use of memory still problematic for traditional DFA on source code
15 Creating an Intermediate Representation PRINCIPLE D PROGRAM ANALYSIS
16 Goals As we ve seen before, analysis can be principled when the actions of a program are defined as operations on an abstract or virtual machine This is useful for both source and binary analysis Compiler intermediate languages refine input languages into an unambiguous syntax and semantics Compiler intermediate languages allow for optimizations that remove and simplify code Both of these are advantages to binary analysis There are precision tradeoffs, usually in how much of the abstract machine we define We want An expressive IR A virtual machine specification expressive enough to talk about programs we care about A translator or frontend that produces our IR
17 Know the Domain What does our real CPU look like? X86 ABI It s not the ABI we need, but the ABI we deserve Integer-value general purpose registers Floating point Memory key-value store Fixed width integer keys mapping to variable width integer values Program code also stored in memory Memory has page permissions Segment selectors Sometimes there are interrupts
18 Abstract Binary Domain This gets really complicated when we consider the full spectrum of X86 behavior What about multi-threaded memory semantics w.r.t the cache? Or HTM instructions? What about self-modifying code? Pick some subsets of the domain and implement them first, and work incrementally BAP has been under development for almost a decade, still has no floating point support Some domains don t allow the expression of self-modifying code through immutable code These are choices you make at the language and CPU abstraction level to decide how accurate your translations are Compromises admit some programs, disallow others If admitted programs are programs you are interested in, you win
19 Runtime and Operating System Domain Knowing stuff about the CPU is great, what about the OS? What are the semantics of system/library calls? malloc returns a pointer to new memory mprotect changes page permissions These platform semantics can be just as important as instruction semantics This kind of platform modeling is a problem when doing source code analysis
20 Language Languages define semantics to map to We create operators that are pure that define the bit-wise operations performed at the ISA level xor, shift, sub, etc The language operates on values defined within your CPU or abstract domain It also operates on memory locations defined in your CPU or abstract domain You can give properties to your language that make analysis easier or harder SSA Side-effect free Exceptions or lack therof
21 Translators Produce language from native instructions When abstractions really leak, they leak here first Translating basic operations like add, sub, are easy What about instructions like iret, sysenter, or in/out? This depends on the way you have defined the abstract machine In good designs, a translator just applies a mapping from native instructions into a language previously defined Usually, translators apply heuristics and hacks to make things make sense
22 Previous Intermediate Representations Compiler hackers and PL scientists approaching the binary analysis problem have had a lot of thoughts about this before Many existing tools for representing binary code as an intermediate language exist There are two big undertakings Defining a virtual machine with a language that programs the virtual machine Translating binary code into the language for this virtual machine Existing tools tend to do both BAP INSIGHT/BINCOA REIL/RREIL TSL/MTLK
23 BAP BIL Dawn Song and David Brumley, University of Berkeley and Carnegie Mellon University OCaml Translates native code into two different types of IL One is SSA, the other is not Analysis and optimization passes written on top of BIL Symbolic execution system written on top of BIL
24 INSIGHT Emmanuel Fleury at LaBRI in Bordeaux C++ Translates x86 and ARM into an intermediate representation On the backend, uses libopcodes and YACC to produce IR Performs control flow recovery using recursive descent
25 REIL/RREIL Zynamics / Google Bindings in Java and Python Produces variables Micro-instructions Uses IDA for CFG recovery Extended into RREIL by Simon RREIL produces 1-bit flag variables
26 TSL/MLTK TSL Transformer Specification Language, Tom Reps, only described in a tech report MLTK Open source, Axel Simon, Domain specific languages to describe semantics of native code abstractly MTLK also includes a blueprint for decoding instructions
27 Static Single Assignment (SSA) A code representation strategy where there are no variables Each value produced by computation is a defined-once value Values are never re-defined Usually, values are unique per function SSA still permits a view of memory into which you can load and store
28 SSA and Loops F-nodes If values are defined-once, how do we represent variables in loops? We create an instruction called F This instruction defines a value based off of control flow If the previous block was A, F produces a SSA, with F nodes, allows for some immediate and trivial analysis to determine loop-carried values SSA can also approximate data flow analysis SSA value def-use chains can answer questions about live variables and available expressions
29 LLVM A compiler mid-end Allows parsers to produce LLVM assembly code from an AST Analyzes and optimizes this assembly code in a pluggable and configurable system Supports code generation of the assembly code to native machine code Lots of canned analysis algorithms that you can throw at code Alias analysis Loop information Dominance and post-dominance frontiers
30 Representation Challenges A straight decoding will give an analysis some new information about machine code Registers used Control flow Memory cells accessed This is one step closer to a de-compilation, but there are some pieces of information missing Full control flow recovery is difficult in the general case Promoting memory cell accesses into variables is difficult Involves answering aliasing questions Your representation has some design choices Faithful to an evaluation at the expense of readability More expressive of inferred intent at the expense of soundness
31 Why be sound anyway? Soundness could be overrated, it depends on your goals What if your goal is program re-writing to remove bugs? Then soundness is very good and a lack of it is very disturbing You could re-write programs incorrectly On a good day, this means that you don t remove a bug from a program when you want to On a bad day, this means that the program you re-write no longer works Soundness errors in program reconstruction result in nonworking programs
32 What about bug identification? Very different An unsound program representation could tell you false things It could also tell you true things The true things it tells you could be information about vulnerabilities in the program If what the representation tells you is a lie though, this vulnerability is a false alarm What to do about false alarm vulnerabilities? Can always test the faulting input and see if it s real False positives are annoying, but can be verified or checked after the fact Many static analysis systems for security permit false positives Worst case scenario Abstraction lies about detail, creates false negatives, results in finding no bugs
33 Variable Recovery Some current best work Anand, EUROSYS Balakrishnan Value-Set Analysis (VSA) Linear Stack Accesses Implemented in IDA/HexRays Doesn t always work totally right Some inherent problems Array accesses Compilers can re-use stack locations
34 Type Inference and Recovery Closely related to variable recovery is type recovery Area of active research interest Some cases seem easy Common idiom: moving a low-valued, non-zero integer into a register means that register has type integer Testing a register against 0 could mean that register is a pointer Some actually are easy If a register is used in a memory expression alone, it is a pointer type What about the combination of two registers? Which is the base and which is the offset? Nobody has a good system that works all the time on this
35 Some Demos
CS 406/534 Compiler Construction Putting It All Together
CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy
More informationLow level security. Andrew Ruef
Low level security Andrew Ruef What s going on Stuff is getting hacked all the time We re writing tons of software Often with little regard to reliability let alone security The regulatory environment
More informationInterprocedural Variable Liveness Analysis for Function Signature Recovery
Interprocedural Variable Liveness Analysis for Function Signature Recovery MIGUEL ARAUJO AND AHMED BOUGACHA {maraujo@cs, ahmed.bougacha@sv}.cmu.edu Carnegie Mellon University April 30, 2014 Final Project
More informationCS 360 Programming Languages Interpreters
CS 360 Programming Languages Interpreters Implementing PLs Most of the course is learning fundamental concepts for using and understanding PLs. Syntax vs. semantics vs. idioms. Powerful constructs like
More informationPrinciples of Programming Languages. Lecture Outline
Principles of Programming Languages CS 492 Lecture 1 Based on Notes by William Albritton 1 Lecture Outline Reasons for studying concepts of programming languages Programming domains Language evaluation
More informationCS5363 Final Review. cs5363 1
CS5363 Final Review cs5363 1 Programming language implementation Programming languages Tools for describing data and algorithms Instructing machines what to do Communicate between computers and programmers
More informationregister allocation saves energy register allocation reduces memory accesses.
Lesson 10 Register Allocation Full Compiler Structure Embedded systems need highly optimized code. This part of the course will focus on Back end code generation. Back end: generation of assembly instructions
More informationCSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)
CSE 413 Languages & Implementation Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341) 1 Goals Representing programs as data Racket structs as a better way to represent
More informationCSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Autumn 2018 Typical workflow concrete syntax (string) "(fn x => x + x) 4" Parsing Possible errors / warnings
More informationCMSC 330: Organization of Programming Languages. OCaml Imperative Programming
CMSC 330: Organization of Programming Languages OCaml Imperative Programming CMSC330 Spring 2018 1 So Far, Only Functional Programming We haven t given you any way so far to change something in memory
More informationA Gentle Introduction to Program Analysis
A Gentle Introduction to Program Analysis Işıl Dillig University of Texas, Austin January 21, 2014 Programming Languages Mentoring Workshop 1 / 24 What is Program Analysis? Very broad topic, but generally
More informationCSE 501 Midterm Exam: Sketch of Some Plausible Solutions Winter 1997
1) [10 pts] On homework 1, I asked about dead assignment elimination and gave the following sample solution: 8. Give an algorithm for dead assignment elimination that exploits def/use chains to work faster
More informationCS558 Programming Languages
CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016 Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g.
More informationCS 415 Midterm Exam Spring 2002
CS 415 Midterm Exam Spring 2002 Name KEY Email Address Student ID # Pledge: This exam is closed note, closed book. Good Luck! Score Fortran Algol 60 Compilation Names, Bindings, Scope Functional Programming
More informationSemantic Analysis. Lecture 9. February 7, 2018
Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average
More informationBinary Code Analysis: Concepts and Perspectives
Binary Code Analysis: Concepts and Perspectives Emmanuel Fleury LaBRI, Université de Bordeaux, France May 12, 2016 E. Fleury (LaBRI, France) Binary Code Analysis: Concepts
More informationCS553 Lecture Generalizing Data-flow Analysis 3
Generalizing Data-flow Analysis Announcements Project 2 writeup is available Read Stephenson paper Last Time Control-flow analysis Today C-Breeze Introduction Other types of data-flow analysis Reaching
More informationCS558 Programming Languages
CS558 Programming Languages Winter 2017 Lecture 7b Andrew Tolmach Portland State University 1994-2017 Values and Types We divide the universe of values according to types A type is a set of values and
More informationBe a Binary Rockst r. An Introduction to Program Analysis with Binary Ninja
Be a Binary Rockst r An Introduction to Program Analysis with Binary Ninja Agenda Motivation Current State of Program Analysis Design Goals of Binja Program Analysis Building Tools 2 Motivation 3 Tooling
More informationDynamic Control Hazard Avoidance
Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>
More informationLecture Compiler Middle-End
Lecture 16-18 18 Compiler Middle-End Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 What We Have Done A lot! Compiler Frontend Defining language Generating
More informationStatic Analysis of C++ Projects with CodeSonar
Static Analysis of C++ Projects with CodeSonar John Plaice, Senior Scientist, GrammaTech jplaice@grammatech.com 25 July 2017, Meetup C++ de Montréal Abstract Static program analysis consists of the analysis
More informationOutline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002
Outline 5 Fall 2002 Lecture 17: Putting it all together From parsing to code generation Saman Amarasinghe 2 6.035 MIT Fall 1998 How to make the computer understand? Write a program using a programming
More informationOptimizing for Bugs Fixed
Optimizing for Bugs Fixed The Design Principles behind the Clang Static Analyzer Anna Zaks, Manager of Program Analysis Team @ Apple What is This Talk About? LLVM/clang project Overview of the Clang Static
More informationCSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures
CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures Dan Grossman Fall 2014 Hi! I m not Hal J I love this stuff and have taught
More informationStatic Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU
Static Analysis methods and tools An industrial study Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU Outline Why static analysis What is it Underlying technology Some tools (Coverity, KlocWork,
More informationBuilding a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2
Building a Compiler with JoeQ AKA, how to solve Homework 2 Outline of this lecture Building a compiler: what pieces we need? An effective IR for Java joeq Homework hints How to Build a Compiler 1. Choose
More informationAdam Chlipala University of California, Berkeley ICFP 2006
Modular Development of Certified Program Verifiers with a Proof Assistant Adam Chlipala University of California, Berkeley ICFP 2006 1 Who Watches the Watcher? Program Verifier Might want to ensure: Memory
More informationCS558 Programming Languages
CS558 Programming Languages Winter 2017 Lecture 4a Andrew Tolmach Portland State University 1994-2017 Semantics and Erroneous Programs Important part of language specification is distinguishing valid from
More informationCS558 Programming Languages
CS558 Programming Languages Winter 2018 Lecture 7b Andrew Tolmach Portland State University 1994-2018 Dynamic Type Checking Static type checking offers the great advantage of catching errors early And
More informationIntro. Scheme Basics. scm> 5 5. scm>
Intro Let s take some time to talk about LISP. It stands for LISt Processing a way of coding using only lists! It sounds pretty radical, and it is. There are lots of cool things to know about LISP; if
More informationCrafting a Compiler with C (II) Compiler V. S. Interpreter
Crafting a Compiler with C (II) 資科系 林偉川 Compiler V S Interpreter Compilation - Translate high-level program to machine code Lexical Analyzer, Syntax Analyzer, Intermediate code generator(semantics Analyzer),
More informationCSE 12 Abstract Syntax Trees
CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures
More informationStatic Analysis of Dynamically Typed Languages made Easy
Static Analysis of Dynamically Typed Languages made Easy Yin Wang School of Informatics and Computing Indiana University Overview Work done as two internships at Google (2009 summer and 2010 summer) Motivation:
More informationWhat do Compilers Produce?
What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond
More informationCMSC 330: Organization of Programming Languages. OCaml Imperative Programming
CMSC 330: Organization of Programming Languages OCaml Imperative Programming CMSC330 Fall 2017 1 So Far, Only Functional Programming We haven t given you any way so far to change something in memory All
More informationProject 3 Due October 21, 2015, 11:59:59pm
Project 3 Due October 21, 2015, 11:59:59pm 1 Introduction In this project, you will implement RubeVM, a virtual machine for a simple bytecode language. Later in the semester, you will compile Rube (a simplified
More informationBinsec: a platform for binary code analysis
Binsec: a platform for binary code analysis 08/06/2016 Adel Djoudi Robin David Josselin Feist Thanh Dinh Ta Introduction Outline Introduction The BINSEC Platform DBA simplification Static analysis Symbolic
More informationCS 370 The Pseudocode Programming Process D R. M I C H A E L J. R E A L E F A L L
CS 370 The Pseudocode Programming Process D R. M I C H A E L J. R E A L E F A L L 2 0 1 5 Introduction At this point, you are ready to beginning programming at a lower level How do you actually write your
More informationVerified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology
Guest lecture for Compiler Construction, Spring 2015 Verified compilers Magnus Myréen Chalmers University of Technology Mentions joint work with Ramana Kumar, Michael Norrish, Scott Owens and many more
More informationCS558 Programming Languages
CS558 Programming Languages Fall 2017 Lecture 3a Andrew Tolmach Portland State University 1994-2017 Binding, Scope, Storage Part of being a high-level language is letting the programmer name things: variables
More informationTypical workflow. CSE341: Programming Languages. Lecture 17 Implementing Languages Including Closures. Reality more complicated
Typical workflow concrete synta (string) "(fn => + ) 4" Parsing CSE341: Programming Languages abstract synta (tree) Lecture 17 Implementing Languages Including Closures Function Constant + 4 Var Var Type
More informationCS558 Programming Languages
CS558 Programming Languages Fall 2017 Lecture 7b Andrew Tolmach Portland State University 1994-2017 Type Inference Some statically typed languages, like ML (and to a lesser extent Scala), offer alternative
More informationIntro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming
Intro to Programming Unit 7 Intro to Programming 1 What is Programming? 1. Programming Languages 2. Markup vs. Programming 1. Introduction 2. Print Statement 3. Strings 4. Types and Values 5. Math Externals
More informationOptimized Scientific Computing:
Optimized Scientific Computing: Coding Efficiently for Real Computing Architectures Noah Kurinsky SASS Talk, November 11 2015 Introduction Components of a CPU Architecture Design Choices Why Is This Relevant
More informationINTRODUCTION TO LLVM Bo Wang SA 2016 Fall
INTRODUCTION TO LLVM Bo Wang SA 2016 Fall LLVM Basic LLVM IR LLVM Pass OUTLINE What is LLVM? LLVM is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. Implemented
More informationChapter 2. Instruction Set. RISC vs. CISC Instruction set. The University of Adelaide, School of Computer Science 18 September 2017
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface RISC-V Edition Chapter 2 Instructions: Language of the Computer These slides are based on the slides by the authors. The slides doesn t
More informationSKILL AREA 304: Review Programming Language Concept. Computer Programming (YPG)
SKILL AREA 304: Review Programming Language Concept Computer Programming (YPG) 304.1 Demonstrate an Understanding of Basic of Programming Language 304.1.1 Explain the purpose of computer program 304.1.2
More informationIntroduction. CS 2210 Compiler Design Wonsun Ahn
Introduction CS 2210 Compiler Design Wonsun Ahn What is a Compiler? Compiler: A program that translates source code written in one language to a target code written in another language Source code: Input
More informationAutomated static deobfuscation in the context of Reverse Engineering
Automated static deobfuscation in the context of Reverse Engineering Sebastian Porst (sebastian.porst@zynamics.com) Christian Ketterer (cketti@gmail.com) Sebastian zynamics GmbH Lead Developer BinNavi
More informationOCaml Data CMSC 330: Organization of Programming Languages. User Defined Types. Variation: Shapes in Java
OCaml Data : Organization of Programming Languages OCaml 4 Data Types & Modules So far, we ve seen the following kinds of data Basic types (int, float, char, string) Lists Ø One kind of data structure
More informationIntermediate Code Generation
Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target
More informationAdvanced Compiler Design. CSE 231 Instructor: Sorin Lerner
Advanced Compiler Design CSE 231 Instructor: Sorin Lerner Let s look at a compiler if ( ) { x := ; } else { y := ; } ; Parser Compiler Compiler Optimizer Code Gen Exec Let s look at a compiler Compiler
More informationOutline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring
Java Outline Java Models for variables Types and type checking, type safety Interpretation vs. compilation Reasoning about code CSCI 2600 Spring 2017 2 Java Java is a successor to a number of languages,
More informationIntroduction. L25: Modern Compiler Design
Introduction L25: Modern Compiler Design Course Aims Understand the performance characteristics of modern processors Be familiar with strategies for optimising dynamic dispatch for languages like JavaScript
More informationLecture Notes: Unleashing MAYHEM on Binary Code
Lecture Notes: Unleashing MAYHEM on Binary Code Rui Zhang February 22, 2017 1 Finding Exploitable Bugs 1.1 Main Challenge in Exploit Generation Exploring enough of the state space of an application to
More informationIntermediate Representation (IR)
Intermediate Representation (IR) Components and Design Goals for an IR IR encodes all knowledge the compiler has derived about source program. Simple compiler structure source code More typical compiler
More informationG Programming Languages - Fall 2012
G22.2110-003 Programming Languages - Fall 2012 Lecture 3 Thomas Wies New York University Review Last week Names and Bindings Lifetimes and Allocation Garbage Collection Scope Outline Control Flow Sequencing
More informationSendmail crackaddr - Static Analysis strikes back
Sendmail crackaddr - Static Analysis strikes back Bogdan Mihaila Technical University of Munich, Germany December 6, 2014 Name Lastname < name@mail.org > ()()()()()()()()()... ()()() 1 / 25 Abstract Interpretation
More informationSTEVEN R. BAGLEY THE ASSEMBLER
STEVEN R. BAGLEY THE ASSEMBLER INTRODUCTION Looking at how to build a computer from scratch Started with the NAND gate and worked up Until we can build a CPU Reached the divide between hardware and software
More informationCIS 341 Final Examination 4 May 2017
CIS 341 Final Examination 4 May 2017 1 /14 2 /15 3 /12 4 /14 5 /34 6 /21 7 /10 Total /120 Do not begin the exam until you are told to do so. You have 120 minutes to complete the exam. There are 14 pages
More informationUnderstand the factors involved in instruction set
A Closer Look at Instruction Set Architectures Objectives Understand the factors involved in instruction set architecture design. Look at different instruction formats, operand types, and memory access
More informationOutline. When we last saw our heros. Language Issues. Announcements: Selecting a Language FORTRAN C MATLAB Java
Language Issues Misunderstimated? Sublimable? Hopefuller? "I know how hard it is for you to put food on your family. "I know the human being and fish can coexist peacefully." Outline Announcements: Selecting
More informationMutable Data Types. Prof. Clarkson Fall A New Despair Mutability Strikes Back Return of Imperative Programming
Mutable Data Types A New Despair Mutability Strikes Back Return of Imperative Programming Prof. Clarkson Fall 2017 Today s music: The Imperial March from the soundtrack to Star Wars, Episode V: The Empire
More informationFaculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology
Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4020 July 5, 2007 14.00-15.30 This exam (8 pages) consists of 60 True/False
More informationCompiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts
Compiler Structure Source Code Abstract Syntax Tree Control Flow Graph Object Code CMSC 631 Program Analysis and Understanding Fall 2003 Data Flow Analysis Source code parsed to produce AST AST transformed
More informationMidterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016
Name: Midterm 2 CMSC 430 Introduction to Compilers Fall 2016 November 21, 2016 Instructions This exam contains 7 pages, including this one. Make sure you have all the pages. Write your name on the top
More informationIntro to x86 Binaries. From ASM to exploit
Intro to x86 Binaries From ASM to exploit Intro to x86 Binaries I lied lets do a quick ctf team thing Organization Ideas? Do we need to a real structure right now? Mailing list is OTW How do we get more
More informationCIS 194: Homework 3. Due Wednesday, February 11, Interpreters. Meet SImPL
CIS 194: Homework 3 Due Wednesday, February 11, 2015 Interpreters An interpreter is a program that takes another program as an input and evaluates it. Many modern languages such as Java 1, Javascript,
More informationProgram verification. Generalities about software Verification Model Checking. September 20, 2016
Program verification Generalities about software Verification Model Checking Laure Gonnord David Monniaux September 20, 2016 1 / 43 The teaching staff Laure Gonnord, associate professor, LIP laboratory,
More informationCOS 320. Compiling Techniques
Topic 5: Types COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Types: potential benefits (I) 2 For programmers: help to eliminate common programming mistakes, particularly
More informationAccelerating Ruby with LLVM
Accelerating Ruby with LLVM Evan Phoenix Oct 2, 2009 RUBY RUBY Strongly, dynamically typed RUBY Unified Model RUBY Everything is an object RUBY 3.class # => Fixnum RUBY Every code context is equal RUBY
More informationHeap Management. Heap Allocation
Heap Management Heap Allocation A very flexible storage allocation mechanism is heap allocation. Any number of data objects can be allocated and freed in a memory pool, called a heap. Heap allocation is
More informationSemantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End
Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors
More informationLECTURE 3. Compiler Phases
LECTURE 3 Compiler Phases COMPILER PHASES Compilation of a program proceeds through a fixed series of phases. Each phase uses an (intermediate) form of the program produced by an earlier phase. Subsequent
More informationDeallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection
Deallocation Mechanisms User-controlled Deallocation Allocating heap space is fairly easy. But how do we deallocate heap memory no longer in use? Sometimes we may never need to deallocate! If heaps objects
More informationEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches Jiong Wang Linux Plumbers Conference Vancouver, Nov, 2018 1 Background ISA specification and impact on JIT compiler Default code-gen use 64-bit register, ALU64, JMP64 test_l4lb_noinline.c
More informationAdministration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator
CS 412/413 Introduction to Compilers and Translators Andrew Myers Cornell University Administration Design reports due Friday Current demo schedule on web page send mail with preferred times if you haven
More informationParsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones
Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,
More informationWhere We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.
Where We Are Source Code Lexical Analysis Syntax Analysis Semantic Analysis IR Generation IR Optimization Code Generation Optimization Machine Code Where We Are Source Code Lexical Analysis Syntax Analysis
More informationAdvanced C Programming
Advanced C Programming Compilers Sebastian Hack hack@cs.uni-sb.de Christoph Weidenbach weidenbach@mpi-inf.mpg.de 20.01.2009 saarland university computer science 1 Contents Overview Optimizations Program
More informationECE 498 Linux Assembly Language Lecture 1
ECE 498 Linux Assembly Language Lecture 1 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 13 November 2012 Assembly Language: What s it good for? Understanding at a low-level what
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 1
CS321 Languages and Compiler Design I Winter 2012 Lecture 1 1 COURSE GOALS Improve understanding of languages and machines. Learn practicalities of translation. Learn anatomy of programming languages.
More information1: Introduction to Object (1)
1: Introduction to Object (1) 김동원 2003.01.20 Overview (1) The progress of abstraction Smalltalk Class & Object Interface The hidden implementation Reusing the implementation Inheritance: Reusing the interface
More informationAn Introduction to Python (TEJ3M & TEJ4M)
An Introduction to Python (TEJ3M & TEJ4M) What is a Programming Language? A high-level language is a programming language that enables a programmer to write programs that are more or less independent of
More informationReversing. Time to get with the program
Reversing Time to get with the program This guide is a brief introduction to C, Assembly Language, and Python that will be helpful for solving Reversing challenges. Writing a C Program C is one of the
More informationThe Environment Model. Nate Foster Spring 2018
The Environment Model Nate Foster Spring 2018 Review Previously in 3110: Interpreters: ASTs, evaluation, parsing Formal syntax: BNF Formal semantics: dynamic: small-step substitution model static semantics
More informationChapter 5. A Closer Look at Instruction Set Architectures
Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationCompilers and Code Optimization EDOARDO FUSELLA
Compilers and Code Optimization EDOARDO FUSELLA The course covers Compiler architecture Pre-requisite Front-end Strong programming background in C, C++ Back-end LLVM Code optimization A case study: nu+
More informationCOMP1730/COMP6730 Programming for Scientists. Testing and Debugging.
COMP1730/COMP6730 Programming for Scientists Testing and Debugging. Overview * Testing * Debugging * Defensive Programming Overview of testing * There are many different types of testing - load testing,
More informationFaculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology
Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4303 April 9, 2010 14.00-15.30 This exam (6 pages) consists of 52 True/False
More informationPrinciples of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore
(Refer Slide Time: 00:27) Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 1 An Overview of a Compiler Welcome
More informationProgramming Languages Third Edition. Chapter 7 Basic Semantics
Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol
More informationWhat is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573
What is a compiler? Xiaokang Qiu Purdue University ECE 573 August 21, 2017 What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g.,
More informationOptiCode: Machine Code Deobfuscation for Malware Analysis
OptiCode: Machine Code Deobfuscation for Malware Analysis NGUYEN Anh Quynh, COSEINC CONFidence, Krakow - Poland 2013, May 28th 1 / 47 Agenda 1 Obfuscation problem in malware analysis
More informationAnalysis Tool Project
Tool Overview The tool we chose to analyze was the Java static analysis tool FindBugs (http://findbugs.sourceforge.net/). FindBugs is A framework for writing static analyses Developed at the University
More informationSlide Set 5. for ENCM 369 Winter 2014 Lecture Section 01. Steve Norman, PhD, PEng
Slide Set 5 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section
More informationCompiler Construction
Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ws-1819/cc/ Generation of Intermediate Code Outline of Lecture 15
More informationECE 154A Introduction to. Fall 2012
ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 4: Arithmetic and Data Transfer Instructions Agenda Review of last lecture Logic and shift instructions Load/store instructionsi
More informationProcessor. Lecture #2 Number Rep & Intro to C classic components of all computers Control Datapath Memory Input Output
CS61C L2 Number Representation & Introduction to C (1) insteecsberkeleyedu/~cs61c CS61C : Machine Structures Lecture #2 Number Rep & Intro to C Scott Beamer Instructor 2007-06-26 Review Continued rapid
More information