Any quesnons? Homework 1. Dynamic Analysis. Today s plan. RaNonal Purify. The Problem (for Purify to solve) 2/1/17

Similar documents
301AA - Advanced Programming [AP-2017]

n Specifying what each method does q Specify it in a comment before method's header n Precondition q Caller obligation n Postcondition

Heap Arrays and Linked Lists. Steven R. Bagley

CS61C : Machine Structures

CS61C : Machine Structures

Lecture Notes on Memory Layout

Lecture 10 Notes Linked Lists

CS 3 Midterm 1 Review

Computer Science 50: Introduction to Computer Science I Harvard College Fall Quiz 0 Solutions. Answers other than the below may be possible.

Heap Arrays. Steven R. Bagley

Lecture 10 Notes Linked Lists

Data Structure Series

Run-time Environments - 3

Class Information ANNOUCEMENTS

CS354 gdb Tutorial Written by Chris Feilbach

Clustering Documents. Document Retrieval. Case Study 2: Document Retrieval

CS113: Lecture 7. Topics: The C Preprocessor. I/O, Streams, Files

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Post Experiment Interview Questions

APPENDIX B. Fortran Hints

COSC Software Engineering. Lecture 16: Managing Memory Managers

CS61C Machine Structures. Lecture 3 Introduction to the C Programming Language. 1/23/2006 John Wawrzynek. www-inst.eecs.berkeley.

CS510 Advanced Topics in Concurrency. Jonathan Walpole

Run-time Environments -Part 3

Lecture 1: Overview

Reliable programming

Exception Namespaces C Interoperability Templates. More C++ David Chisnall. March 17, 2011

The Generational Hypothesis

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 2

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Building Memory

5 R1 The one green in the same place so either of these could be green.

ques4ons? Midterm Projects, etc. Path- Based Sta4c Analysis Sta4c analysis we know Example 11/20/12

Under the Hood: Data Representations, Memory and Bit Operations. Computer Science 104 Lecture 3

C++ Support Classes (Data and Variables)

The Specification Phase

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Stanford University Computer Science Department CS 295 midterm. May 14, (45 points) (30 points) total

CA31-1K DIS. Pointers. TA: You Lu

Pointers in C/C++ 1 Memory Addresses 2

COS 320. Compiling Techniques

Last time. Reasoning about programs. Coming up. Project Final Presentations. This Thursday, Nov 30: 4 th in-class exercise

Reasoning about programs

Semantic Analysis. Lecture 9. February 7, 2018

6. Pointers, Structs, and Arrays. 1. Juli 2011

CS61, Fall 2012 Section 2 Notes

Comp 11 Lectures. Mike Shah. June 26, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures June 26, / 57

Page 1. Logistics. Introduction to Embedded Systems. Last Time. ES Software Design. Labs start Wed CS/ECE 6780/5780. Al Davis

CS 241 Honors Memory

Type Checking and Type Equality

Reversed Buffer Overflow Cross Stack Attacks. Kris Kaspersky Endeavor Security, Inc.

CS1 Lecture 4 Jan. 23, 2019

CSCI-1200 Data Structures Spring 2016 Lecture 6 Pointers & Dynamic Memory

CSE 374 Programming Concepts & Tools. Brandon Myers Winter 2015 Lecture 11 gdb and Debugging (Thanks to Hal Perkins)

6. Pointers, Structs, and Arrays. March 14 & 15, 2011

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

CS61C : Machine Structures

ANITA S SUPER AWESOME RECITATION SLIDES

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

Pointers and Memory 1

Quiz 0 Review Session. October 13th, 2014

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Chapter 1 Getting Started

Unit Testing as Hypothesis Testing

18-642: Code Style for Compilers

CSE 20 DISCRETE MATH WINTER

Assertions, pre/postconditions

School of Computer & Communication Sciences École Polytechnique Fédérale de Lausanne

Intermediate Programming, Spring 2017*

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #29 Arrays in C

CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement"

Unit 9 Tech savvy? Tech support. 1 I have no idea why... Lesson A. A Unscramble the questions. Do you know which battery I should buy?

Lecture 20: templates

Testing. UW CSE 160 Winter 2016

CS 61C: Great Ideas in Computer Architecture C Pointers. Instructors: Vladimir Stojanovic & Nicholas Weaver

Structure of Programming Languages Lecture 10

Programming with Math and Logic

XP: Backup Your Important Files for Safety

Dynamic Storage Allocation

CS 1110, LAB 3: MODULES AND TESTING First Name: Last Name: NetID:

4.1 COMPUTATIONAL THINKING AND PROBLEM-SOLVING

Testing. Prof. Clarkson Fall Today s music: Wrecking Ball by Miley Cyrus

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

(Provisional) Lecture 22: Rackette Overview, Binary Tree Analysis 10:00 AM, Oct 27, 2017

Hanoi, Counting and Sierpinski s triangle Infinite complexity in finite area

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

Oracle Developer Studio Code Analyzer

Process Address Spaces and Binary Formats

CS 61C: Great Ideas in Computer Architecture Introduc)on to C, Part III

(Refer Slide Time 3:31)

Final Examination CS 111, Fall 2016 UCLA. Name:

Control Structures: The IF statement!

Na.ve POSIX Threading Library (NPTL)

Heap Management. Heap Allocation

Topic Notes: Building Memory

MARKING KEY The University of British Columbia MARKING KEY Computer Science 260 Midterm #2 Examination 12:30 noon, Thursday, March 15, 2012

Design and Debug: Essen.al Concepts Numerical Conversions CS 16: Solving Problems with Computers Lecture #7

Structured Data. CIS 15 : Spring 2007

Princeton University. Computer Science 217: Introduction to Programming Systems. Dynamic Memory Management

Performance Measurement

Transcription:

Homework 1 Dynamic Analysis Due Thursday Feb 16, 9 AM on moodle On dynamic analysis (today s topic) Install and use an open- source tool: Daikon Add a very useful tool to your toolbox Understand how dynamic analysis works RunNme monitoring RaNonal Purify Today s plan Any quesnons? Dynamic invariant detecnon Daikon RaNonal Purify UNICOM first RaNonal, then bought by IBM, then sold to UNICOM debugging unininalized memory access buffer overflow improper freeing of memory leak detecnon memory blocks that no longer have a valid pointer The Problem (for Purify to solve) C/C++ are not type safe The compiler does not enforce type abstracnons Does the runnme system? no hzps://teamblue.unicomsi.com/products/purifyplus 1

Object Object int[] What happens if we write here? The Problem (for Purify to solve) C/C++ are not type safe The compiler does not enforce type abstracnons Does the runnme system? no Possible to read or write outside of your intended data structure and many undesirable behaviors What can we do? Track each memory locanon One of three states: Unallocated: cannot be read or wrizen Allocated but unininalized: cannot be read Object Object int[] Allocated and ininalized: can be read or wrizen What happens if we write here? Represent each byte s state with a machine Unallocated malloc Anything missing? free UniniNalized write IniNalized How do we implement this? Keep a machine for each byte On each access check the state of each byte update the machine state Can instrument the binary (no need for source code) Add code before each load and store Represent the machines as a giant array How many bits needed per byte of system memory? What s the overhead? Catches byte- level, but not bit- level errors RunNme CPU efficiency? very slow, but worth it 2

Can we do bezer? Object Object int[] We can detect unallocated or unininalized accesses. Can we force all accesses to be that way? We can detect errors in the blue areas. We can detect some errors in the green areas. But there are many others we cannot. Padding between objects Let memory age Object Object If we disallow adjacent objects in memory (pad them), then all accesses past the end of an array access a blue zone int[] Do not allow reallocanon of freed memory for some Nme Prevents errors caused by dangling pointers Both this and padding can be easily implemented in the malloc library Garbage collecnon Instead of bits, keep track of pointers to memory When no pointers are leg, free the memory Where have we seen this before? In PracNce These ideas work prezy well and are widely used. Ogen, it is OK to pay very high performance price to get system correctness. Dynamic analysis instruments the program, can maintain propernes at runnme. 3

RunNme monitoring RaNonal Purify Today s plan à Dynamic invariant detecnon Daikon What is a program supposed to do? How do we know the program s specificanon? Maybe the developers wrote it down. but ogen, that has errors Without a specificanon or some way to tell if behavior is correct, we cannot test! What is a specificanon? The documentanon can be the specificanon Informal May contain mistakes Can be hard to parse The program itself is a specificanon TesNng becomes a tautology But is there some kind of tesnng this can facilitate? Regression tesnng Also great for program understanding, reasoning, etc. Use the program to find likely invariants Hypothesize an invariant for example, square(x) > 0 Run the program on many test inputs (without needing to know the outputs) If square(x) > 0 in all the execunons, it s a likely invariant. Example: funny_sqrt(int x) bool positive = (x>0); if (positive) j = sqrt(x); else j = sqrt(-x); return j; What is an invariant? (j 0 is) Invariants hold at a program point before a statement executes ager a statement executes or maybe at all program points Invariants cannot reference variables out of scope Is j < abs(y)? Test for - 100 < x < 100 j 0 4

Can execunons ever prove a property? Can show that a property holds in many execunons. But can this method show that a property always holds? What can execunons prove? They can disprove invariants by finding an execunon in which an invariant holds. Example: Is j always 0? funny_sqrt(int x) bool positive = (x>0); if (positive) j = sqrt(x); else j = sqrt(-x); return j; No, when x is - 100, j is 10. Test for - 100 < x < 100 j can be between 0 and 10 How do we know if an invariant is likely? Thesis: Hypothesize i is an invariant at a program point. If many test cases do not disprove the hypothesis, conclude that i likely is an invariant. This doesn t quite work Example: funny_sqrt(int x) bool positive = (x>0); if (positive) j = sqrt(x); else j = sqrt(-x); return j; x j Test for 0 < x < 100 posinve = true What went wrong? We had many test cases none disproved the invariant But the hypothesis is not disproved because we didn t even execute the relevant line of code. Example: funny_sqrt(int x) bool positive = (x>0); if (positive) j = sqrt(x); else j = sqrt(-x); return j; x j Test for 0 < x < 100 posinve = true 5

SoluNon: Use stansncs! An invariant is only likely if the observanons do not disprove it AND the relevant observanons are stansncally significant How to compute stansncal significance? For a hypothesized invariant P(x,y) What are the chances P(x,y) is sansfied under a random choice of x and y? Assume 0 x, y < 1000 P(x == y) P(x < y) P(x!= y).001.5.999 What to compute We want a high confidence that invariants are not observed by chance The number of samples we need varies with the invariant predicates have widely varying chances of being accidentally sansfied What can we do with unlikely invariants? If it is likely that a [non]invariant is an accident, don t report it. Give the user control of the confidence threshold. An invariant may be true, but not be stansncally significant when examined under some (all?) test suites. Which invariants do we check? Given a possible invariant, we can check if it is likely. But which possible invariants do we check? How many are there? How many are there? Ordering relanonships over two variables: x < y, x == y, x > y, x y, x y, x y No problem. Just a finite number If a program has n variables, how many possible such relanonships are there? Θ(n 2 ) 6

What about other types? x = c, for some constant c Use the computer for what it s good at Guess a HUGE number of possible invariants Check them all Only those that are likely true will survive Computers are great at this! many: 2 64 for ints on a 64- bit machine = 18,446,744,073,709,551,616 So what do we do about the 18,446,744,073,709,551,616 ints? Possible invariant: x=c Don t store any at first. First Nme you see x assigned to some c, remember that c. Then check if x=c in all later execunons. For others too Same idea for more- complex invariant types: For example: ax + b = y Two observanons of (x,y) is sufficient to solve for the only possible (a,b). And others snll We can do: min(array) max(array) sum(array) etc. These expressions can be like variables: x = min(array z) Review Guess lots of invariants Check which ones hold Keep stansncs to check for stansncal significance Those guesses that survived all the execunons and are stansncally significant are likely true. Does not need expected execunon outputs 7

In pracnce This works! Finds interesnng invariants for complex programs. Gives concise specificanons Needs fewer execunons than you d think. surviving potennal invariants False invariants die quickly number of execunons Daikon Implements dynamic invariant detecnon Open source, free to use, Highly robust and customizable Takes some Nme to master but very powerful You ll see it on homework 1 hzp://groups.csail.mit.edu/pag/daikon 8