C++ Undefined Behavior

Similar documents
C++ Undefined Behavior What is it, and why should I care?

Memory Corruption 101 From Primitives to Exploit

Page 1. Today. Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right? Compiler requirements CPP Volatile

Undefined Behaviour in C

Important From Last Time

Page 1. Today. Important From Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right?

Important From Last Time

AGENDA Binary Operations CS 3330 Samira Khan

Deep C (and C++) by Olve Maudal

UNDEFINED BEHAVIOR IS AWESOME

Programming in C and C++

Deep C by Olve Maudal

11 'e' 'x' 'e' 'm' 'p' 'l' 'i' 'f' 'i' 'e' 'd' bool equal(const unsigned char pstr[], const char *cstr) {

Pointers, Dynamic Data, and Reference Types

Important From Last Time

Lecture 05 Integer overflow. Stephen Checkoway University of Illinois at Chicago

CS 261 Fall C Introduction. Variables, Memory Model, Pointers, and Debugging. Mike Lam, Professor

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

11. Reference Types. Swap! Reference Types. Reference Types: Definition

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

Lecture 07 Debugging Programs with GDB

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

CSC C69: OPERATING SYSTEMS

10. Reference Types. Swap! Reference Types: Definition. Reference Types

Undefinedness and Non-determinism in C

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

A brief introduction to C programming for Java programmers

Understanding Undefined Behavior

Dynamic memory. EECS 211 Winter 2019

CS 220: Introduction to Parallel Computing. Arrays. Lecture 4

MISRA-C. Subset of the C language for critical systems

Hardening LLVM with Random Testing

C Introduction. Comparison w/ Java, Memory Model, and Pointers

The issues. Programming in C++ Common storage modes. Static storage in C++ Session 8 Memory Management

CS24 Week 2 Lecture 1

18-642: Code Style for Compilers

CSCI 350: Getting Started with C Written by: Stephen Tsung-Han Sher June 12, 2016

QUIZ. What is wrong with this code that uses default arguments?

Recitation #11 Malloc Lab. November 7th, 2017

// Initially NULL, points to the dynamically allocated array of bytes. uint8_t *data;

Pierce Ch. 3, 8, 11, 15. Type Systems

Pointers. 10/5/07 Pointers 1

CS 11 C track: lecture 5

Limitations of the stack

Bruce Merry. IOI Training Dec 2013

COMP 2355 Introduction to Systems Programming

377 Student Guide to C++

An Operational and Axiomatic Semantics for Non-determinism and Sequence Points in C

Week 9 Part 1. Kyle Dewey. Tuesday, August 28, 12

Please refer to the turn-in procedure document on the website for instructions on the turn-in procedure.

Structures, Unions Alignment, Padding, Bit Fields Access, Initialization Compound Literals Opaque Structures Summary. Structures

CS2141 Software Development using C/C++ Debugging

CPSC 427: Object-Oriented Programming

Jagannath Institute of Management Sciences Lajpat Nagar. BCA II Sem. C Programming

Data Representation and Storage

CSCE 548 Building Secure Software Integers & Integer-related Attacks & Format String Attacks. Professor Lisa Luo Spring 2018

CYSE 411/AIT681 Secure Software Engineering Topic #10. Secure Coding: Integer Security

Pointers and Memory Management

2/9/18. Readings. CYSE 411/AIT681 Secure Software Engineering. Introductory Example. Secure Coding. Vulnerability. Introductory Example.

2/9/18. CYSE 411/AIT681 Secure Software Engineering. Readings. Secure Coding. This lecture: String management Pointer Subterfuge

High Performance Computing in C and C++

Fast Introduction to Object Oriented Programming and C++

CSCI-1200 Data Structures Fall 2017 Lecture 5 Pointers, Arrays, & Pointer Arithmetic

Intermediate Programming, Spring 2017*

Modern and Lucid C++ Advanced for Professional Programmers. Part 3 Move Semantics. Department I - C Plus Plus Advanced

Administrivia. Introduction to Computer Systems. Pointers, cont. Pointer example, again POINTERS. Project 2 posted, due October 6

Pointers and References

Ch. 3: The C in C++ - Continued -

Malloc Lab & Midterm Solutions. Recitation 11: Tuesday: 11/08/2016

Data Representation and Storage. Some definitions (in C)

Writing Program in C Expressions and Control Structures (Selection Statements and Loops)

Part I Part 1 Expressions

11. Reference Types. Swap! Reference Types: Definition. Reference Types

ECE 250 / CS 250 Computer Architecture. C to Binary: Memory & Data Representations. Benjamin Lee

CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays

15. Pointers, Algorithms, Iterators and Containers II

Contents of Lecture 3

Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters

Programming in C. Lecture 9: Tooling. Dr Neel Krishnaswami. Michaelmas Term

CSE 1001 Fundamentals of Software Development 1. Identifiers, Variables, and Data Types Dr. H. Crawford Fall 2018

Programming Language Implementation

18-642: Code Style for Compilers

CS113: Lecture 3. Topics: Variables. Data types. Arithmetic and Bitwise Operators. Order of Evaluation

Simple Data Types in C. Alan L. Cox

C PROGRAMMING LANGUAGE. POINTERS, ARRAYS, OPERATORS AND LOOP. CAAM 519, CHAPTER5

CSCI 2212: Intermediate Programming / C Review, Chapters 10 and 11

THE GOOD, BAD AND UGLY ABOUT POINTERS. Problem Solving with Computers-I

CA31-1K DIS. Pointers. TA: You Lu

Arithmetic type issues

Principles of Programming Pointers, Dynamic Memory Allocation, Character Arrays, and Buffer Overruns

Fall 2018 Discussion 2: September 3, 2018

Lecture 04 Introduction to pointers

CS31 Discussion. Jie(Jay) Wang Week8 Nov.18

CS 31: Intro to Systems C Programming. Kevin Webb Swarthmore College September 13, 2018

Memory, Arrays & Pointers

Memory Organization. The machine code and data associated with it are in the code segment

CS113: Lecture 9. Topics: Dynamic Allocation. Dynamic Data Structures

Programming with MPI

Slide Set 8. for ENCM 339 Fall 2017 Section 01. Steve Norman, PhD, PEng

C++ Programming Lecture 4 Software Engineering Group

Transcription:

C++ Undefined Behavior What is it, and why should I care? A presentation originally by Marshal Clow Original: https://www.youtube.com/watch?v=uhclkb1vkay Original Slides: https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/undefined-behavior.pdf All errors in this slides or presentation are mine

What is Undefined Behavior? 1.3.24 [defns.undefined] undefined behavior behavior for which this International Standard imposes no requirements [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed.

No requirements

Some examples what can happen Program crashes Program gives unexpected results Computer catches fire Daemons fly out of your nose Program appears to work fine There are no wrong answers!

Example #1 No Wrong Answers! #include <iostream> /* what does this program print? */ int main() { int arr[] = { 0, 2, 4, 6, 8 }; int i = 1; std::cout << i + arr[++i] + arr[i++] << \n ; }

How can I get UB? (1) Signed integer overflow (but not unsigned!) Dereferencing nullptr (NULL) Dereferencing result of malloc(0); Shift greater than (or equal to) the width of the operand Reading from uninitializer variables Modifying a variable more than once in an expression Buffer overflow Comparing pointers to different data structures

How can I get UB? (2) Pointer overflow Modifying a const object Modifying a string literal Negating INT_MIN Mismatch between new and delete ([]) Calling a library routine without fulfilling the prerequisites (z.b. memcpy with overlapping buffers) Data races

Example #2 #include <new> class Foo { // some complicated class }; int main() { Foo *p = new Foo[4]; // lots of stuff here delete p; }

atomic_is_lock_free 20.8.2.5 shared_ptr atomic access [util.smartptr.shared.atomic]... template<class T> bool atomic_is_lock_free(const shared_ptr<t>* p); Requires: p shall not be null....

Arithmetic Operations 5 Expressions [expr]... If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined....

Example #3 No Wrong Answers! #include <stdio.h> #include <stdbool.h> int main() { bool b; } if (b) printf( true\n ); If (!b) printf( false\n );

Why do C and C++ do this? It gives the compiler leeway to generate smaller code, by omitting checks By assuming no UB the compiler can generate simpler, faster and smaller code Different hardware reacts different

Why is this important? Because compilers know it and optimizers take advantage of it It is perfectly legal to transform a program exhibiting UB into any other program Remember: in UB there are no wrong answers

Different kinds of routines Type 1: no UB, no matter what the inputs Type 2: UB for some subset of all possible inputs Type 3: UB, no matter what the inputs John Regehr, University of Utah

int *do_something( int *p) { log( do_something %d, *p); if (!p) { p = malloc(...);... } return p; } Example #4

Beispiel #5 #include <stdio.h> int main() { int i = 0x10000000; int c = 0; do { c++; i += i; printf( %d\n, i); } while (I > 0); printf( %d iterations\n, c); }

Why do we care? (1) It is surprisingly easy to write code with undefined behavior http://code.google.com/p/nativeclient/issues/detail?id=245 UB Code may work for a while, and the break when optimization level is increased or the compiler is upgraded ( optimizationunstable code - STACK)

Why do we care? (2) UB shows up in tricky code; frequently code that is attempting security checks http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475 Bugs that STACK found in Postgres

Example #6 // Checks are hard to write correctly // From Apple Secure Coding Guidelines (second edition) void xxxx( int n, int m) { size_t bytes = n * m; if (n > 0 && m > 0 && SIZE_MAX / n >= m) { /* allocate bytes space */ } }

Beispiel #7 Aliasing struct Foo { int a; }; struct Bar { int a; int b; }; Foo f{3}; Bar *p = (Bar *)&f; p->a = 4; std::cout << f.a << \n ;

What can I do about UB? Be aware of UB Don t blame the compiler ( don't shoot the messenger ) When you do something tricky think about UB Build your code with several compilers and optimization levels

Impact If you use Undefined Behavior you con no longer reason about what your program does

It is too late You can not check if Undefined Behavior has already happened

Example #8 // wrong bool will_this_overflow(int a) { return a + 100 < a; } // what the compiler can/will generate: bool will_this_overflow(int a) { return false; } // correct bool will_this_overflow(int a) { return a < (INT_MAX - 100); }

Example #9 #include <sstream> void xxxx( { } int fh, int i) std::stringstream ss; ss << I << \n ; write(fh, ss.str().c_str(), ss.str().size());

Tools Clang: -fsanitize=undefined John Regehr's Integer Overflow Checker STACK

Quiz // Optimize this code void contains_null_check( int *p) { int dead = *p; if (p == nullptr) return; *p = 4; }

References A Guide to Undefined Behavior in C and C++, Part 1 http://blog.regehr.org/archives/213 (links zu Teil 2 und 3) Towards optimization-safe systems http://pdos.csail.mit.edu/papers/stack:sosp13.pdf What every C programmer should know about undefined behavior http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html It's time to get serious about exploiting Undefined Behavior http://blog.regehr.org/archives/761 Finding Undefined Behavior by finding dead code http://blog.regehr.org/archives/970 About unspecified and undefined behavior in C (ACCU 2013) http://www.pvv.org/~oma/unspecifiedandundefined_accu_apr2013.pdf