C++ Undefined Behavior What is it, and why should I care?

Similar documents
C++ Undefined Behavior

Page 1. Today. Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right? Compiler requirements CPP Volatile

Undefined Behaviour in C

Memory Corruption 101 From Primitives to Exploit

Pointers, Dynamic Data, and Reference Types

AGENDA Binary Operations CS 3330 Samira Khan

CS2141 Software Development using C/C++ C++ Basics

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Important From Last Time

Page 1. Today. Important From Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right?

Important From Last Time

Programming in C and C++

Deep C (and C++) by Olve Maudal

UNDEFINED BEHAVIOR IS AWESOME

Deep C by Olve Maudal

CHAPTER 3 BASIC INSTRUCTION OF C++

377 Student Guide to C++

Important From Last Time

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

Lecture 07 Debugging Programs with GDB

a data type is Types

THE INTEGER DATA TYPES. Laura Marik Spring 2012 C++ Course Notes (Provided by Jason Minski)

Common Misunderstandings from Exam 1 Material

Welcome Back. CSCI 262 Data Structures. Hello, Let s Review. Hello, Let s Review. How to Review 8/19/ Review. Here s a simple C++ program:

Ch. 3: The C in C++ - Continued -

Exercise 1.1 Hello world

CS 261 Fall C Introduction. Variables, Memory Model, Pointers, and Debugging. Mike Lam, Professor

Bruce Merry. IOI Training Dec 2013

Pointers and References

Fast Introduction to Object Oriented Programming and C++

Lecture 05 Integer overflow. Stephen Checkoway University of Illinois at Chicago

CS113: Lecture 3. Topics: Variables. Data types. Arithmetic and Bitwise Operators. Order of Evaluation

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

CSE 333 Autumn 2013 Midterm

Understanding Undefined Behavior

There are, of course, many other possible solutions and, if done correctly, those received full credit.

MISRA-C. Subset of the C language for critical systems

Programming in C++ Prof. Partha Pratim Das Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

LECTURE 02 INTRODUCTION TO C++

Hardening LLVM with Random Testing

Pierce Ch. 3, 8, 11, 15. Type Systems

QUIZ. 1. Explain the meaning of the angle brackets in the declaration of v below:

Review: Exam 1. Your First C++ Program. Declaration Statements. Tells the compiler. Examples of declaration statements

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

C Introduction. Comparison w/ Java, Memory Model, and Pointers

The C++ Language. Arizona State University 1

Character Set. The character set of C represents alphabet, digit or any symbol used to represent information. Digits 0, 1, 2, 3, 9

! A program is a set of instructions that the. ! It must be translated. ! Variable: portion of memory that stores a value. char

11 'e' 'x' 'e' 'm' 'p' 'l' 'i' 'f' 'i' 'e' 'd' bool equal(const unsigned char pstr[], const char *cstr) {

Programming Language Implementation

CS2141 Software Development using C/C++ Debugging

6.096 Introduction to C++ January (IAP) 2009

Simple Data Types in C. Alan L. Cox

Arithmetic type issues

CE221 Programming in C++ Part 2 References and Pointers, Arrays and Strings

CE221 Programming in C++ Part 1 Introduction

Expressions and Casting

Basic program The following is a basic program in C++; Basic C++ Source Code Compiler Object Code Linker (with libraries) Executable

Chapter 2: Introduction to C++

Lecture 3: C Programm

Introduction to Programming using C++

Writing Program in C Expressions and Control Structures (Selection Statements and Loops)

Lab#5 Due Wednesday, February 25, at the start of class. Purpose: To develop familiarity with C++ pointer variables

Part I Part 1 Expressions

Chapter 2. Procedural Programming

Discussion 1H Notes (Week 3, April 14) TA: Brian Choi Section Webpage:

Summary of basic C++-commands

Chapter 2: Special Characters. Parts of a C++ Program. Introduction to C++ Displays output on the computer screen

Lecture 2, September 4

Programming in C++ 5. Integral data types

COMP 2355 Introduction to Systems Programming

C++ PROGRAMMING. For Industrial And Electrical Engineering Instructor: Ruba A. Salamh

Basic C Programming (2) Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

ECE 250 / CS 250 Computer Architecture. C to Binary: Memory & Data Representations. Benjamin Lee

CYSE 411/AIT681 Secure Software Engineering Topic #10. Secure Coding: Integer Security

Linked List using a Sentinel

COMP322 - Introduction to C++ Lecture 01 - Introduction

2/9/18. Readings. CYSE 411/AIT681 Secure Software Engineering. Introductory Example. Secure Coding. Vulnerability. Introductory Example.

2/9/18. CYSE 411/AIT681 Secure Software Engineering. Readings. Secure Coding. This lecture: String management Pointer Subterfuge

Welcome Back. CSCI 262 Data Structures. Hello, Let s Review. Hello, Let s Review. How to Review 1/9/ Review. Here s a simple C++ program:

2.1. Chapter 2: Parts of a C++ Program. Parts of a C++ Program. Introduction to C++ Parts of a C++ Program

CS3157: Advanced Programming. Outline

C++ Programming Lecture 1 Software Engineering Group

CS 376b Computer Vision

Pointers and Arrays CS 201. This slide set covers pointers and arrays in C++. You should read Chapter 8 from your Deitel & Deitel book.

Lecture 04 Introduction to pointers

CSE 1001 Fundamentals of Software Development 1. Identifiers, Variables, and Data Types Dr. H. Crawford Fall 2018

Agenda. The main body and cout. Fundamental data types. Declarations and definitions. Control structures

PROGRAMMING IN C++ KAUSIK DATTA 18-Oct-2017

377 Student Guide to C++

C++ Basics. Data Processing Course, I. Hrivnacova, IPN Orsay

Please refer to the turn-in procedure document on the website for instructions on the turn-in procedure.

Static Analysis in C/C++ code with Polyspace

G52CPP C++ Programming Lecture 20

Generic Programming in C++: A modest example

CS Introduction to Programming Midterm Exam #2 - Prof. Reed Fall 2015

Part I Part 1 Expressions

Exceptions, Case Study-Exception handling in C++.

Types, Values, Variables & Assignment. EECS 211 Winter 2018

Programming. C++ Basics

Transcription:

C++ Undefined Behavior What is it, and why should I care? Marshall Clow Qualcomm marshall@idio.com http://cplusplusmusings.wordpress.com (intermittent) Twitter: @mclow ACCU 2014 April 2014

What is Undefined Behavior? (from N3691 - C++) 1.3.24: undefined behavior Behavior for which this International Standard imposes no requirements. [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. end note ]

Some examples of UB Program crashes Program gives unexpected results Computer catches fire Cat gets pregnant Program appears to work fine There are no wrong answers!

Example #1: No Wrong Answers! #include <iostream> // g++ sequence_points.cpp &&./a.out --> 10 // clang++ sequence_points.cpp &&./a.out --> 9 using namespace std; int main () {! int arr [] = { 0, 2, 4, 6, 8 };! int i = 1;! cout << i + arr[++i] + arr[i++] << endl;! }

How can I get UB? (I) Signed integer overflow (but not unsigned!) Dereferencing NULL pointer or result of malloc(0) Shift greater than (or equal to) the width of the operand Reading from uninitialized variables Modifying a variable more than once in an expression

How can I get UB? (II) Buffer overflow Comparing pointers into two different data structures Pointer overflow Modifying a const object (C++) or a string literal

How can I get UB? (III) Negating INT_MIN Data races Mismatch between new and delete Calling a library routine w/o fulfilling the prerequisites memcpy with overlapping buffers

Yikes! #include <new> class Foo {}; // complicated class int main ( int argc, char *argv[] ) {! int *p = new Foo [4]; //..much later.. delete p;! return 0; }

atomic_is_lock_free 20.9.2.5 shared_ptr atomic access template<class T> bool atomic_is_lock_free (const shared_ptr<t>* p); Requires: p shall not be null.

Arithmetic Operations 3.9.1.4: If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [Note: most existing implementations of C++ ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all floating point exceptions vary among machines, and is usually adjustable by a library function. endnote]

Example #2: No Wrong Answers! #include <stdio.h> #include <stdbool.h> int main ( int argc, char *argv[] ) {! bool b;! if ( b ) printf ( "true\n" );! if (!b ) printf ( "false\n" );! return 0; }

Why does C and C++ do this? It gives the compiler leeway to generate smaller code, by omitting checks By assuming no UB, the compiler can generate simpler, faster, smaller code.

Why is this important? Because compilers know it - and optimizers take advantage of it. It is perfectly legal to transform a program exhibiting UB into any other program. Remember - in UB, there are no wrong answers!

Different kinds of routines Type 1 - no UB, no matter what the inputs Type 2 - UB for some subset of all possible inputs. Type 3 - UB every time, no matter what the inputs. John Regehr, University of Utah

Example #3! int * do_something ( int *p )! {!! log ( "do_something %d", *p );!! if (!p )!! {!!! // code here p = malloc (... ); // more code here!! }! return p; }

Example #4 #include <stdio.h> int main () {! int i = 0x10000000;! int c = 0;! do! {!! c++;!! i += i;!! printf ("%d\n", i);! } while (i > 0);! printf ("%d iterations\n", c); }

Why do we care? (I) It's surprisingly easy to write code with undefined behavior. http://code.google.com/p/nativeclient/issues/detail? id=245 UB code may work for a while, and then break when the optimization level is increased or the compiler is upgraded. This is what the STACK people call optimizationunstable code. (remember, no wrong answers!)

Why do we care? (II) UB shows up in tricky code; frequently code that is attempting security checks. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475 Bugs that STACK found in Postgres

Don t be this guy! https://www.youtube.com/watch?v=hrj-vlehcjg

What can I do about UB? Be aware of UB. Don t blame the compiler (AKA don t shoot the messenger ) if you're doing something tricky think about UB. Build your code with several compilers/ different optimization levels.

You can t check for UB after the fact It s too late The damage has already been done

If you write this: bool WillThisOverflow ( int a ) { return a + 100 < a; } the compiler can/may/will optimize it to: bool WillThisOverflow ( int a ) { return false; } // Why? Instead, you should write: bool WillThisOverflow ( int a ) { return a < ( INT_MAX - 100 ); }

Are there any tools to help detect UB? Tools are starting to appear clang has -fsanitize=undefined See http://blog.llvm.org/2013/04/testinglibc-with-fsanitizeundefined.html John Regehr's Integer Overflow Checker STACK (this past summer from MIT)

Quiz // Optimize this code void contains_null_check(int *P) { int dead = *P; if (P == 0) return; *P = 4; }

Quiz // Optimize this code void contains_null_check(int *P) { int dead = *P; if (P == 0) return; *P = 4; }

Questions?

References A Guide to Undefined Behavior in C and C++, Part I http://blog.regehr.org/archives/213 (links to II and III) Towards optimization-safe systems http:// pdos.csail.mit.edu/papers/stack:sosp13.pdf http://clang.llvm.org/docs/ UsersManual.html#controlling-code-generation What every C programmer should know about undefined behavior http://blog.llvm.org/2011/05/whatevery-c-programmer-should-know.html (with link to parts II and III)

References It s Time to Get Serious About Exploiting Undefined Behavior http://blog.regehr.org/archives/761 Finding Undefined Behavior Bugs by Finding Dead Code http://blog.regehr.org/archives/970 About unspecified and undefined behavior in C (ACCU 2013) http://www.pvv.org/~oma/ UnspecifiedAndUndefined_ACCU_Apr2013.pdf