UNDEFINED BEHAVIOR IS AWESOME

Similar documents
Deep C by Olve Maudal

DEVIRTUALIZATION IN LLVM

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Deep C (and C++) by Olve Maudal

Lecture 04 Introduction to pointers

Memory Corruption 101 From Primitives to Exploit

C++ Undefined Behavior What is it, and why should I care?

C++ Undefined Behavior

Intermediate Programming, Spring 2017*

Arrays and Pointers in C. Alan L. Cox

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

CSCI-1200 Data Structures Fall 2017 Lecture 5 Pointers, Arrays, & Pointer Arithmetic

Pointers and References

Common Misunderstandings from Exam 1 Material

Basic Types, Variables, Literals, Constants

CS 61C: Great Ideas in Computer Architecture. C Arrays, Strings, More Pointers

CS 261 Fall C Introduction. Variables, Memory Model, Pointers, and Debugging. Mike Lam, Professor

Procedures, Parameters, Values and Variables. Steven R. Bagley

CS31 Discussion. Jie(Jay) Wang Week8 Nov.18

CS 0449 Sample Midterm

CS31 Discussion 1E Spring 17 : week 08

A brief introduction to C programming for Java programmers

High-performance computing and programming Intro to C on Unix/Linux. Uppsala universitet

So far, system calls have had easy syntax. Integer, character string, and structure arguments.

C Introduction. Comparison w/ Java, Memory Model, and Pointers

CS61, Fall 2012 Section 2 Notes

Arrays and Memory Management

Pointers, Dynamic Data, and Reference Types

Lecture 2, September 4

COMP6771 Advanced C++ Programming

Memory and Addresses. Pointers in C. Memory is just a sequence of byte-sized storage devices.

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

EMBEDDED SYSTEMS PROGRAMMING Language Basics

377 Student Guide to C++

Arrays. Returning arrays Pointers Dynamic arrays Smart pointers Vectors

377 Student Guide to C++

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

Fall 2018 Discussion 2: September 3, 2018

CMPSC 311- Introduction to Systems Programming Module: Arrays and Pointers

CS 31: Intro to Systems Pointers and Memory. Martin Gagne Swarthmore College February 16, 2016

ELEC 377 C Programming Tutorial. ELEC Operating Systems

From Java to C. Thanks to Randal E. Bryant and David R. O'Hallaron (Carnegie-Mellon University) for providing the basis for these slides

Structures, Unions Alignment, Padding, Bit Fields Access, Initialization Compound Literals Opaque Structures Summary. Structures

Bruce Merry. IOI Training Dec 2013

CSE 333 Lecture 2 Memory

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

Fundamental of Programming (C)

CSE 333. Lecture 10 - references, const, classes. Hal Perkins Paul G. Allen School of Computer Science & Engineering University of Washington

Agenda. The main body and cout. Fundamental data types. Declarations and definitions. Control structures

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

Q1: /8 Q2: /30 Q3: /30 Q4: /32. Total: /100

CSE 333 Final Exam June 6, 2017 Sample Solution

CS24 Week 3 Lecture 1

Introduction to C++ Systems Programming

Systems Programming and Computer Architecture ( )

Understanding Pointers

a data type is Types

primitive arrays v. vectors (1)

Introduction to C++ Professor Hugh C. Lauer CS-2303, System Programming Concepts

Kurt Schmidt. October 30, 2018

Review: C Strings. A string in C is just an array of characters. Lecture #4 C Strings, Arrays, & Malloc

Week 9 Part 1. Kyle Dewey. Tuesday, August 28, 12

Object-Oriented Programming for Scientific Computing

QUIZ. What is wrong with this code that uses default arguments?

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

CSE 333 Lecture 3 - arrays, memory, pointers

C++ Programming Lecture 4 Software Engineering Group

CS 61c: Great Ideas in Computer Architecture

Lecture 03 Bits, Bytes and Data Types

Q1: /20 Q2: /30 Q3: /24 Q4: /26. Total: /100

CE221 Programming in C++ Part 2 References and Pointers, Arrays and Strings

COMP s1 Lecture 1

SYSC 2006 C Winter 2012

CSCI 104 Memory Allocation. Mark Redekopp David Kempe

CSE 333 Lecture 2 - arrays, memory, pointers

CSE 303: Concepts and Tools for Software Development

Wednesday, October 15, 14. Functions

Memory (Stack and Heap)

Vector and Free Store (Vectors and Arrays)

05-01 Discussion Notes

CSE 333 Midterm Exam Sample Solution 7/28/14

CSE 333 Lecture 9 - intro to C++

Object-Oriented Programming, Iouliia Skliarova

Pointers. Pointer Variables. Chapter 11. Pointer Variables. Pointer Variables. Pointer Variables. Declaring Pointer Variables

CSCI-1200 Data Structures Fall 2018 Lecture 5 Pointers, Arrays, & Pointer Arithmetic

CSE 333 Autumn 2013 Midterm

QStringView. everywhere. Marc Mutz, Senior Software Engineer at KDAB

Recitation: C Review. TA s 20 Feb 2017

CS107, Lecture 9 C Generics Function Pointers

Pointers, Arrays, Memory: AKA the cause of those Segfaults

CS 61C: Great Ideas in Computer Architecture. Lecture 3: Pointers. Bernhard Boser & Randy Katz

Lecture 9. Pointer, linked node, linked list TDDD86: DALP. Content. Contents Pointer and linked node. 1.1 Introduction. 1.

CS102 Software Engineering Principles

CS 61C: Great Ideas in Computer Architecture. Memory Management and Usage

Modern C++ for Computer Vision and Image Processing. Igor Bogoslavskyi

Pointers and scanf() Steven R. Bagley

First of all, it is a variable, just like other variables you studied

Agenda. Components of a Computer. Computer Memory Type Name Addr Value. Pointer Type. Pointers. CS 61C: Great Ideas in Computer Architecture

In Java we have the keyword null, which is the value of an uninitialized reference type

1d: tests knowing about bitwise fields and union/struct differences.

Transcription:

UNDEFINED BEHAVIOR IS AWESOME Piotr Padlewski piotr.padlewski@gmail.com, @PiotrPadlewski

ABOUT MYSELF Currently working in IIIT developing C++ tooling like clang-tidy and studying on University of Warsaw. Worked on optimizations in clang and LLVM - devirtualization and ThinLTO at Google

IMPLEMENTATION DEFINED BEHAVIOUR (IB) The behavior of the program varies between implementations, and the conforming implementation must document the effects of each behavior. Examples: Type of std::size_t Number of bits in long Number of bits in a byte

UNSPECIFIED BEHAVIOR The behavior of the program varies between implementations and the conforming implementation is not required to document the effects of each behavior.[0] Examples: Order of evaluation: foo(bar(), baz()); Identical string literals address: foo == foo

UNDEFINED BEHAVIOR (UB) There are no restrictions on the behavior of the program. We can treat it as a promise to the compiler that something won t happen.

WHAT CAN HAPPEN AFTER HITTING UB?

UNDEFINED BEHAVIOR (UB) In theory your program can do anything in practice the odds of formatting your hard drive are

BORING UBS Naming variable starting with underscore Defining functions in namespace std Specializing non-user defined types in namespace std (can t specialize std::hash<std::pair<int, int>>) Calling delete/free after new[]

MORE INTERESTING UBS Dereferencing nullptr Using uninitialized values Integers overflows

SIMPLE OVERFLOW int foo(int x) { return x+1 > x; int foo(int) { return true;

LOOPS for (int i = 0; i < n; i+=2) { A[i] = B[i] + C[i]; A[i+1] = B[i+1] + C[i+1]; Loop will terminate assert(n >= i); safe to wider i to uint64_t = VECTORIZATION AND UNROLLING

TASTY UBS buffer overflow using pointer to object of ended lifetime violating strict-aliasing const_casting const

BUFFER OVERFLOW int table[4]; bool exists_in_table(int v) { for (int i = 0; i <= 4; i++) { if (table[i] == v) return true; return false;

BUFFER OVERFLOW int table[4]; bool exists_in_table(int v) { for (int i = 0; i <= 4; i++) { if (table[i] == v) return true; return false;

BUFFER OVERFLOW int table[4]; bool exists_in_table(int v) { for (int i = 0; i <= 4; i++) { if (table[i] == v) return true; return false;

BUFFER OVERFLOW int table[4]; bool exists_in_table(int v) { return true;

LIFETIME AND POINTERS #include <stdio.h> #include <stdlib.h> int main() { int *p = (int*)malloc(sizeof(int)); int *q = (int*)realloc(p, sizeof(int)); if (p == q) { *p = 1; *q = 2; printf("%d %d\n", *p, *q); Compiled with clang produce: 1 2

LIFETIME AND POINTERS vector<int> v; v.reserve(2); v.push_back(4); v.push_back(2); auto& ref = v[0]; v.push_back(42); IS THIS VALID? IS THIS UB/IB? valid, because std::allocator only calls new/delete if (&v[0] == &ref) ref = 42;

WHEN SOMETHING IS GOOD CANDIDATE TO BE UB? When occurred situation is considered a bug and defining it s behavior would be a performance loss.

STACK OVERFLOW Why I can t get a nice error message saying I got stack overflow?

OUT OF MEMORY Ok, at least we get std::bad_alloc or nullptr when heap allocation fails.

WHY PEOPLE HATE EXCEPTIONS With enabled exceptions every call generate branch Compiling with -fno-exceptions changes every throw to call of std::abort() Other solution is to mark almost every function with noexcept

LET S TALK ABOUT CONST struct A { void foo() const; int b = 0; ; struct A { void foo() const; int b = 0; ; void bar(int); void bar(int); int main() { A a; bar(a.b); a.foo(); bar(a.b): int main() { A a; bar(0); a.foo(); bar(0):

LET S TALK ABOUT CONST Illegal to do the optimization because foo can use const_cast on b const_cast on a const reference to non-const variable is OK const_cast on a memory declared const is UB

CONST PROPAGATION Every time we call external const method, or function having const reference parameters, the compiler have to assume the worst - const_cast This really sucks, because const propagation is awesome If only const would have non-mutable guarantee

IMAGINE THERE IS NO CONST_CAST IMAGINE IF CONST_CASTING WOULD

CONST PROPAGATION WITH STRICT CONST void foo(const int &a) { bar(a); bar(a); void bar(const int &b);

CONST PROPAGATION WITH STRICT CONST void foo(const int &a) { int global; const int temp = a; bar(temp); bar(temp); void caller() { foo(global); void bar(const int &b); void bar(const int &b) { global++;

WHAT ABOUT MUTABLE? class A { ; ;;; class A { ; mutable X; void caller() { A a; foo(a); void caller() { A a; foo(a); void foo(const A &a); void foo(const A &a);

DOES THE REAL CONST EXIST? Kinda. Compilers mark functions as const (e.g. readonly in llvm) if they don t modify memory Maybe problem is somewhere else?

BUT IS THE CONST THE REAL PROBLEM?

THE SOLUTION Use Link Time Optimizations! 32GB of RAM? Then use ThinLTO/LIPO

VIRTUAL FUNCTIONS Is there a difference between C++ virtual functions and hand written 'virtual' functions in C? C++ standard doesn t explicitly mention UB with virtual functions But it say much about object lifetime

VIRTUAL FUNCTIONS int test(base *a) { int sum = 0; sum += a->foo(); sum += a->foo(); // Is it the same foo()? return sum; int Base::foo() { new (this) Derived; return 1;

CALLING MAIN int main(int argc, const char* argv[]) { if (argc == 0) return 0; printf("%s ", argv[0]); return main(argc - 1, argv + 1);

int main() { auto p = std::make_unique<int>(42); std::unique_ptr<int> p2 = std::move(p); *p = 42; std::cout << *p << std::endl;

int main() { auto p = std::make_unique<int>(42); std::unique_ptr<int> p2 = std::move(p);

int main() { auto p = std::make_unique<int>(42); std::move(p);

int main() { std::make_unique<int>(42);

int main() {

void fun(int *p, int *z) { *p = 42; if (!p) { *z = 54;

void fun(int *p, int *z) { *p = 42; /* if (!p) { *z = 54; */

void fun(int *p, int *z) { if (!p) { *z = 54; *p = 42;

void fun(int *p, int *z) { /* if (!p) { *z = 54; */ *p = 42;

QUESTIONS!