Checked C. Michael Hicks The University of Maryland joint work with David Tarditi (MSR), Andrew Ruef (UMD), Sam Elliott (UW)

Size: px
Start display at page:

Download "Checked C. Michael Hicks The University of Maryland joint work with David Tarditi (MSR), Andrew Ruef (UMD), Sam Elliott (UW)"

Transcription

1 Checked C Michael Hicks The University of Maryland joint work with David Tarditi (MSR), Andrew Ruef (UMD), Sam Elliott (UW) UM

2 Motivation - Lots of C/C++ code out there. - One open source code indexer (openhub.net) found 8.5 billion lines of C code. - Huge investment in existing (often unsafe) code - Many approaches saving C have been proposed: - Static/dynamic analysis (ASAN, Softbound), fuzzing, OS/HWlevel mitigations - Rewrite the code in a type-safe language (Java/Rust ) - New dialects of C (Cyclone, CCured, Deputy, ) 2

3 Checked C Checked C is another take at making system software written in C more reliable and secure. Approach: - Extend C so that type-safe programs can be written in it. - A new language risks introducing unnecessary differences. - Backward compatible, support incremental conversion of code to be type-safe. - Implement in widely used C compilers (Clang/LLVM) - Create automated conversion tools - Hypothesis: most source code behaves in a type-safe fashion. - Evaluate and get real-world use. Iterate based on experience

4 Making code type safe Design Static/dynamic bounds checking for pointers and arrays Initialization of data, type casts, explicit memory management No temporal safety enforcement at present (use GC) Backward compatibility All pointers binary-compatible (no fat pointers) Strict extension (parsing) of C Like Cyclone Unlike Cyclone Unchecked and checked code can co-exist (e.g., in a function) The former can use checked types in a looser manner Can prove a blame property that characterizes isolation Can work with improving hardware designs 4

5 Clang & LLVM Checked C's Clang Frontend Analyses LLVM IR Generator LLVM Analyses Optimizations Code Generators x86 / x64 / ARM / Other Assembly 5

6 Pointers T* val = malloc(sizeof(t)); Singleton Pointer ptr<t> val = malloc(sizeof(t)); T* vals = calloc(n, sizeof(t)); T vals[n] = {... ; Array Pointer array_ptr<t> vals = calloc(n, sizeof(t)); T vals checked[n] = {... ; 6

7 Bounds Declarations Declaration array_ptr<t> p : bounds(l, u) Invariant l p < u array_ptr<t> p : count(n) p p < p+n array_ptr<t> p : byte_count(n) p p < (char*)p + n Expressions in bounds(l, u) must be non-modifying - No Assignments or Increments/Decrements - No Calls 7

8 NUL Terminated Pointers char* str = calloc(n, sizeof(char)); char str[n] = {... ; NT Array Pointer nt_array_ptr<char> str : count(n-1) = calloc(n, sizeof(char)); char str checked[n+1] = {, \0 ; nt_array_ptr<t> p : bounds(l, u) l p u && c u. *c == \0 can read *u or write \0 to it 8

9 Bounds Expansion size_t my_strlcpy( nt_array_ptr<char> dst: count(dst_sz - 1), nt_array_ptr<char> src, size_t dst_sz) { size_t i = 0; nt_array_ptr<char> s : count(i) = src; while (s[i]!= \0 && i < dst_sz - 1) { //bounds on s may expand by 1 dst[i] = s[i]; ++i; dst[i] = \0 ; //ok to write to upper bound return i; 9

10 Where Dynamic Checks Occur int i = *p; *p = 0; Pointer Dereference Assignment p[n] p->field *p += 1; Compound Assignment (*p)++; Increment/Decrement Elided if the compiler can prove the access is safe 10

11 bool echo( int16_t user_length, size_t user_payload_len, char *user_payload, resp_t *resp) { char *resp_data = malloc(user_length); resp->payload_buf = resp_data; resp->payload_buf_len = user_length; Copy data from user_payload into new buffer in resp object Example inspired by Heartbleed error // memcpy(resp->payload_buf, user_payload_buf, user_length) for (int i = 0; i < user_length; i++) { resp->payload_buf[i] = user_payload_buf[i]; return true; user_length is provided by user user_payload_len is from the parser typedef struct { size_t payload_len; char *payload; //... resp_t; 11

12 bool echo( int16_t user_length, size_t user_payload_len, char *user_payload, resp_t *resp) { char *resp_data = malloc(user_length); resp->payload = resp_data; resp->payload_len = user_length; Copy data from user_payload into new buffer in resp object // memcpy(resp->payload_buf, user_payload_buf, user_length) for (int i = 0; i malloc < user_length; could fail i++) { resp->payload_buf[i] = user_payload_buf[i]; return true; user_length is provided by user user_payload_len is from the parser typedef struct { size_t payload_len; char *payload; //... resp_t; 12

13 bool echo( int16_t user_length, size_t user_payload_len, char *user_payload, resp_t *resp) { char *resp_data = malloc(user_length); resp->payload = resp_data; resp->payload_len = user_length; Copy data from user_payload into new buffer in resp object // memcpy(resp->payload, user_payload, user_length) for (size_t i = 0; i < user_length; i++) { resp->payload[i] = user_payload[i]; return true; user_length is user_payload_len is user_length provided by could from the be parser larger than user_payload_len typedef struct { size_t payload_len; char *payload; //... resp_t; 13

14 bool echo( int16_t user_length, size_t user_payload_len, array_ptr<char> user_payload, ptr<resp_t> resp) { array_ptr<char> resp_data = malloc(user_length); resp->payload = resp_data; resp->payload_len = user_length; Step 1: Manually Convert to Checked Types // memcpy(resp->payload, user_payload, user_length) for (size_t i = 0; i < user_length; i++) { resp->payload[i] = user_payload[i]; return true; typedef struct { size_t payload_len; array_ptr<char> payload; //... resp_t; 14

15 bool echo( int16_t user_length, size_t user_payload_len, array_ptr<char> user_payload : count(user_payload_len), ptr<resp_t> resp) { array_ptr<char> resp_data : count(user_length) = malloc(user_length) resp->payload = resp_data; resp->payload_len = user_length; Step 2: Manually // memcpy(resp->payload, user_payload, user_length) for (size_t i = 0; i < user_length; Add i++) Bounds { resp->payload[i] = user_payload[i]; Declarations return true; typedef struct { size_t payload_len; array_ptr<char> payload : count(payload_len); //... resp_t; 15

16 bool echo( int16_t user_length, Step 3: size_t user_payload_len, Compiler Inserts array_ptr<char> user_payload : count(user_payload_len), ptr<resp_t> resp) { Checks Automatically array_ptr<char> resp_data : count(user_length) = malloc(user_length) dynamic_check(resp!= NULL); resp->payload = resp_data; resp->payload_len = user_length; malloc now checked // memcpy(resp->payload, user_payload, user_length) for (size_t i = 0; i < user_length; i++) { dynamic_check(user_payload!= NULL); dynamic_check(user_payload <= &user_payload[i]); dynamic_check(&user_payload[i] < user_payload + user_payload_len); dynamic_check(resp->payload!= NULL); dynamic_check(resp->payload <= &resp->payload[i]); dynamic_check(&resp->payload[i] < resp->payload + resp->payload_le resp->payload[i] = user_payload[i]; No Memory Disclosure return true; 16

17 bool echo( int16_t user_length, Step 3: size_t user_payload_len, Compiler Inserts array_ptr<char> user_payload : count(user_payload_len), ptr<resp_t> resp) { Checks Automatically array_ptr<char> resp_data : count(user_length) = malloc(user_length) Code Not Bug-Free: Will signal run-time error if either dynamic_check(resp!= NULL); resp->payload = resp_data; resp->payload_len = user_length; - malloc(user_length) fails malloc now checked // memcpy(resp->payload, user_payload, user_length) for (size_t - user_length i = 0; i < user_length; > user_payload_len i++) { dynamic_check(user_payload!= NULL); dynamic_check(user_payload <= &user_payload[i]); dynamic_check(&user_payload[i] But: Vulnerable Executions < user_payload Prevented + user_payload_len); dynamic_check(resp->payload!= NULL); dynamic_check(resp->payload <= &resp->payload[i]); dynamic_check(&resp->payload[i] < resp->payload + resp->payload_le resp->payload[i] = user_payload[i]; No Memory Disclosure return true; 17

18 Step 4: bool echo( int16_t user_length, Restrictions on bounds size_t user_payload_len, array_ptr<char> user_payload : count(user_payload_len), ptr<resp_t> resp) { expressions may allow removal array_ptr<char> resp_data : count(user_length) = malloc(user_length) dynamic_check(resp!= NULL); resp->payload = resp_data; resp->payload_len = user_length; malloc still checked dynamic_check(user_payload!= NULL); dynamic_check(resp->payload!= NULL); // memcpy(resp->payload, user_payload, user_length) for (size_t i = 0; i < user_length; i++) { dynamic_check(i <= user_payload_len); resp->payload[i] = user_payload[i]; return true; No Memory Disclosure 18

19 Partially Converted Code - We may not want (or be able) to port all at once - Can use checked types wherever we want - Allows some hard-to-prove-safe idioms - But adds risk void more(int *b, int idx, ptr<int *>out) { int oldidx = idx, c; do { c = readvalue(); b[idx++] = c; //could overflow b? while (c!= 0); *out = b+idx-oldidx; //bad if out corrupted 19

20 Checked Regions - Checked regions are scopes (or functions or files) that are internally safe. Made so by disallowing - Explicit casts to checked pointers - Reads/writes via unchecked pointers - Use of varargs, or K&R prototypes - Calls to unchecked functions - Keep in mind checked pointers can appear anywhere - Different than prior systems which strongly partition safe and unsafe code 20

21 Checked Regions void foo(int *out) { _Ptr<int> ptrout; if (out!= (int *)0) { ptrout = (_Ptr<int>)out; // cast OK else { return; checked { int b checked[5][5]; for (int i = 0; i < 5; i++) { for (int j = 0; j < 5; j++) { b[i][j] = -1; // access safe *ptrout = b[0][0]; unchecked code checked code 21

22 Blame What assurance do checked regions give us? 1. Assuming accessible pointers well-formed, checked region pointer accesses are safe 2. Checked regions preserve well-formedness 3. Hence: Checked code cannot be blamed Proved this property in a simple formalization of the Checked C type system 22

23 Checked Interfaces Annotate Unchecked Pointers with Bounds, in code and in library function prototypes T* val : itype(ptr<t>) = malloc(sizeof(t)); T* vals : count(n) = calloc(n, sizeof(t)); size_t fwrite(void *p : byte_count(s*n), size_t s, size_t n, FILE *st : itype(_ptr<file>)); 23

24 Implementation 24

25 Implementation - Extended C grammar recognized by Clang (not easy!) - Guts of the implementation includes type checking, bounds inference, subsumption checking, dynamic check insertion - Dynamic checks placed so that subsequent LLVM optimization passes can eliminate redundant ones - Put serious effort into this, to get good performance 25

26 Preliminary Evaluation - How much code must be changed to port? - How hard is it to do? - What is the overhead on the converted code? - Running time, compilation time, executable size 26

27 Experimental Setup - Olden and Ptrdist Suites: 15 Programs - Used by CCured, Deputy, others - Converted by hand - 2 Conversions Incomplete - 12 Core Intel Xeon X5650, 2.66GHz, 32GB RAM 27

28 Benchmarks Benchmark LoC Description Barnes & Hut N-body force computation Olden: bh 1,162 Olden: bisort 263 Forward & Backward Bitonic Sort Olden: em3d 478 3D Electromagnetic Wave Propagation Olden: health 389 Columbian Health Care Simulation Olden: mst 328 Minimum Spanning Tree Olden: perimeter 399 Perimeters of Regions on Images Olden: power 458 Power Pricing Optimisation Solver Olden: treeadd 180 Recursive Sum over Tree Olden: tsp 420 Travelling Salesman Problem Computes voronoi diagram of a set of points Olden: voronoi 814 Ptrdist: anagram 362 Finding Anagrams from a Dictionary Arbitrary precision calculator Minimum Spanning Tree using Fibonacci heaps Ptrdist: bc 5,194 Ptrdist: ft 893 Ptrdist: ks 552 Schweikert-Kernighan Graph Partitioning Ptrdist: yacr2 2,529 VSLI Channel Router 28

29 Code Modifications Benchmark bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr2 14.6% 0% 10% 20% 30% Lines Modified (%) Benchmark bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr2 10.6% bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr2 83.9% 0% 20% 40% 60% 80% 100% Easy Modifications (%) Suite Olden Ptrdist 0% 10% 20% 30% Lines Unchecked (%) 29

30 Performance Overhead Benchmark bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr % 20% 0% + 20% + 40% + 60% Runtime Slowdown (±%) Benchmark bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr % bh bisort em3d health mst perimeter power treadd tsp anagram ft ks yacr2 20% 0% + 20% + 40% + 60% Executable Size Change (±%) % 25% 0% + 25% + 50% + 75% Compile Time Slowdown (±%) Suite Olden Ptrdist 30

31 In Progress - Full implementation of NUL-terminated arrays, general pointer arithmetic - Need data flow analysis to handle changing bounds via pointer arithmetic (flow-sensitive types) - Automated conversion of C code - Partially complete inference of ptr<t> types (based on CCured) - Working on inference of bounds expressions via abstract interpretation framework - Smaller stuff - Definite initialization, better subsumption checks, - More experience (and then some) - Temporal safety (longer term) 31

32 Summary - Checked C is a new effort to make C safe. Key design choices - Binary compatible with C pointers annotated with bounds declarations, not made fat - An extension, not a replacement, with a design to help reason about mixed code - Part of LLVM, to encourage production use - Focus on spatial safety for now; temporal safety via GC in meantime

Putting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017

Putting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017 Putting the Checks into Checked C Archibald Samuel Elliott Quals - 31st Oct 2017 Added Runtime Bounds Checks to the Checked C Compiler 2 C Extension for Spatial Memory Safety Added Runtime Bounds Checks

More information

Putting the Checks into Checked C

Putting the Checks into Checked C Putting the Checks into Checked C Checked C Technical Report Number 2 31 October 2017 Archibald Samuel Elliott Paul G. Allen School, University of Washington Summary Checked C is an extension to C that

More information

APPENDIX Summary of Benchmarks

APPENDIX Summary of Benchmarks 158 APPENDIX Summary of Benchmarks The experimental results presented throughout this thesis use programs from four benchmark suites: Cyclone benchmarks (available from [Cyc]): programs used to evaluate

More information

Static Transformation for Heap Layout Using Memory Access Patterns

Static Transformation for Heap Layout Using Memory Access Patterns Static Transformation for Heap Layout Using Memory Access Patterns Jinseong Jeon Computer Science, KAIST Static Transformation computing machine compiler user + static transformation Compilers can transform

More information

Baggy bounds checking. Periklis Akri5dis, Manuel Costa, Miguel Castro, Steven Hand

Baggy bounds checking. Periklis Akri5dis, Manuel Costa, Miguel Castro, Steven Hand Baggy bounds checking Periklis Akri5dis, Manuel Costa, Miguel Castro, Steven Hand C/C++ programs are vulnerable Lots of exis5ng code in C and C++ More being wrieen every day C/C++ programs are prone to

More information

Ironclad C++ A Library-Augmented Type-Safe Subset of C++

Ironclad C++ A Library-Augmented Type-Safe Subset of C++ Ironclad C++ A Library-Augmented Type-Safe Subset of C++ Christian DeLozier, Richard Eisenberg, Peter-Michael Osera, Santosh Nagarakatte*, Milo M. K. Martin, and Steve Zdancewic October 30, 2013 University

More information

CS527 Software Security

CS527 Software Security Security Policies Purdue University, Spring 2018 Security Policies A policy is a deliberate system of principles to guide decisions and achieve rational outcomes. A policy is a statement of intent, and

More information

HardBound: Architectural Support for Spatial Safety of the C Programming Language

HardBound: Architectural Support for Spatial Safety of the C Programming Language HardBound: Architectural Support for Spatial Safety of the C Programming Language Joe Devietti *, Colin Blundell, Milo Martin, Steve Zdancewic * University of Washington, University of Pennsylvania devietti@cs.washington.edu,

More information

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs CCured Type-Safe Retrofitting of C Programs [Necula, McPeak,, Weimer, Condit, Harren] #1 One-Slide Summary CCured enforces memory safety and type safety in legacy C programs. CCured analyzes how you use

More information

Memory Safety for Embedded Devices with nescheck

Memory Safety for Embedded Devices with nescheck Memory Safety for Embedded Devices with nescheck Daniele MIDI, Mathias PAYER, Elisa BERTINO Purdue University AsiaCCS 2017 Ubiquitous Computing and Security Sensors and WSNs are pervasive Small + cheap

More information

Transparent Pointer Compression for Linked Data Structures

Transparent Pointer Compression for Linked Data Structures Transparent Pointer Compression for Linked Data Structures lattner@cs.uiuc.edu Vikram Adve vadve@cs.uiuc.edu June 12, 2005 MSP 2005 http://llvm.cs.uiuc.edu llvm.cs.uiuc.edu/ Growth of 64-bit computing

More information

A program execution is memory safe so long as memory access errors never occur:

A program execution is memory safe so long as memory access errors never occur: A program execution is memory safe so long as memory access errors never occur: Buffer overflows, null pointer dereference, use after free, use of uninitialized memory, illegal free Memory safety categories

More information

Low level security. Andrew Ruef

Low level security. Andrew Ruef Low level security Andrew Ruef What s going on Stuff is getting hacked all the time We re writing tons of software Often with little regard to reliability let alone security The regulatory environment

More information

Compressing Heap Data for Improved Memory Performance

Compressing Heap Data for Improved Memory Performance Compressing Heap Data for Improved Memory Performance Youtao Zhang Rajiv Gupta Department of Computer Science Department of Computer Science The University of Texas at Dallas The University of Arizona

More information

CSE409, Rob Johnson, Alin Tomescu, November 11 th, 2011 Buffer overflow defenses

CSE409, Rob Johnson,   Alin Tomescu, November 11 th, 2011 Buffer overflow defenses Buffer overflow defenses There are two categories of buffer-overflow defenses: - Make it hard for the attacker to exploit buffer overflow o Address space layout randomization o Model checking to catch

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

CS-527 Software Security

CS-527 Software Security CS-527 Software Security Memory Safety Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Kyriakos Ispoglou https://nebelwelt.net/teaching/17-527-softsec/ Spring 2017 Eternal

More information

SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14

SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14 SoK: Eternal War in Memory Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song In: Oakland 14 Presenter: Mathias Payer, EPFL http://hexhive.github.io 1 Memory attacks: an ongoing war Vulnerability classes

More information

Chapter 10. Improving the Runtime Type Checker Type-Flow Analysis

Chapter 10. Improving the Runtime Type Checker Type-Flow Analysis 122 Chapter 10 Improving the Runtime Type Checker The runtime overhead of the unoptimized RTC is quite high, because it instruments every use of a memory location in the program and tags every user-defined

More information

COS 320. Compiling Techniques

COS 320. Compiling Techniques Topic 5: Types COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Types: potential benefits (I) 2 For programmers: help to eliminate common programming mistakes, particularly

More information

in memory: an evolution of attacks Mathias Payer Purdue University

in memory: an evolution of attacks Mathias Payer Purdue University in memory: an evolution of attacks Mathias Payer Purdue University Images (c) MGM, WarGames, 1983 Memory attacks: an ongoing war Vulnerability classes according to CVE Memory

More information

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1 CSE 401 Compilers Static Semantics Hal Perkins Winter 2009 2/3/2009 2002-09 Hal Perkins & UW CSE I-1 Agenda Static semantics Types Symbol tables General ideas for now; details later for MiniJava project

More information

Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08

Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08 Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08 For your solutions you should submit a hard copy; either hand written pages stapled together or a print out of a typeset document

More information

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 2

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 2 CIS 551 / TCOM 401 Computer and Network Security Spring 2007 Lecture 2 Announcements First project is on the web Due: Feb. 1st at midnight Form groups of 2 or 3 people If you need help finding a group,

More information

SoftBound: Highly Compatible and Complete Spatial Memory Safety for C

SoftBound: Highly Compatible and Complete Spatial Memory Safety for C University of Pennsylvania ScholarlyCommons Technical Reports (CIS) Department of Computer & Information Science January 2009 SoftBound: Highly Compatible and Complete Spatial Memory Safety for C Santosh

More information

15. Pointers, Algorithms, Iterators and Containers II

15. Pointers, Algorithms, Iterators and Containers II 498 Recall: Pointers running over the Array 499 Beispiel 15. Pointers, Algorithms, Iterators and Containers II int a[5] = 3, 4, 6, 1, 2; for (int p = a; p < a+5; ++p) std::cout

More information

Pointers, Dynamic Data, and Reference Types

Pointers, Dynamic Data, and Reference Types Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation The new operator The delete operator Dynamic Memory Allocation for Arrays 1 C++ Data Types simple

More information

PtrSplit: Supporting General Pointers in Automatic Program Partitioning

PtrSplit: Supporting General Pointers in Automatic Program Partitioning PtrSplit: Supporting General Pointers in Automatic Program Partitioning Shen Liu Gang Tan Trent Jaeger Computer Science and Engineering Department The Pennsylvania State University 11/02/2017 Motivation

More information

Inference of Memory Bounds

Inference of Memory Bounds Research Review 2017 Will Klieber, software security researcher Joint work with Will Snavely public release and unlimited distribution. 1 Copyright 2017 Carnegie Mellon University. All Rights Reserved.

More information

The CHERI capability model Revisiting RISC in an age of risk

The CHERI capability model Revisiting RISC in an age of risk CTS CTSRDCRASH-worthy Trustworthy Systems Research and Development The CHERI capability model Revisiting RISC in an age of risk Jonathan Woodruff, Robert N. M. Watson, David Chisnall, Simon W. Moore, Jonathan

More information

An Efficient and Backwards-Compatible Transformation to Ensure Memory Safety of C Programs

An Efficient and Backwards-Compatible Transformation to Ensure Memory Safety of C Programs An Efficient and Backwards-Compatible Transformation to Ensure Memory Safety of C Programs Wei Xu, Daniel C. DuVarney, and R. Sekar Department of Computer Science, Stony Brook University Stony Brook, NY

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Winter 2017 Lecture 4a Andrew Tolmach Portland State University 1994-2017 Semantics and Erroneous Programs Important part of language specification is distinguishing valid from

More information

SoftBound: Highly Compatible and Complete Spatial Safety for C

SoftBound: Highly Compatible and Complete Spatial Safety for C SoftBound: Highly Compatible and Complete Spatial Safety for C Santosh Nagarakatte, Jianzhou Zhao, Milo Martin, Steve Zdancewic University of Pennsylvania {santoshn, jianzhou, milom, stevez}@cis.upenn.edu

More information

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus Types Type checking What is a type? The notion varies from language to language Consensus A set of values A set of operations on those values Classes are one instantiation of the modern notion of type

More information

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Projects 1 Information flow analysis for mobile applications 2 2 Machine-learning-guide typestate analysis for UAF vulnerabilities 3 3 Preventing

More information

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result. Every program uses data, either explicitly or implicitly to arrive at a result. Data in a program is collected into data structures, and is manipulated by algorithms. Algorithms + Data Structures = Programs

More information

Unleashing D* on Android Kernel Drivers. Aravind Machiry

Unleashing D* on Android Kernel Drivers. Aravind Machiry Unleashing D* on Android Kernel Drivers Aravind Machiry (@machiry_msidc) $ whoami Fourth year P.h.D Student at University of California, Santa Barbara. Vulnerability Detection in System software. machiry.github.io

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016 Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g.

More information

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline CS 0 Lecture 8 Chapter 5 Louden Outline The symbol table Static scoping vs dynamic scoping Symbol table Dictionary associates names to attributes In general: hash tables, tree and lists (assignment ) can

More information

18-642: Code Style for Compilers

18-642: Code Style for Compilers 18-642: Code Style for Compilers 9/25/2017 1 Anti-Patterns: Coding Style: Language Use Code compiles with warnings Warnings are turned off or over-ridden Insufficient warning level set Language safety

More information

QUIZ. 1. Explain the meaning of the angle brackets in the declaration of v below:

QUIZ. 1. Explain the meaning of the angle brackets in the declaration of v below: QUIZ 1. Explain the meaning of the angle brackets in the declaration of v below: This is a template, used for generic programming! QUIZ 2. Why is the vector class called a container? 3. Explain how the

More information

Stack Bounds Protection with Low Fat Pointers

Stack Bounds Protection with Low Fat Pointers Stack Bounds Protection with Low Fat Pointers Gregory J. Duck, Roland H.C. Yap, and Lorenzo Cavallaro NDSS 2017 Overview Heap Bounds Protection with Low Fat Pointers, CC 2016 Stack New method for detecting

More information

Baggy bounds with LLVM

Baggy bounds with LLVM Baggy bounds with LLVM Anton Anastasov Chirantan Ekbote Travis Hance 6.858 Project Final Report 1 Introduction Buffer overflows are a well-known security problem; a simple buffer-overflow bug can often

More information

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

SEMANTIC ANALYSIS TYPES AND DECLARATIONS SEMANTIC ANALYSIS CS 403: Type Checking Stefan D. Bruda Winter 2015 Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination now we move to check whether

More information

18-642: Code Style for Compilers

18-642: Code Style for Compilers 18-642: Code Style for Compilers 9/6/2018 2017-2018 Philip Koopman Programming can be fun, so can cryptography; however they should not be combined. Kreitzberg and Shneiderman 2017-2018 Philip Koopman

More information

IR Optimization. May 15th, Tuesday, May 14, 13

IR Optimization. May 15th, Tuesday, May 14, 13 IR Optimization May 15th, 2013 Tuesday, May 14, 13 But before we talk about IR optimization... Wrapping up IR generation Tuesday, May 14, 13 Three-Address Code Or TAC The IR that you will be using for

More information

Buffer overflow background

Buffer overflow background and heap buffer background Comp Sci 3600 Security Heap Outline and heap buffer Heap 1 and heap 2 3 buffer 4 5 Heap Outline and heap buffer Heap 1 and heap 2 3 buffer 4 5 Heap Address Space and heap buffer

More information

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer

HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer HexType: Efficient Detection of Type Confusion Errors for C++ Yuseok Jeon Priyam Biswas Scott A. Carr Byoungyoung Lee Mathias Payer Motivation C++ is a popular programming language Google Chrome, Firefox,

More information

Secure Virtual Architecture: Using LLVM to Provide Memory Safety to the Entire Software Stack

Secure Virtual Architecture: Using LLVM to Provide Memory Safety to the Entire Software Stack Secure Virtual Architecture: Using LLVM to Provide Memory Safety to the Entire Software Stack John Criswell, University of Illinois Andrew Lenharth, University of Illinois Dinakar Dhurjati, DoCoMo Communications

More information

September 10,

September 10, September 10, 2013 1 Bjarne Stroustrup, AT&T Bell Labs, early 80s cfront original C++ to C translator Difficult to debug Potentially inefficient Many native compilers exist today C++ is mostly upward compatible

More information

CS 161 Computer Security

CS 161 Computer Security Wagner Spring 2014 CS 161 Computer Security 1/27 Reasoning About Code Often functions make certain assumptions about their arguments, and it is the caller s responsibility to make sure those assumptions

More information

CS558 Programming Languages

CS558 Programming Languages CS558 Programming Languages Fall 2017 Lecture 3a Andrew Tolmach Portland State University 1994-2017 Binding, Scope, Storage Part of being a high-level language is letting the programmer name things: variables

More information

Programming refresher and intro to C programming

Programming refresher and intro to C programming Applied mechatronics Programming refresher and intro to C programming Sven Gestegård Robertz sven.robertz@cs.lth.se Department of Computer Science, Lund University 2018 Outline 1 C programming intro 2

More information

Tutorial 1: Introduction to C Computer Architecture and Systems Programming ( )

Tutorial 1: Introduction to C Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Tutorial 1: Introduction to C Computer Architecture and Systems Programming (252-0061-00) Herbstsemester 2012 Goal Quick introduction to C Enough

More information

Lecture Notes on Types in C

Lecture Notes on Types in C Lecture Notes on Types in C 15-122: Principles of Imperative Computation Frank Pfenning, Rob Simmons Lecture 20 April 2, 2013 1 Introduction In lecture 18, we emphasized the things we lost by going to

More information

Cling: A Memory Allocator to Mitigate Dangling Pointers. Periklis Akritidis

Cling: A Memory Allocator to Mitigate Dangling Pointers. Periklis Akritidis Cling: A Memory Allocator to Mitigate Dangling Pointers Periklis Akritidis --2010 Use-after-free Vulnerabilities Accessing Memory Through Dangling Pointers Techniques : Heap Spraying, Feng Shui Manual

More information

Types and Type Inference

Types and Type Inference CS 242 2012 Types and Type Inference Notes modified from John Mitchell and Kathleen Fisher Reading: Concepts in Programming Languages, Revised Chapter 6 - handout on Web!! Outline General discussion of

More information

SAFECODE: A PLATFORM FOR DEVELOPING RELIABLE SOFTWARE IN UNSAFE LANGUAGES DINAKAR DHURJATI

SAFECODE: A PLATFORM FOR DEVELOPING RELIABLE SOFTWARE IN UNSAFE LANGUAGES DINAKAR DHURJATI SAFECODE: A PLATFORM FOR DEVELOPING RELIABLE SOFTWARE IN UNSAFE LANGUAGES BY DINAKAR DHURJATI B.Tech., Indian Institute of Technology - Delhi, 2000 DISSERTATION Submitted in partial fulfillment of the

More information

Nail. A Practical Tool for Parsing and Generating Data Formats. Julian Bangert, Nickolai Zeldovich MIT CSAIL. OSDI 14 October 2014

Nail. A Practical Tool for Parsing and Generating Data Formats. Julian Bangert, Nickolai Zeldovich MIT CSAIL. OSDI 14 October 2014 Nail A Practical Tool for Parsing and Generating Data Formats Julian Bangert, Nickolai Zeldovich MIT CSAIL OSDI 14 October 2014 1 / 12 Motivation Parsing Vulnerabilities hand-crafted input parsing and

More information

14. Pointers, Algorithms, Iterators and Containers II

14. Pointers, Algorithms, Iterators and Containers II Recall: Pointers running over the Array Beispiel 14. Pointers, Algorithms, Iterators and Containers II Iterations with Pointers, Arrays: Indices vs. Pointers, Arrays and Functions, Pointers and const,

More information

CSCI 171 Chapter Outlines

CSCI 171 Chapter Outlines Contents CSCI 171 Chapter 1 Overview... 2 CSCI 171 Chapter 2 Programming Components... 3 CSCI 171 Chapter 3 (Sections 1 4) Selection Structures... 5 CSCI 171 Chapter 3 (Sections 5 & 6) Iteration Structures

More information

COMPUTER SCIENCE TRIPOS

COMPUTER SCIENCE TRIPOS CST1.2018.4.1 COMPUTER SCIENCE TRIPOS Part IB Monday 4 June 2018 1.30 to 4.30 COMPUTER SCIENCE Paper 4 Answer five questions: up to four questions from Section A, and at least one question from Section

More information

Language Security. Lecture 40

Language Security. Lecture 40 Language Security Lecture 40 (from notes by G. Necula) Prof. Hilfinger CS 164 Lecture 40 1 Lecture Outline Beyond compilers Looking at other issues in programming language design and tools C Arrays Exploiting

More information

Fall 2018 Discussion 2: September 3, 2018

Fall 2018 Discussion 2: September 3, 2018 CS 61C C Basics Fall 2018 Discussion 2: September 3, 2018 1 C C is syntactically similar to Java, but there are a few key differences: 1. C is function-oriented, not object-oriented; there are no objects.

More information

Safe arrays and pointers for C

Safe arrays and pointers for C Safe arrays and pointers for C John Nagle July, 2012 Introduction Buffer overflows continue to plague C and C++ programs. This is a draft proposal for dealing with that problem. The basis of this proposal

More information

Enforcing Textual Alignment of

Enforcing Textual Alignment of Parallel Hardware Parallel Applications IT industry (Silicon Valley) Parallel Software Users Enforcing Textual Alignment of Collectives using Dynamic Checks and Katherine Yelick UC Berkeley Parallel Computing

More information

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

CS321 Languages and Compiler Design I Winter 2012 Lecture 13 STATIC SEMANTICS Static Semantics are those aspects of a program s meaning that can be studied at at compile time (i.e., without running the program). Contrasts with Dynamic Semantics, which describe how

More information

Pointer Ownership Model

Pointer Ownership Model Pointer Ownership Model Abstract Pointers are a dangerous feature provided by C/C++, and incorrect use of pointers is a common source of bugs and vulnerabilities. Most new languages lack pointers or severely

More information

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited CMSC 330: Organization of Programming Languages Type Systems, Names & Binding Topics Covered Thus Far Programming languages Syntax specification Regular expressions Context free grammars Implementation

More information

Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues

Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues Finding Bugs Last time Run-time reordering transformations Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues CS553 Lecture Finding

More information

Programming in C and C++

Programming in C and C++ Programming in C and C++ Types, Variables, Expressions and Statements Neel Krishnaswami and Alan Mycroft Course Structure Basics of C: Types, variables, expressions and statements Functions, compilation

More information

Motivation was to facilitate development of systems software, especially OS development.

Motivation was to facilitate development of systems software, especially OS development. A History Lesson C Basics 1 Development of language by Dennis Ritchie at Bell Labs culminated in the C language in 1972. Motivation was to facilitate development of systems software, especially OS development.

More information

Advances in Programming Languages: Regions

Advances in Programming Languages: Regions Advances in Programming Languages: Regions Allan Clark and Stephen Gilmore The University of Edinburgh February 22, 2007 Introduction The design decision that memory will be managed on a per-language basis

More information

Secure Low-Level Programming via Hardware-Assisted Memory-Safe C

Secure Low-Level Programming via Hardware-Assisted Memory-Safe C Secure Low-Level Programming via Hardware-Assisted Memory-Safe C Prof. Milo Martin Santosh Nagarakatte, Joe Devietti, Jianzhou Zhao, Colin Blundell Prof. Steve Zdancewic University of Pennsylvania Please

More information

Secure Programming. An introduction to Splint. Informatics and Mathematical Modelling Technical University of Denmark E

Secure Programming. An introduction to Splint. Informatics and Mathematical Modelling Technical University of Denmark E Secure Programming An introduction to Splint Christian D. Jensen René Rydhof Hansen Informatics and Mathematical Modelling Technical University of Denmark E05-02230 CDJ/RRH (IMM/DTU) Secure Programming

More information

Recitation #11 Malloc Lab. November 7th, 2017

Recitation #11 Malloc Lab. November 7th, 2017 18-600 Recitation #11 Malloc Lab November 7th, 2017 1 2 Important Notes about Malloc Lab Malloc lab has been updated from previous years Supports a full 64 bit address space rather than 32 bit Encourages

More information

Programming. Pointers, Multi-dimensional Arrays and Memory Management

Programming. Pointers, Multi-dimensional Arrays and Memory Management Programming Pointers, Multi-dimensional Arrays and Memory Management Summary } Computer Memory } Pointers } Declaration, assignment, arithmetic and operators } Casting and printing pointers } Relationship

More information

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms SE352b Software Engineering Design Tools W3: Programming Paradigms Feb. 3, 2005 SE352b, ECE,UWO, Hamada Ghenniwa SE352b: Roadmap CASE Tools: Introduction System Programming Tools Programming Paradigms

More information

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Pros and Cons of Pointers Lecture 27 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers

More information

Lecture 14 Pointer Analysis

Lecture 14 Pointer Analysis Lecture 14 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis [ALSU 12.4, 12.6-12.7] Phillip B. Gibbons 15-745: Pointer Analysis

More information

Understanding Undefined Behavior

Understanding Undefined Behavior Session Developer Tools #WWDC17 Understanding Undefined Behavior 407 Fred Riss, Clang Team Ryan Govostes, Security Engineering and Architecture Team Anna Zaks, Program Analysis Team 2017 Apple Inc. All

More information

Lecture 20 Pointer Analysis

Lecture 20 Pointer Analysis Lecture 20 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis (Slide content courtesy of Greg Steffan, U. of Toronto) 15-745:

More information

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Type checking Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Summary of parsing Parsing A solid foundation: context-free grammars A simple parser: LL(1) A more powerful parser:

More information

logistics: ROP assignment

logistics: ROP assignment bug-finding 1 logistics: ROP assignment 2 2013 memory safety landscape 3 2013 memory safety landscape 4 different design points memory safety most extreme disallow out of bounds usually even making out-of-bounds

More information

Programming Languages Third Edition. Chapter 7 Basic Semantics

Programming Languages Third Edition. Chapter 7 Basic Semantics Programming Languages Third Edition Chapter 7 Basic Semantics Objectives Understand attributes, binding, and semantic functions Understand declarations, blocks, and scope Learn how to construct a symbol

More information

Hyperkernel: Push-Button Verification of an OS Kernel

Hyperkernel: Push-Button Verification of an OS Kernel Hyperkernel: Push-Button Verification of an OS Kernel Luke Nelson, Helgi Sigurbjarnarson, Kaiyuan Zhang, Dylan Johnson, James Bornholt, Emina Torlak, and Xi Wang The OS Kernel is a critical component Essential

More information

Lecture Notes on Polymorphism

Lecture Notes on Polymorphism Lecture Notes on Polymorphism 15-122: Principles of Imperative Computation Frank Pfenning Lecture 21 November 9, 2010 1 Introduction In this lecture we will start the transition from C0 to C. In some ways,

More information

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas CS 6V81.005 Automatic Exploit Generation (AEG) Matthew Stephen Department of Computer Science University of Texas at Dallas February 20 th, 2012 Outline 1 Overview Introduction Considerations 2 AEG Challenges

More information

Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. {livshits,

Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. {livshits, Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs {livshits, lam}@cs.stanford.edu 2 Background Software systems are getting bigger Harder to develop Harder to modify Harder

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Type Systems, Names and Binding CMSC 330 - Spring 2013 1 Topics Covered Thus Far! Programming languages Ruby OCaml! Syntax specification Regular expressions

More information

CCured: Type-Safe Retrofitting of Legacy Code

CCured: Type-Safe Retrofitting of Legacy Code Published in the Proceedings of the Principles of Programming Languages, 2002, pages 128 139 CCured: Type-Safe Retrofitting of Legacy Code George C. Necula Scott McPeak Westley Weimer University of California,

More information

Data-Centric Locality in Chapel

Data-Centric Locality in Chapel Data-Centric Locality in Chapel Ben Harshbarger Cray Inc. CHIUW 2015 1 Safe Harbor Statement This presentation may contain forward-looking statements that are based on our current expectations. Forward

More information

Important From Last Time

Important From Last Time Important From Last Time Embedded C Pros and cons Macros and how to avoid them Intrinsics Interrupt syntax Inline assembly Today Advanced C What C programs mean How to create C programs that mean nothing

More information

Region-Based Memory Management in Cyclone

Region-Based Memory Management in Cyclone Region-Based Memory Management in Cyclone Dan Grossman Cornell University June 2002 Joint work with: Greg Morrisett, Trevor Jim (AT&T), Michael Hicks, James Cheney, Yanling Wang Cyclone A safe C-level

More information

System Assertions. Andreas Zeller

System Assertions. Andreas Zeller System Assertions Andreas Zeller System Invariants Some properties of a program must hold over the entire run: must not access data of other processes must handle mathematical exceptions must not exceed

More information

Compiling Techniques

Compiling Techniques Lecture 2: The view from 35000 feet 19 September 2017 Table of contents 1 2 Passes Representations 3 Instruction Selection Register Allocation Instruction Scheduling 4 of a compiler Source Compiler Machine

More information

The Low-Level Bounded Model Checker LLBMC

The Low-Level Bounded Model Checker LLBMC The Low-Level Bounded Model Checker LLBMC A Precise Memory Model for LLBMC Carsten Sinz Stephan Falke Florian Merz October 7, 2010 VERIFICATION MEETS ALGORITHM ENGINEERING KIT University of the State of

More information

Ivy: Modernizing C. David Gay, Intel Labs Berkeley

Ivy: Modernizing C. David Gay, Intel Labs Berkeley Ivy: Modernizing C David Gay, Intel Labs Berkeley Joint work between UCB and Intel Berkeley: George Necula, Eric Brewer, Jeremy Condit, Zachary Anderson, Matt Harren, Feng Zhou, Bill McCloskey, Robert

More information

Motivation was to facilitate development of systems software, especially OS development.

Motivation was to facilitate development of systems software, especially OS development. A History Lesson C Basics 1 Development of language by Dennis Ritchie at Bell Labs culminated in the C language in 1972. Motivation was to facilitate development of systems software, especially OS development.

More information