Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017

Similar documents
Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) 1

GNAT Pro Innovations for High-Integrity Development

Using Ada in non-ada systems

SPARKSkein A Formal and Fast Reference Implementation of Skein

A student was asked to point out interface elements in this code: Answer: cout. What is wrong?

AdaCore Vendor Presentation

SPARK 2014 User s Guide

Dynamic Control Hazard Avoidance

Ch. 11: References & the Copy-Constructor. - continued -

Spark verification features

Page 1. Last Time. Today. Embedded Compilers. Compiler Requirements. What We Get. What We Want

18-642: Code Style for Compilers

CS 31: Intro to Systems Caching. Martin Gagne Swarthmore College March 23, 2017

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

QUIZ. What are 3 differences between C and C++ const variables?

QUIZ How do we implement run-time constants and. compile-time constants inside classes?

Hybrid Verification in SPARK 2014: Combining Formal Methods with Testing

Implementation Guidance for the Adoption of SPARK

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

explicit class and default definitions revision of SC22/WG21/N1582 =

The SECURE Project and GCC

Ada and Real-Time. Prof. Lars Asplund. Mälardalen University, Computer Science

Background. Let s see what we prescribed.

Program Verification (6EC version only)

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CITS5501 Software Testing and Quality Assurance Formal methods

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018

SECUDRIVE Sanitizer Portable User Guide

Exception Handling. Sometimes when the computer tries to execute a statement something goes wrong:

CSE 333 Lecture 1 - Systems programming

Lecture 1: Perfect Security

Exception Handling. Run-time Errors. Methods Failure. Sometimes when the computer tries to execute a statement something goes wrong:

Formal Verification Techniques for GPU Kernels Lecture 1

Pipelining - Part 2. CSE Computer Systems October 19, 2001

Formal Verification in Aeronautics: Current Practice and Upcoming Standard. Yannick Moy, AdaCore ACSL Workshop, Fraunhofer FIRST

All code must follow best practices. Part (but not all) of this is adhering to the following guidelines:

Functions in MIPS. Functions in MIPS 1

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 25

CSE 12 Abstract Syntax Trees

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

Green Hills Software, Inc.

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Software Engineering 2 A practical course in software engineering. Ekkart Kindler

How much is a mechanized proof worth, certification-wise?

Vector and Free Store (Vectors and Arrays)

Performance analysis basics

CSI31 Introduction to Computer Programming I. Dr. Sharon Persinger Fall

Reliable programming

1 Parsing (25 pts, 5 each)

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

Unit 2. Chapter 4 Cache Memory

William Stallings Computer Organization and Architecture 8th Edition. Cache Memory

Register Allocation. Lecture 16

1: Introduction to Object (1)

AdaCore technologies

#include <stdio.h> int main() { printf ("hello class\n"); return 0; }

Lecture 5. Announcements: Today: Finish up functions in MIPS

Labels and Information Flow

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

Lecture 7: Data Abstractions

Homework #2 Think in C, Write in Assembly

Today. Putting it all together

Software Design and Analysis for Engineers

Secure Programming Techniques

High Performance Computing and Programming, Lecture 3

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

Important From Last Time

CIS 890: Safety Critical Systems

Formal semantics and other research around Spark2014

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies.

Architecture as Interface

CSc 520 Principles of Programming Languages

Memory Management: Virtual Memory and Paging CS 111. Operating Systems Peter Reiher

Introduction to Lab 2

ANITA S SUPER AWESOME RECITATION SLIDES

Memory Management. Kevin Webb Swarthmore College February 27, 2018

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

assembler Machine Code Object Files linker Executable File

Chapter 3:: Names, Scopes, and Bindings (cont.)

Lectures 5. Announcements: Today: Oops in Strings/pointers (example from last time) Functions in MIPS

QUIZ. Write the following for the class Bar: Default constructor Constructor Copy-constructor Overloaded assignment oper. Is a destructor needed?

ECE 485/585 Microprocessor System Design

John Wawrzynek & Nick Weaver

Storage Management 1

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, SPRING 2013

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Virtual Memory. ICS332 Operating Systems

Programming Language Vulnerabilities within the ISO/IEC Standardization Community

Improving Software Testability

Types, Values, Variables & Assignment. EECS 211 Winter 2018

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

Topics. Computer Organization CS Improving Performance. Opportunity for (Easy) Points. Three Generic Data Hazards

Shared Memory Programming With OpenMP Exercise Instructions

Scalability in Compiler Development How to Get Testing and Optimization Done in a Reasonable Time. Jeremy Bennett

MISRA-C. Subset of the C language for critical systems

Virtual Memory. 11/8/16 (election day) Vote!

Transcription:

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

The problem Secure coding standards call for sanitization of sensitive data after it has been used. What does this actually mean? How do you do it? How do you know you ve got it right? Oh.. and we had to do this for a client project, to meet GCHQ evaluation standards

The problem Standards survey: GCHQ IA Developers Note 6: Coding Requirements and Guidance CERT Coding Standards ISO SC22/WG23 Technical Report 24772 Common Weakness Enumeration (CWE) Cryptography Coding Standard

The problem GCHQ IA Developers Note 6: Coding Requirements and Guidance Sanitise all variables that contain sensitive data (such as cryptovariables and unencrypted data) sanitisation may require multiple overwrites or verification, or both. if a variable can be shown to be overwritten shortly afterwards, it may be acceptable not to sanitise it, provided it is sanitised when it is no longer needed. Shortly is not defined more precisely, since it will depend on the situation

The problem GCHQ IA Developers Note 6: Coding Requirements and Guidance Sanitise all variables that contain sensitive data (such as cryptovariables and unencrypted data) sanitisation may require multiple overwrites or verification, or both. if a variable can be shown to be overwritten shortly afterwards, it may be acceptable not to sanitise it, provided it is sanitised when it is no longer needed. Shortly is not defined more precisely, What since does it will depend this on the situation actually mean?

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Technical issues So why not just write zeroes to the variable? declare T : Word32; begin -- Do stuff with T; end; -- Now sanitize T T := 0;

Technical issues So why not just write zeroes to the variable? T is local, therefore final assignment is dead in information-flow terms. Optimizing compilers can remove the final assignment. Oops! Modern compilers try very hard to remove redundant loads and stores. All zeroes might not be a valid value, so can a legal assignment statement be written at all?

Technical issues Derived values and copies If A and B are sensitive, then C := A op B; Is C also sensitive? Does C require sanitization? What about copies of sensitive date in compilergenerated local variables, CPU registers, data cache? How do you sanitize those (without assembly language programming )?

Technical issues By-Copy parameter passing See above copies are bad! How do you sanitize a By-Copy in (aka const ) parameter anyway???

Technical issues CPU data caching and memory hierarchy. Memory subsystem in a modern CPU is really really complicated! Multiple levels of cache. Instruction re-ordering and write coalescing etc. etc. Register renaming (more copies!) Operating system paging and virtual memory? Has a copy of my sensitive data been written to disk?!?!

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Design goals For a recent project design constraints and goals: Sanitization code in source Ada (using SPARK subset) no assembly language required. Portable no use of non-portable or implementation-defined language features. No chance of changing the compiler in the scope of this project. Bare metal embedded target, so compatibility with GNAT Pro Zero-FootPrint (ZFP) runtime library. Compatible with both SPARK 2005 and SPARK 2014 languages and verification tools. Just works (with confidence) at any compiler optimization level.

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Ada support Various language mechanisms were investigated, including Volatile aspect Limited and By-Reference types Inspection_Point pragma

Volatile Easy! Just mark a sensitive object as Volatile, and the compiler will respect the reads and writes exactly as indicated in the source code Better still use Volatile Types for all sensitive data. Looks good, but

Volatile Problems with Volatile 1. It is a blunt instrument it preserves all reads and writes of an object, not just the last one, so some performance penalty 2. Compilers don t always get it right anyway

Paper from EMSOFT 2008. Have compilers improved? Not sure Are commercial compilers better than open source (e.g. GNAT Pro vs FSF GCC vs LLVM)? Don t know

Limited types A very useful mechanism in Ada No assignment by default. Good! Passed By-Reference, so no copies. Good!

By-Reference types Another useful mechanism in Ada, and useful where limited types are not appropriate. Some types are defined to be By Reference, so avoids copying of sensitive parameters, for example Tagged ( OO ) types (RM 6.2(5)) Record with Volatile component (RM C.6(18)

Inspection_Point Pragma added in Ada 95, but little used (or understood?) Very useful requirement in RM H 3.2(9): The implementation is not allowed to perform dead store elimination on the last assignment to a variable prior to a point where the variable is inspectable. Thus an inspection point has the effect of an implicit read of each of its inspectable objects. Exactly what we want! But

Inspection_Point How does it work? GCC sources gcc/ada/gcc-interface/trans.c, in function Pragma_to_gnu () tree gnu_expr = gnat_to_gnu (gnat_expr);... ASM_VOLATILE_P (gnu_expr) = 1; So see concerns above over correct compilation of Volatile.

Pattern 1 Combining these ideas yields a pattern for a sanitized abstract data type: package Sensitive is type T is limited private; -- so no assignment procedure Sanitize (X : out T); pragma No_Inline (Sanitize); private type T is limited record -- so by-reference F : -- and so on end record; end Sensitive;

Pattern 1 To allow for alternative implementations (i.e. for different targets/operating systems), the body of Sanitize is supplied as a separate subunit. For a ZFP/Bare-Metal target, we might write: separate (Sensitive) procedure Sanitize (X : out T) is begin X.F := 0; -- or other valid value pragma Inspection_Point (X); end Sanitize;

SPARK In both SPARK 2005 and SPARK 2014, a sanitizing assignment is reported as Ineffective by information-flow analysis. So... expect this, and justify: pragma Warnings (Off, unused assignment, Reason => Sanitization ); T := 0; pragma Inspection_Point (T);

SPARK What about proof? For non-limited types, we could declare a constant for the sanitized value, and use that in postconditions and/or assertions. Note, though, that the value of the constant must be valid, so a value with representation 2#0000_0000_ # might not be OK. For limited types, we could declare a Booleanvalued Is_Sanitized function, thus:

SPARK package Sensitive is type T is limited private; -- so no assignment function Is_Sanitized (X : in T) return Boolean with Ghost; -- no implementation procedure Sanitize (X : out T) with Post => Is_Sanitized (X); pragma No_Inline (Sanitize); private -- As before end Sensitive;

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Policy Identification and Naming Project must clearly define what is sensitive. Consider global and local variables carefully May also depend on physical characteristics of memory (e.g. Stack might be in secure RAM, but global data isn t )

Policy Identification and Naming Sensitive constants are not permitted. Define a naming convention for sensitive types, variables and formal parameters. Choose convention to facilitate automated search of compiler and tool output.

Policy Types and Patterns Use by-reference types for sensitive data. Use limited types as per Pattern 1 above where possible. Use pragma Warnings to suppress SPARK flow error.

Contents The problem Technical issues Design goals Language support A policy for sanitization Related and further work

Related and Further Work (1) Special compiler switch to automatically sanitize local data? -ferase-stack perhaps? Actually, this has already been done at least twice: Laurent Simon at Cambridge see his paper Erasing Secrets from RAM from RWC 2017 The team at www.embecosm.com in LLVM. Will it work with Ada?

Related and Further Work (2) What about a new language-defined Aspect? Key : Word32 with Sensitive; --??? Then compiler takes care of it?

Related and Further Work (3) What is the impact of Link-Time Optimization (LTO)? No idea Can we use information-flow analysis to track values derived from sensitive data? Like taint analysis in other languages

Questions