Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017

Similar documents
Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) 1

GNAT Pro Innovations for High-Integrity Development

Using Ada in non-ada systems

SPARKSkein A Formal and Fast Reference Implementation of Skein

AdaCore Vendor Presentation

Page 1. Last Time. Today. Embedded Compilers. Compiler Requirements. What We Get. What We Want

SPARK 2014 User s Guide

Spark verification features

CSI31 Introduction to Computer Programming I. Dr. Sharon Persinger Fall

Exposing Uninitialized Variables: Strengthening and Extending Run-Time Checks in Ada

A student was asked to point out interface elements in this code: Answer: cout. What is wrong?

GNAT Pro User s Guide

AdaCore technologies

Memory Consistency Models

How much is a mechanized proof worth, certification-wise?

CS 31: Intro to Systems Caching. Martin Gagne Swarthmore College March 23, 2017

Ada and Real-Time. Prof. Lars Asplund. Mälardalen University, Computer Science

Computer Systems A Programmer s Perspective 1 (Beta Draft)

18-642: Code Style for Compilers

Hybrid Verification in SPARK 2014: Combining Formal Methods with Testing

Compilers. Lecture 2 Overview. (original slides by Sam

Formal Verification Techniques for GPU Kernels Lecture 1

Architecture as Interface

Institut Supérieur de l Aéronautique et de l Espace Ocarina: update and future directions

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 25

I/O Devices. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

On 17 June 2006, the editor provided the following list via an to the convener:

Performance analysis basics

Compiler Design Spring 2018

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

CSE 333 Lecture 1 - Systems programming

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

QUIZ. What is wrong with this code that uses default arguments?

Secure Programming Techniques

Current Developments in Fortran Standards. David Muxworthy 15 June 2012

Elaboration The execution of declarations

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

QUIZ. Write the following for the class Bar: Default constructor Constructor Copy-constructor Overloaded assignment oper. Is a destructor needed?

Exception Handling. Sometimes when the computer tries to execute a statement something goes wrong:

Formal Verification in Aeronautics: Current Practice and Upcoming Standard. Yannick Moy, AdaCore ACSL Workshop, Fraunhofer FIRST

All code must follow best practices. Part (but not all) of this is adhering to the following guidelines:

Dynamic Control Hazard Avoidance

Exception Handling. Run-time Errors. Methods Failure. Sometimes when the computer tries to execute a statement something goes wrong:

Typed Lambda Calculus. Chapter 9 Benjamin Pierce Types and Programming Languages

COSC 2P91. Bringing it all together... Week 4b. Brock University. Brock University (Week 4b) Bringing it all together... 1 / 22

A program execution is memory safe so long as memory access errors never occur:

CS533 Concepts of Operating Systems. Jonathan Walpole

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

Implementation Guidance for the Adoption of SPARK

CSC313 High Integrity Systems/CSCM13 Critical Systems. CSC313/CSCM13 Chapter 2 1/ 221

Software Engineering 2 A practical course in software engineering. Ekkart Kindler

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

CIS 890: Safety Critical Systems

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Type checking. Jianguo Lu. November 27, slides adapted from Sean Treichler and Alex Aiken s. Jianguo Lu November 27, / 39

GCC and LLVM collaboration

Program Verification (6EC version only)

The SECURE Project and GCC

A3. Programming Languages for Writing Safety-Critical Software

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

CS510 Advanced Topics in Concurrency. Jonathan Walpole

#include <stdio.h> int main() { printf ("hello class\n"); return 0; }

Register Allocation. Lecture 16

Operational Semantics. One-Slide Summary. Lecture Outline

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

Kampala August, Agner Fog

Labels and Information Flow

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

CSc 520 Principles of Programming Languages

Ch. 11: References & the Copy-Constructor. - continued -

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

ANITA S SUPER AWESOME RECITATION SLIDES

Gain Processing Efficiency of C Programming through avoiding Data Types Declaration.

QUIZ. What are 3 differences between C and C++ const variables?

Definition Checklist for Source Statement Counts

GNAT Reference Manual

Hello, World! in C. Johann Myrkraverk Oskarsson October 23, The Quintessential Example Program 1. I Printing Text 2. II The Main Function 3

Introduction to Stephe s Ada Library

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

SECUDRIVE Sanitizer Portable User Guide

explicit class and default definitions revision of SC22/WG21/N1582 =

Job Posting (Aug. 19) ECE 425. ARM7 Block Diagram. ARM Programming. Assembly Language Programming. ARM Architecture 9/7/2017. Microprocessor Systems

assembler Machine Code Object Files linker Executable File

CSc 520 Principles of Programming Languages

Assembly Language: Overview!

John Wawrzynek & Nick Weaver

Chapter 3:: Names, Scopes, and Bindings (cont.)

Memory Management: Virtual Memory and Paging CS 111. Operating Systems Peter Reiher

Object-Oriented Software Engineering Practical Software Development using UML and Java

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm

Programming Language Vulnerabilities within the ISO/IEC Standardization Community

Combining program verification with component-based architectures. Alexander Senier BOB 2018 Berlin, February 23rd, 2018

Assertions and Exceptions Lecture 11 Fall 2005

GNATprove a Spark2014 verifying compiler Florian Schanda, Altran UK

ECE 485/585 Microprocessor System Design

Transcription:

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

The problem Secure coding standards call for sanitization of sensitive data after it has been used. What does this actually mean? How do you do it? How do you know you ve got it right? Oh.. and we had to do this for a client project, to meet GCHQ evaluation standards

The problem Standards survey: GCHQ IA Developers Note 6: Coding Requirements and Guidance CERT Coding Standards ISO SC22/WG23 Technical Report 24772 Common Weakness Enumeration (CWE) Cryptography Coding Standard

The problem GCHQ IA Developers Note 6: Coding Requirements and Guidance Sanitise all variables that contain sensitive data (such as cryptovariables and unencrypted data) sanitisation may require multiple overwrites or verification, or both. if a variable can be shown to be overwritten shortly afterwards, it may be acceptable not to sanitise it, provided it is sanitised when it is no longer needed. Shortly is not defined more precisely, since it will depend on the situation

The problem GCHQ IA Developers Note 6: Coding Requirements and Guidance Sanitise all variables that contain sensitive data (such as cryptovariables and unencrypted data) sanitisation may require multiple overwrites or verification, or both. if a variable can be shown to be overwritten shortly afterwards, it may be acceptable not to sanitise it, provided it is sanitised when it is no longer needed. Shortly is not defined more precisely, What since does it will depend this on the situation actually mean?

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Technical issues So why not just write zeroes to the variable? declare T : Word32; begin -- Do stuff with T; end; -- Now sanitize T T := 0;

Technical issues So why not just write zeroes to the variable? T is local, therefore final assignment is dead in information-flow terms. Optimizing compilers can remove the final assignment. Oops! Modern compilers try very hard to remove redundant loads and stores. All zeroes might not be a valid value, so can a legal assignment statement be written at all?

Technical issues Derived values and copies If A and B are sensitive, then C := A op B; Is C also sensitive? Does C require sanitization? What about copies of sensitive date in compilergenerated local variables, CPU registers, data cache? How do you sanitize those (without assembly language programming )?

Technical issues By-Copy parameter passing See above copies are bad! How do you sanitize a By-Copy in parameter anyway???

Technical issues CPU data caching and memory hierarchy. Memory subsystem in a modern CPU is really really complicated! Multiple levels of cache. Instruction re-ordering and write coalescing etc. etc. Register renaming (more copies!) Operating system paging and virtual memory? Has a copy of my sensitive data been written to disk?!?!

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Design goals For a recent project design constraints and goals: Sanitization code in source Ada and/or SPARK no assembly language required. Portable no use of non-portable or implementation-defined language features. Bare metal embedded target, so compatibility with GNAT Pro Zero-FootPrint (ZFP) runtime library. Compatible with both SPARK 2005 and SPARK 2014 languages and verification tools. Just works (with confidence) at any compiler optimization level.

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Ada support Various language mechanisms were investigated, including Controlled Types Volatile aspect Limited and By-Reference types Inspection_Point pragma No_Inline aspect

Controlled Types Define a Finalize procedure that does the sanitization? Tempting, but Significant support from the runtime library required, so no chance of this working with ZFP. Not allowed by SPARK anyway. Complex (i.e. not very well understood) semantics and implementation issues. Therefore rejected!

Volatile Easy! Just mark a sensitive object as Volatile, and the compiler will respect the reads and writes exactly as indicated in the source code Better still use Volatile Types for all sensitive data. Looks good, but

Volatile Problems with Volatile 1. It is a blunt instrument it preserves all reads and writes of an object, not just the last one, so some performance penalty 2. Compilers don t always get it right anyway

Paper from EMSOFT 2008. Have compilers improved? Not sure Are commercial compilers better than open source (e.g. GNAT Pro vs FSF GCC vs LLVM)? Don t know

Limited types A very useful mechanism in Ada No assignment by default. Good! Passed By-Reference, so no copies. Good!

By-Reference types Another useful mechanism in Ada, and useful where limited types are not appropriate. Some types are defined to be By Reference, so avoids copying of sensitive parameters, for example Tagged types (RM 6.2(5)) Record with Volatile component (RM C.6(18)) GNAT gnatrm flag can be used to verify passing mechanism chosen for each parameters of each subprogram.

Inspection_Point Added to Annex H in Ada 95, but little used (or understood?) Very useful requirement in RM H 3.2(9): The implementation is not allowed to perform dead store elimination on the last assignment to a variable prior to a point where the variable is inspectable. Thus an inspection point has the effect of an implicit read of each of its inspectable objects. Exactly what we want! But

Inspection_Point How does it work? GNAT sources ada/gcc-interface/trans.c, in function Pragma_to_gnu () tree gnu_expr = gnat_to_gnu (gnat_expr);... ASM_VOLATILE_P (gnu_expr) = 1; So see concerns above over correct compilation of Volatile.

No_Inline Regehr recommends using a subprogram to perform the santization, and making sure that subprogram can never be inlined. This is intended to prevent inlining and subsequent optimization of the sanitizing assignment.

Pattern 1 Combining these ideas yields a pattern for a sanitized abstract data type: package Sensitive is type T is limited private; -- so no assignment procedure Sanitize (X : out T); pragma No_Inline (Sanitize); private type T is limited record -- so by-reference F : -- and so on end record; end Sensitive;

Pattern 1 To allow for alternative implementations (i.e. for different targets/operating systems), the body of Sanitize is supplied as a separate subunit. For a ZFP/Bare-Metal target, we might write: separate (Sensitive) procedure Sanitize (X : out T) is begin X.F := 0; -- or other valid value pragma Inspection_Point (X); end Sanitize;

SPARK In both SPARK 2005 and SPARK 2014, a sanitizing assignment is reported as Ineffective by information-flow analysis. So... expect this, and justify: pragma Warnings (Off, unused assignment, Reason => Sanitization ); T := 0; pragma Inspection_Point (T);

SPARK What about proof? For non-limited types, we could declare a constant for the sanitized value, and use that in postconditions and/or assertions. Note, though, that the value of the constant must be valid, so a value with representation 2#0000_0000_ # might not be OK. For limited types, we could declare a Booleanvalued function, thus:

SPARK package Sensitive is type T is limited private; -- so no assignment function Is_Sanitized (X : in T) return Boolean; procedure Sanitize (X : out T) with Post => Is_Sanitized (X); pragma No_Inline (Sanitize); private -- As before end Sensitive;

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Policy Identification and Naming Project must clearly define what is sensitive. Consider global and local variables carefully May also depend on physical characteristics of memory (e.g. Stack might be in secure RAM, but library-level data isn t )

Policy Identification and Naming Sensitive constants are not permitted. Define a naming convention for sensitive types, variables and formal parameters. Choose convention to facilitate automated search of compiler and tool output.

Policy Types and Patterns Use by-reference types for sensitive data. Use limited types as per Pattern 1 above where possible. Use pragma Warnings to suppress SPARK flow error.

Policy Compiler switches Use gnatwa to get useless assignment warning enabled. Use gnatrm to verify passing mechanism. Use g and fverbose-asm to check generated code if necessary.

Contents The problem Technical issues Design goals Ada language support A policy for sanitization Related and further work

Related and Further Work (1) Special compiler switch to automatically sanitize local data? -ferase-stack perhaps? Actually, this has already been done by the team at www.embecosm.com in GCC and LLVM. Will it work with Ada?

Related and Further Work (2) What about a new language-defined Aspect? Key : Word32 with Sensitive; --??? Then compiler takes care of it?

Related and Further Work (3) A binding to C11 stdatomic library would be good Provides portable access to memory fence instructions and so on

Related and Further Work (4) What is the impact of Link-Time Optimization (LTO)? No idea Can we use information-flow analysis to track values derived from sensitive data? Like taint analysis in other languages

Homework

Questions