Quantifying Information Leaks in Software

Similar documents
BITCOIN MINING IN A SAT FRAMEWORK

Seminar in Software Engineering Presented by Dima Pavlov, November 2010

Introduction to CBMC. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel December 5, 2011

Bounded Model Checking Of C Programs: CBMC Tool Overview

Application of Propositional Logic II - How to Test/Verify my C program? Moonzoo Kim

Introduction to CBMC: Part 1

CS 267: Automated Verification. Lecture 13: Bounded Model Checking. Instructor: Tevfik Bultan

Program Security and Vulnerabilities Class 2

Software Model Checking. Xiangyu Zhang

CSC2108: Automated Verification Assignment 3. Due: November 14, classtime.

Applications of Logic in Software Engineering. CS402, Spring 2016 Shin Yoo

Constant-time programming in C

MEMORY SAFETY ATTACKS & DEFENSES

SMT-Based Bounded Model Checking for Embedded ANSI-C Software. Lucas Cordeiro, Bernd Fischer, Joao Marques-Silva

Copyright 2008 CS655 System Modeling and Analysis. Korea Advanced Institute of Science and Technology

Formal Verification of Embedded Software in Medical Devices Considering Stringent Hardware Constraints

Automated Test Generation using CBMC

Model Checking Embedded C Software using k-induction and Invariants

C Language, Token, Keywords, Constant, variable

C Bounded Model Checker

On Reasoning about Finite Sets in Software Checking

Verifying Temporal Properties via Dynamic Program Execution. Zhenhua Duan Xidian University, China

CS 161 Computer Security

CSE 509: Computer Security

Fundamentals of Computer Security

CS 161 Computer Security

UNDERSTANDING PROGRAMMING BUGS IN ANSI-C SOFTWARE USING BOUNDED MODEL CHECKING COUNTER-EXAMPLES

Handling Loops in Bounded Model Checking of C Programs via k-induction

Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08

Software Security: Buffer Overflow Attacks

Linux Device Drivers Interrupt Requests

CS 361S - Network Security and Privacy Spring Homework #2

: A Bounded Model Checking Tool to Verify Qt Applications

Overflows, Injection, & Memory Safety

Brave New 64-Bit World. An MWR InfoSecurity Whitepaper. 2 nd June Page 1 of 12 MWR InfoSecurity Brave New 64-Bit World

Short Notes of CS201

CSE 333 Section 8 - Client-Side Networking

CSE 565 Computer Security Fall 2018

CS201 - Introduction to Programming Glossary By

Model Checking and Its Applications

NETWORK PROGRAMMING. Instructor: Junaid Tariq, Lecturer, Department of Computer Science

Software verification for ubiquitous computing

SpiNNaker Application Programming Interface (API)

This time. Defenses and other memory safety vulnerabilities. Everything you ve always wanted to know about gdb but were too afraid to ask

Undefinedness and Non-determinism in C

Verifying C & C++ with ESBMC

ESBMC 1.22 (Competition Contribution) Jeremy Morse, Mikhail Ramalho, Lucas Cordeiro, Denis Nicole, Bernd Fischer

Lecture 2: Symbolic Model Checking With SAT

Verification & Validation of Open Source

The Low-Level Bounded Model Checker LLBMC

Bug Finding with Under-approximating Static Analyses. Daniel Kroening, Matt Lewis, Georg Weissenbacher

CSE 333 SECTION 3. POSIX I/O Functions

Topics in Software Security Vulnerability

CS 161 Computer Security

Making C Less Dangerous

CS504 4 th Quiz solved by MCS GROUP

Crit-bit Trees. Adam Langley (Version )

Information Flow Analysis and Type Systems for Secure C Language (VITC Project) Jun FURUSE. The University of Tokyo

Introduction to Computer Systems , fall th Lecture, Sep. 28 th

CS 33. Intro to Storage Allocation. CS33 Intro to Computer Systems XXVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CptS 360 (System Programming) Unit 2: Introduction to UNIX and Linux

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

Software Engineering using Formal Methods

Frama-C Value Analysis

File Descriptors and Piping

Execution path timing analysis of UNIX daemons

Structures, Unions Alignment, Padding, Bit Fields Access, Initialization Compound Literals Opaque Structures Summary. Structures

Automatic Software Verification

Integers. Today. Next time. ! Numeric Encodings! Programming Implications! Basic operations. ! Floats

CS240: Programming in C

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

SAT-based Model Checking for C programs

CSci 4061 Introduction to Operating Systems. Input/Output: High-level

Crit-bit Trees. Adam Langley (Version )

VS 3 : SMT Solvers for Program Verification

A brief introduction to C programming for Java programmers

Structures and Unions in C

Lecture 10. Pointless Tainting? Evaluating the Practicality of Pointer Tainting. Asia Slowinska, Herbert Bos. Advanced Operating Systems

Systems Programming. 09. Filesystem in USErspace (FUSE) Alexander Holupirek

ECS 153 Discussion Section. April 6, 2015

Programming refresher and intro to C programming

No model may be available. Software Abstractions. Recap on Model Checking. Model Checking for SW Verif. More on the big picture. Abst -> MC -> Refine

Version:1.1. Overview of speculation-based cache timing side-channels

CSC209H Lecture 7. Dan Zingaro. February 25, 2015

SMT-Based Bounded Model Checking for Embedded ANSI-C Software

MEMORY MANAGEMENT TEST-CASE GENERATION OF C PROGRAMS USING BOUNDED MODEL CHECKING

The Spin Model Checker : Part I/II

On Limitations of Designing LRPS: Attacks, Principles and Usability

Computer Systems CEN591(502) Fall 2011

CSC 252: Computer Organization Spring 2018: Lecture 9

CS 261 Fall Mike Lam, Professor. Structs and I/O

CSC209H Lecture 9. Dan Zingaro. March 11, 2015

Formal Methods for Software Development

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

typedef void (*type_fp)(void); int a(char *s) { type_fp hf = (type_fp)(&happy_function); char buf[16]; strncpy(buf, s, 18); (*hf)(); return 0; }

Baggy bounds with LLVM

Dissecting a 17-year-old kernel bug

Secure Software Development: Theory and Practice

Outline. OS Interface to Devices. System Input/Output. CSCI 4061 Introduction to Operating Systems. System I/O and Files. Instructor: Abhishek Chandra

Transcription:

Quantifying Information Leaks in Software Jonathan Heusser, Pasquale Malacaria Queen Mary University of London 11. 10. 2016

Introduction High complexity associated with quantifying precise leakage quantities Technique to decide if a program conforms to a quantitative policy Applied to a number of officially reported information leak vulnerabilities in Linux Kernel and authentication routines in SRP and IMSP When is there an unacceptable leakage? and Does the applied software patch solve it? First demonstration of QIF addressing real world industrial programs

Tools able to quantify leakage of confidential information Example: if(password==guess) access=1 else access=0 Unavoidable leakage - Attacker observing value of access If leakage is unavoidable, the real question is not whether or not the programs leak, but How much? For example, how much information about password can be obtained by the attacker who can read/write guess If the amount leaked is very small, the program might as well be considered secure

A precise QIF analysis for secret > few bits is computationally infeasible Involves computation of entropy of a random variable whose complexity is the same as computing all possible runs of the program Even when abstraction techniques and statistical sampling are used, useful analysis of real code through this method is problematic Hence to address computational feasibility, shift the focus from How much does it leak? to Does it leak more than k? Off-the-shelf symbolic model checkers like CBMC are able to efficiently answer the second question

CBMC Makes it easy to parse and analyse large ANSI C based projects It models bit vector semantics of C accurately - detect arithmetic overflows Nondeterministic choice functions to model user input - efficient solving due to the symbolic nature Though bounded, can check whether enough unwinding of the transition system was done - no deeper counterexamples

Quantification + nature of leak Counter examples causes of leak For example, we can extract a public user input from the counter example that triggers a violation Prove whether official patch eliminated the information leak Four main technical contributions

Model of Programs and Distinctions C function where inputs are formal arguments and outputs are either return values or pointer arguments P is the function taking inputs h, l Consider o=(h%4)+l h 4 bits, l 1 bit, observable o takes values from 0..4 and l is the low input P is modelled as transition system TS=(S,T,I,F)

Successor function for s S : Post(s) = {s S (s, s ) T } (1) A state s is in F if Post(s) = A path is a finite sequence of states π = s 0 s 1 s 2...s n where s 0 I and s n F A state is a tuple S = S H S L Input/output pairs of states of a path denoted as (h, l), o where o is produced by final state drawn from some output alphabet O

A distinction on the confidential input through observations O exists when at least two paths through P, that leads to different o for different h but constant l An equivalence relation P,l on the values of the high variables is defined as follows: h P,l h iff : if (h, l), o, (h, l), o are input/output pairs in P, then o = o That is, two high values are equivalent if they cannot be distinguished by any observable For the modulo program example, and equivalence class in P,l would be {1, 5, 9, 13}

Let I(X) be the set of all possible equivalence relations on a set X Define on I(X) the order: s 1, s 2 (s 1 s 2 s 1 s 2 ) (2), I(X) s 1, s 2 X defines a complete lattice over X (Lattice of Information)

Characterization of Non-Leaking Programs PROPOSITION 1: P is non-interfering iff for all l, P,l is the least element in I(S H ) PROPOSITION 2: p p iff for all probability distributions H(R P ) H(R P ) PROPOSITION 3: 1. P is non-interfering iff log 2 ( P ) = 0 2. The channel capacity of P is log 2 ( P ) 3. If for all probability distributions H(R P ) H(R P ) then P P

Encoding Distinction-Based Policies A program violates a policy if it makes more distinctions than what is allowed by the policy Use assume-guarantee reasoning to encode such a policy in driver function Triggers violation producing a counterexample of the policy int h1,h2,h3; int o1,o2,o3; h1=input(); h2=input(); h3=input(); o1=func(h1); o2=func(h2); assume(o1!=o2); //(A) o3=func(h3); assert(o3 == o1 o3 == o2); //(B)

Bounded Model Checking ANSI-C program into propositional formula Tool can check if unwinding bound is sufficient and ensure that no longer counterexample exists C P where C is constraint and P is accumulation of assumptions If E 1 and E 2 are two assume statements and Q is expression of assert statement, then P is P E 1 E 2 = Q

Driver Template to syntactically generate a driver for N distinction policy has been given If the driver template is successfully verified upto bound k, then func does not make more than N distinctions on the output within k It implies the validity of the following implication: o 1 o 2 o 1 o 3... o n 1 o n = o n+1 = o 1... o n+1 = o n Three claims on the result of model checking process

Checking Quantitative Policies: 4 steps Modelling Low Input typedef long long loff_t; typedef unsigned int size_t; int underflow(int h, loff_t ppos) { int bufsz; size_t nbytes; bufsz=1024; nbytes=20; if(ppos + nbytes > bufsz) //(A) nbytes = bufsz - ppos; //(B) if(ppos + nbytes > bufsz) { return h; //(C) } else{ return 0; } }

Environment Library functions or data structures that have no implementation, need to be modelled in a way for the property to be verified CBMC replaces function calls with no implementations with non-deterministic values Example: strcmp and memcmp returning 0 or non-zero int memcmp(char *s1, char *s2, unsigned int n){ int i; for(i=0;i<n;i++){ if(s1[i]!= s2[i]) return -1; } return 0; }

Experimental Results

Linux Kernel Parts of kernel memory gets mistakenly copied to user space Kernel memory modelled as non-deterministic values Syscalls - arguments and return value (Data structure and single values) AppleTalk struct sockaddr_at { u_char sat_len, sat_family, sat_port; struct at_addr sat_addr; union{ struct netrange r_netrange; char r_zero[8]; }sat_range; }; #define sat_zero sat_range.r_zero

int atalk_getname(struct socket *sock, struct sockaddr *uaddr, int *uaddr_len, int peer) { struct sockaddr_at sat; //Official Patch. Comment out to trigger leak //memset(&sat.sat_zero, 0, sizeof(sat.sat_zero));.. //sat structure gets filled memcpy(uaddr, &sat,sizeof(sat)); return 0; }

tcf_fill_node: struct tcmsg *tcm;... nlh=nlmsg_new(skb, pid, seq, event, sizeof(*tcm), flags); tcm=nlmsg_data(nlh); tcm->tcm_familu = AF_UNSPEC; tcm->tcm pad1 = 0; tcm->tcm pad1 = 0; // typo, should be tcm pad2 instead.

sigaltstack. Structure with padding: typedef struct sigaltstack{ void user *ss_sp; int ss_flags; //4 bytes padding on 64-bit size_t ss_size; } stack_t;

Copying whole structures: int do_sigaltstack (const stack_t user *uss, stack_t user *uoss, unsigned long sp){ stack_t oss;... // oss fields get filled if (copy_to_user(uoss, &oss, sizeof(oss))) goto out;... Calculation: pad = ALIGN - (sizeof(oss) % ALIGN); if(pad==align) padding=0; else padding = ((unsigned int) nondet_int())% (1 << (pad*8))

cpuset. if (*ppos + nbytes > ctr->bufsz) nbytes = ctr->bufsz - *ppos; if (copy_to_user(buf, ctr->buf + *ppos, nbytes)) return -EFAULT; Way out of actual buffer and thus disclose kernel memory Requires too much manual intervention Modify CBMC to return non-deterministic values for out-of-bound memory accesses

Authentication Checks SRP _TYPE( int ) t_getpass (char* buf, unsigned maxlen, const char* prompt) { DWORD mode; GetConsoleMode( handle, &mode ); SetConsoleMode( handle, mode & ~ENABLE_ECHO_INPUT ); if(fputs(prompt, stdout) == EOF fgets(buf, maxlen, stdin) == NULL) { SetConsoleMode(handle,mode); return -1; }...

IMPSD int login_plaintext(char* user, char* pass, char* reply struct passwd* pwd = getpwname(user); if (!pwd) return 1; if (strcmp(pwd->pw_passwd, crypt(pass, pwd->pw_passwd))!=0){ *reply = "wrong password" return 1; } return 0;

Description CVEBulletin LOC k Proof log 2 (N) Time appletalk 2009-3002 237 64 >6bit 1.39h tcffillnode 2009-3612 146 64 >6bit 3.34m sigaltstack 2009-3612 199 128 >7bit 49.5m cpuset 2007-2875 63 64 x >6bit 1.32m SRP - 93 8 1bit 0.128s login_unix - 128 8-2 bit 8.364s

Conclusion Combined model checking with theoretical work on Quantitative Information Flow Proof for whether official patches fix the problem Leaks are not synonymous with security breach Quantitative is better equipped than qualitative