Don t let data Go astray

Similar documents
The Go Programming Language. Frank Roberts

Let s Go! Akim D le, Etienne Renault, Roland Levillain. June 8, TYLA Let s Go! June 8, / 58

New Parallel Programming Languages for Optimization Research

Introduzione a Go e RPC in Go

Go for Java Developers

05-concurrency.txt Tue Sep 11 10:05: , Fall 2012, Class 05, Sept. 11, 2012 Randal E. Bryant

CS558 Programming Languages

Go Cheat Sheet. Operators. Go in a Nutshell. Declarations. Basic Syntax. Hello World. Functions. Comparison. Arithmetic. Credits

A Deterministic Concurrent Language for Embedded Systems

The Awesomeness of Go. Igor Lankin DevFest Karlsruhe, Nov 2016

What Came First? The Ordering of Events in

Static Deadlock Detection for Go by Global Session Graph Synthesis. Nicholas Ng & Nobuko Yoshida Department of Computing Imperial College London

Go Tutorial. Arjun Roy CSE 223B, Spring 2017

Go Forth and Code. Jonathan Gertig. CSC 415: Programing Languages. Dr. Lyle

Static Program Analysis Part 1 the TIP language

Go Tutorial. To do. A brief, gentle intro to Go. Next Networking. q Today

go get my/vulnerabilities Green threads are not eco friendly threads

A Deterministic Concurrent Language for Embedded Systems

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Concurrency in Go 9/22/17

Erlang and Go (CS262a, Berkeley Fall 2016) Philipp Moritz

Field Analysis. Last time Exploit encapsulation to improve memory system performance


developed ~2007 by Robert Griesemer, Rob Pike, Ken Thompson open source

Ripple: Reflection Analysis for Android Apps in Incomplete Information Environments

William Kennedy. Brian Ketelsen Erik St. Martin Steve Francia FOREWORD BY MANNING WITH

CS558 Programming Languages

Advanced Compiler Design. CSE 231 Instructor: Sorin Lerner

Wanted: Students to participate in a user study

CYSE 411/AIT681 Secure Software Engineering Topic #16. Static Analysis

4/9/18. CYSE 411/AIT681 Secure Software Engineering. Static Analysis for Secure Development. Current Practice for Software Assurance

A Deterministic Concurrent Language for Embedded Systems

Simply-Typed Lambda Calculus

CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio

C#: framework overview and in-the-small features

Detection of Bugs and Code Smells through Static Analysis of Go Source Code

Recap: Functions as first-class values

Optimizing for Bugs Fixed

A Gentle Introduction to Program Analysis

Runtime Checking for Program Verification Systems

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Secure Programming Lecture 15: Information Leakage

Advanced Programming Methods. Introduction in program analysis

Static Program Analysis Part 9 pointer analysis. Anders Møller & Michael I. Schwartzbach Computer Science, Aarhus University

COMP-520 GoLite Tutorial

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CS3733: Operating Systems

GO SHORT INTERVIEW QUESTIONS EXPLAINED IN COLOR

CSE 403: Software Engineering, Fall courses.cs.washington.edu/courses/cse403/16au/ Static Analysis. Emina Torlak

Static Program Analysis Part 7 interprocedural analysis

Static Analysis. Current Practice (cont d) 4/9/18. If You re Worried about Security. CYSE 411/AIT681 Secure Software Engineering. What more can we do?

Advances in Programming Languages

Code Reviews. James Walden Northern Kentucky University

CS558 Programming Languages

CS 4120 Lecture 31 Interprocedural analysis, fixed-point algorithms 9 November 2011 Lecturer: Andrew Myers

CSCI-GA Scripting Languages

Inheritance STL. Entity Component Systems. Scene Graphs. Event Systems

Static Vulnerability Analysis

9/21/17. Outline. Expression Evaluation and Control Flow. Arithmetic Expressions. Operators. Operators. Notation & Placement

PHP Personal Home Page PHP: Hypertext Preprocessor (Lecture 35-37)

Scientific GPU computing with Go A novel approach to highly reliable CUDA HPC 1 February 2014

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

C++ Concurrency in Action

High Performance Computing Course Notes Shared Memory Parallel Programming

Securing Software Applications Using Dynamic Dataflow Analysis. OWASP June 16, The OWASP Foundation

Functional Programming Patterns And Their Role Instructions

CS558 Programming Languages

GO CONCURRENCY BASICS AND PATTERNS EXPLAINED IN COLOR

Multiprocessors 2007/2008

Review: Easy Piece 1

6.858 Quiz 2 Review. Android Security. Haogang Chen Nov 24, 2014

func findbestfold(seq string, energyfunction func(fold)float64)) Fold {

Symbolic Computation and Common Lisp

Finding Vulnerabilities in Web Applications

Distributed Systems. 02r. Go Programming. Paul Krzyzanowski. TA: Yuanzhen Gu. Rutgers University. Fall 2015

DEBUGGING: DYNAMIC PROGRAM ANALYSIS

GO IDIOMATIC CONVENTIONS EXPLAINED IN COLOR

C and C++ Secure Coding 4-day course. Syllabus

Using the Go Programming Language in Practice

Cut. it s not just for Google. Eleanor McHugh.

Marwan Burelle. Parallel and Concurrent Programming

Marwan Burelle. Parallel and Concurrent Programming

Homework I - Solution

QUIZ on Ch.5. Why is it sometimes not a good idea to place the private part of the interface in a header file?

Jython. An introduction by Thinh Le

Using Positive Tainting and Syntax-Aware Evaluation to Counter SQL Injection Attacks

MOC 6232A: Implementing a Microsoft SQL Server 2008 Database

A Fast Review of C Essentials Part I

Binghamton University. CS-211 Fall Syntax. What the Compiler needs to understand your program

ECE 650 Systems Programming & Engineering. Spring 2018

COMP520 - GoLite Type Checking Specification

Chapter 6: Process Synchronization

Static Conflict Analysis for Multi-Threaded Object Oriented Programs

Pointer Analysis in the Presence of Dynamic Class Loading. Hind Presented by Brian Russell

Short Notes of CS201

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Advanced Compiler Design. CSE 231 Instructor: Sorin Lerner

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

CSE 451 Midterm 1. Name:

CS 376b Computer Vision

Transcription:

Don t let data Go astray A Context-Sensitive Taint Analysis for Concurrent Programs in Go Volker Stolz Bergen University College, Norway & University of Oslo, Norway 28 th Nordic Workshop on Programming Theory (NWPT 16) 1 st November 2016 Supported by the bilateral Norwegian/German project GoRETech Go Runtime Enforcement Techniques & EU COST Action IC1402 ARVI Applied Runtime Verification

Don t let data Go astray Violet Ka I Pun Martin Steffen Volker Stolz Anna-Katharina Wickert Eric Bodden Michael Eichberg

Motivation Taint analysis: data flow analysis Secure information flow Prevent untrusted/sensitive data from reaching sensitive locations Examples (the usual suspects): SQL injection (user input flows unfiltered into SQL query) leaks (clear-text password ending up in log/debugging output) Volker Stolz Don t let data Go astray NWPT 16 1 / 20

The Go language I Backed by Google I imperative (C-programmers should be able to read it) I object-oriented (maybe... ) I concurrent (goroutines) I structurally typed I garbage collected; dynamic race checker I higher-order functions and closures Volker Stolz Don t let data Go astray NWPT 16 2 / 20

What are methods? procedures functions methods methods are specific functions Example: add1 / add2 not that much different from each other type Number struct { n int } func add1 (x Number, y Number) func (x Number) add2 (y Number) int { return x.n+y.n } method function with special first argument f (o, v) vs. o.f (v) elsewhere often: special keyword for first argument: this (or self) Volker Stolz Don t let data Go astray NWPT 16 3 / 20

Higher-order functions known from functional languages languages with higher-order functions functions as first-class data func add (x int) (func (int) int) { return func ( y int ) ( int ) { return y + x } } add : int (int int) = λx :int.λy :int.x + y Volker Stolz Don t let data Go astray NWPT 16 4 / 20

Deferred functions Each function/method can be called: 1 conventionally 2 deferred 3 asynchronously (see later) Also in Apple s Swift language func main () { defer fmt. Println ( 1 ) fmt. Println ( 2 ) } Deferred call (guaranteed to be) executed when the surrounding function body returns Eval d for side-effect only, returned value irrelevant Deferred calls can be nested, too Volker Stolz Don t let data Go astray NWPT 16 5 / 20

Concurrency in Go Go s concurrency mantra Don t communicate by sharing memory, share memory by communicating! Go concurrency goroutines + channels claimed to be easy first-class, typed channels Volker Stolz Don t let data Go astray NWPT 16 6 / 20

Channels named pipes FIFO, bounded, non-lossy communication crucial data type with synchronization power taking a back-seat: locks mutexes monitors semaphores... channels: first-class data channels can send (references to) channels inspired by CSP (and CCS, and, actually π) directed channels Volker Stolz Don t let data Go astray NWPT 16 7 / 20

Channels Simple example: synchronized channel for strings package main import fmt func main () { messages := make ( chan string, 0 ) } go func () { messages ping } () msg := messages fmt. Println (msg) Volker Stolz Don t let data Go astray NWPT 16 8 / 20

Back to Taint Analysis... Identifies flows of private information to untrusted places Terminology: Source (produce tainted data) Sink (consumer of data) Flow (from source to sink) Both given as user-defined sets of methods (Q: How to untaint data? Sanitizers, future work) Volker Stolz Don t let data Go astray NWPT 16 9 / 20

Abstract syntax Abstract syntax captures essence of Go Gory details in SSA representation in implementation s ::= x := e x.f := e if v then s else s defer((λx.s) v) go s x y x y return v s ; s e ::= v v v makechan v ::= x x.f () true false λx.s Volker Stolz Don t let data Go astray NWPT 16 10 / 20

Lattice for Taint Analysis Simple lattice: uninitialized / untainted / tainted / both 1 0 1 0 1 1 1 0 0 0 1 0 Volker Stolz Don t let data Go astray NWPT 16 11 / 20

Standing on the Shoulders... Existing Go compiler infrastructure: SSA representation https://godoc.org/golang.org/x/tools/go/ssa Basic blocks / call-graph https://godoc.org/golang.org/x/tools/go/callgraph Points-to analysis à la Andersen https://godoc.org/golang.org/x/tools/go/pointer Interprocedural, context-sensitive analysis: Padhye / Khedker, SOAP@PLDI 2013 standard worklist algorithm w/ calling context & value Volker Stolz Don t let data Go astray NWPT 16 12 / 20

Standing on the Shoulders... Existing Go compiler infrastructure: SSA representation https://godoc.org/golang.org/x/tools/go/ssa Basic blocks / call-graph https://godoc.org/golang.org/x/tools/go/callgraph Points-to analysis à la Andersen https://godoc.org/golang.org/x/tools/go/pointer Interprocedural, context-sensitive analysis: Padhye / Khedker, SOAP@PLDI 2013 standard worklist algorithm w/ calling context & value... or their toes? Volker Stolz Don t let data Go astray NWPT 16 12 / 20

Our analysis example main() h(f *os.file) (c string, r int) n 1 c 1 n 3 a := Hello World b := g(a) fmt.print(b) // sink f, := os.openfile(./pw.txt ) n 4 n 5 b := make([]byte, 8) r, = f.read(b) c = string(b[:]) return c 2 s, n := h(f) n 6 fmt.print(s) for n > 0 // sink c 3 s, n = h(f) c 4 t := g(s) n 8 fmt.print(t) // sink g(a string) (b string) n 9 exit n 2 n 7 b = a return Volker Stolz Don t let data Go astray NWPT 16 13 / 20

Our analysis example main() h(f *os.file) (c string, r int) n 1 c 1 n 3 a := Hello World b := g(a) fmt.print(b) // sink f, := os.openfile(./pw.txt ) n 4 n 5 b := make([]byte, 8) r, = f.read(b) c = string(b[:]) return c 2 s, n := h(f) n 6 c 3 fmt.print(s) for n > 0 s, n = h(f) // sink Untainted c 4 t := g(s) n 8 fmt.print(t) // sink g(a string) (b string) n 9 exit n 2 n 7 b = a return Volker Stolz Don t let data Go astray NWPT 16 13 / 20

Our analysis example main() h(f *os.file) (c string, r int) n 1 c 1 n 3 a := Hello World b := g(a) fmt.print(b) // sink f, := os.openfile(./pw.txt ) n 4 n 5 b := make([]byte, 8) r, = f.read(b) c = string(b[:]) return c 2 n 6 c 3 s, n := h(f) fmt.print(s) for n > 0 s, n = h(f) // sink Tainted Untainted c 4 t := g(s) n 8 fmt.print(t) // sink g(a string) (b string) n 9 exit n 2 n 7 b = a return Volker Stolz Don t let data Go astray NWPT 16 13 / 20

Channel Handling Q: Can we handle flow through channels in the same framework? n 1 x := Hello World n 2 ch := make(chan, string) c 1 go f(ch) fn 1 func f(ch chan string) n 3 sink(x) fn 2 y := ch n 4 x = tainted fn 3 sink(y) n 5 ch x n 6... Volker Stolz Don t let data Go astray NWPT 16 14 / 20

Channel Handling Q: Can we handle flow through channels in the same framework? n 1 x := Hello World n 2 ch := make(chan, string) c 1 go f(ch) fn 1 func f(ch chan string) TA(n 5 ) ch = [ch S(x)] n 3 sink(x) fn 2 y := ch n 4 x = tainted fn 3 sink(y) n 5 ch x n 6... A: Add feedback from writes to reads Overapproximation; only feed back taint value of channel along this edge Volker Stolz Don t let data Go astray NWPT 16 14 / 20

Our analysis Nielson-style data-flow analysis (monotone framework) Analysis info of a node = transfer function of node applied to (union over all incoming flows) TA(l) = Φ(S, N l ) where S = {TA(l ) (l, l) flow (P)} and N l nodes(p). Volker Stolz Don t let data Go astray NWPT 16 15 / 20

Our analysis Assignments, from/to struct members, calls: Φ(S, [x := e] l ) = φ(s, x, e, l) Volker Stolz Don t let data Go astray NWPT 16 16 / 20

Our analysis Assignments, from/to struct members, calls: Φ(S, [x := e] l ) = φ(s, x, e, l) φ(s, x, y.f, l) = S[x {TA(l ) y [y := e] l [y ch] l f.a. (l : y ) aliases(l : y)}] φ(s, x.f, y.f, l) = S[x.f {TA(l ) [y := e] l [y ch] l y f.a. (l : y ) aliases(l : y)}] φ(s, x, v 1 v 2, l) = { S[x 1] if v1 is a source S[x Φ v 1 exit ] otherwise aliases(l : u) produced by existing context-insensitive pta. tainting a struct-member taints the entire struct same for slices & built-in key/value map data structure Volker Stolz Don t let data Go astray NWPT 16 16 / 20

Our analysis Assignments, from/to struct members, calls: Φ(S, [x := e] l ) = φ(s, x, e, l) φ(s, x, y.f, l) = S[x {TA(l ) y [y := e] l [y ch] l f.a. (l : y ) aliases(l : y)}] φ(s, x, v 1 v 2, l) = { S[x 1] if v1 is a source S[x Φ v 1 exit ] otherwise aliases(l : u) produced by existing context-insensitive pta. tainting a struct-member taints the entire struct same for slices & built-in key/value map data structure Volker Stolz Don t let data Go astray NWPT 16 16 / 20

Channel Handling Φ(S, [x ch] l ) = S[ch S(x)] Φ(S, [x ch] l ) = S[x A] where A = {TA(l ) ch [x ch ] l } f.a. (l : ch ) aliases(l : ch) existing pta knows about channels we lose information about ordering of messages in channels (even in obvious cases) Volker Stolz Don t let data Go astray NWPT 16 17 / 20

Sanitizers & Monitoring Taint analysis with only sources and sinks too restrictive Certain operations untaint data: SQL injections: filter out dangerous characters Password example: data flowing through hash-function sanitized Natural extension through third set of operations: sanitizers Interesting questions: Where to report tainted flow at runtime (early/late) Minimize taint-tagging at runtime Automated placement of sanitizers to repair programs (for C : Livshits, POPL 13) Volker Stolz Don t let data Go astray NWPT 16 18 / 20

Practical Evaluation Extensive list of sources & sinks from Eric s security projects Test suite with hand-crafted examples: Runtime bounded by lattice height, calling contexts Real-world case study: well... fine-grained analysis descends into imported packages (fmt.print) case study requires domain specific properties (and by now everyone is avoiding SQL injections... ) SSA-based infrastructure requires setup that actually compiles (git clone github.com/random/goproject not enough) High-profile targets: Docker, Dropbox, Android development Volker Stolz Don t let data Go astray NWPT 16 19 / 20

Summary: Challenge in Go Traditional data-flow analysis, for a newer language Information flow: sources to sinks (via sanitizers) mixes problems from C (structs, pointers) with problems from π-calculus (channels) combination of off-the-shelf components (SSA representation & pta from Go compiler infrastructure, vasco interprocedural analysis) 1 More research is needed on... : fine tune white-list/black-list for imports or: precompute & cache standard libraries? collect (performance) data on real-world Go code 1 E. Bodden, K. I Pun, M. Steffen, V. Stolz, A.-K. Wickert Information Flow Analysis for Go, ISoLA 2016. Volker Stolz Don t let data Go astray NWPT 16 20 / 20

Summary: Challenge in Go Traditional data-flow analysis, for a newer language Information flow: sources to sinks (via sanitizers) mixes problems from C (structs, pointers) with problems from π-calculus (channels) combination of off-the-shelf components (SSA representation & pta from Go compiler infrastructure, vasco interprocedural analysis) 1 More research is needed on... : fine tune white-list/black-list for imports or: precompute & cache standard libraries? collect (performance) data on real-world Go code 1 E. Bodden, K. I Pun, M. Steffen, V. Stolz, A.-K. Wickert Information Flow Analysis for Go, ISoLA 2016. Volker Stolz Don t let data Go astray NWPT 16 20 / 20