Security in Data Science

Similar documents
SiTP Dec Cryptography for IoT. Dan Boneh Stanford University

Security and Privacy through Modern Cryptography

Iron: Functional encryption using Intel SGX

Prio: Private, Robust, and Scalable Computation of Aggregate Statistics

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Securing Distributed Computation via Trusted Quorums. Yan Michalevsky, Valeria Nikolaenko, Dan Boneh

Multi-Theorem Preprocessing NIZKs from Lattices

US Census Bureau Workshop on Multi-party Computing. David W. Archer, PhD 16-Nov-2017

Town Crier. Authenticated Data Feeds For Smart Contracts. CS5437 Lecture by Kyle Croman and Fan Zhang Mar 18, 2016

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Atom. Horizontally Scaling Strong Anonymity. Albert Kwon Henry Corrigan-Gibbs 10/30/17, SOSP 17

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Faster Private Set Intersection based on OT Extension

The Exact Round Complexity of Secure Computation

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

Blind Machine Learning

Jian Liu, Sara Ramezanian

DISSENT: Accountable, Anonymous Communication

Secure Multiparty Computation

Research Statement. Yehuda Lindell. Dept. of Computer Science Bar-Ilan University, Israel.

Systems Novelties Seminar (2/24/10)

PrivCount: A Distributed System for Safely Measuring Tor

Secure Multi-party Computation

Secure Multiparty Computation Introduction to Privacy Preserving Distributed Data Mining

Privacy Preserving Collaborative Filtering

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong

Secure Multi-Party Sorting and Applications

Spy vs. spy: Anonymous messaging over networks. Giulia Fanti, Peter Kairouz, Sewoong Oh, Kannan Ramchandran, Pramod Viswanath

Securing services running over untrusted clouds : the two-tiered trust model

Security & Privacy. Web Architecture and Information Management [./] Spring 2009 INFO (CCN 42509) Contents. Erik Wilde, UC Berkeley School of

Application to More Efficient Obfuscation

SMART DEVICES: DO THEY RESPECT YOUR PRIVACY?

Accountability in Privacy-Preserving Data Mining

Crypto Background & Concepts SGX Software Attestation

ESP Egocentric Social Platform

BUILDING SECURE (CLOUD) APPLICATIONS USING INTEL S SGX

CRYPTOGRAPHIC PROTOCOLS: PRACTICAL REVOCATION AND KEY ROTATION

Auditing IoT Communications with TLS-RaR

- Presentation 25 minutes + 5 minutes for questions. - Presentation is on Wednesday, 11:30-12:00 in B05-B06

A Machine Learning Approach to Privacy-Preserving Data Mining Using Homomorphic Encryption

And Then There Were More:

How I Learned to Stop Worrying and Love the Internet of Things

SCALABLE MPC WITH STATIC ADVERSARY. Mahnush Movahedi, Jared Saia, Valerie King, Varsha Dani University of New Mexico University of Victoria

ENEE 459-C Computer Security. Security protocols

MTAT Research Seminar in Cryptography Building a secure aggregation database

Towards Practical Differential Privacy for SQL Queries. Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley

Key Management & Protection: Evaluation of Hardware, Tokens, TEEs and MPC

CIS 551 / TCOM 401 Computer and Network Security. Spring 2008 Lecture 23

Implementing Fully Key-Homomorphic Encryption in Haskell. Maurice Shih CS 240h

Scalable privacy-enhanced traffic monitoring in vehicular ad hoc networks

How to (not) Share a Password:

Plaintext Awareness via Key Registration

CPSC 467b: Cryptography and Computer Security

SGX Enclave Life Cycle Tracking TLB Flushes Security Guarantees

Secure Set Intersection with Untrusted Hardware Tokens

Cryptography & Data Privacy Research in the NSRC

Privacy in Statistical Databases

Cooperative Private Searching in Clouds

Trojan-tolerant Hardware

Lecture 44 Blockchain Security I (Overview)

Design and Analysis of Efficient Anonymous Communication Protocols

Query Auditing for Protecting Sensitive Attributes in Statistical Databases

CSE 484 / CSE M 584: Computer Security and Privacy. Web Security. Autumn Tadayoshi (Yoshi) Kohno

Direct Anonymous Attestation

Indrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin

Lecture 15 PKI & Authenticated Key Exchange. COSC-260 Codes and Ciphers Adam O Neill Adapted from

More crypto and security

Privacy, Discovery, and Authentication for the Internet of Things

Whitewash: Outsourcing Garbled Circuit Generation for Mobile Devices

Privacy, Discovery, and Authentication for the Internet of Things

Vlad Kolesnikov Bell Labs

CIS 4360 Secure Computer Systems SGX

1 A Tale of Two Lovers

Crypto meets Web Security: Certificates and SSL/TLS

anonymous routing and mix nets (Tor) Yongdae Kim

Securing Cloud-assisted Services

Implementation of Decentralized Access Control with Anonymous Authentication in Cloud

Recommendations for TEEP Support of Intel SGX Technology

ENEE 459-C Computer Security. Security protocols (continued)

Nigori: Storing Secrets in the Cloud. Ben Laurie

Notes for Lecture 14

Department of Computer Science Institute for System Architecture, Operating Systems Group TRUSTED COMPUTING CARSTEN WEINHOLD

Mapping Internet Sensors with Probe Response Attacks

Computer Security Course. Public Key Crypto. Slides credit: Dan Boneh

CSE 484 / CSE M 584: Computer Security and Privacy. Anonymity Mobile. Autumn Tadayoshi (Yoshi) Kohno

from circuits to RAM programs in malicious-2pc

Software Security and Intro to Cryptography

Privacy-Preserving Machine Learning

Round-Optimal Secure Multi-Party Computation

Memory Defenses. The Elevation from Obscurity to Headlines. Rajeev Balasubramonian School of Computing, University of Utah

Zero Knowledge Protocol

TRUSTED COMPUTING TECHNOLOGIES

Private Database Queries Using Somewhat Homomorphic Encryption. Dan Boneh, Craig Gentry, Shai Halevi, Frank Wang, David J. Wu

CSC 5930/9010 Cloud S & P: Cloud Primitives

Structure-Preserving Certificateless Encryption and Its Application

Introduction to Cryptoeconomics

Privacy Protected Spatial Query Processing

Secure Multi-Party Computation. Lecture 13

Transcription:

SDSI Nov. 2017 Security in Data Science Dan Boneh Stanford University

Private genomic data analysis [Jagadeesh, Wu, Birgmeier, Boneh, Bejerano, Science, 2017] What genes causes a specific disorder? 2 v 1 : 6 v 3 : 4 0 1 0 2 0 1 1 0 1 2 0 1 2 0 0 2 1 1 0 0 1 2 0 1 3 7 5 People with Kabuki syndrome Each has 211 to 374 rare genes out of 20,000 genes Patient i: vector v i of dim 20,000 that is 0 for normal genes

MPC for genomic data analysis [Jagadeesh, Wu, Birgmeier, Boneh, Bejerano, Science, 2017] MPC protocol r 1, r r 2, r 3, 1 r 2 r 3 v 1 -r 1 v 1 -r 2 -r 1, v 2 -r 2 v 2, 3 -r v 3 -r 3, 3 People with Kabuki syndrome Each has 211 to 374 rare genes out of 20,000 genes Patient i: vector v i of dim 20,000 that is 0 for normal genes

MPC for genomic data analysis [Jagadeesh, Wu, Birgmeier, Boneh, Bejerano, Science, 2017] MPC protocol r 1, r 2, r 3, v 1 -r 1, v 2 -r 2, v 3 -r 3, People with Kabuki syndrome most common rare genes KMT2D, COL6A1 Nothing else is revealed about the individual genomes!!

How does it work? See paper

Can we do this with Intel s SGX? Enclave Application Remote Attestation Enclave Source: ISCA 2015 tutorial slides for Intel SGX

Iron: Functional encryption and obfuscation using Intel SGX Ben Fisch, Dhinakaran Vinayagamurthy, Dan Boneh, Sergey Gorbunov In ACM CCS 2017

... and now for something completely different: Prio: Private, Robust, and Efficient Computation of Aggregate Statistics Joint work with Henry Corrigan-Gibbs

Today: Non-private aggregation StressTracker Blood pressure Every user has a private data point Twitter usage

Today: Non-private aggregation StressTracker Blood pressure Twitter usage

Today: Non-private aggregation StressTracker Blood pressure The app provider learns more than it needs Twitter usage

Prio: Private aggregation Clients send one share of their data to each aggregator App store StressTracker Blood pressure Twitter usage

Prio: Private aggregation App store StressTracker Blood pressure Aggregator learns nothing else Twitter usage

100,000,000 THE PROBLEM App store StressTracker 200 Blood pressure Twitter usage

Private aggregation x1 x2 x3 xn f(x1,, xn) Exact correctness: if all servers are honest they learn f(x 1,,x n ) Privacy: if one server is honest they learn only f(x 1,,x n ) Robustness: Scalable: malicious clients have bounded influence no public-key crypto (other than TLS)

Achieves all four goals Prio contributions 1. Robustness using secret-shared non-interactive proofs (SNIPs) Every client efficiently proves to servers that its submission is well formed Takes advantage of non-colluding servers (verifiers) 2. Aggregatable encodings Compute sums privately compute f( ) privately for many f s of interest

Existing approaches Additively homomorphic encryption P4P (2010), Private stream aggregation (2011), Grid aggregation (2011), PDDP (2012), SplitX (2013), PrivEx (2014), PrivCount (2016), Succinct sketches (2016), Multi-party computation [GMW87], [BGW88] FairPlay (2004), Brickell-Shmatikov (2006), FairplayMP (2008), SEPIA (2010), Private matrix factorization (2013), JustGarble (2013), Anonymous credentials/tokens VPriv (2009), PrivStats (2011), ANONIZE (2014), Randomized response [W65], [DMNS06], [D06], RAPPOR (2014, 2016)

Private aggregation: needed in many settings Private client value (xi) Aggregate f(x1,, xn) Location data (phones/cars) Number of devices in location L Ten most popular locations Locations with weakest signal strength Web browsing history Most common websites Websites with TLS certificate errors Health information Min, max, avg, stddev heart rate ML model relating BP to Twitter usage Text messages Min, max, average number per day ML model relating time of day to emotion

Prio: how does it work? See paper

THE END