Efficient Private Information Retrieval

Similar documents
Efficient implementation of Private Information. Retrieval protocols

Usable PIR. Network Security and Applied. Cryptography Laboratory.

pcloud: A Distributed System for Practical PIR

Attribute-based encryption with encryption and decryption outsourcing

Evaluating Private Information Retrieval on the Cloud

Distributed Oblivious RAM for Secure Two-Party Computation

Yale University Department of Computer Science

Searchable Encryption Using ORAM. Benny Pinkas

GP-ORAM: A Generalized Partition ORAM

CSC 5930/9010 Cloud S & P: Cloud Primitives

Lecture 22 - Oblivious Transfer (OT) and Private Information Retrieval (PIR)

Solutions. Location-Based Services (LBS) Problem Statement. PIR Overview. Spatial K-Anonymity

Design and Implementation of the Ascend Secure Processor. Ling Ren, Christopher W. Fletcher, Albert Kwon, Marten van Dijk, Srinivas Devadas

Comparative Study of Private Information Retrieval Protocols

Distributed Oblivious RAM for Secure Two-Party Computation

Lecture 19 - Oblivious Transfer (OT) and Private Information Retrieval (PIR)

Practical Oblivious RAM and its Applications

Privacy-Preserving Computation with Trusted Computing via Scramble-then-Compute

Private Information Retrieval Techniques for Enabling Location Privacy in Location-Based Services

An Accountability Scheme for Oblivious RAMs

On the (In)security of Hash-based Oblivious RAM and a New Balancing Scheme

Exploring Timing Side-channel Attacks on Path-ORAMs

A Survey of Single-Database PIR: Techniques and Applications

Generalizing PIR for Practical Private Retrieval of Public Data

Separation Shielding and Content Supporting Locus Positioned Problems

An Overview of Secure Multiparty Computation

The Best of Both Worlds: Combining Information-Theoretic and Computational Private Information Retrieval for Communication

Sub-logarithmic Distributed Oblivious RAM with Small Block Size

TSKT-ORAM: A Two-Server k-ary Tree Oblivious RAM without Homomorphic Encryption

Ascend: Architecture for Secure Computation on Encrypted Data Oblivious RAM (ORAM)

Asymptotically Tight Bounds for Composing ORAM with PIR

Searchable Symmetric Encryption: Optimal Locality in Linear Space via Two-Dimensional Balanced Allocations

Public-Key Locally-Decodable Codes

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

PanORAMa: Oblivious RAM with Logarithmic Overhead

1. Diffie-Hellman Key Exchange

On Robust Combiners for Private Information Retrieval and Other Primitives

Cryptographic protocols

The Best of Both Worlds: Combining Information-Theoretic and Computational PIR for Communication Efficiency

Cryptographic Primitives and Protocols for MANETs. Jonathan Katz University of Maryland

Practical Oblivious RAM and its Applications

Private Information Retrieval from MDS Coded Data in Distributed Storage Systems

Introduction to Cryptography Lecture 7

Private Stateful Information Retrieval

Implementation of a multiuser customized oblivious RAM

Elements of Cryptography and Computer and Networking Security Computer Science 134 (COMPSCI 134) Fall 2016 Instructor: Karim ElDefrawy

Multi-Theorem Preprocessing NIZKs from Lattices

A Survey of Single-Database Private Information Retrieval: Techniques and Applications

Secure Multiparty Computation

A Fast Multi-Server, Multi-Block Private Information Retrieval Protocol

Onion ORAM: Constant Bandwidth ORAM Using Additively Homomorphic Encryption Ling Ren

Lectures 6+7: Zero-Leakage Solutions

Searchable Encryption. Nuttiiya Seekhao

Private Information Retrieval

k Anonymous Private Query Based on Blind Signature and Oblivious Transfer

Lecture 1: January 22

Research Statement. Yehuda Lindell. Dept. of Computer Science Bar-Ilan University, Israel.

Secure Multi-Party Computation Without Agreement

Lecture 1: January 23

Burst ORAM: Minimizing ORAM Response Times for Bursty Access Patterns

VERIFIABLE SYMMETRIC SEARCHABLE ENCRYPTION

Fine-Grained Data Sharing Supporting Attribute Extension in Cloud Computing

Using Multi Shares for Ensuring Privacy in Database-as-a-Service

Private Content Based Image Retrieval

Information-Theoretic Private Information Retrieval: A Unified Construction (Extended Abstract)

Privacy Protected Spatial Query Processing

Secure Remote Storage Using Oblivious RAM

Integrating OpenID with proxy re-encryption to enhance privacy in cloud-based identity services

Lectures 4+5: The (In)Security of Encrypted Search

Understanding Cryptography by Christof Paar and Jan Pelzl. Chapter 9 Elliptic Curve Cryptography

CloudSky: A Controllable Data Self-Destruction System for Untrusted Cloud Storage Networks

Public Key Cryptography and RSA

Public Key Cryptography

Encryption Algorithms Authentication Protocols Message Integrity Protocols Key Distribution Firewalls

Tunably-Oblivious Memory: Generalizing ORAM to Enable Privacy-Efficiency Tradeoffs

Evaluating Branching Programs on Encrypted Data

Implementation of IBE with Outsourced Revocation technique in Cloud Computing

Cryptography: More Primitives

Secure coprocessor-based private information retrieval without periodical preprocessing

Whitewash: Outsourcing Garbled Circuit Generation for Mobile Devices

Notes for Lecture 24

Revisiting the Computational Practicality of Private Information Retrieval

Cryptographic Hash Functions

Information Security CS526

Parallel Coin-Tossing and Constant-Round Secure Two-Party Computation

Introduction to Secure Multi-Party Computation

The Fusion Distributed File System

Lecture 5: Zero Knowledge for all of NP

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong

Private Content Based Image Retrieval

Practical Secure Two-Party Computation and Applications

Conventional Protection Mechanisms in File Systems

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments

Efficient Memory Integrity Verification and Encryption for Secure Processors

Security. Communication security. System Security

Cryptography and Network Security

Lecture 2 Applied Cryptography (Part 2)

Symmetrically Private Information Retrieval (Extended Abstract)

Searchable Symmetric Encryption: Optimal Locality in Linear Space via Two-Dimensional Balanced Allocations

arxiv: v1 [cs.cr] 19 Sep 2017

Transcription:

Efficient Private Information Retrieval K O N S T A N T I N O S F. N I K O L O P O U L O S T H E G R A D U A T E C E N T E R, C I T Y U N I V E R S I T Y O F N E W Y O R K K N I K O L O P O U L O S @ G R A D C E N T E R. C U N Y. E D U C O M M I T T E E : S P I R I D O N B A K I R A S ( M E N T O R ) A B D U L L A H UZ T A N S E L S T A T H I S Z A C H O S

Introduction Database queries can reveal sensitive information about an individual Therefore, privacy-preserving query processing is an emerging research area in database systems Objective: Hide the content of the query from the database server Existing solutions Database encryption Query obfuscation Private information retrieval (PIR) our work 2

Introduction Database encryption Design appropriate encryption methods that facilitate query processing on the encrypted data (e.g., OPES) Queries are also encrypted, i.e., query privacy is inherent Limitation: Access pattern attacks Query obfuscation Hide the the actual query into a pool of decoy queries Limitation: Weak privacy PIR Offers perfect privacy Limitation: Computationally expensive 3

Private information retrieval Consider an n-bit database DB = {x 1, x 2,..., x n }. A PIR protocol allows a client to retrieve bit x i, while keeping the value of the index i secret from the server Can easily be adapted to retrieve blocks of data Objective: Hide the content of the query the database may not be encrypted Encryption is orthogonal can be applied independently There are three main categories of PIR protocols Information theoretic PIR Computational PIR Secure hardware based PIR our work 4

Information Theoretic PIR [Chor et al., FOCS 95] Ensures that the query discloses no information about the retrieved bit, even if the server has unbounded computational power For a single server and information theoretic privacy, all n bits of the database need to be transmitted to the client For sublinear communication cost, the database must be replicated into k non-colluding servers Not practical the servers need to be trusted that they will not collude with each other 5

Computational PIR These protocols work with a single server They employ well known cryptographic primitives that guarantee query privacy in the presence of a computationally bounded server Quadratic residuosity assumption [Kushilevitz and Ostrovsky, FOCS 97] φ-hiding assumption [Gentry and Ramzan, ICALP 05] Extremely expensive need at least one modular multiplication for every bit of the database 6

Secure hardware PIR Implements the PIR functionality through a specialized hardware, called secure co-processor (SCOP) This is a tamper-resistant CPU, which can be trusted to carry out its processing unmolested and unobserved, even if the adversary has physical access to it [Wang et al., ESORICS 06] Database consists of n pages The SCOP initially encrypts and obliviously permutes the database pages The SCOP s cache can hold k out of n pages Typical cache size is 64MB 7

Secure hardware PIR [Wang et al., ESORICS 06] (cont d) Clients communicate with the SCOP through a secure SSL connection Each client requests a certain page that is retrieved by the SCOP and returned back to the client through the SSL connection For every request the SCOP inserts a new page into the cache (even if it was a cache hit) When the storage capacity is reached, the database is re-shuffled The amortized computational cost of this approach is O(n/k) 8

Secure hardware PIR [Williams and Sion, NDSS 08 and Williams et al., CCS 08] Amortized O(log 2 n) and O(logn loglogn) computational costs, respectively (state-of-the-art) Implement the Oblivious RAM model on a SCOP The SCOP maintains a pyramid data structure with log 4 n levels in the server s disk, where level i contains 4 i buckets For every request, the SCOP accesses one bucket per level Every requested page is inserted at the top of the pyramid structure When a level becomes full, the SCOP empties it into the next level, after re-encrypting the pages, and obliviously re-permuting them 9

Motivation Secure hardware PIR is currently the only practical approach for acceptable privacy Existing solutions feature amortized costs Some queries may incur prohibitively large response times 100s or 1000s of seconds This is due to the page re-shuffling process Consider a database of n pages After re-shuffling the database, any page has an equal probability (= 1/n) of being stored in any of the n disk locations 10

Motivation Some applications may not require such stringent privacy guarantees What if we allow pages to land in different disk locations according to a nonuniform distribution? Our goals Reasonable page retrieval times Constant cost Adjustable privacy level 11

Our approach Every requested page is temporarily stored inside the SCOP s cache Continuous re-shuffling of the database During each request instant, one previously retrieved page is relocated to a new position on the disk We introduce the notion of c-approximate PIR A scheme provides c-approximate PIR if, after moving a single page p to a new location on the disk and for any pair of disk locations l i, l j, the probability of p landing in location l i is at most c times larger than the probability of landing in location l j 12

Architecture Adversary Polynomial time Can query the database Can see the accessed locations on the disk Knows all algorithms implemented inside the secure hardware Curious but not malicious 13

Page retrieval algorithm During each page request, the SCOP Retrieves a block of k contiguous pages (in a round-robin manner) k is the security parameter Retrieves the requested page or a random one (in the case of a cache hit) Moves a random page from the cache to one of the k locations corresponding to the retrieved block Stores the retrieved page into the cache Database updates are handled similarly 14

Security analysis What should the value of k be to satisfy the privacy parameter c? Assume a cache size of m pages Consider a page that is just stored into the cache The probability that it is relocated on the subsequent block is The probability that it is relocated on the block before that is 15

Security analysis To satisfy the privacy parameter c, we want c = 1 corresponds to the trivial case of PIR (k = n) Given c and n, the the value of the security parameter k is determined by the available cache capacity Page retrieval times increase linearly with k 16

Secure hardware deployment Storage cost Cached pages Look-up table (location of each page on the disk or cache) Page retrieval cost Transfer cost between the SCOP and the server (k+1 pages) Encryption/decryption cost inside the SCOP (bottleneck) 17

Secure hardware deployment Typical values used in our results 18

Response time for 1KB pages (c=2) 19

Response time vs. c = 1+e 20

Software implementation (1TB, c=2) 21

Back to Computational PIR Traditional Computational PIR relies on a single server. Today s infrastructure: Cloud Computing High Performance Clusters Parallel Programming Amazon EC2, Microsoft Azure, Motivation Employ today s increasingly available parallel computational resources Goal: Improve Computational PIR Efficiency 22

Parallel Computing Communication Shared memory Message Passing Computation Data parallelism Same Instructions, Multiple Data (SIMD) Operation parallelism Multiple Instructions, Multiple Data (MIMD) Single Program, Multiple Data (SPMD) 23

Message Passing Process A processor running a program. Uses its own program counter and address space. Inter-process communication Move data from one process local address space to another s. Cooperation Messages have to be sent and received. 24

Message Passing Interface (MPI) What is MPI? A Message Passing library specification Many implementations MPICH, OpenMPI, MIMD/SPMD parallelism Provides access to parallel hardware, while abstracting underlying details of it. Portable - Implemented in many platforms Languages: Mainly C and Fortran. MPI Standard: http://www.mpi-forum.org 25

MPI Basics Datatypes Point-to-point communication Collective communication Process topologies Parallel I/O 26

Communicators, Groups, Processes 27

Our proposed architecture 28

Gentry Ramzan PIR O(w + B) communication complexity w: security parameter, based on database size n and B. B: length of database blocks (in bits). Relies on the Phi-hiding assumption Server Operations Chinese Remainder Theorem Modular multiplications Client Operations Discrete logarithm (Pohlig-Hellman) efficient due to small primes used 29

Gentry Ramzan Parallel Version Database Initialization Each process will independently handle a part of the database. Operate on pages, not blocks. Page size: k = xb, for integer x. N available processes process rank: 0 r N-1 Blocks are associated with prime powers π i π i = p i ci, where p i : small & distinct primes, c i : ceil(l/log 2 p i ) 30

Gentry Ramzan Parallel Version Database Initialization Example 31

Gentry Ramzan Parallel Version Query Generation & Extraction Client generates pair (G, g) G: cyclic group of order G = qπi, for an integer q g: generator Receives g e = g e G from server. Computes h e = g e q log h h e within H \in G of order π i Pohlig-Hellman Efficient, if p i primes were small. 32

Gentry Ramzan Parallel Version Query Response Server receives g and the module m. Distributes the values to the processes. Gathers g e from all Sends back to client Many hidden MPI details behind these operations, partly due to length of numbers. 33

Gentry Ramzan Parallel Version Overview 34

Kushilevitz Ostrovsky PIR First computational PIR which eliminated database replication. Relies on Quadratic Residuosity Assumption. If p 1, p 2 are large primes, determining if a number is a Quadratic Residue (QR) or Quadratic Non Residue (QNR) in (mod p 1 p 2 ), is computationally hard. 35

Kushilevitz Ostrovsky Example Example adapted from Tan s presentation 36

Kushilevitz Ostrovsky Improvements Read chunks of bits and store in cache. Keep all 256 possible combinations for a byte. 37

Further Work Additional parallelization i.e. GPU for modular multiplications Add more PIR schemes into framework Collective run times of experiments. Develop complete application. 38

39

References B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan: Private information retrieval. In Proc. IEEE FOCS, 1995. E. Kushilevitz and R. Ostrovsky: Replication is NOT needed: Single database, computationally-private information retrieval. In Proc. IEEE FOCS, 1997. C. Gentry and Z. Ramzan: Single-database private information retrieval with constant communication rate. In Proc. ICALP, 2005. S. Wang, X. Ding, R. H. Deng, and F. Bao: Private information retrieval using trusted hardware. In Proc. ESORICS, 2006. P. Williams and R. Sion: Usable PIR. In Proc. NDSS, 2008. P. Williams, R. Sion, and B. Carbunar: Building castles out of mud: Practical access pattern privacy and correctness on untrusted storage. In Proc. ACM CCS, 2008. MPI Standard: http://www.mpi-forum.org 40