Private Information Retrieval (PIR)

Similar documents
Parallelism for Nested Loops with Non-uniform and Flow Dependences

An Optimal Algorithm for Prufer Codes *

Lecture 3: Computer Arithmetic: Multiplication and Division

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Report on On-line Graph Coloring

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Improving Low Density Parity Check Codes Over the Erasure Channel. The Nelder Mead Downhill Simplex Method. Scott Stransky

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)

Lecture 5: Multilayer Perceptrons

Module Management Tool in Software Development Organizations

Esc101 Lecture 1 st April, 2008 Generating Permutation

Secure Index Coding: Existence and Construction

Spatially Coupled Repeat-Accumulate Coded Cooperation

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Array transposition in CUDA shared memory

LP Decoding. Martin J. Wainwright. Electrical Engineering and Computer Science UC Berkeley, CA,

Performance Evaluation of Information Retrieval Systems

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Practical PIR for Electronic Commerce

A Saturation Binary Neural Network for Crossbar Switching Problem

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

S1 Note. Basis functions.

Fuzzy Keyword Search over Encrypted Data in Cloud Computing

CSE 326: Data Structures Quicksort Comparison Sorting Bound

On Some Entertaining Applications of the Concept of Set in Computer Science Course

CSE 326: Data Structures Quicksort Comparison Sorting Bound

Support Vector Machines

Parallel matrix-vector multiplication

Problem Set 3 Solutions

An Image Compression Algorithm based on Wavelet Transform and LZW

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Hierarchical clustering for gene expression data analysis

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

Related-Mode Attacks on CTR Encryption Mode

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

An efficient iterative source routing algorithm

Secure Index Coding: Existence and Construction

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

The Shortest Path of Touring Lines given in the Plane

Polyhedral Compilation Foundations

Enhanced AMBTC for Image Compression using Block Classification and Interpolation

Greedy Technique - Definition

y and the total sum of

ELEC 377 Operating Systems. Week 6 Class 3

Needed Information to do Allocation

Solving two-person zero-sum game by Matlab

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Real-Time Guarantees. Traffic Characteristics. Flow Control

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Loop Transformations, Dependences, and Parallelization

Feature Reduction and Selection

Fast exponentiation via prime finite field isomorphism

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

Garbling Gadgets for Boolean and Arithmetic Circuits

Concurrent Apriori Data Mining Algorithms

Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets

GSLM Operations Research II Fall 13/14

Optimal Workload-based Weighted Wavelet Synopses

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

2 optmal per-pxel estmate () whch we had proposed for non-scalable vdeo codng [5] [6]. The extended s shown to accurately account for both temporal an

Summarizing Data using Bottom-k Sketches

Hermite Splines in Lie Groups as Products of Geodesics

The Codesign Challenge

NGPM -- A NSGA-II Program in Matlab

Preconditioning Parallel Sparse Iterative Solvers for Circuit Simulation

QoS-aware routing for heterogeneous layered unicast transmissions in wireless mesh networks with cooperative network coding

3. CR parameters and Multi-Objective Fitness Function

CHAPTER 10: ALGORITHM DESIGN TECHNIQUES

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

CHAPTER 2 DECOMPOSITION OF GRAPHS

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES

Lecture 5: Probability Distributions. Random Variables

Sorting. Sorting. Why Sort? Consistent Ordering

Efficient Low-Contention Parallel Algorithms

Secure Distributed Cluster Formation in Wireless Sensor Networks

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 5 Luca Trevisan September 7, 2017

Classification Based Mode Decisions for Video over Networks

Efficient Distributed File System (EDFS)

Announcements. Supervised Learning

Distributed Degree Splitting, Edge Coloring, and Orientations

Optimizing Document Scoring for Query Retrieval

Cluster Analysis of Electrical Behavior

CS 534: Computer Vision Model Fitting

On the Efficiency of Swap-Based Clustering

Review of approximation techniques

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Face Recognition Based on SVM and 2DPCA

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

Constructing Minimum Connected Dominating Set: Algorithmic approach

Efficient Content Distribution in Wireless P2P Networks

Whitewash: Outsourcing Garbled Circuit Generation for Mobile Devices

Application of Clustering Algorithm in Big Data Sample Set Optimization

Transcription:

2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market database e.g., Alce s a company queryng a patent database a trval soluton s for Alce to download the entre database Can the problem be solved wth less communcatons? typcal model: the database s an n-bt strng: X = x x 2 x n Alce s nterested n x the database should not be able to learn 2/7

Some negatve results f Alce uses a determnstc scheme then n bts must be transferred (even f there are multple non-communcatng copes of the database) Alce should use con flps (a randomzed algorthm) f the database has unlmted computatonal power and there s only a sngle copy of the database then n bts must be transferred there s hope f the database can only perform effcent computatons (.e., t s computatonally bounded) there s hope f the database has unlmted computatonal power but there are multple non-communcatng copes of the database 3/7 An example PIR protocol assume that there are 4 copes of the database the bts of X are arranged n a n /2 x n /2 matrx Alce wants to retreve x ( <=, <= n /2 ) protocol: Alce generates two random bt strngs s and t of length n /2 let s be the same as s but wth the -th bt flpped, and let t be the same as t but wth the -th bt flpped Alce sends s and t to DB s and t to DB s and t to DB2 s and t to DB4 each DB returns a sngle bt computed as the XOR of bts x ab where the a-th bt of s (or s ) and the b-th bt of t (or t ) are both equal to Alce XORs the receved bts, and the result gves x 4/7

An example PIR protocol why does t work? s = t = s = t = 4 2 4 s = t = s = t = 2 2 4 2 4 5/7 An example PIR protocol why s t prvate? each database receves two random vectors that are ndependent of and no nformaton on and s leaked to the database 6/7

Informaton theoretc vs. computatonal PIR nformaton theoretc PIR protocols leak no nformaton (n nformaton theoretc sense) about the ndex requested by Alce they wthstand attacks even from a database wth un-lmted computatonal power computatonal PIR (CPIR) protocols provde weaker guarantees: they ensure only that the database cannot get any nformaton unless t solves a computatonally hard problem (reducton) nformaton theoretc PIR protocols requre more than one non-communcatng copes of the database, whle CPIR protocols wth low communcaton overhead exst even for the sngle database case 7/7 An example CPIR protocol prelmnares let m be a postve nteger a number a s a quadratc resdue (QR) mod m, f there s an nteger x such that x 2 mod m = a otherwse a s quadratc non-resdue (QNR) mod m t s computatonally hard to dstngush numbers that are QRs mod m from numbers that are QNRs mod m, unless one knows the factorzaton of m setup the bts of X are arranged n a n /2 x n /2 matrx Alce wants to retreve x ( <=, <= n /2 ) 8/7

An example CPIR protocol protocol: Alce chooses at random a large nteger m (together wth ts factorzaton) she generates n /2 -random QRs mod m: a, a 2,, a -, a +, she generates a random QNR mod m: b Alce sends a, a 2,, a -, b, a +, to the database the server cannot make the dfference between QRs and QNRs mod m, so from the server s pont of vew, the receved vector s ust an array of random numbers: u, u 2, for each column c of X, the database computes v c = u x c u2 x 2c mod m the database responds wth v, v 2, Alce verfes f v s a QR or a QNR mod m f QR, then x = f QNR then x = 9/7 An example CPIR protocol why does t work? X = x x x x a U = b v = a x b x mod m f x = then only QRs are multpled, otherwse QRs are multpled wth a sngle QNR t s known that QR x QR = QR and QR x QNR = QNR /7

State of the art best known nformaton theoretc PIR protocol s based on representng the database as a polynomal, and requres the transmsson of n O(log log k / k log k) bts (where k s the number of copes of the database) CPIR schemes have been constructed based on the dffculty of the Quadratc Resdue Problem (O(n ε )) and the φ-hdng problem (O((log n) a )), and based one-way permutatons (no(n)) connectons of CPIR to oblvous transfer, collson resstant hash functons, functon hdng publc key crypto, complexty theory n general have been studed /7 Varants of PIR and CPIR block PIR what f Alce wants a block of bts (of sze m)? can we do better than nvokng a PIR protocol m tmes? robust PIR what f some of the database copes break down or return false answers (Byzantne falure model)? t-prvate PIR how to ensure that even t colludng databases cannot fgure out n whch bt Alce s nterested n? symmetrc PIR how to prevent Alce to learn more than ust the bt she s nterested n? PIR wth preprocessng the database usually has to do O(n) computatons can ths be cut down? 2/7

Locally decodable codes (LDCs) error correctng codes add redundancy to a message codeword send over nosy channel recover message even f some fracton of the codeword bts are corrupted n practce, longer messages are parttoned nto smaller blocks and each block s coded separately ths allows effcent random access to message bts (one must decode only a fracton of the receved codewords) however, even f a sngle codeword s lost (unrecoverable), then the message cannot be recovered f the entre message would be encoded as a sngle large block ths would mprove robustness but random access would requre decodng the entre message (typcally prohbtvely expensve) 3/7 Locally decodable codes (LDCs) LDCs smultaneously provde random access retreval and hgh nose resstance ths s acheved by allowng the relable reconstructon of any bt of the message from a small number of randomly chosen codeword bts defnton: A (k, δ, ε)-ldc encodes n bt messages nto N bt codewords such that every bt x of the message can be recovered wth probablty -ε by a randomzed decodng procedure that reads only k codeword bts, even f at most δn bts of the codeword are corrupted local decodablty comes at a prce of loss n terms of code effcency (N >> n) fndng more effcent (optmal) LDCs s an actve research area and a maor challenge 4/7

An LDC example (2, δ, 2δ)-Hadamard encodes n bt messages nto 2 n bt codewords let H be a bnary matrx that contans n ts columns all the possble n bt vectors (H s an n x 2 n matrx) encodng: y = C(x) = xh decodng (of the -th bt of x): pck a random n-bt vector t, and let t be the same as t but wth the -th bt flpped x = y t XOR y t probablty of successful decodng at most δn bts of y are corrupted ~ each bt n y s corrupted wth probablty δ (ndependently from the other bts) the probablty that y t or y t s corrupted s 2δ the probablty that both y t and y t are ntact (and hence the decodng of x s successful) s -2δ 5/7 LDCs and the PIR problem LDCs yeld effcent PIR schemes and vce versa all recent constructon of nformaton theoretc PIR schemes work by frst constructng LDCs and then convertng them nto PIR protocols general procedure to obtan a k-server PIR scheme from a (perfectly smooth) k-query LDC: each of the k database servers encodes the database X wth the LDC and stores C(X) f Alce s nterested n x, she generates k random queres q, q 2,, q k, such that x can be recovered from C(X) q,, C(X) qk, and sends q to DB each server DB responds wth one bt C(X) q Alce combnes the responses to obtan x prvacy perfect smoothness of the LDC means that ndvdual queres are dstrbuted perfectly unformly over the codeword bts thus, n the PIR scheme, every query q s ndependent from, and hence, reveals no nformaton on 6/7

Further readngs S. Yekhann, Prvate Informaton Retreval, Communcatons of the ACM, Vol. 53 No. 4, Aprl 2. good ntro + connecton wth LDCs see also hs recent PhD thess (done at MIT) W. Gasarch, A survey on Prvate Informaton Retreval, onlne contans open problems and lot of references R. Ostrovsky, W. Sketh, A survey of sngle-database PIR: technques and applactons, on-lne some constructons n detals + relaton to other problems + references n the above papers 7/7