CS573 Data Privacy and Security. Differential Privacy. Li Xiong
|
|
- Martin McCarthy
- 6 years ago
- Views:
Transcription
1 CS573 Data Privacy and Security Differential Privacy Li Xiong
2 Outline Differential Privacy Definition Basic techniques Composition theorems
3
4 Statistical Data Privacy Non-interactive vs interactive Privacy goal: individual is protected Utility goal: statistical information useful for analysis Queries Original Data Privacy Mechanism Data curator Statistics/ Synthetic data Data analyst
5 Recap Anonymization or de-identification (input perturbation) Linkage attacks, homogeneity attacks Query auditing/restriction Query denial is itself disclosive, computationally infeasible Summary statistics Differencing attacks
6 Differential Privacy Promise: an individual will not be affected, adversely or otherwise, by allowing his/her data to be used in any study or analysis, no matter what other studies, datasets, or information sources, are available Paradox: learning nothing about an individual while learning useful statistical information about a population
7 Differential Privacy Statistical outcome is indistinguishable regardless whether a particular user (record) is included in the data
8 Differential Privacy Statistical outcome is indistinguishable regardless whether a particular user (record) is included in the data
9 Differential privacy: an example Original records Original histogram Perturbed histogram with differential privacy
10
11
12
13
14
15 Differential Privacy: Some Qualitative Properties Protection against presence/participation of a single record Quantification of privacy loss Composition Post-processing
16 Differential Privacy: Additional Remarks Correlations between records Granularity of a single record (difference for neighboring database) Group privacy Graph database (eg social networks): node vs edge Movie rating database: user vs event (movie)
17 Outline Differential Privacy Definition Basic techniques Laplace mechanism Exponential mechanism Random Response Composition theorems
18
19 Can deterministic algorithms satisfy differential privacy? Module 2 Tutorial: Differential Privacy in the Wild 19
20 Non trivial deterministic algorithms do not satisfy differential privacy Space of all inputs Space of all outputs (at least 2 distinct outputs) Module 2 Tutorial: Differential Privacy in the Wild 20
21 Non-trivial deterministic algorithms do not satisfy differential privacy Each input mapped to a distinct output. Module 2 Tutorial: Differential Privacy in the Wild 21
22 There exist two inputs that differ in one entry mapped to different outputs. Pr > 0 Pr = 0 Module 2 Tutorial: Differential Privacy in the Wild 22
23 Output Randomization Query Database Add noise to true answer Add noise to answers such that: Each answer does not leak too much information about the database. Noisy answers are close to the original answers. Researcher Module 2 Tutorial: Differential Privacy in the Wild 23
24 [DMNS 06] Laplace Mechanism Query q Database True answer q(d) q(d) + η η Researcher 0.6 Laplace Distribution Lap(S/ε) Module Tutorial: Differential Privacy in the Wild 24
25 Laplace Distribution PDF: Denoted as Lap(b) when u=0 Mean u Variance 2b 2
26 How much noise for privacy? [Dwork et al., TCC 2006] Sensitivity: Consider a query q: I R. S(q) is the smallest number s.t. for any neighboring tables D, D, q(d) q(d ) S(q) Theorem: If sensitivity of the query is S, then the algorithm A(D) = q(d) + Lap(S(q)/ε) guarantees ε- differential privacy Module 2 Tutorial: Differential Privacy in the Wild 26
27 Example: COUNT query Number of people having disease Sensitivity = 1 D Disease (Y/N) Y Y N Solution: 3 + η, where η is drawn from Lap(1/ε) Mean = 0 Variance = 2/ε 2 Y N N Module 2 Tutorial: Differential Privacy in the Wild 27
28 Example: SUM query Suppose all values x are in [a,b] Sensitivity = b Module 2 Tutorial: Differential Privacy in the Wild 28
29 Privacy of Laplace Mechanism Consider neighboring databases D and D Consider some output O Module 2 Tutorial: Differential Privacy in the Wild 29
30 Utility of Laplace Mechanism Laplace mechanism works for any function that returns a real number Error: E(true answer noisy answer) 2 = Var( Lap(S(q)/ε) ) = 2*S(q) 2 / ε 2 Error bound: very unlikely the result has an error greater than a factor (Roth book Theorem 3.8) Module 2 Tutorial: Differential Privacy in the Wild 30
31 Outline Differential Privacy Definition Basic techniques Laplace mechanism Exponential mechanism Random Response Composition theorems
32 Exponential Mechanism For functions that do not return a real number what is the most common nationality in this room : Chinese/Indian/American When perturbation leads to invalid outputs To ensure integrality/non-negativity of output Module 2 Tutorial: Differential Privacy in the Wild 32
33 [MT 07] Exponential Mechanism Consider some function f (can be deterministic or probabilistic): Inputs Outputs How to construct a differentially private version of f? Module 2 Tutorial: Differential Privacy in the Wild 33
34 Exponential Mechanism Theorem For a database D, output space R and a utility score function u : D R R, the algorithm A Pr[A(D) = r] exp (ε u(d, r)/ 2Δu) satisfies ε-differential privacy, where Δu is the sensitivity of the utility score function Δu = max r & D,D u(d, r) - u(d, r)
35 Example: Exponential Mechanism Scoring/utility function w: Inputs x Outputs R D: nationalities of a set of people f(d) : most frequent nationality in D u (D, O) = #(D, O) the number of people with nationality O Module 2 Tutorial: Differential Privacy in the Wild 35
36 Privacy of Exponential Mechanism The exponential mechanism outputs an element r with probability Pr[A(D) = r] exp (ε u(d, r)/ 2Δu) Δu = max r & D,D u(d, r) - u(d, r) Approximately Pr[A(D) = r] /Pr[A(D ) = r] <= ε (Exact proof with normalization factor: Roth Book page 39)
37 Privacy of Exponential Mechanism
38 Utility of Exponential Mechanism Can give strong utility guarantees, as it discounts outcomes exponentially based on utility score Highly unlikely that returned element r has a utility score inferior to max r u(d,r) by an additive factor of (Theorem 3.11 Roth book)
39 Outline Differential Privacy Definition Basic techniques Laplace mechanism Exponential mechanism Random Response Composition theorems
40 Randomized Response (a.k.a. local randomization) D O [W 65] Disease (Y/N) Y Y N With probability p, Report true value With probability 1-p, Report flipped value Disease (Y/N) Y N N Y N N Y N N Module 2 Tutorial: Differential Privacy in the Wild 40
41 Differential Privacy Analysis Consider 2 databases D, D (of size M) that differ in the j th value D[j] D [j]. But, D[i] = D [i], for all i j Consider some output O Module 2 Tutorial: Differential Privacy in the Wild 41
42 Utility Analysis Suppose n1 out of n people replied yes, and rest said no What is the best estimate for π = fraction of people with disease = Y? π hat = {n1/n (1-p)}/(2p-1) E(π hat ) = π Var(π hat) = Sampling Variance due to coin flips Module 2 Tutorial: Differential Privacy in the Wild 42
43 Laplace Mechanism vs Randomized Response Privacy Provide the same ε-differential privacy guarantee Laplace mechanism assumes data collector is trusted Randomized Response does not require data collector to be trusted Also called a Local Algorithm, since each record is perturbed Module 2 Tutorial: Differential Privacy in the Wild 43
44 Laplace Mechanism vs Randomized Response Utility Suppose a database with N records where μn records have disease = Y. Query: # rows with Disease=Y Std dev of Laplace mechanism answer: O(1/ε) Std dev of Randomized Response answer: O( N) Module 2 Tutorial: Differential Privacy in the Wild 44
45 Outline Differential Privacy Basic Algorithms Laplace Exponential Mechanism Randomized Response Composition Theorems Module 2 Tutorial: Differential Privacy in the Wild 45
46 Why Composition? Reasoning about privacy of a complex algorithm is hard. Helps software design If building blocks are proven to be private, it would be easy to reason about privacy of a complex algorithm built entirely using these building blocks. Module 2 Tutorial: Differential Privacy in the Wild 46
47 A bound on the number of queries In order to ensure utility, a statistical database must leak some information about each individual We can only hope to bound the amount of disclosure Hence, there is a limit on number of queries that can be answered Module 2 Tutorial: Differential Privacy in the Wild 47
48 Composition theorems Sequential composition i ε i differential privacy Parallel composition max(ε i ) differential privacy
49 Sequential Composition If M 1, M 2,..., M k are algorithms that access a private database D such that each M i satisfies ε i -differential privacy, then the combination of their outputs satisfies ε-differential privacy with ε=ε ε k Module 2 Tutorial: Differential Privacy in the Wild 49
50 Parallel Composition If M 1, M 2,..., M k are algorithms that access disjoint databases D 1, D 2,, D k such that each M i satisfies ε i -differential privacy, then the combination of their outputs satisfies ε-differential privacy with ε= max{ε 1,...,ε k } Module 2 Tutorial: Differential Privacy in the Wild 50
51 Postprocessing If M 1 is an ε differentially private algorithm that accesses a private database D, then outputting M 2 (M 1 (D)) also satisfies ε- differential privacy. Module 2 Tutorial: Differential Privacy in the Wild 51
52 Summary Differential privacy ensure an attacker can t infer the presence or absence of a single record in the input based on any output. Building blocks Laplace, exponential mechanism, (local) randomized response Composition rules help build complex algorithms using building blocks Module 2 Tutorial: Differential Privacy in the Wild 52
53 Case Study: K-means Clustering Module 2 Tutorial: Differential Privacy in the Wild 53
54 Kmeans Partition a set of points x 1, x 2,, x n into k clusters S 1, S 2,, S k such that the following is minimized: Mean of the cluster S i Module 2 Tutorial: Differential Privacy in the Wild 54
55 Kmeans Algorithm: Initialize a set of k centers Repeat Assign each point to its nearest center Recompute the set of centers Until convergence Output final set of k centers Module 2 Tutorial: Differential Privacy in the Wild 55
56 Differentially Private Kmeans [BDMN 05] Suppose we fix the number of iterations to T In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster 3. Compute noisy sums of points in each cluster Module 2 Tutorial: Differential Privacy in the Wild 56
57 Differentially Private Kmeans Suppose we fix the number of iterations to T Each iteration uses ε/t privacy budget, total privacy loss is ε In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster 3. Compute noisy sums of points in each cluster Module 2 Tutorial: Differential Privacy in the Wild 57
58 Differentially Private Kmeans Suppose we fix the number of iterations to T Exercise: Which of these steps expends privacy budget? In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster 3. Compute noisy sums of points in each cluster Module 2 Tutorial: Differential Privacy in the Wild 58
59 Differentially Private Kmeans Suppose we fix the number of iterations to T Exercise: Which of these steps expends privacy budget? In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster 3. Compute noisy sums of points in each cluster NO YES YES Module 2 Tutorial: Differential Privacy in the Wild 59
60 Differentially Private Kmeans Suppose we fix the number of iterations to T What is the sensitivity? In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster 3. Compute noisy sums of points in each cluster 1 Domain size Module 2 Tutorial: Differential Privacy in the Wild 60
61 Differentially Private Kmeans Suppose we fix the number of iterations to T Each iteration uses ε/t privacy budget, total privacy loss is ε In each iteration (given a set of centers): 1. Assign the points to the new center to form clusters 2. Noisily compute the size of each cluster Laplace(2T/ε) 3. Compute noisy sums of points in each cluster Laplace(2T dom /ε) Module 2 Tutorial: Differential Privacy in the Wild 61
62 Results (T = 10 iterations, random initialization) Original Kmeans algorithm Laplace Kmeans algorithm Even though we noisily compute centers, Laplace kmeans can distinguish clusters that are far apart. Since we add noise to the sums with sensitivity proportional to dom, Laplace k-means can t distinguish small clusters that are close by. Module 2 Tutorial: Differential Privacy in the Wild 62
63 Privacy as Constrained Optimization Three axes Privacy Error Queries that can be answered E.g.: Given a fixed set of queries and privacy budget ε, what is the minimum error that can be achieved? E.g.: Given a task and privacy budget ε, how to design a set of queries (functions) and allocate the budget such that the error is minimized? Module 2 Tutorial: Differential Privacy in the Wild 63
64 References [W65] Warner, Randomized Response JASA 1965 [DN03] Dinur, Nissim, Revealing information while preserving privacy, PODS 2003 [BDMN05] Blum, Dwork, McSherry, Nissim, Practical privacy: the SuLQ framework, PODS 2005 [D06] Dwork, Differential Privacy, ICALP 2006 [DMNS06] Dwork, McSherry, Nissim, Smith, Calibrating noise to sensitivity in private data analysis, TCC 2006 [MT07] McSherry, Talwar, Mechanism Design via Differential Privacy, FOCS 2007 Module 2 Tutorial: Differential Privacy in the Wild 64
Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017
Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of
More informationPufferfish: A Semantic Approach to Customizable Privacy
Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop
More informationPrivacy Preserving Machine Learning: A Theoretically Sound App
Privacy Preserving Machine Learning: A Theoretically Sound Approach Outline 1 2 3 4 5 6 Privacy Leakage Events AOL search data leak: New York Times journalist was able to identify users from the anonymous
More informationPrivately Solving Linear Programs
Privately Solving Linear Programs Justin Hsu 1 Aaron Roth 1 Tim Roughgarden 2 Jonathan Ullman 3 1 University of Pennsylvania 2 Stanford University 3 Harvard University July 8th, 2014 A motivating example
More informationData Anonymization. Graham Cormode.
Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties
More informationPrivacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University
Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks
More informationPrivacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015.
Privacy-preserving machine learning Bo Liu, the HKUST March, 1st, 2015. 1 Some slides extracted from Wang Yuxiang, Differential Privacy: a short tutorial. Cynthia Dwork, The Promise of Differential Privacy.
More informationDifferential Privacy. Cynthia Dwork. Mamadou H. Diallo
Differential Privacy Cynthia Dwork Mamadou H. Diallo 1 Focus Overview Privacy preservation in statistical databases Goal: to enable the user to learn properties of the population as a whole, while protecting
More information0x1A Great Papers in Computer Security
CS 380S 0x1A Great Papers in Computer Security Vitaly Shmatikov http://www.cs.utexas.edu/~shmat/courses/cs380s/ C. Dwork Differential Privacy (ICALP 2006 and many other papers) Basic Setting DB= x 1 x
More informationDifferentially Private H-Tree
GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern
More informationParallel Composition Revisited
Parallel Composition Revisited Chris Clifton 23 October 2017 This is joint work with Keith Merrill and Shawn Merrill This work supported by the U.S. Census Bureau under Cooperative Agreement CB16ADR0160002
More informationAn Ad Omnia Approach to Defining and Achiev ing Private Data Analysis
An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis Mohammad Hammoud CS3525 Dept. of Computer Science University of Pittsburgh Introduction This paper addresses the problem of defining
More informationCryptography & Data Privacy Research in the NSRC
Cryptography & Data Privacy Research in the NSRC Adam Smith Assistant Professor Computer Science and Engineering 1 Cryptography & Data Privacy @ CSE NSRC SIIS Algorithms & Complexity Group Cryptography
More informationPrivacy-Preserving Machine Learning
Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following concepts:
More informationCo-clustering for differentially private synthetic data generation
Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &
More informationIntroduction to Data Mining
Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL
More informationMicrodata Publishing with Algorithmic Privacy Guarantees
Microdata Publishing with Algorithmic Privacy Guarantees Tiancheng Li and Ninghui Li Department of Computer Science, Purdue University 35 N. University Street West Lafayette, IN 4797-217 {li83,ninghui}@cs.purdue.edu
More informationDistributed Data Mining with Differential Privacy
Distributed Data Mining with Differential Privacy Ning Zhang, Ming Li, Wenjing Lou Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA Email: {ning, mingli}@wpi.edu,
More informationFREQUENT subgraph mining (FSM) is a fundamental and
JOURNAL OF L A T E X CLASS FILES, VOL. 4, NO. 8, AUGUST 25 A Two-Phase Algorithm for Differentially Private Frequent Subgraph Mining Xiang Cheng, Sen Su, Shengzhi Xu, Li Xiong, Ke Xiao, Mingxing Zhao Abstract
More informationApproaches to distributed privacy protecting data mining
Approaches to distributed privacy protecting data mining Bartosz Przydatek CMU Approaches to distributed privacy protecting data mining p.1/11 Introduction Data Mining and Privacy Protection conflicting
More informationarxiv: v1 [cs.ds] 12 Sep 2016
Jaewoo Lee Penn State University, University Par, PA 16801 Daniel Kifer Penn State University, University Par, PA 16801 JLEE@CSE.PSU.EDU DKIFER@CSE.PSU.EDU arxiv:1609.03251v1 [cs.ds] 12 Sep 2016 Abstract
More informationCrowd-Blending Privacy
Crowd-Blending Privacy Johannes Gehrke, Michael Hay, Edward Lui, and Rafael Pass Department of Computer Science, Cornell University {johannes,mhay,luied,rafael}@cs.cornell.edu Abstract. We introduce a
More informationDEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open
DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open AI 2 3 Deep Learning Fashion Cognitive tasks: speech,
More informationDifferentially Private Multi- Dimensional Time Series Release for Traffic Monitoring
DBSec 13 Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring Liyue Fan, Li Xiong, Vaidy Sunderam Department of Math & Computer Science Emory University 9/4/2013 DBSec'13:
More informationwith BLENDER: Enabling Local Search a Hybrid Differential Privacy Model
BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model Brendan Avent 1, Aleksandra Korolova 1, David Zeber 2, Torgeir Hovden 2, Benjamin Livshits 3 University of Southern California 1
More informationDifferentially Private Algorithm and Auction Configuration
Differentially Private Algorithm and Auction Configuration Ellen Vitercik CMU, Theory Lunch October 11, 2017 Joint work with Nina Balcan and Travis Dick $1 Prices learned from purchase histories can reveal
More informationIndrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin
Airavat: Security and Privacy for MapReduce Indrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin Computing in the year 201X 2 Data Illusion of
More informationFrequent grams based Embedding for Privacy Preserving Record Linkage
Frequent grams based Embedding for Privacy Preserving Record Linkage ABSTRACT Luca Bonomi Emory University Atlanta, USA lbonomi@mathcs.emory.edu Rui Chen Concordia University Montreal, Canada ru_che@encs.concordia.ca
More informationSublinear Algorithms January 12, Lecture 2. First, let us look at a couple of simple testers that do not work well.
Sublinear Algorithms January 12, 2012 Lecture 2 Lecturer: Sofya Raskhodnikova Scribe(s): Madhav Jha 1 Testing if a List is Sorted Input: Goal: a list x 1,..., x n of arbitrary numbers. an -tester for sortedness
More informationPrivApprox. Privacy- Preserving Stream Analytics.
PrivApprox Privacy- Preserving Stream Analytics https://privapprox.github.io Do Le Quoc, Martin Beck, Pramod Bhatotia, Ruichuan Chen, Christof Fetzer, Thorsten Strufe July 2017 Motivation Clients Analysts
More informationCS573 Data Privacy and Security. Differential Privacy tabular data and range queries. Li Xiong
CS573 Data Privacy and Security Differential Privacy tabular data and range queries Li Xiong Outline Tabular data and histogram/range queries Algorithms for low dimensional data Algorithms for high dimensional
More informationSecurity Control Methods for Statistical Database
Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP
More informationIntroduction to Secure Multi-Party Computation
CS 380S Introduction to Secure Multi-Party Computation Vitaly Shmatikov slide 1 Motivation General framework for describing computation between parties who do not trust each other Example: elections N
More informationAlgorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth
Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis Part 1 Aaron Roth The 2015 ImageNet competition An image classification competition during a heated war for deep learning talent
More informationPrivate Database Synthesis for Outsourced System Evaluation
Private Database Synthesis for Outsourced System Evaluation Vani Gupta 1, Gerome Miklau 1, and Neoklis Polyzotis 2 1 Dept. of Computer Science, University of Massachusetts, Amherst, MA, USA 2 Dept. of
More informationDifferential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu
Differential Privacy CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Era of big data Motivation: Utility vs. Privacy large-size database automatized data analysis Utility "analyze and extract knowledge from
More informationPrivacy in Statistical Databases
Privacy in Statistical Databases CSE 598D/STAT 598B Fall 2007 Lecture 2, 9/13/2007 Aleksandra Slavkovic Office hours: MW 3:30-4:30 Office: Thomas 412 Phone: x3-4918 Adam Smith Office hours: Mondays 3-5pm
More informationSMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique
SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge
More informationData Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness
Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and
More informationAutomated Information Retrieval System Using Correlation Based Multi- Document Summarization Method
Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated
More informationA generic and distributed privacy preserving classification method with a worst-case privacy guarantee
Distrib Parallel Databases (2014) 32:5 35 DOI 10.1007/s10619-013-7126-6 A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Madhushri Banerjee Zhiyuan
More informationDifferential Privacy Under Fire
Differential Privacy Under Fire Andreas Haeberlen Benjamin C. Pierce Arjun Narayan University of Pennsylvania 1 Motivation: Protecting privacy Alice #1 (Star Wars, 5) (Alien, 4) Bob #2 (Godfather, 1) (Porn,
More informationCoresets for Differentially Private K-Means Clustering and Applications to Privacy in Mobile Sensor Networks
Coresets for Differentially Private K-Means Clustering and Applications to Privacy in Mobile Sensor Networks ABSTRACT Mobile sensor networks are a great source of data. By collecting data with mobile sensor
More informationEmerging Measures in Preserving Privacy for Publishing The Data
Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the
More informationThe Two Dimensions of Data Privacy Measures
The Two Dimensions of Data Privacy Measures Abstract Orit Levin Page 1 of 9 Javier Salido Corporat e, Extern a l an d Lega l A ffairs, Microsoft This paper describes a practical framework for the first
More informationIntroduction to Secure Multi-Party Computation
Introduction to Secure Multi-Party Computation Many thanks to Vitaly Shmatikov of the University of Texas, Austin for providing these slides. slide 1 Motivation General framework for describing computation
More informationAdding Differential Privacy in an Open Board Discussion Board System
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-26-2017 Adding Differential Privacy in an Open Board Discussion Board System Pragya Rana San
More informationSanitization of call detail records via differentially-private Bloom filters
Sanitization of call detail records via differentially-private Bloom filters Mohammad Alaggan Helwan University Joint work with Sébastien Gambs (Université de Rennes 1 - Inria / IRISA), Stan Matwin and
More informationDifferentially Private Trajectory Data Publication
Differentially Private Trajectory Data Publication Rui Chen #1, Benjamin C. M. Fung 2, Bipin C. Desai #3 # Department of Computer Science and Software Engineering, Concordia University Montreal, Quebec,
More informationDemonstration of Damson: Differential Privacy for Analysis of Large Data
Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg
More informationSummarizing and mining inverse distributions on data streams via dynamic inverse sampling
Summarizing and mining inverse distributions on data streams via dynamic inverse sampling Presented by Graham Cormode cormode@bell-labs.com S. Muthukrishnan muthu@cs.rutgers.edu Irina Rozenbaum rozenbau@paul.rutgers.edu
More informationClustering. (Part 2)
Clustering (Part 2) 1 k-means clustering 2 General Observations on k-means clustering In essence, k-means clustering aims at minimizing cluster variance. It is typically used in Euclidean spaces and works
More informationDifferentially-Private Network Trace Analysis. Frank McSherry and Ratul Mahajan Microsoft Research
Differentially-Private Network Trace Analysis Frank McSherry and Ratul Mahajan Microsoft Research Overview. 1 Overview Question: Is it possible to conduct network trace analyses in a way that provides
More informationSecure Multiparty Computation
CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation
More informationOn Biased Reservoir Sampling in the Presence of Stream Evolution
Charu C. Aggarwal T J Watson Research Center IBM Corporation Hawthorne, NY USA On Biased Reservoir Sampling in the Presence of Stream Evolution VLDB Conference, Seoul, South Korea, 2006 Synopsis Construction
More informationA Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris
A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris Gergely Acs (INRIA) gergely.acs@inria.fr!! Claude Castelluccia (INRIA) claude.castelluccia@inria.fr! Outline 2! Dataset descrip9on!
More informationCollecting Telemetry Data Privately
Collecting Telemetry Data Privately Bolin Ding, Janardhan Kulkarni, Sergey Yekhanin Microsoft Research {bolind, jakul, yekhanin}@microsoft.com Abstract The collection and analysis of telemetry data from
More informationAn Adaptive Algorithm for Range Queries in Differential Privacy
Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 6-2016 An Adaptive Algorithm for Range Queries in Differential Privacy Asma Alnemari Follow this and additional
More information6.842 Randomness and Computation September 25-27, Lecture 6 & 7. Definition 1 Interactive Proof Systems (IPS) [Goldwasser, Micali, Rackoff]
6.84 Randomness and Computation September 5-7, 017 Lecture 6 & 7 Lecturer: Ronitt Rubinfeld Scribe: Leo de Castro & Kritkorn Karntikoon 1 Interactive Proof Systems An interactive proof system is a protocol
More informationPrivacy-Preserving Outlier Detection for Data Streams
Privacy-Preserving Outlier Detection for Data Streams Jonas Böhler 1, Daniel Bernau 1, and Florian Kerschbaum 2 1 SAP Research, Karlsruhe, Germany firstname.lastname@sap.com 2 University of Waterloo, Canada
More informationMining Frequent Patterns with Differential Privacy
Mining Frequent Patterns with Differential Privacy Luca Bonomi (Supervised by Prof. Li Xiong) Department of Mathematics & Computer Science Emory University Atlanta, USA lbonomi@mathcs.emory.edu ABSTRACT
More informationDifferen'al Privacy. CS 297 Pragya Rana
Differen'al Privacy CS 297 Pragya Rana Outline Introduc'on Privacy Data Analysis: The SeAng Impossibility of Absolute Disclosure Preven'on Achieving Differen'al Privacy Introduc'on Sta's'c: quan'ty computed
More informationCS 6903 Modern Cryptography February 14th, Lecture 4: Instructor: Nitesh Saxena Scribe: Neil Stewart, Chaya Pradip Vavilala
CS 6903 Modern Cryptography February 14th, 2008 Lecture 4: Instructor: Nitesh Saxena Scribe: Neil Stewart, Chaya Pradip Vavilala Definition 1 (Indistinguishability (IND-G)) IND-G is a notion that was defined
More informationCS321 Introduction To Numerical Methods
CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types
More informationDifferentially Private Histogram Publication
Noname manuscript No. will be inserted by the editor) Differentially Private Histogram Publication Jia Xu Zhenjie Zhang Xiaokui Xiao Yin Yang Ge Yu Marianne Winslett Received: date / Accepted: date Abstract
More informationAlgorithms for Integer Programming
Algorithms for Integer Programming Laura Galli November 9, 2016 Unlike linear programming problems, integer programming problems are very difficult to solve. In fact, no efficient general algorithm is
More informationPosition Paper: Differential Privacy with Information Flow Control
Position Paper: Differential Privacy with Information Flow Control Arnar Birgisson Chalmers University of Technology arnar.birgisson@chalmers.se Frank McSherry Martín Abadi Microsoft Research {mcsherry,abadi}@microsoft.com
More informationA Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean:
A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms. Zero Knowledge Protocols 3. Each statement is derived via the derivation rules.
More informationZero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16)
Zero Knowledge Protocols c Eli Biham - May 3, 2005 442 Zero Knowledge Protocols (16) A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms.
More informationA Differentially Private Matching Scheme for Pairing Similar Users of Proximity Based Social Networking applications
Proceedings of the 51 st Hawaii International Conference on System Sciences 2018 A Differentially Private Matching Scheme for Pairing Similar Users of Proximity Based Social Networking applications Michael
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationEvent Detection through Differential Pattern Mining in Internet of Things
Event Detection through Differential Pattern Mining in Internet of Things Authors: Md Zakirul Alam Bhuiyan and Jie Wu IEEE MASS 2016 The 13th IEEE International Conference on Mobile Ad hoc and Sensor Systems
More informationSecure Multiparty Computation Multi-round protocols
Secure Multiparty Computation Multi-round protocols Li Xiong CS573 Data Privacy and Security Secure multiparty computation General circuit based secure multiparty computation methods Specialized secure
More informationMatrix Mechanism and Data Dependent algorithms
Matrix Mechanism and Data Dependent algorithms CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 9 : 590.03 Fall 16 1 Recap: Constrained Inference Lecture 9 : 590.03 Fall 16 2 Constrained Inference
More informationAuthenticating Pervasive Devices with Human Protocols
Authenticating Pervasive Devices with Human Protocols Presented by Xiaokun Mu Paper Authors: Ari Juels RSA Laboratories Stephen A. Weis Massachusetts Institute of Technology Authentication Problems It
More informationHW Graph Theory SOLUTIONS (hbovik) - Q
1, Diestel 9.3: An arithmetic progression is an increasing sequence of numbers of the form a, a+d, a+ d, a + 3d.... Van der Waerden s theorem says that no matter how we partition the natural numbers into
More informationCryptography CS 555. Topic 11: Encryption Modes and CCA Security. CS555 Spring 2012/Topic 11 1
Cryptography CS 555 Topic 11: Encryption Modes and CCA Security CS555 Spring 2012/Topic 11 1 Outline and Readings Outline Encryption modes CCA security Readings: Katz and Lindell: 3.6.4, 3.7 CS555 Spring
More informationInferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles
Supporting Information to Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles Ali Shojaie,#, Alexandra Jauhiainen 2,#, Michael Kallitsis 3,#, George
More informationAn iteration of the simplex method (a pivot )
Recap, and outline of Lecture 13 Previously Developed and justified all the steps in a typical iteration ( pivot ) of the Simplex Method (see next page). Today Simplex Method Initialization Start with
More informationPrivacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.
Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC
More informationMechanism Design in Large Congestion Games
Mechanism Design in Large Congestion Games Ryan Rogers, Aaron Roth, Jonathan Ullman, and Steven Wu July 22, 2015 Routing Game l e (y) Routing Game Routing Game A routing game G is defined by Routing Game
More informationDifferentially Private Spatial Decompositions
Differentially Private Spatial Decompositions Graham Cormode Cecilia Procopiuc Divesh Srivastava AT&T Labs Research {graham, magda, divesh}@research.att.com Entong Shen Ting Yu North Carolina State University
More informationTopic 5 - Joint distributions and the CLT
Topic 5 - Joint distributions and the CLT Joint distributions Calculation of probabilities, mean and variance Expectations of functions based on joint distributions Central Limit Theorem Sampling distributions
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationHomework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)
Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in
More informationSecure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)
Secure Multiparty Computation: Introduction Ran Cohen (Tel Aviv University) Scenario 1: Private Dating Alice and Bob meet at a pub If both of them want to date together they will find out If Alice doesn
More informationGraph Theory and Optimization Approximation Algorithms
Graph Theory and Optimization Approximation Algorithms Nicolas Nisse Université Côte d Azur, Inria, CNRS, I3S, France October 2018 Thank you to F. Giroire for some of the slides N. Nisse Graph Theory and
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationTime Distortion Anonymization for the Publication of Mobility Data with High Utility
Time Distortion Anonymization for the Publication of Mobility Data with High Utility Vincent Primault, Sonia Ben Mokhtar, Cédric Lauradoux and Lionel Brunie Mobility data usefulness Real-time traffic,
More informationAnonymization Algorithms - Microaggregation and Clustering
Anonymization Algorithms - Microaggregation and Clustering Li Xiong CS573 Data Privacy and Anonymity Anonymization using Microaggregation or Clustering Practical Data-Oriented Microaggregation for Statistical
More informationClimbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle
Climbing the Kaggle Leaderboard by Exploiting the Log-Loss Oracle Jacob Whitehill jrwhitehill@wpi.edu Worcester Polytechnic Institute https://arxiv.org/abs/1707.01825 Machine learning competitions Data-mining
More informationImproving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique
Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,
More informationThe Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data
The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data Li Liu, Murat Kantarcioglu and Bhavani Thuraisingham Computer Science Department University of Texas
More informationApproximation Algorithms for k-anonymity 1
Journal of Privacy Technology 20051120001 Approximation Algorithms for k-anonymity 1 Gagan Aggarwal 2 Google Inc., Mountain View, CA Tomas Feder 268 Waverley St., Palo Alto, CA Krishnaram Kenthapadi 3
More informationCPSC 536N: Randomized Algorithms Term 2. Lecture 10
CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very
More informationCS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong
CS573 Data Privacy and Security Cryptographic Primitives and Secure Multiparty Computation Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation
More informationSecure Multiparty Computation
Secure Multiparty Computation Li Xiong CS573 Data Privacy and Security Outline Secure multiparty computation Problem and security definitions Basic cryptographic tools and general constructions Yao s Millionnare
More informationDifferential Privacy and Its Application
Differential Privacy and Its Application By Tianqing Zhu Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Deakin University March 2014 To my Son... iii Table of Contents
More informationRelationship Privacy: Output Perturbation for Queries with Joins
Relationship Privacy: Output Perturbation for Queries with Joins ABSTRACT Vibhor Rastogi CSE, University of Washington Seattle, WA, USA, 98105-2350 vibhor@cs.washington.edu Gerome Miklau CS, UMass Amherst
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More information