Pufferfish: A Semantic Approach to Customizable Privacy

Size: px
Start display at page:

Download "Pufferfish: A Semantic Approach to Customizable Privacy"

Transcription

1 Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop 9/29/2012 1

2 Outline Background How to define privacy? Correlations: A case for customizable privacy [Kifer-M SIGMOD 11] Pufferfish Privacy Framework [Kifer-M PODS 12] Case Study [Kifer-M PODS 12 & Ding-M 13] Privately handling correlations due to exact release of counts Open Challenges idash Privacy Workshop 9/29/2012 2

3 Data Privacy Problem Utility: Privacy: No breach about any individual Server D B Individual 1 r 1 Individual 2 r 2 Individual 3 r 3 Individual N r N idash Privacy Workshop 9/29/2012 6

4 Data Privacy in the real world Application Data Collector Third Party (adversary) Private Information Function (utility) Medical Hospital Epidemiologist Disease Correlation between disease and geography Genome analysis Hospital Statistician/ Researcher Genome Advertising Google/FB/Y! Advertiser Clicks/Brows ing Social Recommendations Facebook Another user Friend links / profile Correlation between genome and disease Number of clicks on an ad by age/region/gender Recommend other users or ads to users based on social network idash Privacy Workshop 9/29/2012 7

5 Many definitions & several attacks Sweeney et al. IJUFKS 02 K-Anonymity T-closeness Li et. al ICDE 07 L-diversity Machanavajjhala et. al TKDD 07 E-Privacy Linkage attack Background knowledge attack Minimality /Reconstruction attack de Finetti attack Composition attack Differential Privacy Machanavajjhala et. al VLDB 09 Dwork et. al ICALP 06 idash Privacy Workshop 9/29/

6 Differential Privacy Domain independent privacy definition that is independent of the attacker. Tolerates many attacks that other definitions are susceptible to. Avoids composition attacks Claimed to be tolerant against adversaries with arbitrary background knowledge. Allows simple, efficient and useful privacy mechanisms Used in a live US Census Product [M et al ICDE 08] idash Privacy Workshop 9/29/

7 No Free Lunch Theorem It is not possible to guarantee any utility in addition to privacy, without making assumptions about the data generating distribution [Kifer-Machanavajjhala SIGMOD 11] the background knowledge available to an adversary [Dwork-Naor JPC 10] idash Privacy Workshop 9/29/

8 Outline Background How to define privacy? Correlations: A case for customizable privacy [Kifer-M SIGMOD 11] Pufferfish Privacy Framework [Kifer-M PODS 12] Case Study [Kifer-M PODS 12 & Ding-M 12] Privately handling correlations due to exact release of counts Open Challenges idash Privacy Workshop 9/29/

9 Contingency tables Each tuple takes k=4 different values D Count(, ) idash Privacy Workshop 9/29/

10 Contingency tables Want to release counts privately???? D Count(, ) idash Privacy Workshop 9/29/

11 Laplace Mechanism 2 + Lap(1/ε) 2 + Lap(1/ε) 2 + Lap(1/ε) 8 + Lap(1/ε) D Mean : 8 Variance : 2/ε 2 Guarantees differential privacy. idash Privacy Workshop 9/29/

12 Marginal counts 2 + Lap(1/ε) 2 + Lap(1/ε) Lap(1/ε) 8 + Lap(1/ε) Auxiliary marginals published for following reasons: D 1. Legal: 2002 Supreme Court case Utah v. Evans 2. Contractual: Advertisers must know exact demographics at coarse granularities Does Laplace mechanism still guarantee privacy? idash Privacy Workshop 9/29/

13 Marginal counts 2 + Lap(1/ε) 2 + Lap(1/ε) Lap(1/ε) 8 + Lap(1/ε) D Count (, ) = 8 + Lap(1/ε) Count (, ) = 8 - Lap(1/ε) Count (, ) = 8 - Lap(1/ε) Count (, ) = 8 + Lap(1/ε) idash Privacy Workshop 9/29/

14 Marginal counts 2 + Lap(1/ε) 2 + Lap(1/ε) Lap(1/ε) 8 + Lap(1/ε) D Mean : 8 Variance : 2/ke 2 can reconstruct the table with high precision for large k idash Privacy Workshop 9/29/

15 For handling correlations Customizing Privacy Social networks [Kifer-M SIGMOD 11] Utility driven applications Realistic vs Worst-case adversaries [M et al PVLDB 09] Dealing with aggregate secrets [Kifer-M PODS 12] Qn: How to design principled privacy definitions customized to such scenarios? idash Privacy Workshop 9/29/

16 Outline Background How to define privacy? Correlations: A case for customizable privacy [Kifer-M SIGMOD 11] Pufferfish Privacy Framework [Kifer-M PODS 12] Case Study [Kifer-M PODS 12 & Ding-M 12] Privately handling correlations due to exact release of counts Open Challenges idash Privacy Workshop 9/29/

17 What is being kept secret? Pufferfish Semantics Who are the adversaries? How is information disclosure bounded? idash Privacy Workshop 9/29/

18 Sensitive Information Secrets: S be a set of potentially sensitive statements individual j s record is in the data, and j has Cancer individual j s record is not in the data Discriminative Pairs: Spairs is a subset of SxS. Mutually exclusive pairs of secrets. ( Bob is in the table, Bob is not in the table ) ( Bob has cancer, Bob has diabetes ) idash Privacy Workshop 9/29/

19 Adversaries An adversary can be completely characterized by his/her prior information about the data We do not assume computational limits Data Evolution Scenarios: set of all probability distributions that could have generated the data. No assumptions: All probability distributions over data instances are possible. I.I.D.: Set of all f such that: P(data = {r 1, r 2,, r k }) = f(r 1 ) x f(r 2 ) x x f(r k ) idash Privacy Workshop 9/29/

20 Information Disclosure Mechanism M satisfies ε-pufferfish(s, Spairs, D), if for every w ε Range(M), (s i, s j ) ε Spairs Θ ε D, such that P(s i θ) 0, P(s j θ) 0 P(M(data) = w s i, θ) e ε P(M(data) = w s j, θ) idash Privacy Workshop 9/29/

21 Pufferfish Semantic Guarantee Posterior odds of s i vs s j Prior odds of s i vs s j idash Privacy Workshop 9/29/

22 Applying Pufferfish to Differential Privacy Spairs: record j is in the table vs record j is not in the table record j is in the table with value x vs record j is not in the table Data evolution: Probability record j is in the table: π j Probability distribution over values of record j: f j For all θ = [f 1, f 2, f 3,, f k, π 1, π 2,, π k ] P[Data = D θ] = Π rj not in D (1-π j ) x Π rj in D π j x f j (r j ) idash Privacy Workshop 9/29/

23 Applying Pufferfish to Differential Privacy Spairs: record j is in the table vs record j is not in the table record j is in the table with value x vs record j is not in the table Data evolution: For all θ = [f 1, f 2, f 3,, f k, π 1, π 2,, π k ] P[Data = D θ] = Π rj not in D (1-π j ) x Π rj in D π j x f j (r j ) A mechanism M satisfies differential privacy if and only if it satisfies Pufferfish instantiated using Spairs and {θ} (as defined above) idash Privacy Workshop 9/29/

24 Differential Privacy Sensitive information: All pairs of secrets individual j is in the table with value x vs individual j is not in the table Adversary: Adversaries who believe the data is generated using any probability distribution that is independent across individuals Disclosure: ratio of the prior and posterior odds of the adversary is bounded by e ε idash Privacy Workshop 9/29/

25 Characterizing good privacy definition Any privacy definition that can be phrased as follows composes with itself. where I is the set of all tables. idash Privacy Workshop 9/29/

26 Outline Background How to define privacy? Correlations: A case for customizable privacy [Kifer-M SIGMOD 11] Pufferfish Privacy Framework [Kifer-M PODS 12] Case Study [Kifer-M PODS 12 & Ding-M 12] Privately handling correlations due to exact release of counts Open Challenges idash Privacy Workshop 9/29/

27 Induced Neighbor Privacy Differential Privacy: Neighboring tables differ in one value Induced neighbors: tables whose shortest path does not contain another table that is consistent with the marginals (e.g., D and D ) D D D idash Privacy Workshop 9/29/

28 Induced Neighbor Privacy For every pair of induced neighbors For every output D 1 D 2 Adversary should not be able to distinguish between any D 1 and D 2 based on any O O log Pr[A(D 1 ) = O] Pr[A(D 2 ) = O]. < ε (ε>0) idash Privacy Workshop 9/29/

29 Induced Neighbors Privacy and Pufferfish Given a set of count constraints Q, Spairs: record j is in the table vs record j is not in the table record j is in the table with value x vs record j is not in the table Data evolution: For all θ = [f 1, f 2, f 3,, f k, π 1, π 2,, π k ] P[Data = D θ] α Π rj not in D (1-π j ) x Π rj in D π j x f j (r j ), if D satisfies Q P[Data = D θ] = 0, if D does not satisfy Q A mechanism M satisfies induced neighbors privacy if and only if it satisfies Pufferfish instantiated using Spairs and {θ} idash Privacy Workshop 9/29/

30 Computing induced sensitivity 2D case: q all : outputs all the counts in a 2-D contingency table. Marginals: row and column sums. The induced-sensitivity of q all = min(2r, 2c). General Case: Deciding whether S in (q) > 0 is NP-hard. Conjecture: Computing S in (q) is hard (and complete) for the second level of the polynomial hierarchy. idash Privacy Workshop 9/29/

31 Outline Background How to define privacy? Correlations: A case for customizable privacy [Kifer-M SIGMOD 11] Pufferfish Privacy Framework [Kifer-M PODS 12] Case Study [Kifer-M PODS 12 & Ding-M 12] Privately handling correlations due to exact release of counts Open Challenges idash Privacy Workshop 9/29/

32 Privacy of social networks Open Questions Adversaries may use social network evolution models to infer sensitive information about edges in a network [Kifer-M SIGMOD 11] Can correlations in a social network be generatively described? Characterize necessary and sufficient conditions for resilience against classes of attacks. Sufficient conditions for composition. [Kifer-M PODS 12] Attack-resilient relaxations of differential privacy Existing privacy definitions do not provide sufficient utility for certain applications (e.g., social recommendations [M et al, VLDB 11]) idash Privacy Workshop 9/29/

33 Thank you [M et al PVLDB 11] A. Machanavajjhala, A. Korolova, A. Das Sarma, Personalized Social Recommendations Accurate or Private?, PVLDB 4(7) 2011 [Kifer-M SIGMOD 11] D. Kifer, A. Machanavajjhala, No Free Lunch in Data Privacy, SIGMOD 2011 [Kifer-M PODS 12] D. Kifer, A. Machanavajjhala, A Rigorous and Customizable Framework for Privacy, PODS 2012 [Ding-M 12] B. Ding, A. Machanavajjhala, Induced Neighbors Privacy(Work in progress), 2012 idash Privacy Workshop 9/29/

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Data Anonymization. Graham Cormode.

Data Anonymization. Graham Cormode. Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties

More information

Parallel Composition Revisited

Parallel Composition Revisited Parallel Composition Revisited Chris Clifton 23 October 2017 This is joint work with Keith Merrill and Shawn Merrill This work supported by the U.S. Census Bureau under Cooperative Agreement CB16ADR0160002

More information

0x1A Great Papers in Computer Security

0x1A Great Papers in Computer Security CS 380S 0x1A Great Papers in Computer Security Vitaly Shmatikov http://www.cs.utexas.edu/~shmat/courses/cs380s/ C. Dwork Differential Privacy (ICALP 2006 and many other papers) Basic Setting DB= x 1 x

More information

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and

More information

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD)

Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Vol.2, Issue.1, Jan-Feb 2012 pp-208-212 ISSN: 2249-6645 Secured Medical Data Publication & Measure the Privacy Closeness Using Earth Mover Distance (EMD) Krishna.V #, Santhana Lakshmi. S * # PG Student,

More information

Demonstration of Damson: Differential Privacy for Analysis of Large Data

Demonstration of Damson: Differential Privacy for Analysis of Large Data Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg

More information

Auditing a Batch of SQL Queries

Auditing a Batch of SQL Queries Auditing a Batch of SQL Queries Rajeev Motwani, Shubha U. Nabar, Dilys Thomas Department of Computer Science, Stanford University Abstract. In this paper, we study the problem of auditing a batch of SQL

More information

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Differential Privacy CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Era of big data Motivation: Utility vs. Privacy large-size database automatized data analysis Utility "analyze and extract knowledge from

More information

Survey Result on Privacy Preserving Techniques in Data Publishing

Survey Result on Privacy Preserving Techniques in Data Publishing Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant

More information

m-privacy for Collaborative Data Publishing

m-privacy for Collaborative Data Publishing m-privacy for Collaborative Data Publishing Slawomir Goryczka Emory University Email: sgorycz@emory.edu Li Xiong Emory University Email: lxiong@emory.edu Benjamin C. M. Fung Concordia University Email:

More information

Composition Attacks and Auxiliary Information in Data Privacy

Composition Attacks and Auxiliary Information in Data Privacy Composition Attacks and Auxiliary Information in Data Privacy Srivatsava Ranjit Ganta Pennsylvania State University University Park, PA 1682 ranjit@cse.psu.edu Shiva Prasad Kasiviswanathan Pennsylvania

More information

K ANONYMITY. Xiaoyong Zhou

K ANONYMITY. Xiaoyong Zhou K ANONYMITY LATANYA SWEENEY Xiaoyong Zhou DATA releasing: Privacy vs. Utility Society is experiencing exponential growth in the number and variety of data collections containing person specific specific

More information

Cryptography & Data Privacy Research in the NSRC

Cryptography & Data Privacy Research in the NSRC Cryptography & Data Privacy Research in the NSRC Adam Smith Assistant Professor Computer Science and Engineering 1 Cryptography & Data Privacy @ CSE NSRC SIIS Algorithms & Complexity Group Cryptography

More information

Crowd-Blending Privacy

Crowd-Blending Privacy Crowd-Blending Privacy Johannes Gehrke, Michael Hay, Edward Lui, and Rafael Pass Department of Computer Science, Cornell University {johannes,mhay,luied,rafael}@cs.cornell.edu Abstract. We introduce a

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo Differential Privacy Cynthia Dwork Mamadou H. Diallo 1 Focus Overview Privacy preservation in statistical databases Goal: to enable the user to learn properties of the population as a whole, while protecting

More information

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017 Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of

More information

Security Control Methods for Statistical Database

Security Control Methods for Statistical Database Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP

More information

Privacy-Preserving Machine Learning

Privacy-Preserving Machine Learning Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following concepts:

More information

A Review of Privacy Preserving Data Publishing Technique

A Review of Privacy Preserving Data Publishing Technique A Review of Privacy Preserving Data Publishing Technique Abstract:- Amar Paul Singh School of CSE Bahra University Shimla Hills, India Ms. Dhanshri Parihar Asst. Prof (School of CSE) Bahra University Shimla

More information

Co-clustering for differentially private synthetic data generation

Co-clustering for differentially private synthetic data generation Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

Microdata Publishing with Algorithmic Privacy Guarantees

Microdata Publishing with Algorithmic Privacy Guarantees Microdata Publishing with Algorithmic Privacy Guarantees Tiancheng Li and Ninghui Li Department of Computer Science, Purdue University 35 N. University Street West Lafayette, IN 4797-217 {li83,ninghui}@cs.purdue.edu

More information

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis Mohammad Hammoud CS3525 Dept. of Computer Science University of Pittsburgh Introduction This paper addresses the problem of defining

More information

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER

SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 31 st July 216. Vol.89. No.2 25-216 JATIT & LLS. All rights reserved. SIMPLE AND EFFECTIVE METHOD FOR SELECTING QUASI-IDENTIFIER 1 AMANI MAHAGOUB OMER, 2 MOHD MURTADHA BIN MOHAMAD 1 Faculty of Computing,

More information

Differentially Private H-Tree

Differentially Private H-Tree GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

CS573 Data Privacy and Security. Li Xiong

CS573 Data Privacy and Security. Li Xiong CS573 Data Privacy and Security Anonymizationmethods Li Xiong Today Clustering based anonymization(cont) Permutation based anonymization Other privacy principles Microaggregation/Clustering Two steps:

More information

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks Knowl Inf Syst DOI 10.1007/s10115-010-0311-2 REGULAR PAPER The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks Bin Zhou Jian Pei Received:

More information

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1105

More information

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris Gergely Acs (INRIA) gergely.acs@inria.fr!! Claude Castelluccia (INRIA) claude.castelluccia@inria.fr! Outline 2! Dataset descrip9on!

More information

Semantic Security: Privacy Definitions Revisited

Semantic Security: Privacy Definitions Revisited 185 198 Semantic Security: Privacy Definitions Revisited Jinfei Liu, Li Xiong, Jun Luo Department of Mathematics & Computer Science, Emory University, Atlanta, USA Huawei Noah s Ark Laboratory, HongKong

More information

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression UT DALLAS Erik Jonsson School of Engineering & Computer Science Achieving k-anonmity* Privacy Protection Using Generalization and Suppression Murat Kantarcioglu Based on Sweeney 2002 paper Releasing Private

More information

CS573 Data Privacy and Security. Differential Privacy tabular data and range queries. Li Xiong

CS573 Data Privacy and Security. Differential Privacy tabular data and range queries. Li Xiong CS573 Data Privacy and Security Differential Privacy tabular data and range queries Li Xiong Outline Tabular data and histogram/range queries Algorithms for low dimensional data Algorithms for high dimensional

More information

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015.

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015. Privacy-preserving machine learning Bo Liu, the HKUST March, 1st, 2015. 1 Some slides extracted from Wang Yuxiang, Differential Privacy: a short tutorial. Cynthia Dwork, The Promise of Differential Privacy.

More information

De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs,

De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs, De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs, Jianwei Qian Illinois Tech Chunhong Zhang BUPT Xiang#Yang Li USTC,/Illinois Tech Linlin Chen Illinois Tech Outline

More information

Alpha Anonymization in Social Networks using the Lossy-Join Approach

Alpha Anonymization in Social Networks using the Lossy-Join Approach TRANSACTIONS ON DATA PRIVACY 11 (2018) 1 22 Alpha Anonymization in Social Networks using the Lossy-Join Kiran Baktha*, B K Tripathy** * Department of Electronics and Communication Engineering, VIT University,

More information

Distributed Data Anonymization with Hiding Sensitive Node Labels

Distributed Data Anonymization with Hiding Sensitive Node Labels Distributed Data Anonymization with Hiding Sensitive Node Labels C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan University,Trichy

More information

Cryptography & Data Privacy Research in the NSRC

Cryptography & Data Privacy Research in the NSRC Cryptography & Data Privacy Research in the NSRC Adam Smith Assistant Professor Computer Science and Engineering 1 Cryptography & Data Privacy @ CSE NSRC SIIS Algorithms & Complexity Group Cryptography

More information

On Private Supervised Distributed Learning: Weakly Labeled and without Entity Resolution

On Private Supervised Distributed Learning: Weakly Labeled and without Entity Resolution On Private Supervised Distributed Learning: Weakly Labeled and without Entity Resolution Stephen Hardy Wilko Henecka Richard Nock, Data61; The Australian National University & the University of Sydney,

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

Prajapati Het I., Patel Shivani M., Prof. Ketan J. Sarvakar IT Department, U. V. Patel college of Engineering Ganapat University, Gujarat

Prajapati Het I., Patel Shivani M., Prof. Ketan J. Sarvakar IT Department, U. V. Patel college of Engineering Ganapat University, Gujarat Security and Privacy with Perturbation Based Encryption Technique in Big Data Prajapati Het I., Patel Shivani M., Prof. Ketan J. Sarvakar IT Department, U. V. Patel college of Engineering Ganapat University,

More information

Ambiguity: Hide the Presence of Individuals and Their Privacy with Low Information Loss

Ambiguity: Hide the Presence of Individuals and Their Privacy with Low Information Loss : Hide the Presence of Individuals and Their Privacy with Low Information Loss Hui (Wendy) Wang Department of Computer Science Stevens Institute of Technology Hoboken, NJ, USA hwang@cs.stevens.edu Abstract

More information

Privacy Preserving Machine Learning: A Theoretically Sound App

Privacy Preserving Machine Learning: A Theoretically Sound App Privacy Preserving Machine Learning: A Theoretically Sound Approach Outline 1 2 3 4 5 6 Privacy Leakage Events AOL search data leak: New York Times journalist was able to identify users from the anonymous

More information

On Syntactic Anonymity and Differential Privacy

On Syntactic Anonymity and Differential Privacy 161 183 On Syntactic Anonymity and Differential Privacy Chris Clifton 1, Tamir Tassa 2 1 Department of Computer Science/CERIAS, Purdue University, West Lafayette, IN 47907-2107 USA. 2 The Department of

More information

Matrix Mechanism and Data Dependent algorithms

Matrix Mechanism and Data Dependent algorithms Matrix Mechanism and Data Dependent algorithms CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 9 : 590.03 Fall 16 1 Recap: Constrained Inference Lecture 9 : 590.03 Fall 16 2 Constrained Inference

More information

Privacy Preserved Data Publishing Techniques for Tabular Data

Privacy Preserved Data Publishing Techniques for Tabular Data Privacy Preserved Data Publishing Techniques for Tabular Data Keerthy C. College of Engineering Trivandrum Sabitha S. College of Engineering Trivandrum ABSTRACT Almost all countries have imposed strict

More information

Data attribute security and privacy in Collaborative distributed database Publishing

Data attribute security and privacy in Collaborative distributed database Publishing International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 3, Issue 12 (July 2014) PP: 60-65 Data attribute security and privacy in Collaborative distributed database Publishing

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

Efficient Algorithms for Masking and Finding Quasi-Identifiers

Efficient Algorithms for Masking and Finding Quasi-Identifiers Efficient Algorithms for Masking and Finding Quasi-Identifiers Rajeev Motwani Ying Xu Abstract A quasi-identifier refers to a subset of attributes that can uniquely identify most tuples in a table. Incautious

More information

A Theory of Privacy and Utility for Data Sources

A Theory of Privacy and Utility for Data Sources A Theory of Privacy and Utility for Data Sources Lalitha Sankar Princeton University 7/26/2011 Lalitha Sankar (PU) Privacy and Utility 1 Electronic Data Repositories Technological leaps in information

More information

Data Anonymization - Generalization Algorithms

Data Anonymization - Generalization Algorithms Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity Generalization and Suppression Z2 = {410**} Z1 = {4107*. 4109*} Generalization Replace the value with a less specific

More information

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique P.Nithya 1, V.Karpagam 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College,

More information

Detection of Conflicts and Inconsistencies in Taxonomy-based Authorization Policies

Detection of Conflicts and Inconsistencies in Taxonomy-based Authorization Policies Detection of Conflicts and Inconsistencies in Taxonomy-based Authorization Policies Apurva Mohan, Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta,

More information

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge

More information

Private Context-aware Recommendation of Points of Interest: An Initial Investigation

Private Context-aware Recommendation of Points of Interest: An Initial Investigation Private Context-aware Recommation of Points of Interest: An Initial Investigation Daniele Riboni Claudio Bettini Universita degli Studi di Milano, D.I.Co., EveryWare Lab via Comelico, 39, I-20135 Milano,

More information

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 247-253 Research India Publications http://www.ripublication.com Comparison and Analysis of Anonymization

More information

Attacks on Privacy and definetti s Theorem

Attacks on Privacy and definetti s Theorem Attacks on Privacy and definetti s Theorem Daniel Kifer Penn State University ABSTRACT In this paper we present a method for reasoning about privacy using the concepts of exchangeability and definetti

More information

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING

ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING ADDITIVE GAUSSIAN NOISE BASED DATA PERTURBATION IN MULTI-LEVEL TRUST PRIVACY PRESERVING DATA MINING R.Kalaivani #1,S.Chidambaram #2 # Department of Information Techology, National Engineering College,

More information

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007

K-Anonymity and Other Cluster- Based Methods. Ge Ruan Oct. 11,2007 K-Anonymity and Other Cluster- Based Methods Ge Ruan Oct 11,2007 Data Publishing and Data Privacy Society is experiencing exponential growth in the number and variety of data collections containing person-specific

More information

Incognito: Efficient Full Domain K Anonymity

Incognito: Efficient Full Domain K Anonymity Incognito: Efficient Full Domain K Anonymity Kristen LeFevre David J. DeWitt Raghu Ramakrishnan University of Wisconsin Madison 1210 West Dayton St. Madison, WI 53706 Talk Prepared By Parul Halwe(05305002)

More information

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University) Secure Multiparty Computation: Introduction Ran Cohen (Tel Aviv University) Scenario 1: Private Dating Alice and Bob meet at a pub If both of them want to date together they will find out If Alice doesn

More information

SFI Talk. Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes

SFI Talk. Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes SFI Talk Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes 1 Reconstructing Evolutionary Trees via Distance Experiments Learning

More information

Privacy Preserving in Knowledge Discovery and Data Publishing

Privacy Preserving in Knowledge Discovery and Data Publishing B.Lakshmana Rao, G.V Konda Reddy and G.Yedukondalu 33 Privacy Preserving in Knowledge Discovery and Data Publishing B.Lakshmana Rao 1, G.V Konda Reddy 2, G.Yedukondalu 3 Abstract Knowledge Discovery is

More information

(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data

(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data (δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data Mohammad-Reza Zare-Mirakabad Department of Computer Engineering Scool of Electrical and Computer Yazd University, Iran mzare@yazduni.ac.ir

More information

Clustering with Diversity

Clustering with Diversity Clustering with Diversity Jian Li 1, Ke Yi 2, and Qin Zhang 2 1 University of Maryland, College Park, MD, USA. E-mail: lijian@cs.umd.edu 2 Hong Kong University of Science and Technology, Hong Kong, China

More information

Classifying Online Social Network Users Through the Social Graph

Classifying Online Social Network Users Through the Social Graph Classifying Online Social Network Users Through the Social Graph Cristina Pe rez Sola and Jordi Herrera Joancomartı Departament d Enginyeria de la Informacio i les Comunicacions Universitat Auto noma de

More information

Graph Symmetry and Social Network Anonymization

Graph Symmetry and Social Network Anonymization Graph Symmetry and Social Network Anonymization Yanghua XIAO ( 肖仰华 ) School of computer science Fudan University For more information, please visit http://gdm.fudan.edu.cn Graph isomorphism determination

More information

Dynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling

Dynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling 2014/04/09 @ WWW 14 Dynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling Takuya Akiba (U Tokyo) Yoichi Iwata (U Tokyo) Yuichi Yoshida (NII & PFI)

More information

Large Scale Graph Algorithms

Large Scale Graph Algorithms Large Scale Graph Algorithms A Guide to Web Research: Lecture 2 Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Stuttgart, Spring 2007 1 / 34 Talk Objective To pose an abstract computational

More information

An (Almost) Constant-Effort Solution-Verification. Proof-of-Work Protocol based on Merkle Trees. Fabien Coelho

An (Almost) Constant-Effort Solution-Verification. Proof-of-Work Protocol based on Merkle Trees. Fabien Coelho An (Almost) Constant-Effort Solution-Verification Composed with LAT E X, revision 841 Proof of Work? economic measure to deter DOS attacks Crypto 92 Cynthia Dwork and Moni Naor Pricing via processing or

More information

Spy vs. Spy: Rumor Source Obfuscation

Spy vs. Spy: Rumor Source Obfuscation Spy vs. Spy: Rumor Source Obfuscation Peter Kairouz University of Illinois at Urbana-Champaign Joint work with Giulia Fanti, Sewoong Oh, and Pramod Viswanath Some people have important, sensitive things

More information

Distributed Data Mining with Differential Privacy

Distributed Data Mining with Differential Privacy Distributed Data Mining with Differential Privacy Ning Zhang, Ming Li, Wenjing Lou Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA Email: {ning, mingli}@wpi.edu,

More information

Differential Privacy

Differential Privacy CPSC 426/526 Differential Privacy Ennan Zhai Computer Science Department Yale University Recall: Lec-11 In lec-11, we learned: - Cryptographic basics - Symmetric key cryptography - Public key cryptography

More information

Traveling Salesman Problem (TSP) Input: undirected graph G=(V,E), c: E R + Goal: find a tour (Hamiltonian cycle) of minimum cost

Traveling Salesman Problem (TSP) Input: undirected graph G=(V,E), c: E R + Goal: find a tour (Hamiltonian cycle) of minimum cost Traveling Salesman Problem (TSP) Input: undirected graph G=(V,E), c: E R + Goal: find a tour (Hamiltonian cycle) of minimum cost Traveling Salesman Problem (TSP) Input: undirected graph G=(V,E), c: E R

More information

Private & Anonymous Communication. Peter Kairouz ECE Department University of Illinois at Urbana-Champaign

Private & Anonymous Communication. Peter Kairouz ECE Department University of Illinois at Urbana-Champaign Private & Anonymous Communication Peter Kairouz ECE Department University of Illinois at Urbana-Champaign Communication Bob Alice transfer of information from one point in space-time to the other Wireless

More information

Privacy Preserving Data Sharing in Data Mining Environment

Privacy Preserving Data Sharing in Data Mining Environment Privacy Preserving Data Sharing in Data Mining Environment PH.D DISSERTATION BY SUN, XIAOXUN A DISSERTATION SUBMITTED TO THE UNIVERSITY OF SOUTHERN QUEENSLAND IN FULLFILLMENT OF THE REQUIREMENTS FOR THE

More information

Privacy in Statistical Databases

Privacy in Statistical Databases Privacy in Statistical Databases CSE 598D/STAT 598B Fall 2007 Lecture 2, 9/13/2007 Aleksandra Slavkovic Office hours: MW 3:30-4:30 Office: Thomas 412 Phone: x3-4918 Adam Smith Office hours: Mondays 3-5pm

More information

Secure Multiparty Computation

Secure Multiparty Computation CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

Preserving Data Mining through Data Perturbation

Preserving Data Mining through Data Perturbation Preserving Data Mining through Data Perturbation Mr. Swapnil Kadam, Prof. Navnath Pokale Abstract Data perturbation, a widely employed and accepted Privacy Preserving Data Mining (PPDM) approach, tacitly

More information

Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar

Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar School of Computer Science Faculty of Science University of Windsor How Big is this Big Data? 40 Billion Instagram Photos 300 Hours

More information

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg and Eva Tardos

Maximizing the Spread of Influence through a Social Network. David Kempe, Jon Kleinberg and Eva Tardos Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg and Eva Tardos Group 9 Lauren Thomas, Ryan Lieblein, Joshua Hammock and Mary Hanvey Introduction In a social network,

More information

Steps Towards Location Privacy

Steps Towards Location Privacy Steps Towards Location Privacy Subhasish Mazumdar New Mexico Institute of Mining & Technology Socorro, NM 87801, USA. DataSys 2018 Subhasish.Mazumdar@nmt.edu DataSys 2018 1 / 53 Census A census is vital

More information

Bounded-Concurrent Secure Two-Party Computation Without Setup Assumptions

Bounded-Concurrent Secure Two-Party Computation Without Setup Assumptions Bounded-Concurrent Secure Two-Party Computation Without Setup Assumptions Yehuda Lindell IBM T.J.Watson Research 19 Skyline Drive, Hawthorne New York 10532, USA lindell@us.ibm.com ABSTRACT In this paper

More information

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Tiancheng Li Ninghui Li CERIAS and Department of Computer Science, Purdue University 250 N. University Street, West

More information

Learning Network Graph of SIR Epidemic Cascades Using Minimal Hitting Set based Approach

Learning Network Graph of SIR Epidemic Cascades Using Minimal Hitting Set based Approach Learning Network Graph of SIR Epidemic Cascades Using Minimal Hitting Set based Approach Zhuozhao Li and Haiying Shen Dept. of Electrical and Computer Engineering Clemson University, SC, USA Kang Chen

More information

Survey of Anonymity Techniques for Privacy Preserving

Survey of Anonymity Techniques for Privacy Preserving 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Survey of Anonymity Techniques for Privacy Preserving Luo Yongcheng

More information

ADVANCES in NATURAL and APPLIED SCIENCES

ADVANCES in NATURAL and APPLIED SCIENCES ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2017 May 11(7): pages 585-591 Open Access Journal A Privacy Preserving

More information

Privately Solving Linear Programs

Privately Solving Linear Programs Privately Solving Linear Programs Justin Hsu 1 Aaron Roth 1 Tim Roughgarden 2 Jonathan Ullman 3 1 University of Pennsylvania 2 Stanford University 3 Harvard University July 8th, 2014 A motivating example

More information

Data Privacy in Big Data Applications. Sreagni Banerjee CS-846

Data Privacy in Big Data Applications. Sreagni Banerjee CS-846 Data Privacy in Big Data Applications Sreagni Banerjee CS-846 Outline! Motivation! Goal and Approach! Introduction to Big Data Privacy! Privacy preserving methods in Big Data Application! Progress! Next

More information

Keyword search in relational databases. By SO Tsz Yan Amanda & HON Ka Lam Ethan

Keyword search in relational databases. By SO Tsz Yan Amanda & HON Ka Lam Ethan Keyword search in relational databases By SO Tsz Yan Amanda & HON Ka Lam Ethan 1 Introduction Ubiquitous relational databases Need to know SQL and database structure Hard to define an object 2 Query representation

More information

CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY

CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY 1 V.VIJEYA KAVERI, 2 Dr.V.MAHESWARI 1 Research Scholar, Sathyabama University, Chennai 2 Prof., Department of Master

More information

val(y, I) α (9.0.2) α (9.0.3)

val(y, I) α (9.0.2) α (9.0.3) CS787: Advanced Algorithms Lecture 9: Approximation Algorithms In this lecture we will discuss some NP-complete optimization problems and give algorithms for solving them that produce a nearly optimal,

More information

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Distrib Parallel Databases (2014) 32:5 35 DOI 10.1007/s10619-013-7126-6 A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Madhushri Banerjee Zhiyuan

More information

Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring

Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring DBSec 13 Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring Liyue Fan, Li Xiong, Vaidy Sunderam Department of Math & Computer Science Emory University 9/4/2013 DBSec'13:

More information

Exploring re-identification risks in public domains

Exploring re-identification risks in public domains 2012 Tenth Annual International Conference on Privacy, Security and Trust Exploring re-identification risks in public domains Aditi Ramachandran Georgetown University ar372@georgetown.edu Lisa Singh Georgetown

More information

Injector: Mining Background Knowledge for Data Anonymization

Injector: Mining Background Knowledge for Data Anonymization : Mining Background Knowledge for Data Anonymization Tiancheng Li, Ninghui Li Department of Computer Science, Purdue University 35 N. University Street, West Lafayette, IN 4797, USA {li83,ninghui}@cs.purdue.edu

More information

Sanitization of call detail records via differentially-private Bloom filters

Sanitization of call detail records via differentially-private Bloom filters Sanitization of call detail records via differentially-private Bloom filters Mohammad Alaggan Helwan University Joint work with Sébastien Gambs (Université de Rennes 1 - Inria / IRISA), Stan Matwin and

More information