Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017

Size: px
Start display at page:

Download "Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017"

Transcription

1 Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017

2 Outline 1. Introduction 2. Definition and Features of Differential Privacy 3. Techniques 4. Practical Issues and Limitations 5. Differential Privacy in Machine Learning & Differential Privacy 2

3 Introduction

4 Privacy Protection Anonymization Removal of identifying attributes such as names or social security number. Often considered enough to protect privacy. Differential Privacy 4

5 Privacy Protection Anonymization Removal of identifying attributes such as names or social security number. Often considered enough to protect privacy. However: Netflix Prize Dataset [9] Movie A B C Netflix Ratings Linkage Attack Movie A B C IMDb Ratings Reidentification of medical records using publicly available voting records [12] Differential Privacy 4

6 Privacy Protection What is privacy protection? Nothing about an individual should be learnable from the database that cannot be learned without access to the database. [2] Differential Privacy 5

7 Privacy Protection What is privacy protection? Nothing about an individual should be learnable from the database that cannot be learned without access to the database. [2] Proven to be impossible if the privacy mechanism is useful. Reason: Auxiliary information. [3] Differential Privacy 5

8 Definition and Features of Differential Privacy

9 Differential Privacy Differential Privacy A mathematical definition of privacy which bounds the privacy risk for an participant in a database. Makes it possible to learn properties of a population while protecting the privacy of individuals. Differential Privacy 7

10 Formalizing Differential Privacy [11] ɛ-differential Privacy An algorithm A priv with A priv (D) T provides ɛ-differential privacy if Pr[A priv (D) S] e ɛ Pr[A priv (D ) S] (1) for all S T and all datasets D, D differing in only a single entry. Differential Privacy 8

11 Formalizing Differential Privacy [11] ɛ-differential Privacy An algorithm A priv with A priv (D) T provides ɛ-differential privacy if Pr[A priv (D) S] e ɛ Pr[A priv (D ) S] (1) for all S T and all datasets D, D differing in only a single entry. (ɛ, δ)-differential Privacy An algorithm A priv with A priv (D) T provides (ɛ, δ)-differential privacy if Pr[A priv (D) S] e ɛ Pr[A priv (D ) S] + δ (2) for all S T and all datasets D, D differing in only a single entry. Differential Privacy 8

12 Resilience to arbitrary auxiliary information [4] Differential Privacy provides plausible deniability to each participant since the same outcome could have been produced using a dataset without him. This definition is independent of available side information Furthermore: Differential Privacy holds regardless what auxiliary information is available right now or will be available in the future. Differential Privacy 9

13 Postprocessing Postprocessing [4] Let A be a (ɛ, δ)-differentially private mechanism and f an arbitrary mapping. Then the composition f A is (ɛ, δ)-differentially private. Differential Privacy 10

14 Composition Composition [11] Let A 1 priv and A2 priv be algorithms with privacy guarantees of ɛ 1 and ɛ 2. Then applying both algorithms to the data has a privacy risk of at most ɛ 1 + ɛ 2. Differential Privacy 11

15 Techniques

16 Techniques Approaches: Input Perturbation Output Perturbation Algorithm Perturbation Differential Privacy 13

17 Input perturbation [11] [11] Add noise directly to the database D The perturbed dataset can then be published and guarantees differential privacy for any following algorithm. Example: Randomized Response Differential Privacy 14

18 Input Perturbation [4] [13] Randomized Response Question: Have you ever committed a crime? Randomization Process: 1. Flip a coin. 2. If tails : answer truthfully. 3. If heads : Flip a coin. tails : say no. heads : say yes. Plausible deniability for the individual. True distribution can still be estimated: p: all people who have committed a crime, y: number of people who said yes : E(y) = 0.5 p p (1 p) = 0.5 p p = 2 y 0.5 Differential Privacy 15

19 Input perturbation: Pro s & Con s Pro Results can be reproduced Privacy is not dependent on a specific algorithm Contra Determining the amount of noise needed and therefore determining ɛ not trivial Privacy guarantees might be worse than for algorithm-specific techniques Differential Privacy 16

20 Output Perturbation[4][7] [11] Add noise to the results of A nonpriv Only publish the perturbed results Destroy original data Differential Privacy 17

21 Output Perturbation l1-sensitivity Maximum difference of the function over all pairs of databases D and D differing in a single record. S(A) = max D,D A(D) A(D ) 1 (3) Laplace Mechanism Given an algorithm A nonpriv : D R k, the Laplace Algorithm adds Laplacian noise to the result of A nonpriv [7]: A priv (x, ɛ) = A nonpriv (x) + (Z 1,..., Z k ) (4) where Z i are i.i.d. random variables with Z i Lap( S(A nonpriv ) ɛ ) Differential Privacy 18

22 Output Perturbation: Pro s & Con s Pro Better privacy guarantees than input perturbation Easier to add noise and control the privacy Contra Results cannot be reproduced Differential Privacy 19

23 Exponential Mechanism [7] [11] Sometimes adding noise to the input or output is not possible. Example [4]: Items for sale: A: 1,00$, B: 1,00$, C: 3,01$ Best price: 3,01$, revenue: 3,01$ 2nd best price: 1,00$, revenue: 3,00$ Revenue for price 3,02$: 0$ Revenue for price 1,01$: 1,01$ Differential Privacy 20

24 Exponential Mechanism [7] Construct a utility-measure over the dataset D and all possible outputs k: The sensitivity of q is: q(d, k) = u, u R (5) S(q) = max k,d,d q(d, k) q(d, k) (6) The exponential mechanism picks a random value for k with distribution: p(k) exp( ɛ q(d, k)) (7) 2S(q) Therefore the exponential mechanism is biased towards values of k with a higher utility. Differential Privacy 21

25 Exponential Mechanism: Pro s & Con s Pro Biased towards the more useful values Contra Computationally expensive Requires modification of existing algorithms Differential Privacy 22

26 Practical Issues and Limitations

27 Practical Issues and Limitations Some solutions for achieving differential privacy still rely on technical assumptions about the data (e.g. discrete/continuous data). How to choose ɛ and δ? Rule of thumb: δ 1 D The lower, the better: But what is low enough? Trade-off between privacy and utility: Figure : Lap( 1 ) for different ɛ ɛ Differential Privacy 24

28 Differential Privacy in Machine Learning & Data Mining

29 Differential Privacy in Machine Learning & Data Mining How to achieve differential privacy in machine learning and data mining: Input and Output Perturbation: Using these techniques enables us to use the standard machine learning techniques and achieving differential privacy at the same time. Algorithm Perturbation: This requires the machine learning technique to be modified. Differential Privacy 26

30 Differentially Private Graph Clustering [8] Motivation Suppose there exists a graph consisting of users of a social network and their relationship. Consider detecting a connection between two user a privacy violation Publishing the original graph as a whole is a clear privacy breach Even if just the community structure of a graph is revealed it might be possible for an attacker to infer the existence (or non-existence) between nodes Differential Privacy 27

31 Differentially Private Graph Clustering [8] Graph Perturbation (PIG) [8] Privacy-Itegrated Graph Guarantees edge-differential privacy Perturbs the input graph graph guarantees privacy independent from clustering algorithms Differential Privacy 28

32 Differentially Private Graph Clustering [8] Algorithm 1 Graph Perturbation Algorithm PIG 1: function PerturbGraph(Adjacency matrix A, privacy parameter s) 2: for all a ij A with i < j do 3: if preservation is chosen with probability 1 s then 4: continue 5: else 6: Value of a ij is randomized 7: 8: if 0 is chosen with probability 1 2 then a ij = a ji = 0 9: else 10: a ij = a ji = 1 11: end if 12: end if 13: end for 14: return A 15: end function Differential Privacy 29

33 Differentially Private Graph Clustering [8] Evaluation It can be proven that a PIG-perturbed graph guarantees edge-differential privacy for ɛ ln( 2 s 1). Figure : Clustering quality using the algorithm SCAN.[8] Differential Privacy 30

34 Other DM/ML techniques using Differential Privacy [8] Community Detection (Input Perturbation, Algorithm Perturbation) [10] Deep Learning (Algorithm Perturbation) [1] Decision Trees (Output Perturbation) [6] Differential Privacy 31

35 Questions?

36 References I [1] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS 16, pages , New York, NY, USA, ACM. [2] T. Dalenius. Towards a methodology for statistical disclosure control. Statistisk Tidskrift, 15: , [3] C. Dwork. Differential privacy. In 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), volume 4052, pages 1 12, Venice, Italy, July Springer Verlag. Differential Privacy 33

37 References II [4] C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4): , [5] A. Gupta, K. Ligett, F. McSherry, A. Roth, and K. Talwar. Differentially private approximation algorithms. Aug [6] G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright. A practical differentially private random decision tree classifier. In 2009 IEEE International Conference on Workshops, pages , Dec [7] Z. Ji, Z. C. Lipton, and C. Elkan. Differential Privacy and Machine Learning: a Survey and Review. ArXiv e-prints, Dec Differential Privacy 34

38 References III [8] Y. Mülle and Chris Clifton and Klemens Böhm. Privacy-integrated graph clustering through differential privacy. In EDBT/ICDT Workshops, [9] A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy, SP 08, pages , Washington, DC, USA, IEEE Computer Society. [10] H. H. Nguyen, A. Imine, and M. Rusinowitch. Detecting communities under differential privacy. CoRR, abs/ , [11] A. D. Sarwate and K. Chaudhuri. Signal processing and machine learning with differential privacy: Algorithms and challenges for continuous data. IEEE Signal Processing Magazine, 30(5):86 94, Sept Differential Privacy 35

39 References IV [12] L. Sweeney. K-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst., 10(5): , Oct [13] S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63 69, Differential Privacy 36

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015.

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015. Privacy-preserving machine learning Bo Liu, the HKUST March, 1st, 2015. 1 Some slides extracted from Wang Yuxiang, Differential Privacy: a short tutorial. Cynthia Dwork, The Promise of Differential Privacy.

More information

Privacy Preserving Machine Learning: A Theoretically Sound App

Privacy Preserving Machine Learning: A Theoretically Sound App Privacy Preserving Machine Learning: A Theoretically Sound Approach Outline 1 2 3 4 5 6 Privacy Leakage Events AOL search data leak: New York Times journalist was able to identify users from the anonymous

More information

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Differential Privacy CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Era of big data Motivation: Utility vs. Privacy large-size database automatized data analysis Utility "analyze and extract knowledge from

More information

0x1A Great Papers in Computer Security

0x1A Great Papers in Computer Security CS 380S 0x1A Great Papers in Computer Security Vitaly Shmatikov http://www.cs.utexas.edu/~shmat/courses/cs380s/ C. Dwork Differential Privacy (ICALP 2006 and many other papers) Basic Setting DB= x 1 x

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo Differential Privacy Cynthia Dwork Mamadou H. Diallo 1 Focus Overview Privacy preservation in statistical databases Goal: to enable the user to learn properties of the population as a whole, while protecting

More information

Privacy in Statistical Databases

Privacy in Statistical Databases Privacy in Statistical Databases CSE 598D/STAT 598B Fall 2007 Lecture 2, 9/13/2007 Aleksandra Slavkovic Office hours: MW 3:30-4:30 Office: Thomas 412 Phone: x3-4918 Adam Smith Office hours: Mondays 3-5pm

More information

Survey Result on Privacy Preserving Techniques in Data Publishing

Survey Result on Privacy Preserving Techniques in Data Publishing Survey Result on Privacy Preserving Techniques in Data Publishing S.Deebika PG Student, Computer Science and Engineering, Vivekananda College of Engineering for Women, Namakkal India A.Sathyapriya Assistant

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Privacy-Integrated Graph Clustering Through Differential Privacy

Privacy-Integrated Graph Clustering Through Differential Privacy Privacy-Integrated Graph Clustering Through Differential Privacy Yvonne Mülle University of Zurich, Switzerland muelle@ifi.uzh.ch Chris Clifton Purdue University, USA clifton@cs.purdue.edu Klemens Böhm

More information

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open AI 2 3 Deep Learning Fashion Cognitive tasks: speech,

More information

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis Part 1 Aaron Roth The 2015 ImageNet competition An image classification competition during a heated war for deep learning talent

More information

Data Anonymization. Graham Cormode.

Data Anonymization. Graham Cormode. Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties

More information

Privacy-Preserving Machine Learning

Privacy-Preserving Machine Learning Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following concepts:

More information

Cryptography & Data Privacy Research in the NSRC

Cryptography & Data Privacy Research in the NSRC Cryptography & Data Privacy Research in the NSRC Adam Smith Assistant Professor Computer Science and Engineering 1 Cryptography & Data Privacy @ CSE NSRC SIIS Algorithms & Complexity Group Cryptography

More information

Data Distortion for Privacy Protection in a Terrorist Analysis System

Data Distortion for Privacy Protection in a Terrorist Analysis System Data Distortion for Privacy Protection in a Terrorist Analysis System Shuting Xu, Jun Zhang, Dianwei Han, and Jie Wang Department of Computer Science, University of Kentucky, Lexington KY 40506-0046, USA

More information

Research Statement. Yehuda Lindell. Dept. of Computer Science Bar-Ilan University, Israel.

Research Statement. Yehuda Lindell. Dept. of Computer Science Bar-Ilan University, Israel. Research Statement Yehuda Lindell Dept. of Computer Science Bar-Ilan University, Israel. lindell@cs.biu.ac.il www.cs.biu.ac.il/ lindell July 11, 2005 The main focus of my research is the theoretical foundations

More information

Partition Based Perturbation for Privacy Preserving Distributed Data Mining

Partition Based Perturbation for Privacy Preserving Distributed Data Mining BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 2 Sofia 2017 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2017-0015 Partition Based Perturbation

More information

Indrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin

Indrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin Airavat: Security and Privacy for MapReduce Indrajit Roy, Srinath T.V. Setty, Ann Kilzer, Vitaly Shmatikov, Emmett Witchel The University of Texas at Austin Computing in the year 201X 2 Data Illusion of

More information

Pufferfish: A Semantic Approach to Customizable Privacy

Pufferfish: A Semantic Approach to Customizable Privacy Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop

More information

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis Mohammad Hammoud CS3525 Dept. of Computer Science University of Pittsburgh Introduction This paper addresses the problem of defining

More information

Utilizing Large-Scale Randomized Response at Google: RAPPOR and its lessons

Utilizing Large-Scale Randomized Response at Google: RAPPOR and its lessons Utilizing Large-Scale Randomized Response at Google: RAPPOR and its lessons Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova, Steven Holte, Ananth Raghunathan, Giulia Fanti, Ilya Mironov, Andy Chu DIMACS

More information

ADVANCES in NATURAL and APPLIED SCIENCES

ADVANCES in NATURAL and APPLIED SCIENCES ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BYAENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2017 May 11(7): pages 585-591 Open Access Journal A Privacy Preserving

More information

Crowd-Blending Privacy

Crowd-Blending Privacy Crowd-Blending Privacy Johannes Gehrke, Michael Hay, Edward Lui, and Rafael Pass Department of Computer Science, Cornell University {johannes,mhay,luied,rafael}@cs.cornell.edu Abstract. We introduce a

More information

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING

FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING FREQUENT ITEMSET MINING USING PFP-GROWTH VIA SMART SPLITTING Neha V. Sonparote, Professor Vijay B. More. Neha V. Sonparote, Dept. of computer Engineering, MET s Institute of Engineering Nashik, Maharashtra,

More information

Microdata Publishing with Algorithmic Privacy Guarantees

Microdata Publishing with Algorithmic Privacy Guarantees Microdata Publishing with Algorithmic Privacy Guarantees Tiancheng Li and Ninghui Li Department of Computer Science, Purdue University 35 N. University Street West Lafayette, IN 4797-217 {li83,ninghui}@cs.purdue.edu

More information

PrivApprox. Privacy- Preserving Stream Analytics.

PrivApprox. Privacy- Preserving Stream Analytics. PrivApprox Privacy- Preserving Stream Analytics https://privapprox.github.io Do Le Quoc, Martin Beck, Pramod Bhatotia, Ruichuan Chen, Christof Fetzer, Thorsten Strufe July 2017 Motivation Clients Analysts

More information

Statistical Databases: Query Restriction

Statistical Databases: Query Restriction Statistical Databases: Query Restriction Nina Mishra January 21, 2004 Introduction A statistical database typically contains information about n individuals where n is very large. A statistical database

More information

Re-identification in Dynamic. Opportunities

Re-identification in Dynamic. Opportunities Re-identification in Dynamic Networks: Challenges and Opportunities Shawndra Hill University of Pennsylvania Kick-off Meeting, July 28, 2008 ONR MURI: NexGeNetSci Motivating example: Repetitive Subscription

More information

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan

More information

PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION

PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION PRIVACY-PRESERVING MULTI-PARTY DECISION TREE INDUCTION Justin Z. Zhan, LiWu Chang, Stan Matwin Abstract We propose a new scheme for multiple parties to conduct data mining computations without disclosing

More information

A Theory of Privacy and Utility for Data Sources

A Theory of Privacy and Utility for Data Sources A Theory of Privacy and Utility for Data Sources Lalitha Sankar Princeton University 7/26/2011 Lalitha Sankar (PU) Privacy and Utility 1 Electronic Data Repositories Technological leaps in information

More information

Privacy Challenges in Big Data and Industry 4.0

Privacy Challenges in Big Data and Industry 4.0 Privacy Challenges in Big Data and Industry 4.0 Jiannong Cao Internet & Mobile Computing Lab Department of Computing Hong Kong Polytechnic University Email: csjcao@comp.polyu.edu.hk http://www.comp.polyu.edu.hk/~csjcao/

More information

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness

Data Security and Privacy. Topic 18: k-anonymity, l-diversity, and t-closeness Data Security and Privacy Topic 18: k-anonymity, l-diversity, and t-closeness 1 Optional Readings for This Lecture t-closeness: Privacy Beyond k-anonymity and l-diversity. Ninghui Li, Tiancheng Li, and

More information

Guarding user Privacy with Federated Learning and Differential Privacy

Guarding user Privacy with Federated Learning and Differential Privacy Guarding user Privacy with Federated Learning and Differential Privacy Brendan McMahan mcmahan@google.com DIMACS/Northeast Big Data Hub Workshop on Overcoming Barriers to Data Sharing including Privacy

More information

Distributed Data Anonymization with Hiding Sensitive Node Labels

Distributed Data Anonymization with Hiding Sensitive Node Labels Distributed Data Anonymization with Hiding Sensitive Node Labels C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan University,Trichy

More information

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris Gergely Acs (INRIA) gergely.acs@inria.fr!! Claude Castelluccia (INRIA) claude.castelluccia@inria.fr! Outline 2! Dataset descrip9on!

More information

Differentially-Private Network Trace Analysis. Frank McSherry and Ratul Mahajan Microsoft Research

Differentially-Private Network Trace Analysis. Frank McSherry and Ratul Mahajan Microsoft Research Differentially-Private Network Trace Analysis Frank McSherry and Ratul Mahajan Microsoft Research Overview. 1 Overview Question: Is it possible to conduct network trace analyses in a way that provides

More information

Prateek Mittal Princeton University

Prateek Mittal Princeton University On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology Neil Zhenqiang Gong University

More information

Differentially Private H-Tree

Differentially Private H-Tree GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern

More information

Privacy Preserving Collaborative Filtering

Privacy Preserving Collaborative Filtering Privacy Preserving Collaborative Filtering Emily Mu, Christopher Shao, Vivek Miglani May 2017 1 Abstract As machine learning and data mining techniques continue to grow in popularity, it has become ever

More information

Differentially Private Algorithm and Auction Configuration

Differentially Private Algorithm and Auction Configuration Differentially Private Algorithm and Auction Configuration Ellen Vitercik CMU, Theory Lunch October 11, 2017 Joint work with Nina Balcan and Travis Dick $1 Prices learned from purchase histories can reveal

More information

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust G.Mareeswari 1, V.Anusuya 2 ME, Department of CSE, PSR Engineering College, Sivakasi, Tamilnadu,

More information

A Differentially Private Matching Scheme for Pairing Similar Users of Proximity Based Social Networking applications

A Differentially Private Matching Scheme for Pairing Similar Users of Proximity Based Social Networking applications Proceedings of the 51 st Hawaii International Conference on System Sciences 2018 A Differentially Private Matching Scheme for Pairing Similar Users of Proximity Based Social Networking applications Michael

More information

Approaches to distributed privacy protecting data mining

Approaches to distributed privacy protecting data mining Approaches to distributed privacy protecting data mining Bartosz Przydatek CMU Approaches to distributed privacy protecting data mining p.1/11 Introduction Data Mining and Privacy Protection conflicting

More information

An Overview of Secure Multiparty Computation

An Overview of Secure Multiparty Computation An Overview of Secure Multiparty Computation T. E. Bjørstad The Selmer Center Department of Informatics University of Bergen Norway Prøveforelesning for PhD-graden 2010-02-11 Outline Background 1 Background

More information

Comparative Analysis of Anonymization Techniques

Comparative Analysis of Anonymization Techniques International Journal of Electronic and Electrical Engineering. ISSN 0974-2174 Volume 7, Number 8 (2014), pp. 773-778 International Research Publication House http://www.irphouse.com Comparative Analysis

More information

The Confounding Problem of Private Data Release

The Confounding Problem of Private Data Release The Confounding Problem of Private Data Release Divesh Srivastava AT&T Labs-Research Acknowledgments: Ramón, Graham, Colin, Xi, Ashwin, Magda This material represents the views of the individual contributors

More information

More Efficient Classification of Web Content Using Graph Sampling

More Efficient Classification of Web Content Using Graph Sampling More Efficient Classification of Web Content Using Graph Sampling Chris Bennett Department of Computer Science University of Georgia Athens, Georgia, USA 30602 bennett@cs.uga.edu Abstract In mining information

More information

Enforcing Privacy in Decentralized Mobile Social Networks

Enforcing Privacy in Decentralized Mobile Social Networks Enforcing Privacy in Decentralized Mobile Social Networks Hiep H. Nguyen 1, Abdessamad Imine 1, and Michaël Rusinowitch 1 LORIA/INRIA Nancy-Grand Est France {huu-hiep.nguyen,michael.rusinowitch}@inria.fr,abdessamad.imine@loria.fr

More information

Sanitization Techniques against Personal Information Inference Attack on Social Network

Sanitization Techniques against Personal Information Inference Attack on Social Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,

More information

Mining Frequent Patterns with Differential Privacy

Mining Frequent Patterns with Differential Privacy Mining Frequent Patterns with Differential Privacy Luca Bonomi (Supervised by Prof. Li Xiong) Department of Mathematics & Computer Science Emory University Atlanta, USA lbonomi@mathcs.emory.edu ABSTRACT

More information

Attribute-based encryption with encryption and decryption outsourcing

Attribute-based encryption with encryption and decryption outsourcing Edith Cowan University Research Online Australian Information Security Management Conference Conferences, Symposia and Campus Events 2014 Attribute-based encryption with encryption and decryption outsourcing

More information

An Architecture for Privacy-preserving Mining of Client Information

An Architecture for Privacy-preserving Mining of Client Information An Architecture for Privacy-preserving Mining of Client Information Murat Kantarcioglu Jaideep Vaidya Department of Computer Sciences Purdue University 1398 Computer Sciences Building West Lafayette, IN

More information

Co-clustering for differentially private synthetic data generation

Co-clustering for differentially private synthetic data generation Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &

More information

Cryptography & Data Privacy Research in the NSRC

Cryptography & Data Privacy Research in the NSRC Cryptography & Data Privacy Research in the NSRC Adam Smith Assistant Professor Computer Science and Engineering 1 Cryptography & Data Privacy @ CSE NSRC SIIS Algorithms & Complexity Group Cryptography

More information

An Adaptive Algorithm for Range Queries in Differential Privacy

An Adaptive Algorithm for Range Queries in Differential Privacy Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 6-2016 An Adaptive Algorithm for Range Queries in Differential Privacy Asma Alnemari Follow this and additional

More information

Adding Differential Privacy in an Open Board Discussion Board System

Adding Differential Privacy in an Open Board Discussion Board System San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 5-26-2017 Adding Differential Privacy in an Open Board Discussion Board System Pragya Rana San

More information

Emerging Measures in Preserving Privacy for Publishing The Data

Emerging Measures in Preserving Privacy for Publishing The Data Emerging Measures in Preserving Privacy for Publishing The Data K.SIVARAMAN 1 Assistant Professor, Dept. of Computer Science, BIST, Bharath University, Chennai -600073 1 ABSTRACT: The information in the

More information

A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis

A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis Meshal Shutaywi and Nezamoddin N. Kachouie Department of Mathematical Sciences, Florida Institute of Technology Abstract

More information

Information Security in Big Data: Privacy & Data Mining

Information Security in Big Data: Privacy & Data Mining Engineering (IJERCSE) Vol. 1, Issue 2, December 2014 Information Security in Big Data: Privacy & Data Mining [1] Kiran S.Gaikwad, [2] Assistant Professor. Seema Singh Solanki [1][2] Everest College of

More information

The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data

The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data Li Liu, Murat Kantarcioglu and Bhavani Thuraisingham Computer Science Department University of Texas

More information

Collecting Telemetry Data Privately

Collecting Telemetry Data Privately Collecting Telemetry Data Privately Bolin Ding, Janardhan Kulkarni, Sergey Yekhanin Microsoft Research {bolind, jakul, yekhanin}@microsoft.com Abstract The collection and analysis of telemetry data from

More information

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

A Review on Privacy Preserving Data Mining Approaches

A Review on Privacy Preserving Data Mining Approaches A Review on Privacy Preserving Data Mining Approaches Anu Thomas Asst.Prof. Computer Science & Engineering Department DJMIT,Mogar,Anand Gujarat Technological University Anu.thomas@djmit.ac.in Jimesh Rana

More information

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University {tedhong, dtsamis}@stanford.edu Abstract This paper analyzes the performance of various KNNs techniques as applied to the

More information

Protecting Against Maximum-Knowledge Adversaries in Microdata Release: Analysis of Masking and Synthetic Data Using the Permutation Model

Protecting Against Maximum-Knowledge Adversaries in Microdata Release: Analysis of Masking and Synthetic Data Using the Permutation Model Protecting Against Maximum-Knowledge Adversaries in Microdata Release: Analysis of Masking and Synthetic Data Using the Permutation Model Josep Domingo-Ferrer and Krishnamurty Muralidhar Universitat Rovira

More information

Differen'al Privacy. CS 297 Pragya Rana

Differen'al Privacy. CS 297 Pragya Rana Differen'al Privacy CS 297 Pragya Rana Outline Introduc'on Privacy Data Analysis: The SeAng Impossibility of Absolute Disclosure Preven'on Achieving Differen'al Privacy Introduc'on Sta's'c: quan'ty computed

More information

TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SET

TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SET TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SET Priya Kumari 1 and Seema Maitrey 2 1 M.Tech (CSE) Student KIET Group of Institution, Ghaziabad, U.P, 2 Assistant Professor KIET Group

More information

Synthetic Data. Michael Lin

Synthetic Data. Michael Lin Synthetic Data Michael Lin 1 Overview The data privacy problem Imputation Synthetic data Analysis 2 Data Privacy As a data provider, how can we release data containing private information without disclosing

More information

Yale University Department of Computer Science

Yale University Department of Computer Science Yale University Department of Computer Science Java Implementation of a Single-Database Computationally Symmetric Private Information Retrieval (cspir) protocol Felipe Saint-Jean 1 YALEU/DCS/TR-1333 July

More information

Demonstration of Damson: Differential Privacy for Analysis of Large Data

Demonstration of Damson: Differential Privacy for Analysis of Large Data Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg

More information

Distributed Data Mining with Differential Privacy

Distributed Data Mining with Differential Privacy Distributed Data Mining with Differential Privacy Ning Zhang, Ming Li, Wenjing Lou Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA Email: {ning, mingli}@wpi.edu,

More information

VPriv: Protecting Privacy in Location- Based Vehicular Services

VPriv: Protecting Privacy in Location- Based Vehicular Services VPriv: Protecting Privacy in Location- Based Vehicular Services Raluca Ada Popa and Hari Balakrishnan Computer Science and Artificial Intelligence Laboratory, M.I.T. Andrew Blumberg Department of Mathematics

More information

k Anonymous Private Query Based on Blind Signature and Oblivious Transfer

k Anonymous Private Query Based on Blind Signature and Oblivious Transfer Edith Cowan University Research Online International Cyber Resilience conference Conferences, Symposia and Campus Events 2011 k Anonymous Private Query Based on Blind Signature and Oblivious Transfer Russell

More information

A Survey on Frequent Itemset Mining using Differential Private with Transaction Splitting

A Survey on Frequent Itemset Mining using Differential Private with Transaction Splitting A Survey on Frequent Itemset Mining using Differential Private with Transaction Splitting Bhagyashree R. Vhatkar 1,Prof. (Dr. ). S. A. Itkar 2 1 Computer Department, P.E.S. Modern College of Engineering

More information

Research Article Structural Attack to Anonymous Graph of Social Networks

Research Article Structural Attack to Anonymous Graph of Social Networks Mathematical Problems in Engineering Volume, Article ID 7, pages http://dx.doi.org/.//7 Research Article Structural Attack to Anonymous Graph of Social Networks Tieying Zhu, Shanshan Wang, Xiangtao Li,

More information

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data

Comparison and Analysis of Anonymization Techniques for Preserving Privacy in Big Data Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 247-253 Research India Publications http://www.ripublication.com Comparison and Analysis of Anonymization

More information

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and Computer Language Theory Chapter 4: Decidability 1 Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

More information

ADVANCES in NATURAL and APPLIED SCIENCES

ADVANCES in NATURAL and APPLIED SCIENCES ADVANCES in NATURAL and APPLIED SCIENCES ISSN: 1995-0772 Published BY AENSI Publication EISSN: 1998-1090 http://www.aensiweb.com/anas 2016 May 10(5): pages 223-227 Open Access Journal An Efficient Proxy

More information

A compact Aggregate key Cryptosystem for Data Sharing in Cloud Storage systems.

A compact Aggregate key Cryptosystem for Data Sharing in Cloud Storage systems. A compact Aggregate key Cryptosystem for Data Sharing in Cloud Storage systems. G Swetha M.Tech Student Dr.N.Chandra Sekhar Reddy Professor & HoD U V N Rajesh Assistant Professor Abstract Cryptography

More information

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING

A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING 1 B.KARTHIKEYAN, 2 G.MANIKANDAN, 3 V.VAITHIYANATHAN 1 Assistant Professor, School of Computing, SASTRA University, TamilNadu, India. 2 Assistant

More information

Zero-Knowledge Proof and Authentication Protocols

Zero-Knowledge Proof and Authentication Protocols Zero-Knowledge Proof and Authentication Protocols Ben Lipton April 26, 2016 Outline Background Zero-Knowledge Proofs Zero-Knowledge Authentication History Example Protocols Guillou-Quisquater Non-zero-knowledge

More information

Interactive Visualization of the Stock Market Graph

Interactive Visualization of the Stock Market Graph Interactive Visualization of the Stock Market Graph Presented by Camilo Rostoker rostokec@cs.ubc.ca Department of Computer Science University of British Columbia Overview 1. Introduction 2. The Market

More information

Frequent grams based Embedding for Privacy Preserving Record Linkage

Frequent grams based Embedding for Privacy Preserving Record Linkage Frequent grams based Embedding for Privacy Preserving Record Linkage ABSTRACT Luca Bonomi Emory University Atlanta, USA lbonomi@mathcs.emory.edu Rui Chen Concordia University Montreal, Canada ru_che@encs.concordia.ca

More information

Secure Multiparty Computation

Secure Multiparty Computation CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain

Implementation of Privacy Mechanism using Curve Fitting Method for Data Publishing in Health Care Domain Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1105

More information

1 A Tale of Two Lovers

1 A Tale of Two Lovers CS 120/ E-177: Introduction to Cryptography Salil Vadhan and Alon Rosen Dec. 12, 2006 Lecture Notes 19 (expanded): Secure Two-Party Computation Recommended Reading. Goldreich Volume II 7.2.2, 7.3.2, 7.3.3.

More information

Design and Analysis of Efficient Anonymous Communication Protocols

Design and Analysis of Efficient Anonymous Communication Protocols Design and Analysis of Efficient Anonymous Communication Protocols Thesis Defense Aaron Johnson Department of Computer Science Yale University 7/1/2009 1 Acknowledgements Joan Feigenbaum Paul Syverson

More information

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC

More information

Information Technology (CCHIT): Report on Activities and Progress

Information Technology (CCHIT): Report on Activities and Progress Certification Commission for Healthcare Information Technology Certification Commission for Healthcare Information Technology (CCHIT): Report on Activities and Progress Mark Leavitt, MD, PhD Chair, CCHIT

More information

Privately Solving Linear Programs

Privately Solving Linear Programs Privately Solving Linear Programs Justin Hsu 1 Aaron Roth 1 Tim Roughgarden 2 Jonathan Ullman 3 1 University of Pennsylvania 2 Stanford University 3 Harvard University July 8th, 2014 A motivating example

More information

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP

Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP 324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,

More information

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique

SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique SMMCOA: Maintaining Multiple Correlations between Overlapped Attributes Using Slicing Technique Sumit Jain 1, Abhishek Raghuvanshi 1, Department of information Technology, MIT, Ujjain Abstract--Knowledge

More information

Service-Oriented Architecture for Privacy-Preserving Data Mashup

Service-Oriented Architecture for Privacy-Preserving Data Mashup Service-Oriented Architecture for Privacy-Preserving Data Mashup Thomas Trojer a Benjamin C. M. Fung b Patrick C. K. Hung c a Quality Engineering, Institute of Computer Science, University of Innsbruck,

More information

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems Safely Measuring Tor Safely Measuring Tor, Rob Jansen and Aaron Johnson, In the Proceedings of the 23rd ACM Conference on Computer and Communication Security (CCS 2016). Rob Jansen Center for High Assurance

More information

Statistical Disclosure Control meets Recommender Systems: A practical approach

Statistical Disclosure Control meets Recommender Systems: A practical approach Research Group Statistical Disclosure Control meets Recommender Systems: A practical approach Fran Casino and Agusti Solanas {franciscojose.casino, agusti.solanas}@urv.cat Smart Health Research Group Universitat

More information

Applications of Geometric Spanner

Applications of Geometric Spanner Title: Name: Affil./Addr. 1: Affil./Addr. 2: Affil./Addr. 3: Keywords: SumOriWork: Applications of Geometric Spanner Networks Joachim Gudmundsson 1, Giri Narasimhan 2, Michiel Smid 3 School of Information

More information