Differentially Private Algorithm and Auction Configuration

Size: px
Start display at page:

Download "Differentially Private Algorithm and Auction Configuration"

Transcription

1 Differentially Private Algorithm and Auction Configuration Ellen Vitercik CMU, Theory Lunch October 11, 2017 Joint work with Nina Balcan and Travis Dick

2 $1

3 Prices learned from purchase histories can reveal information about individual purchases. $100

4 We Prices need learned a way from to privately purchase set histories prices and can design reveal auctions information based about individual on purchase purchases. histories. $100

5

6 Dear Lab Technician, Please update your default CPLEX parameters to

7 Suppose a parameter correlates with a certain disease. An attacker can infer information about medical records used to tune those parameters.

8 Suppose We need a parameter way to correlates privately configure with a certain disease. algorithms. An attacker can infer information about medical records used to tune those parameters.

9 Suppose Many works a parameter have correlates shown that with is a certain disease. possible to invert a machine learning model to infer An sensitive attacker information can infer information about its training about set. medical records used to tune those parameters.

10 By observing a series of recommendations from websites such as Amazon, an adversary can infer individual users purchases. [Calandrino et al. 2011]

11 It is possible to extract images of training subjects from facial recognition models. The attacker has only the person s name and access to a facial recognition system that returns a class confidence score. [Fredrikson et al. 2015]

12 It is possible to invert a machine learning model to learn sensitive genomic information about individuals. [Fredrikson et al. 2014]

13 In response, computer scientists have developed private machine learning algorithms.

14 Existing private ML algorithms apply to optimization problems defined by well-studied, well-understood functions.

15 What if the objective is nonconvex and not differentiable?

16 We provide a private algorithm for maximizing datadependent piecewise Lipschitz functions. Applications in: Algorithm configuration and mechanism design Pricing reduce mechanism to maximizing data-dependent piecewise Lipschitz and auction functions. design

17 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

18 Learning-based algorithm configuration: Tune algorithm parameters to achieve high performance over a specific application domain. Led to breakthroughs in: Combinatorial auctions [Leyton-Brown et al., 2009] Scientific computing [Demmel et al., 2005] Vehicle routing [Caseau et al., 1999] SAT [Xu et al., 2008]

19 Learning-based algorithm configuration Application- Specific Distribution,, Algorithm Algorithm Designer How can I use the set of samples to find an algorithm that s best for my application domain?

20 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Greedy algorithm configuration Hard combinatorial problems show up in diverse domains where privacy preservation is crucial.

21 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Greedy algorithm configuration Hard These combinatorial are often solved problems by greedy show up in algorithms diverse domains where elements where privacy are preservation iteratively added is crucial. to a solution set according to a heuristic. E.g., in knapsack: size of item i (value ofitemi)

22 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Greedy algorithm configuration Hard Gupta These combinatorial and are often Roughgarden solved problems by [2017] greedy show up in proposed algorithms diverse an domains where infinite elements where family privacy of are greedy preservation heuristics iteratively added for is the crucial. to knapsack a solution and set max weight according independent to a heuristic. set problems. E.g., in knapsack: size of item i (value of item i) ρ

23 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Greedy algorithm configuration Hard Gupta These Our private combinatorial and are often Roughgarden algorithm solved problems uses by [2017] greedy sample show up in proposed algorithms problem diverse instances domains where infinite elements where to family find privacy a of are nearly greedy preservation heuristics iteratively optimal greedy added for is the crucial. heuristic to knapsack a solution for and the set max weight according specific independent application to a heuristic. domain. set problems. E.g., No sensitive in knapsack: information about the training set is size revealed. of item i (value of item i) ρ

24 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Integer quadratic programming algorithm configuration x i IQPs are used in many applications where privacy preservation is essential, such as financial portfolio optimization. x j

25 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Integer quadratic programming algorithm configuration x i IQPs are used often in approximated many applications by where solving privacy a semi-definite preservation program is and essential, rounding the such vectors as financial to integer portfolio optimization. values. x j There are many different rounding schemes with varying quality.

26 We show that our private algorithm has strong utility guarantees for many algorithm configuration problems. Integer quadratic programming algorithm configuration x i IQPs Our private are used often algorithm in approximated many uses applications sample by where solving IQP instances privacy a semi-definite to preservation find a nearly program is optimal and essential, rounding scheme the such vectors as financial for to the integer specific portfolio optimization. values. application domain. x j There No sensitive are many information different rounding about the schemes training set with is revealed. varying quality.

27 Learning-based mechanism design: Use information about past consumers to design mechanisms that extract high revenue from future consumers. Employed throughout industry. Garnered significant attention in TCS. [Elkind, 2007, Cole and Roughgarden, 2014, Huang et al., 2015, Medina and Mohri, 2014, Morgenstern and Roughgarden, 2015, Devanur et al., 2016, etc.]

28 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

29 An algorithm, given an input dataset D, is differentially private if the following holds: The output reveals (almost) nothing more about a record in D than the output would have if the record wasn t contained in D.

30 Alice Bob Claire David Algorithm

31 Alice Bob Claire David Algorithm

32 An algorithm A is (ε, δ)-differentially private if for all pairs of neighboring datasets D, D and all sets O of outputs, P A D O e ε P A D O + δ e ε 1 + ε

33 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

34 Single-item pricing problem: One good for sale

35 Single-item pricing problem: Distribution over buyers: ~D Buyers values are denoted as: value, Revenue, ρ = ρ 1 value, ρ Pricing algorithm receives a set S =,,, ~ D N Goal: maximize 1 Revenue, ρ + + Revenue, ρ N Learning theory tells us that this value of ρ approximately maximizes E Revenue, ρ asdfd

36 average revenue S = ~D value, = 3 price

37 average revenue S =, ~ D 2 value, = 3 value, = 4 price

38 average revenue S =, ~ D 2 value, = 3 value, = 4 price

39 average revenue We want to write an algorithm that gets average revenue as close to $3 as possible while preserving differential privacy. price

40 average revenue We want to write an algorithm that gets average revenue as close to $3 as possible while preserving differential privacy. Average utility U S (ρ) price ρ

41 General problem: Given a piecewise utility function U S ρ, privately find a parameter ρ that approximately maximizes U S ρ. S =,, U S Algorithm ρ

42 General problem: Given a piecewise utility function U S ρ, privately find a parameter ρ that approximately maximizes U S ρ. S =,, U S S =, U S Algorithm ρ

43 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

44 Ideally, we want to use the exponential mechanism [McSherry and Talwar 2007] Pick a parameter vector ρ with probability proportional to exp c ε U S ρ

45 average revenue Sampling from exp c ε U S ρ is easy when we can calculate its antiderivative and the parameter space is one-dimensional. price

46 What about when the parameter space is multi-dimensional?

47 Generally, sampling from a multi-dimensional distribution is hard.

48 Existing efficient sampling techniques work for logconcave distributions. [Lovász and Vempala 2006, 2007, Bassily et al. 2014] When U S ρ is piecewise concave, exp c ε U S ρ is logconcave on each piece.

49 Theorem (part 1) Suppose U S ρ is piecewise L-Lipschitz and concave over convex regions in R d. jhkj There is an ε, δ -DP algorithm that approximately maximizes U S ρ. Its running time polynomial in d, L, ε 1, log δ 1, and S.

50 Step 1: Estimate the probability mass that the exponential mechanism places on each piecewise portion of the domain. This requires an integral estimation algorithm. [Lovász and Vempala 07]

51 Step 2: Sample a portion of the domain according to that probability.

52 Step 3: Sample a point in that portion of the domain with probability approximately according to the exponential mechanism: exp c ε U S ρ This requires approximate sampling [Bassily et al. 2014]

53 Proof sketch: Show that the output distribution is almost exp c ε U S ρ. Lemma [Bassily et al. 2014] If an algorithm A s output distribution is ε-close ǁ to that of a (ε, 0)-differentially private algorithm for every input dataset S, then A is (2ε ǁ + ε, 0)-differentially private. (Close under the metric D μ f, μ g = sup log f(x)/g(x).) Since the exponential mechanism is (ε, 0)-differentially private, our algorithm is (O(ε), δ)-differentially private.

54 Theorem (part 1) Suppose U S ρ is piecewise L-Lipschitz and concave over convex regions in R d. jhkj There is an ε, δ -DP algorithm that approximately maximizes U S ρ. Its running time polynomial in d, L, ε 1, log δ 1, and S.

55 Theorem (part 1) Suppose U S ρ is piecewise L-Lipschitz and concave over convex regions in R d. jhkj There is an ε, δ -DP algorithm that approximately maximizes U S ρ. Its running time polynomial in d, L, ε 1, log δ 1, and S.

56 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

57 Theorem (part 2) Suppose U S ρ is (w, k)-dispersed, piecewise L-Lipschitz, and the parameter space is contained in a ball of radius R. Let ρ R d be the parameter vector output by our ε, δ -DP algorithm. With probability at least 1 γ, U S ρ max U S ρ O H S ε d log R w + log 1 γ + Lw + Hk S (U S ρ is an average of functions with range in 0, H.)

58 Every private algorithm needs to add some noise to the parameter it outputs.

59 Therefore, nearby parameters should have similar utilities.

60 We often guarantee that nearby parameters have similar utilities by assuming the utility function is nice.

61 What does it mean for a piecewise function to be nice? U S ρ ρ

62 What does it mean for a piecewise function to be nice? U S ρ ρ

63 Let P = P 1,, P t be a partition of the parameter space such that U S ρ is piecewise L-Lipschitz within each P i. U S is w, k -dispersed if for any ball B of radius w, the number of sets in P that intersect B is at most k. U S ρ ρ

64 Let P = P 1,, P t be a partition of the parameter space such that U S ρ is piecewise L-Lipschitz within each P i. U S is w, k -dispersed if for any ball B of radius w, the number of sets in P that intersect B is at most k. U S ρ P 1 P 2 P 3 P 4 P 5 P 6 P 7 ρ

65 Let P = P 1,, P t be a partition of the parameter space such that U S ρ is piecewise L-Lipschitz within each P i. U S is w, k -dispersed if for any ball B of radius w, the number of sets in P that intersect B is at most k. U S ρ (1, 3)- dispersed 1 ρ

66 Theorem (part 2) Suppose U S ρ is (w, k)-dispersed, piecewise L-Lipschitz, and the parameter space is contained in a ball of radius R. Let ρ R d be the parameter vector output by our ε, δ -DP algorithm. With probability at least 1 γ, U S ρ max U S ρ O H S ε d log R w + log 1 γ + Lw + Hk S (U S ρ is an average of functions with range in 0, H.)

67 Let P = P 1,, P t be a partition of the parameter space such that U S ρ is piecewise L-Lipschitz within each P i. U S is w, k -dispersed if for any ball B of radius w, the number of sets in P that intersect B is at most k. U S ρ (1, 3)- dispersed 1 ρ

68 Let P = P 1,, P t be a partition of the parameter space such that U S ρ is piecewise L-Lipschitz within each P i. U S is w, k -dispersed if for any ball B of radius w, the number of sets in P that intersect B is at most k. U S ρ (2, 4)- dispersed ρ 2

69 Theorem (part 2) Suppose U S ρ is (w, k)-dispersed, piecewise L-Lipschitz, and the parameter space is contained in a ball of radius R. Let ρ R d be the parameter vector output by our ε, δ -DP algorithm. With probability at least 1 γ, U S ρ max U S ρ O H S ε d log R w + log 1 γ + Lw + Hk S (U S ρ is an average of functions with range in 0, H.)

70 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

71 Multi-item pricing problem: Multiple goods for sale

72 Multi-item pricing problem: Distribution over buyers: ~D Buyers values are denoted as: value, asdf asdf value, Revenue, ρ = ρ 1 1 value, ρ 1 + ρ 2 1 value, ρ 2 Pricing algorithm receives a set S =,,, ~ D N Goal: maximize 1 N Revenue, ρ + + Revenue, ρ

73 The average revenue over S =,,, of a pricing aesfe mechanism is piecewise linear and 1-Lipschitz. U S ρ = average revenue ρ 1 = price ρ 2 = price

74 How dispersed is U S ρ? Let f i (v) be the density of the distribution over the typical buyer s value for item i. U S ρ = average revenue Suppose there are m items and max f i(v) κ. i,v ρ 1 With probability at least 1 γ, U S ρ is w, k -dispersed with w = γ and k = m. S κ m ρ 2

75 Let f i (v) be the density of the distribution over the typical buyer s value for item i. U S ρ = average revenue Suppose there are m items and max f i(v) κ. i,v Let ρ R m be the price vector output by our DP algorithm. ρ 1 With probability 1 γ, U S ρ max U S ρ O Hm S ε + γ S κ m ρ 2

76 We also show that our DP algorithm has strong utility guarantees for many mechanism design problems. Multiple bidders Buyers with unit-demand valuations Buyers with general valuations 2 nd price auctions with reserve prices

77 Introduction Setup Overview: Algorithm configuration and auction design Differential privacy Private pricing design The algorithm Privacy guarantees Utility guarantees Example: private multi-item pricing Summary

78 We provide a private algorithm for maximizing datadependent piecewise Lipschitz functions. Utility function Piecewise concave and Lipschitz Piecewise Lipschitz Piecewise Lipschitz Parameter Dimension Multiple Privacy guarantee (ε, δ) Single (ε, 0) Multiple (ε, 0) Efficiency

79 Questions?

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

A PAC Approach to ApplicationSpecific Algorithm Selection. Rishi Gupta Tim Roughgarden Simons, Nov 16, 2016

A PAC Approach to ApplicationSpecific Algorithm Selection. Rishi Gupta Tim Roughgarden Simons, Nov 16, 2016 A PAC Approach to ApplicationSpecific Algorithm Selection Rishi Gupta Tim Roughgarden Simons, Nov 16, 2016 Motivation and Set-up Problem Algorithms Alg 1: Greedy algorithm Alg 2: Linear programming Alg

More information

Application-Specific Algorithm Selection

Application-Specific Algorithm Selection Application-Specific Algorithm Selection Tim Roughgarden (Stanford) joint work with Rishi Gupta 1 Algorithm Selection I need to solve problem X. Which algorithm should I use? Answer usually depends on

More information

Privacy Preserving Machine Learning: A Theoretically Sound App

Privacy Preserving Machine Learning: A Theoretically Sound App Privacy Preserving Machine Learning: A Theoretically Sound Approach Outline 1 2 3 4 5 6 Privacy Leakage Events AOL search data leak: New York Times journalist was able to identify users from the anonymous

More information

1 A Tale of Two Lovers

1 A Tale of Two Lovers CS 120/ E-177: Introduction to Cryptography Salil Vadhan and Alon Rosen Dec. 12, 2006 Lecture Notes 19 (expanded): Secure Two-Party Computation Recommended Reading. Goldreich Volume II 7.2.2, 7.3.2, 7.3.3.

More information

Approximation Algorithms for Item Pricing

Approximation Algorithms for Item Pricing Approximation Algorithms for Item Pricing Maria-Florina Balcan July 2005 CMU-CS-05-176 Avrim Blum School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 School of Computer Science,

More information

Introduction to Secure Multi-Party Computation

Introduction to Secure Multi-Party Computation Introduction to Secure Multi-Party Computation Many thanks to Vitaly Shmatikov of the University of Texas, Austin for providing these slides. slide 1 Motivation General framework for describing computation

More information

Clustering. Unsupervised Learning

Clustering. Unsupervised Learning Clustering. Unsupervised Learning Maria-Florina Balcan 11/05/2018 Clustering, Informal Goals Goal: Automatically partition unlabeled data into groups of similar datapoints. Question: When and why would

More information

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open AI 2 3 Deep Learning Fashion Cognitive tasks: speech,

More information

A PAC APPROACH TO APPLICATION-SPECIFIC ALGORITHM SELECTION

A PAC APPROACH TO APPLICATION-SPECIFIC ALGORITHM SELECTION SIAM J. COMPUT. Vol. 46, No. 3, pp. 992 1017 c 2017 Society for Industrial and Applied Mathematics A PAC APPROACH TO APPLICATION-SPECIFIC ALGORITHM SELECTION RISHI GUPTA AND TIM ROUGHGARDEN Abstract. The

More information

Combinatorial Auctions: A Survey by de Vries and Vohra

Combinatorial Auctions: A Survey by de Vries and Vohra Combinatorial Auctions: A Survey by de Vries and Vohra Ashwin Ganesan EE228, Fall 2003 September 30, 2003 1 Combinatorial Auctions Problem N is the set of bidders, M is the set of objects b j (S) is the

More information

A Computational Theory of Clustering

A Computational Theory of Clustering A Computational Theory of Clustering Avrim Blum Carnegie Mellon University Based on work joint with Nina Balcan, Anupam Gupta, and Santosh Vempala Point of this talk A new way to theoretically analyze

More information

Introduction to Secure Multi-Party Computation

Introduction to Secure Multi-Party Computation CS 380S Introduction to Secure Multi-Party Computation Vitaly Shmatikov slide 1 Motivation General framework for describing computation between parties who do not trust each other Example: elections N

More information

Market Pricing for Data Streams

Market Pricing for Data Streams Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Market Pricing for Data Streams Melika Abolhassani, Hossein Esfandiari, MohammadTaghi Hajiaghayi, Brendan Lucier, Hadi

More information

Approximation Techniques for Utilitarian Mechanism Design

Approximation Techniques for Utilitarian Mechanism Design Approximation Techniques for Utilitarian Mechanism Design Department of Computer Science RWTH Aachen Germany joint work with Patrick Briest and Piotr Krysta 05/16/2006 1 Introduction to Utilitarian Mechanism

More information

Secure Multiparty Computation

Secure Multiparty Computation Secure Multiparty Computation Li Xiong CS573 Data Privacy and Security Outline Secure multiparty computation Problem and security definitions Basic cryptographic tools and general constructions Yao s Millionnare

More information

Data Anonymization. Graham Cormode.

Data Anonymization. Graham Cormode. Data Anonymization Graham Cormode graham@research.att.com 1 Why Anonymize? For Data Sharing Give real(istic) data to others to study without compromising privacy of individuals in the data Allows third-parties

More information

Machine Learning on Encrypted Data

Machine Learning on Encrypted Data Machine Learning on Encrypted Data Kim Laine Microsoft Research, Redmond WA January 5, 2017 Joint Mathematics Meetings 2017, Atlanta GA AMS-MAA Special Session on Mathematics of Cryptography Two Tracks

More information

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis Part 1 Aaron Roth The 2015 ImageNet competition An image classification competition during a heated war for deep learning talent

More information

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010

Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Online Stochastic Matching CMSC 858F: Algorithmic Game Theory Fall 2010 Barna Saha, Vahid Liaghat Abstract This summary is mostly based on the work of Saberi et al. [1] on online stochastic matching problem

More information

Privately Solving Linear Programs

Privately Solving Linear Programs Privately Solving Linear Programs Justin Hsu 1 Aaron Roth 1 Tim Roughgarden 2 Jonathan Ullman 3 1 University of Pennsylvania 2 Stanford University 3 Harvard University July 8th, 2014 A motivating example

More information

Parallel Composition Revisited

Parallel Composition Revisited Parallel Composition Revisited Chris Clifton 23 October 2017 This is joint work with Keith Merrill and Shawn Merrill This work supported by the U.S. Census Bureau under Cooperative Agreement CB16ADR0160002

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Ph.D. Thesis Gabriella Divéki Supervisor: Dr. Csanád Imreh University of Szeged

More information

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg

MVE165/MMG630, Applied Optimization Lecture 8 Integer linear programming algorithms. Ann-Brith Strömberg MVE165/MMG630, Integer linear programming algorithms Ann-Brith Strömberg 2009 04 15 Methods for ILP: Overview (Ch. 14.1) Enumeration Implicit enumeration: Branch and bound Relaxations Decomposition methods:

More information

Randomized Algorithms 2017A - Lecture 10 Metric Embeddings into Random Trees

Randomized Algorithms 2017A - Lecture 10 Metric Embeddings into Random Trees Randomized Algorithms 2017A - Lecture 10 Metric Embeddings into Random Trees Lior Kamma 1 Introduction Embeddings and Distortion An embedding of a metric space (X, d X ) into a metric space (Y, d Y ) is

More information

More Data, Less Work: Runtime as a decreasing function of data set size. Nati Srebro. Toyota Technological Institute Chicago

More Data, Less Work: Runtime as a decreasing function of data set size. Nati Srebro. Toyota Technological Institute Chicago More Data, Less Work: Runtime as a decreasing function of data set size Nati Srebro Toyota Technological Institute Chicago Outline we are here SVM speculations, other problems Clustering wild speculations,

More information

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee

A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Distrib Parallel Databases (2014) 32:5 35 DOI 10.1007/s10619-013-7126-6 A generic and distributed privacy preserving classification method with a worst-case privacy guarantee Madhushri Banerjee Zhiyuan

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information

Privacy-Preserving Machine Learning

Privacy-Preserving Machine Learning Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following concepts:

More information

Sublinear Algorithms Lectures 1 and 2. Sofya Raskhodnikova Penn State University

Sublinear Algorithms Lectures 1 and 2. Sofya Raskhodnikova Penn State University Sublinear Algorithms Lectures 1 and 2 Sofya Raskhodnikova Penn State University 1 Tentative Topics Introduction, examples and general techniques. Sublinear-time algorithms for graphs strings basic properties

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs Introduction to Mathematical Programming IE406 Lecture 20 Dr. Ted Ralphs IE406 Lecture 20 1 Reading for This Lecture Bertsimas Sections 10.1, 11.4 IE406 Lecture 20 2 Integer Linear Programming An integer

More information

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017 Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of

More information

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University) Secure Multiparty Computation: Introduction Ran Cohen (Tel Aviv University) Scenario 1: Private Dating Alice and Bob meet at a pub If both of them want to date together they will find out If Alice doesn

More information

Development in Object Detection. Junyuan Lin May 4th

Development in Object Detection. Junyuan Lin May 4th Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,

More information

Clustering Data: Does Theory Help?

Clustering Data: Does Theory Help? Clustering Data: Does Theory Help? Ravi Kannan December 10, 2013 Ravi Kannan Clustering Data: Does Theory Help? December 10, 2013 1 / 27 Clustering Given n objects - divide (partition) them into k clusters.

More information

Prices and Auctions in Markets with Complex Constraints

Prices and Auctions in Markets with Complex Constraints Conference on Frontiers of Economics and Computer Science Becker-Friedman Institute Prices and Auctions in Markets with Complex Constraints Paul Milgrom Stanford University & Auctionomics August 2016 1

More information

Lagrangian Relaxation: An overview

Lagrangian Relaxation: An overview Discrete Math for Bioinformatics WS 11/12:, by A. Bockmayr/K. Reinert, 22. Januar 2013, 13:27 4001 Lagrangian Relaxation: An overview Sources for this lecture: D. Bertsimas and J. Tsitsiklis: Introduction

More information

Homework 2: Multi-unit combinatorial auctions (due Nov. 7 before class)

Homework 2: Multi-unit combinatorial auctions (due Nov. 7 before class) CPS 590.1 - Linear and integer programming Homework 2: Multi-unit combinatorial auctions (due Nov. 7 before class) Please read the rules for assignments on the course web page. Contact Vince (conitzer@cs.duke.edu)

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 8: Interval Estimation Population Mean: Known Population Mean: Unknown Margin of Error and the Interval

More information

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if POLYHEDRAL GEOMETRY Mathematical Programming Niels Lauritzen 7.9.2007 Convex functions and sets Recall that a subset C R n is convex if {λx + (1 λ)y 0 λ 1} C for every x, y C and 0 λ 1. A function f :

More information

Integral Geometry and the Polynomial Hirsch Conjecture

Integral Geometry and the Polynomial Hirsch Conjecture Integral Geometry and the Polynomial Hirsch Conjecture Jonathan Kelner, MIT Partially based on joint work with Daniel Spielman Introduction n A lot of recent work on Polynomial Hirsch Conjecture has focused

More information

Problem 1: Complexity of Update Rules for Logistic Regression

Problem 1: Complexity of Update Rules for Logistic Regression Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1

More information

Secure Multiparty Computation

Secure Multiparty Computation CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

Problem set 2. Problem 1. Problem 2. Problem 3. CS261, Winter Instructor: Ashish Goel.

Problem set 2. Problem 1. Problem 2. Problem 3. CS261, Winter Instructor: Ashish Goel. CS261, Winter 2017. Instructor: Ashish Goel. Problem set 2 Electronic submission to Gradescope due 11:59pm Thursday 2/16. Form a group of 2-3 students that is, submit one homework with all of your names.

More information

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University K- Means + GMMs Clustering Readings: Murphy 25.5 Bishop 12.1, 12.3 HTF 14.3.0 Mitchell

More information

Lecture Notes on Congestion Games

Lecture Notes on Congestion Games Lecture Notes on Department of Computer Science RWTH Aachen SS 2005 Definition and Classification Description of n agents share a set of resources E strategy space of player i is S i 2 E latency for resource

More information

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated

More information

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data Privacy-Preserving Distributed Linear Regression on High-Dimensional Data Borja Balle Amazon Research Cambridge (work done at Lancaster University) Based on joint work with Adria Gascon, Phillipp Schoppmann,

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Classic Ideas, New Ideas Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura University of Toronto, July 2007 1 / 39 Outline

More information

MATH 829: Introduction to Data Mining and Analysis Clustering I

MATH 829: Introduction to Data Mining and Analysis Clustering I 1/13 MATH 829: Introduction to Data Mining and Analysis Clustering I Dominique Guillot Departments of Mathematical Sciences University of Delaware April 25, 2016 Supervised and unsupervised learning 2/13

More information

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong CS573 Data Privacy and Security Cryptographic Primitives and Secure Multiparty Computation Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

A Non-exact Approach and Experiment Studies on the Combinatorial Auction Problem

A Non-exact Approach and Experiment Studies on the Combinatorial Auction Problem A Non-exact Approach and Experiment Studies on the Combinatorial Auction Problem Y. Guo 1, A. Lim 2, B. Rodrigues 3 and Y. Zhu 2 1 Department of Computer Science, National University of Singapore 3 Science

More information

ON WEIGHTED RECTANGLE PACKING WITH LARGE RESOURCES*

ON WEIGHTED RECTANGLE PACKING WITH LARGE RESOURCES* ON WEIGHTED RECTANGLE PACKING WITH LARGE RESOURCES* Aleksei V. Fishkin, 1 Olga Gerber, 1 Klaus Jansen 1 1 University of Kiel Olshausenstr. 40, 24118 Kiel, Germany {avf,oge,kj}@informatik.uni-kiel.de Abstract

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

Optimisation While Streaming

Optimisation While Streaming Optimisation While Streaming Amit Chakrabarti Dartmouth College Joint work with S. Kale, A. Wirth DIMACS Workshop on Big Data Through the Lens of Sublinear Algorithms, Aug 2015 Combinatorial Optimisation

More information

CS522: Advanced Algorithms

CS522: Advanced Algorithms Lecture 1 CS5: Advanced Algorithms October 4, 004 Lecturer: Kamal Jain Notes: Chris Re 1.1 Plan for the week Figure 1.1: Plan for the week The underlined tools, weak duality theorem and complimentary slackness,

More information

Linear methods for supervised learning

Linear methods for supervised learning Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes

More information

Locality- Sensitive Hashing Random Projections for NN Search

Locality- Sensitive Hashing Random Projections for NN Search Case Study 2: Document Retrieval Locality- Sensitive Hashing Random Projections for NN Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 18, 2017 Sham Kakade

More information

Greedy Algorithms CLRS Laura Toma, csci2200, Bowdoin College

Greedy Algorithms CLRS Laura Toma, csci2200, Bowdoin College Greedy Algorithms CLRS 16.1-16.2 Laura Toma, csci2200, Bowdoin College Overview. Sometimes we can solve optimization problems with a technique called greedy. A greedy algorithm picks the option that looks

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Approximation Algorithms and Online Mechanisms for Item Pricing

Approximation Algorithms and Online Mechanisms for Item Pricing Approximation Algorithms and Online Mechanisms for Item Pricing Maria-Florina Balcan March 2006 CMU-CS-05-176R Avrim Blum School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 School

More information

Karaoke. Distributed Private Messaging Immune to Passive Traffic Analysis. David Lazar, Yossi Gilad, Nickolai Zeldovich

Karaoke. Distributed Private Messaging Immune to Passive Traffic Analysis. David Lazar, Yossi Gilad, Nickolai Zeldovich Karaoke Distributed Private Messaging Immune to Passive Traffic Analysis David Lazar, Yossi Gilad, Nickolai Zeldovich 1 Motivation: Report a crime without getting fired You re Fired if you talk to the

More information

Greedy Routing with Guaranteed Delivery Using Ricci Flow

Greedy Routing with Guaranteed Delivery Using Ricci Flow Greedy Routing with Guaranteed Delivery Using Ricci Flow Jie Gao Stony Brook University Joint work with Rik Sarkar, Xiaotian Yin, Wei Zeng, Feng Luo, Xianfeng David Gu Greedy Routing Assign coordinatesto

More information

SIMULATED ANNEALING WITH AN EFFICIENT UNIVERSAL BARRIER

SIMULATED ANNEALING WITH AN EFFICIENT UNIVERSAL BARRIER JACOB ABERNETHY UNIVERSITY OF MICHIGAN (JOINT WORK WITH ELAD HAZAN PRINCETON) 1 FASTER CONVEX OPTIMIZATION SIMULATED ANNEALING WITH AN EFFICIENT UNIVERSAL BARRIER 2 THIS TALK OUTLINE 1. The goal of Convex

More information

Lecture Notes 2: The Simplex Algorithm

Lecture Notes 2: The Simplex Algorithm Algorithmic Methods 25/10/2010 Lecture Notes 2: The Simplex Algorithm Professor: Yossi Azar Scribe:Kiril Solovey 1 Introduction In this lecture we will present the Simplex algorithm, finish some unresolved

More information

Math 734 Aug 22, Differential Geometry Fall 2002, USC

Math 734 Aug 22, Differential Geometry Fall 2002, USC Math 734 Aug 22, 2002 1 Differential Geometry Fall 2002, USC Lecture Notes 1 1 Topological Manifolds The basic objects of study in this class are manifolds. Roughly speaking, these are objects which locally

More information

Property Testing for Sparse Graphs: Structural graph theory meets Property testing

Property Testing for Sparse Graphs: Structural graph theory meets Property testing Property Testing for Sparse Graphs: Structural graph theory meets Property testing Ken-ichi Kawarabayashi National Institute of Informatics(NII)& JST, ERATO, Kawarabayashi Large Graph Project Joint work

More information

On Covering a Graph Optimally with Induced Subgraphs

On Covering a Graph Optimally with Induced Subgraphs On Covering a Graph Optimally with Induced Subgraphs Shripad Thite April 1, 006 Abstract We consider the problem of covering a graph with a given number of induced subgraphs so that the maximum number

More information

Comprehensive Practice Handout MATH 1325 entire semester

Comprehensive Practice Handout MATH 1325 entire semester 1 Comprehensive Practice Handout MATH 1325 entire semester Test 1 material Use the graph of f(x) below to answer the following 6 questions. 7 1. Find the value of lim x + f(x) 2. Find the value of lim

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful

More information

Submodular Optimization

Submodular Optimization Submodular Optimization Nathaniel Grammel N. Grammel Submodular Optimization 1 / 28 Submodularity Captures the notion of Diminishing Returns Definition Suppose U is a set. A set function f :2 U! R is submodular

More information

Polynomial-Time Approximation Algorithms

Polynomial-Time Approximation Algorithms 6.854 Advanced Algorithms Lecture 20: 10/27/2006 Lecturer: David Karger Scribes: Matt Doherty, John Nham, Sergiy Sidenko, David Schultz Polynomial-Time Approximation Algorithms NP-hard problems are a vast

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Week 5. Convex Optimization

Week 5. Convex Optimization Week 5. Convex Optimization Lecturer: Prof. Santosh Vempala Scribe: Xin Wang, Zihao Li Feb. 9 and, 206 Week 5. Convex Optimization. The convex optimization formulation A general optimization problem is

More information

Differentially Private H-Tree

Differentially Private H-Tree GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern

More information

On the Robustness of Distributed Computing Networks

On the Robustness of Distributed Computing Networks 1 On the Robustness of Distributed Computing Networks Jianan Zhang, Hyang-Won Lee, and Eytan Modiano Lab for Information and Decision Systems, Massachusetts Institute of Technology, USA Dept. of Software,

More information

Applying the weighted barycentre method to interactive graph visualization

Applying the weighted barycentre method to interactive graph visualization Applying the weighted barycentre method to interactive graph visualization Peter Eades University of Sydney Thanks for some software: Hooman Reisi Dekhordi Patrick Eades Graphs and Graph Drawings What

More information

Independent dominating sets in graphs of girth five via the semi-random method

Independent dominating sets in graphs of girth five via the semi-random method Independent dominating sets in graphs of girth five via the semi-random method Ararat Harutyunyan (Oxford), Paul Horn (Harvard), Jacques Verstraete (UCSD) March 12, 2014 Introduction: The semi-random method

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Machine Learning Basics - 1 U Kang Seoul National University U Kang 1 In This Lecture Overview of Machine Learning Capacity, overfitting, and underfitting

More information

Nearest Neighbor with KD Trees

Nearest Neighbor with KD Trees Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest

More information

A BGP-Based Mechanism for Lowest-Cost Routing

A BGP-Based Mechanism for Lowest-Cost Routing A BGP-Based Mechanism for Lowest-Cost Routing Joan Feigenbaum, Christos Papadimitriou, Rahul Sami, Scott Shenker Presented by: Tony Z.C Huang Theoretical Motivation Internet is comprised of separate administrative

More information

On the Robustness of Distributed Computing Networks

On the Robustness of Distributed Computing Networks 1 On the Robustness of Distributed Computing Networks Jianan Zhang, Hyang-Won Lee, and Eytan Modiano Lab for Information and Decision Systems, Massachusetts Institute of Technology, USA Dept. of Software,

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

Network monitoring: detecting node failures

Network monitoring: detecting node failures Network monitoring: detecting node failures 1 Monitoring failures in (communication) DS A major activity in DS consists of monitoring whether all the system components work properly To our scopes, we will

More information

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff

Outline Introduction Problem Formulation Proposed Solution Applications Conclusion. Compressed Sensing. David L Donoho Presented by: Nitesh Shroff Compressed Sensing David L Donoho Presented by: Nitesh Shroff University of Maryland Outline 1 Introduction Compressed Sensing 2 Problem Formulation Sparse Signal Problem Statement 3 Proposed Solution

More information

Beyond Worst-Case Analysis in Combinatorial Optimization

Beyond Worst-Case Analysis in Combinatorial Optimization Proposal Draft Beyond Worst-Case Analysis in Combinatorial Optimization Colin White January 10, 2018 Computer Science Department School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213

More information

Differentially Private Histogram Publication

Differentially Private Histogram Publication Noname manuscript No. will be inserted by the editor) Differentially Private Histogram Publication Jia Xu Zhenjie Zhang Xiaokui Xiao Yin Yang Ge Yu Marianne Winslett Received: date / Accepted: date Abstract

More information

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo

Differential Privacy. Cynthia Dwork. Mamadou H. Diallo Differential Privacy Cynthia Dwork Mamadou H. Diallo 1 Focus Overview Privacy preservation in statistical databases Goal: to enable the user to learn properties of the population as a whole, while protecting

More information

Mathematics 6 12 Section 26

Mathematics 6 12 Section 26 Mathematics 6 12 Section 26 1 Knowledge of algebra 1. Apply the properties of real numbers: closure, commutative, associative, distributive, transitive, identities, and inverses. 2. Solve linear equations

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator

Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator Volkan Cevher Laboratory for Information and Inference Systems LIONS / EPFL http://lions.epfl.ch & Idiap Research Institute joint

More information

Diversity Coloring for Distributed Storage in Mobile Networks

Diversity Coloring for Distributed Storage in Mobile Networks Diversity Coloring for Distributed Storage in Mobile Networks Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Abstract: Storing multiple copies of files is crucial for ensuring

More information

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro

CMU-Q Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization. Teacher: Gianni A. Di Caro CMU-Q 15-381 Lecture 9: Optimization II: Constrained,Unconstrained Optimization Convex optimization Teacher: Gianni A. Di Caro GLOBAL FUNCTION OPTIMIZATION Find the global maximum of the function f x (and

More information

Approximate Algorithms for Touring a Sequence of Polygons

Approximate Algorithms for Touring a Sequence of Polygons Approximate Algorithms for Touring a Sequence of Polygons Fajie Li 1 and Reinhard Klette 2 1 Institute for Mathematics and Computing Science, University of Groningen P.O. Box 800, 9700 AV Groningen, The

More information

John Oliver from The Daily Show. Supporting worthy causes at the G20 Pittsburgh Summit: Bayesians Against Discrimination. Ban Genetic Algorithms

John Oliver from The Daily Show. Supporting worthy causes at the G20 Pittsburgh Summit: Bayesians Against Discrimination. Ban Genetic Algorithms John Oliver from The Daily Show Supporting worthy causes at the G20 Pittsburgh Summit: Bayesians Against Discrimination Ban Genetic Algorithms Support Vector Machines Watch out for the protests tonight

More information

Approximation Basics

Approximation Basics Milestones, Concepts, and Examples Xiaofeng Gao Department of Computer Science and Engineering Shanghai Jiao Tong University, P.R.China Spring 2015 Spring, 2015 Xiaofeng Gao 1/53 Outline History NP Optimization

More information