Mondrian Mul+dimensional K Anonymity

Size: px
Start display at page:

Download "Mondrian Mul+dimensional K Anonymity"

Transcription

1 Mondrian Mul+dimensional K Anonymity Kristen Lefevre, David J. DeWi<, and Raghu Ramakrishnan George W. Boulos gwf5@pi3.edu October

2 Table Linking

3 Overview Mo+va+on & contribu+ons Terminology Quality Metrics Mul+dimensional K Anonymiza+on Greedy Par++oning Algorithm Performance experiments

4 Mo+va+on Protect the data owners privacy using kanonymous tables. Achieve higher quality of anonymzed data. Provide an algorithm for anonymizing tables. The primary goal of k anonymiza3on is to protect the privacy of the individuals to whom the data pertains. However, subject to this constraint, it is important that the released data remain as useful as possible.

5 Contribu+ons Introducing mul+dimensional k anymiza+on. Introducing a greedy algorithm for Kanonymiza+on: more efficient than proposed op0mal k anonymiza0on algorithms for single dimensional models; complexity O(n log n), compared to exponen0al. The greedy mul0dimensional algorithm oaen produces higher quality results than op0mal singledimensional algorithms. More targeted no0on of quality measurement.

6 Terminology Quasi IdenAfier: Minimal set of a<ributes X1, Xd in table T that can be joined with external informa+on to re iden+fy individual records. Equivalence class: the set of all tuples in T containing iden0cal values (x1 xd) for X1 Xd. K Anonymity Property: Table T is k anonymous with respect to a<ributes X1 Xd if every unique tuple (x1 xd) in the (mul0set) projec0on of T on X1 Xd occurs at least k 0mes. K AnonymizaAon: A view V of rela0on T is said to be a kanonymiza0on if the view modifies or generalizes the data of T according to some model such that V is k anonymous with respect to the quasi iden+fier.

7 General Quality Metrics Discernability Metric: Normalized Average Equivalence:

8 K anonymiza+on global recoding: achieves anonymity by mapping the domains of the quasi iden+fier a<ributes to generalized or altered values.

9 Single VS. Mul+dimensional K Single dimensional: Anonymiza+on A single dimensional par++oning defines, for each Xi, a set of non overlapping single dimensional intervals that cover Dxi. øi maps each x Є Dxi to sum summary sta0s0c. Mul0 dimensional: A global recoding achieves anonymity by mapping the domains of the quasi iden+fier a<ributes to generalized or altered values. Øi : Dxi x x Dxn D

10 Single VS. Mul+dimensional K Anonymiza+on (Cont.)

11 Single dimensional Par++oning A single dimensional par++oning defines, for each Xi, a set of non overlapping single dimensional intervals that cover Dxi. Фi maps each x Є Dxi to some summary sta0s0c for the interval in which it is contained.

12 Strict Mul+dimensional Par++oning A strict mul+dimensional par++oning defines a set of non overlapping mul+dimensional regions that cover DX1 DXd. Ø maps each tuple (x1 xd) 2 DX1 DXd to a summary sta+s+c for the region in which it is contained. Proposi3on 1: Every single dimensional par00oning for quasi iden0fier awributes X1 Xd can be expressed as a strict mul0dimensional par00oning.

13 Strict Mul+dimensional Par++oning (Cont.) NP Hard

14 Single dimensional par++oning vs. Proposi+on 1: mul+dimensional Every single dimensional par00oning for quasi iden0fier awributes X1 Xd can be expressed as a strict mul0dimensional par00oning. However, when d >=2 and for all i, Dxi >= 2, there exists a strict mul0dimensional par00oning that cannot be expressed as a singledimensional par00oning.

15 Decisional K Anonymous Mul+dimensional Par++oning Given a set P of unique (point, count) pairs, with points in d dimensional space, for every resul0ng mul+dimensional region Ri: OR NP Complete

16 Allowable Cut Mul+dimensional: A cut perpendicular to axis Xi at xi is allowable if and only if Count(P.Xi > xi) >= k and Count(P.Xi < xi) >= k. Single Dimensional: A single dimensional cut perpendicular to Xi at xi is allowable, given S, if

17 Minimal Par++oning Minimal Strict Mul+dimensional Par++oning: Let R1 Rn denote a set of regions induced by a strict mul0dimensional par++oning, and let each region Ri contain mul+set Pi of points. This mul0dimensional par00oning is minimal if and there exists no allowable mul+dimensional cut for Pi. Minimal Single Dimensional Par00oning: A set S of allowable single dimensional cuts is a minimal single dimensional par++oning for mul+set P of points if there does not exist an allowable singledimensional cut for P given S.

18 Bounds on Par++on size in Mul+dimensional K Anonymiza+on

19 Bounds on Par++on size in Single Dimensional K anonymiza+on <=2k 1

20 Relaxed Mul+dimensional Par++oning A relaxed mul+dimensional par++oning for rela+on T defines a set of (poten+ally overlapping) dis0nct mul0dimensional regions that cover DX1 DXd. Local recoding func0on Ф maps each tuple (x1 xd) Є T to a summary sta0s0c for one of the regions in which it is contained. Proposi0on 2: Every strict mul0dimensional par00oning can be expressed as a relaxed mul0dimensional par00oning. However, if there are at least two tuples in table T having the same vector of quasi iden0fier values, there exists a relaxed mul0dimensional par00oning that cannot be expressed as a strict mul0dimensional par00oning.

21 Greedy Par++oning Algorithm Choose the dimension with the widest range of values

22 Bounds on Quality

23 Scalability Problem Table may be too large to fit in the available memory Calculate the frequency set of a<ributes and load only the frequency set In memory.

24 Workload Driven Quality Range Sta+s+cs: Select Avg(Age) From Pa+ents where sex= male Mean Sta+s+cs Select count(*) From Pa+ents where sex= male and age<=26 It is impossible to answer the second query precisely using the singledimensional recoding.

25 Experimental Evalua+on Used a synthe+c data generator to produce two discrete joint distribu+ons: discrete uniform and discrete normal. Also tested on adults database.

26 Experimental Evalua+on for Synthe+c data

27 Experimental Evalua+on for Adults Database

28 Op+mal single dimensional vs. Greedy strict mul+dimensional par++oning

29 Strengths vs. Weaknesses Defines the process of k anonymity in a larger and more accurate concept. Mul+dimensional approach make sure to include minimal points in a par++on so the output data is be<er. Any Weaknesses?

30 Q & A Thank you

Incognito: Efficient Full Domain K Anonymity

Incognito: Efficient Full Domain K Anonymity Incognito: Efficient Full Domain K Anonymity Kristen LeFevre David J. DeWitt Raghu Ramakrishnan University of Wisconsin Madison 1210 West Dayton St. Madison, WI 53706 Talk Prepared By Parul Halwe(05305002)

More information

Data Anonymization - Generalization Algorithms

Data Anonymization - Generalization Algorithms Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity Generalization and Suppression Z2 = {410**} Z1 = {4107*. 4109*} Generalization Replace the value with a less specific

More information

Decision Trees, Random Forests and Random Ferns. Peter Kovesi

Decision Trees, Random Forests and Random Ferns. Peter Kovesi Decision Trees, Random Forests and Random Ferns Peter Kovesi What do I want to do? Take an image. Iden9fy the dis9nct regions of stuff in the image. Mark the boundaries of these regions. Recognize and

More information

Keyword search in databases: the power of RDBMS

Keyword search in databases: the power of RDBMS Keyword search in databases: the power of RDBMS 1 Introduc

More information

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Mo>va>on: Parallel Query Processing Increasing parallelism in compu>ng Shared nothing clusters, mul> core technology,

More information

OpenWorld 2015 Oracle Par22oning

OpenWorld 2015 Oracle Par22oning OpenWorld 2015 Oracle Par22oning Did You Think It Couldn t Get Any Be6er? Safe Harbor Statement The following is intended to outline our general product direc2on. It is intended for informa2on purposes

More information

Deformable Part Models

Deformable Part Models Deformable Part Models References: Felzenszwalb, Girshick, McAllester and Ramanan, Object Detec@on with Discrimina@vely Trained Part Based Models, PAMI 2010 Code available at hkp://www.cs.berkeley.edu/~rbg/latent/

More information

Extending Heuris.c Search

Extending Heuris.c Search Extending Heuris.c Search Talk at Hebrew University, Cri.cal MAS group Roni Stern Department of Informa.on System Engineering, Ben Gurion University, Israel 1 Heuris.c search 2 Outline Combining lookahead

More information

Stages of (Batch) Machine Learning

Stages of (Batch) Machine Learning Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x

More information

Introduction to Database Systems CSE 444, Winter 2011

Introduction to Database Systems CSE 444, Winter 2011 Version March 15, 2011 Introduction to Database Systems CSE 444, Winter 2011 Lecture 20: Operator Algorithms Where we are / and where we go 2 Why Learn About Operator Algorithms? Implemented in commercial

More information

Preserving Privacy during Big Data Publishing using K-Anonymity Model A Survey

Preserving Privacy during Big Data Publishing using K-Anonymity Model A Survey ISSN No. 0976-5697 Volume 8, No. 5, May-June 2017 International Journal of Advanced Research in Computer Science SURVEY REPORT Available Online at www.ijarcs.info Preserving Privacy during Big Data Publishing

More information

Robust Identification of Fuzzy Duplicates

Robust Identification of Fuzzy Duplicates Robust Identification of Fuzzy Duplicates ì Authors: Surajit Chaudhuri (Microso3 Research) Venkatesh Gan; (Microso3 Research) Rajeev Motwani (Stanford University) Publica;on: 21 st Interna;onal Conference

More information

On Op%mality of Clustering by Space Filling Curves

On Op%mality of Clustering by Space Filling Curves On Op%mality of Clustering by Space Filling Curves Pan Xu panxu@iastate.edu Iowa State University Srikanta Tirthapura snt@iastate.edu 1 Mul%- dimensional Data Indexing and managing single- dimensional

More information

Query and Join Op/miza/on 11/5

Query and Join Op/miza/on 11/5 Query and Join Op/miza/on 11/5 Overview Recap of Merge Join Op/miza/on Logical Op/miza/on Histograms (How Es/mates Work. Big problem!) Physical Op/mizer (if we have /me) Recap on Merge Key (Simple) Idea

More information

CS573 Data Privacy and Security. Li Xiong

CS573 Data Privacy and Security. Li Xiong CS573 Data Privacy and Security Anonymizationmethods Li Xiong Today Clustering based anonymization(cont) Permutation based anonymization Other privacy principles Microaggregation/Clustering Two steps:

More information

Spa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Spa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University Spa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University Class Outlines Spatial Point Pattern Regional Data (Areal Data) Continuous Spatial Data (Geostatistical

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1 CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1 Reminder: Rela0onal Algebra Rela2onal algebra is a nota2on for specifying queries

More information

Crowdsourcing the Acquisi3on and Analysis of Mobile Videos for Disaster Response

Crowdsourcing the Acquisi3on and Analysis of Mobile Videos for Disaster Response IEEE Big Data 2015, October 31, 2015 Crowdsourcing the Acquisi3on and Analysis of Mobile Videos for Disaster Response Presented by Hien To Dr. Seon Ho Kim Integrated Media Systems Center University of

More information

Chunking: An Empirical Evalua3on of So7ware Architecture (?)

Chunking: An Empirical Evalua3on of So7ware Architecture (?) Chunking: An Empirical Evalua3on of So7ware Architecture (?) Rachana Koneru David M. Weiss Iowa State University weiss@iastate.edu rachana.koneru@gmail.com With participation by Audris Mockus, Jeff St.

More information

Hypergraph Sparsifica/on and Its Applica/on to Par//oning

Hypergraph Sparsifica/on and Its Applica/on to Par//oning Hypergraph Sparsifica/on and Its Applica/on to Par//oning Mehmet Deveci 1,3, Kamer Kaya 1, Ümit V. Çatalyürek 1,2 1 Dept. of Biomedical Informa/cs, The Ohio State University 2 Dept. of Electrical & Computer

More information

ACT s College Readiness Standards

ACT s College Readiness Standards Course ACT s College Readiness Standards Select a single piece of data (numerical or nonnumerical) from a simple data presentation (e.g., a table or graph with two or three variables; a food web diagram)

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

ECS 165B: Database System Implementa6on Lecture 14

ECS 165B: Database System Implementa6on Lecture 14 ECS 165B: Database System Implementa6on Lecture 14 UC Davis April 28, 2010 Acknowledgements: por6ons based on slides by Raghu Ramakrishnan and Johannes Gehrke, as well as slides by Zack Ives. Class Agenda

More information

CITS4009 Introduc0on to Data Science

CITS4009 Introduc0on to Data Science School of Computer Science and Software Engineering CITS4009 Introduc0on to Data Science SEMESTER 2, 2017: CHAPTER 3 EXPLORING DATA 1 Chapter Objec0ves Using summary sta.s.cs to explore data Exploring

More information

Database Design CENG 351

Database Design CENG 351 Database Design Database Design Process Requirements analysis What data, what applica;ons, what most frequent opera;ons, Conceptual database design High level descrip;on of the data and the constraint

More information

Utility-Based Anonymization Using Local Recoding

Utility-Based Anonymization Using Local Recoding Utility-Based Anonymization Using Local Recoding Jian Xu 1 Wei Wang 1 Jian Pei Xiaoyuan Wang 1 Baile Shi 1 Ada Wai-Chee Fu 3 1 Fudan University, China, {xujian, weiwang1, xy wang, bshi}@fudan.edu.cn Simon

More information

AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA

AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA AN EFFECTIVE FRAMEWORK FOR EXTENDING PRIVACY- PRESERVING ACCESS CONTROL MECHANISM FOR RELATIONAL DATA Morla Dinesh 1, Shaik. Jumlesha 2 1 M.Tech (S.E), Audisankara College Of Engineering &Technology 2

More information

Privacy-preserving Anonymization of Set-valued Data

Privacy-preserving Anonymization of Set-valued Data Privacy-preserving Anonymization of Set-valued Data Manolis Terrovitis Dept. of Computer Science University of Hong Kong rrovitis@cs.hku.hk Panos Kalnis Dept. of Computer Science National University of

More information

Coupled Conges,on Control for RTP Media. Safiqul Islam, Michael Welzl, Stein Gjessing and Naeem Khademi Department of Informa,cs University of Oslo

Coupled Conges,on Control for RTP Media. Safiqul Islam, Michael Welzl, Stein Gjessing and Naeem Khademi Department of Informa,cs University of Oslo Coupled Conges,on Control for RTP Media Safiqul Islam, Michael Welzl, Stein Gjessing and Naeem Khademi Department of Informa,cs University of Oslo Problem statement Each Flow has its own Conges@on Control

More information

Privacy Preserved Data Publishing Techniques for Tabular Data

Privacy Preserved Data Publishing Techniques for Tabular Data Privacy Preserved Data Publishing Techniques for Tabular Data Keerthy C. College of Engineering Trivandrum Sabitha S. College of Engineering Trivandrum ABSTRACT Almost all countries have imposed strict

More information

Differen'al Privacy. CS 297 Pragya Rana

Differen'al Privacy. CS 297 Pragya Rana Differen'al Privacy CS 297 Pragya Rana Outline Introduc'on Privacy Data Analysis: The SeAng Impossibility of Absolute Disclosure Preven'on Achieving Differen'al Privacy Introduc'on Sta's'c: quan'ty computed

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually

More information

h7ps://bit.ly/citustutorial

h7ps://bit.ly/citustutorial Before We Start Setup a Citus Cloud account for the exercises: h7ps://bit.ly/citustutorial Designing a Mul

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and

More information

Starchart*: GPU Program Power/Performance Op7miza7on Using Regression Trees

Starchart*: GPU Program Power/Performance Op7miza7on Using Regression Trees Starchart*: GPU Program Power/Performance Op7miza7on Using Regression Trees Wenhao Jia, Princeton University Kelly A. Shaw, University of Richmond Margaret Martonosi, Princeton University *Sta7s7cal Tuning

More information

Fix- point engine in Z3. Krystof Hoder Nikolaj Bjorner Leonardo de Moura

Fix- point engine in Z3. Krystof Hoder Nikolaj Bjorner Leonardo de Moura μz Fix- point engine in Z3 Krystof Hoder Nikolaj Bjorner Leonardo de Moura Mo?va?on Horn EPR applica?ons (Datalog) Points- to analysis Security analysis Deduc?ve data- bases and knowledge bases (Yago)

More information

Anonymity in Unstructured Data

Anonymity in Unstructured Data Anonymity in Unstructured Data Manolis Terrovitis, Nikos Mamoulis, and Panos Kalnis Department of Computer Science University of Hong Kong Pokfulam Road, Hong Kong rrovitis@cs.hku.hk Department of Computer

More information

Classification: Decision Trees

Classification: Decision Trees Classification: Decision Trees IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1 Decision Tree Example Will a pa)ent have high-risk based on the ini)al 24-hour observa)on?

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing

More information

Today s Class. High Dimensional Data & Dimensionality Reduc8on. Readings for This Week: Today s Class. Scien8fic Data. Misc. Personal Data 2/22/12

Today s Class. High Dimensional Data & Dimensionality Reduc8on. Readings for This Week: Today s Class. Scien8fic Data. Misc. Personal Data 2/22/12 High Dimensional Data & Dimensionality Reduc8on Readings for This Week: Graphical Histories for Visualiza8on: Suppor8ng Analysis, Communica8on, and Evalua8on, Jeffrey Heer, Jock D. Mackinlay, Chris Stolte,

More information

Register Alloca.on Deconstructed. David Ryan Koes Seth Copen Goldstein

Register Alloca.on Deconstructed. David Ryan Koes Seth Copen Goldstein Register Alloca.on Deconstructed David Ryan Koes Seth Copen Goldstein 12th Interna+onal Workshop on So3ware and Compilers for Embedded Systems April 24, 12009 Register Alloca:on Problem unbounded number

More information

Utility-Based k-anonymization

Utility-Based k-anonymization Utility-Based -Anonymization Qingming Tang tqm2004@gmail.com Yinjie Wu yjwu@fzu.edu.cn Shangbin Liao liaoshangbin@ gmail.com Xiaodong Wang wangxd@fzu.edu.cn Abstract--Anonymity is a well-researched mechanism

More information

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression

Achieving k-anonmity* Privacy Protection Using Generalization and Suppression UT DALLAS Erik Jonsson School of Engineering & Computer Science Achieving k-anonmity* Privacy Protection Using Generalization and Suppression Murat Kantarcioglu Based on Sweeney 2002 paper Releasing Private

More information

Global Analytics in the Face of Bandwidth and Regulatory Constraints

Global Analytics in the Face of Bandwidth and Regulatory Constraints Global Analytics in the Face of Bandwidth and Regulatory Constraints Ashish Vulimiri, Carlo Curino, Brighten Godfrey, Thomas Jungblut, Jitu Padhye, George Varghese NSDI 15 Presenter: Sarthak Grover Motivation

More information

Collabora've, Privacy Preserving Data Aggrega'on at Scale

Collabora've, Privacy Preserving Data Aggrega'on at Scale Collabora've, Privacy Preserving Data Aggrega'on at Scale Michael J. Freedman Princeton University Joint work with: Benny Applebaum, Haakon Ringberg, MaHhew Caesar, and Jennifer Rexford Problem: Network

More information

Introduc)on to Informa)on Visualiza)on

Introduc)on to Informa)on Visualiza)on Introduc)on to Informa)on Visualiza)on Seeing the Science with Visualiza)on Raw Data 01001101011001 11001010010101 00101010100110 11101101011011 00110010111010 Visualiza(on Applica(on Visualiza)on on

More information

Security Control Methods for Statistical Database

Security Control Methods for Statistical Database Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP

More information

UNIT II A. ENTITY RELATIONSHIP MODEL

UNIT II A. ENTITY RELATIONSHIP MODEL UNIT II A. ENTITY RELATIONSHIP MODEL Agenda En0ty & En0ty Sets A6ributes Rela0onship & Rela0onship Sets Constraints Mapping Cardinali0es, Par0cipa0on Constraints, Keys E-R Diagrams & Design of Database

More information

MPI & OpenMP Mixed Hybrid Programming

MPI & OpenMP Mixed Hybrid Programming MPI & OpenMP Mixed Hybrid Programming Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline Introduc/on Share & Distributed Memory Programming MPI & OpenMP Advantages/Disadvantages MPI vs. OpenMP Why

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Graph-Based Synopses for Relational Data. Alkis Polyzotis (UC Santa Cruz)

Graph-Based Synopses for Relational Data. Alkis Polyzotis (UC Santa Cruz) Graph-Based Synopses for Relational Data Alkis Polyzotis (UC Santa Cruz) Data Synopses Data Query Result Data Synopsis Query Approximate Result Problem: exact answer may be too costly to compute Examples:

More information

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006)

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006) Data Flow Analysis Suman Jana Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006) Data flow analysis Derives informa=on about the dynamic behavior of a program by only

More information

SEDA An architecture for Well Condi6oned, scalable Internet Services

SEDA An architecture for Well Condi6oned, scalable Internet Services SEDA An architecture for Well Condi6oned, scalable Internet Services Ma= Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium on Operating Systems Principles (SOSP), October

More information

Survey of Anonymity Techniques for Privacy Preserving

Survey of Anonymity Techniques for Privacy Preserving 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Survey of Anonymity Techniques for Privacy Preserving Luo Yongcheng

More information

Origin- des*na*on Flow Measurement in High- Speed Networks

Origin- des*na*on Flow Measurement in High- Speed Networks IEEE INFOCOM, 2012 Origin- des*na*on Flow Measurement in High- Speed Networks Tao Li Shigang Chen Yan Qiao Introduc*on (Defini*ons) Origin- des+na+on flow between two routers is the set of packets that

More information

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010 Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the

More information

CS: Formal Methods in Software Engineering

CS: Formal Methods in Software Engineering CS:5810 Formal Methods in So7ware Engineering Sets and Rela

More information

Anonymization Algorithms - Microaggregation and Clustering

Anonymization Algorithms - Microaggregation and Clustering Anonymization Algorithms - Microaggregation and Clustering Li Xiong CS573 Data Privacy and Anonymity Anonymization using Microaggregation or Clustering Practical Data-Oriented Microaggregation for Statistical

More information

Introduction to MAPPER

Introduction to MAPPER Introduction to MAPPER Leyda Almodóvar You will find Mapper and instructions to download it and install it here: http://danifold.net/mapper Or see page 2 of this document Make sure to look at http://danifold.net/mapper/installation/index.html

More information

Human Factors in Anonymous Mobile Communications

Human Factors in Anonymous Mobile Communications Human Factors in Anonymous Mobile Communications Svenja Schröder Research Group, University of Vienna Talk at the PhD School at the Android Security Symposium, September 9 th, 2015 in Vienna Svenja Schröder,

More information

CrowdLogging: Distributed, private, and anonymous search logging

CrowdLogging: Distributed, private, and anonymous search logging CrowdLogging: Distributed, private, and anonymous search logging Henry Feild James Allan Joshua Gla4 Center for Intelligent Informa:on Retrieval University of Massachuse4s Amherst July 26, 2011 Centralized

More information

Informa/on Retrieval. Text Search. CISC437/637, Lecture #23 Ben CartereAe. Consider a database consis/ng of long textual informa/on fields

Informa/on Retrieval. Text Search. CISC437/637, Lecture #23 Ben CartereAe. Consider a database consis/ng of long textual informa/on fields Informa/on Retrieval CISC437/637, Lecture #23 Ben CartereAe Copyright Ben CartereAe 1 Text Search Consider a database consis/ng of long textual informa/on fields News ar/cles, patents, web pages, books,

More information

HIDDEN SLIDE Summary These slides are meant to be used as is to give an upper level view of perfsonar for an audience that is not familiar with the

HIDDEN SLIDE Summary These slides are meant to be used as is to give an upper level view of perfsonar for an audience that is not familiar with the HIDDEN SLIDE Summary These slides are meant to be used as is to give an upper level view of perfsonar for an audience that is not familiar with the concept. You *ARE* allowed to delete things you don t

More information

MFTP: a Clean- Slate Transport Protocol for the Informa8on Centric MobilityFirst Network

MFTP: a Clean- Slate Transport Protocol for the Informa8on Centric MobilityFirst Network MFTP: a Clean- Slate Transport Protocol for the Informa8on Centric MobilityFirst Network Kai Su (presen8ng), Francesco Bronzino, K. K. Ramakrishnan*, and Dipankar Raychaudhuri WINLAB, Rutgers University

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis@cs Grades

More information

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching

Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Optimal k-anonymity with Flexible Generalization Schemes through Bottom-up Searching Tiancheng Li Ninghui Li CERIAS and Department of Computer Science, Purdue University 250 N. University Street, West

More information

RESTful Design for Internet of Things Systems

RESTful Design for Internet of Things Systems RESTful Design for Internet of Things Systems dra8- keranen- t2trg- rest- iot- 00 Ari Keränen with MaGhias Kovatsch & Klaus Hartke W3C Web of Things IG October 30 th 2015, Sapporo,

More information

CompSci Understanding Data: Theory and Applica>ons

CompSci Understanding Data: Theory and Applica>ons CompSci 590.6 Understanding Data: Theory and Applica>ons Lecture 8 Why- Not Queries (Query- based) Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Paper(s) Why Not? Chapman- Jagadish SIGMOD

More information

Online Algorithms for Mul2-commodity Network Design

Online Algorithms for Mul2-commodity Network Design Online Algorithms for Mul2-commodity Network Design Debmalya Panigrahi Joint work with Deeparnab Chakrabarty, Alina Ene, and Ravishankar Krishnaswamy Thanks Alina for sharing slides! Online Network Design:

More information

hashfs Applying Hashing to Op2mize File Systems for Small File Reads

hashfs Applying Hashing to Op2mize File Systems for Small File Reads hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design

More information

Ensemble- Based Characteriza4on of Uncertain Features Dennis McLaughlin, Rafal Wojcik

Ensemble- Based Characteriza4on of Uncertain Features Dennis McLaughlin, Rafal Wojcik Ensemble- Based Characteriza4on of Uncertain Features Dennis McLaughlin, Rafal Wojcik Hydrology TRMM TMI/PR satellite rainfall Neuroscience - - MRI Medicine - - CAT Geophysics Seismic Material tes4ng Laser

More information

Predic'ng ALS Progression with Bayesian Addi've Regression Trees

Predic'ng ALS Progression with Bayesian Addi've Regression Trees Predic'ng ALS Progression with Bayesian Addi've Regression Trees Lilly Fang and Lester Mackey November 13, 2012 RECOMB Conference on Regulatory and Systems Genomics The ALS Predic'on Prize Challenge: Predict

More information

Virtual Synchrony. Jared Cantwell

Virtual Synchrony. Jared Cantwell Virtual Synchrony Jared Cantwell Review Mul7cast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed file systems Goal Distributed programming is hard What

More information

Visualizing Logical Dependencies in SWRL Rule Bases

Visualizing Logical Dependencies in SWRL Rule Bases Visualizing Logical Dependencies in SWRL Rule Bases Saeed Hassanpour, Mar:n J. O Connor and Amar K. Das Stanford Center for Biomedical Informa:cs Research MSOB X215, 251 Campus Drive, Stanford, California,

More information

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms Lecture 8 Leader Election Mads Dam Autumn/Winter 2011 Previously... Consensus for message passing concurrency Crash failures,

More information

A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins

A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins Panagio;s Bouro and Nikos Mamouli 1 Aarhus University, Denmark 2 University of Ioannina, Greece Interval Joins employee start end

More information

Input: n jobs (associated start time s j, finish time f j, and value v j ) for j = 1 to n M[j] = empty M[0] = 0. M-Compute-Opt(n)

Input: n jobs (associated start time s j, finish time f j, and value v j ) for j = 1 to n M[j] = empty M[0] = 0. M-Compute-Opt(n) Objec&ves Dnamic Programming Ø Wrapping up: weighted interval schedule Ø Ø Subset Sums Summar: Proper&es of Problems for DP Polnomial number of subproblems Solu&on to original problem can be easil computed

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Privacy Preservation Data Mining Using GSlicing Approach Mr. Ghanshyam P. Dhomse

More information

Hiding the Presence of Individuals from Shared Databases: δ-presence

Hiding the Presence of Individuals from Shared Databases: δ-presence Consiglio Nazionale delle Ricerche Hiding the Presence of Individuals from Shared Databases: δ-presence M. Ercan Nergiz Maurizio Atzori Chris Clifton Pisa KDD Lab Outline Adversary Models Existential Uncertainty

More information

Spa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Spa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University Spa$al Analysis and Modeling (GIST 432/532) Guofeng Cao Department of Geosciences Texas Tech University Representa$on of Spa$al Data Representa$on of Spa$al Data Models Object- based model: treats the

More information

Decision Support Systems

Decision Support Systems Decision Support Systems 2011/2012 Week 3. Lecture 6 Previous Class Dimensions & Measures Dimensions: Item Time Loca0on Measures: Quan0ty Sales TransID ItemName ItemID Date Store Qty T0001 Computer I23

More information

: Advanced Compiler Design. 8.0 Instruc?on scheduling

: Advanced Compiler Design. 8.0 Instruc?on scheduling 6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

If ( ) is approximated by a left sum using three inscribed rectangles of equal width on the x-axis, then the approximation is

If ( ) is approximated by a left sum using three inscribed rectangles of equal width on the x-axis, then the approximation is More Integration Page 1 Directions: Solve the following problems using the available space for scratchwork. Indicate your answers on the front page. Do not spend too much time on any one problem. Note:

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #2: The Rela0onal Model, and SQL/Rela0onal Algebra

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #2: The Rela0onal Model, and SQL/Rela0onal Algebra CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #2: The Rela0onal Model, and SQL/Rela0onal Algebra Data Model A Data Model is a nota0on for describing data or informa0on. Structure of

More information

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects Today s Objec2ves Kerberos Peer To Peer Overlay Networks Final Projects Nov 27, 2017 Sprenkle - CSCI325 1 Kerberos Trusted third party, runs by default on port 88 Security objects: Ø Ticket: token, verifying

More information

Survey of k-anonymity

Survey of k-anonymity NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA Survey of k-anonymity by Ankit Saroha A thesis submitted in partial fulfillment for the degree of Bachelor of Technology under the guidance of Dr. K. S. Babu Department

More information

Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search

Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search Alejandro Arbelaez - CharloBe Truchet - Philippe Codognet JFLI University of Tokyo LINA, UMR 6241 University of

More information

TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3.

TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3. TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3. 301. Definition. Let m be a positive integer, and let X be a set. An m-tuple of elements of X is a function x : {1,..., m} X. We sometimes use x i instead

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #21: Data Mining and Warehousing

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #21: Data Mining and Warehousing CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #21: Data Mining and Warehousing Overview Tradi8onal database systems are tuned to many, small, simple queries. New applica8ons

More information

Ar#ficial Intelligence

Ar#ficial Intelligence Ar#ficial Intelligence Advanced Searching Prof Alexiei Dingli Gene#c Algorithms Charles Darwin Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for

More information

Dr. Ulas Bagci

Dr. Ulas Bagci CAP- Computer Vision Lecture - Image Segmenta;on as an Op;miza;on Problem Dr. Ulas Bagci bagci@ucf.edu Reminders Oct Guest Lecture: SVM by Dr. Gong Oct 8 Guest Lecture: Camera Models by Dr. Shah PA# October

More information

Detec%ng the Temporal Context of Queries. Oliver Kennedy, Ying Yang, Jan Chomicki, Ronny Fehling, Zhen Hua Liu, and Dieter Gawlick 09/01/2014

Detec%ng the Temporal Context of Queries. Oliver Kennedy, Ying Yang, Jan Chomicki, Ronny Fehling, Zhen Hua Liu, and Dieter Gawlick 09/01/2014 Detec%ng the Temporal Context of Queries Oliver Kennedy, Ying Yang, Jan Chomicki, Ronny Fehling, Zhen Hua Liu, and Dieter Gawlick 09/01/2014 Outline Mo.va.on Contextual Analysis Prac.cal Temporal Dependency

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability

More information

Efficient and Scalable Socware Detec2on in Online Social Networks

Efficient and Scalable Socware Detec2on in Online Social Networks Efficient and Scalable Socware Detec2on in Online Social Networks Md Sazzadur Rahman, Ting- Kai Huang, Harsha Madhyastha, Michalis Faloutsos University of California, Riverside Problem Statement Social

More information

Review. Objec,ves. Example Students Table. Database Overview 3/8/17. PostgreSQL DB Elas,csearch. Databases

Review. Objec,ves. Example Students Table. Database Overview 3/8/17. PostgreSQL DB Elas,csearch. Databases Objec,ves PostgreSQL DB Elas,csearch Review Databases Ø What language do we use to query databases? March 8, 2017 Sprenkle - CSCI397 1 March 8, 2017 Sprenkle - CSCI397 2 Database Overview Store data in

More information

Chapter 10 Advanced topics in relational databases

Chapter 10 Advanced topics in relational databases Chapter 10 Advanced topics in relational databases Security and user authorization in SQL Recursion in SQL Object-relational model 1. User-defined types in SQL 2. Operations on object-relational data Online

More information

Scalable Package Queries in Rela2onal Database Systems. Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou

Scalable Package Queries in Rela2onal Database Systems. Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou Scalable Package Queries in Rela2onal Database Systems Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou Package Queries An important class of combinatorial op-miza-on queries Largely unsupported

More information

SQL- Updates, Asser0ons and Views

SQL- Updates, Asser0ons and Views SQL- Updates, Asser0ons and Views Data Defini0on, Constraints, and Schema Changes Used to CREATE, DROP, and ALTER the descrip0ons of the tables (rela0ons) of a database CREATE TABLE In SQL2, can use the

More information