Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring

Size: px
Start display at page:

Download "Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring"

Transcription

1 DBSec 13 Differentially Private Multi- Dimensional Time Series Release for Traffic Monitoring Liyue Fan, Li Xiong, Vaidy Sunderam Department of Math & Computer Science Emory University

2 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 2 Outline Traffic Monitoring User Privacy Challenges Proposed Solutions Temporal Estimation Spatial Estimation Empirical Evaluation

3 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 3 Monitoring Traffic Congestions/Trending places/everyday life How many cars are there? Where are they? Monital Metropol, Brazil Google Traffic View

4 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 4 Traffic Monitoring Real-time GPS data At any timestamp: traffic histogram Real-time user location Aggregate 2D Histogram

5 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 5 User Privacy User privacy should be protected when releasing their data! Real-time location data is sensitive pleaserobme.com GPS traces are identifying We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier's antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. De Montjoye, Yves-Alexandre, Cesar A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. "Unique in the Crowd: The Privacy Bounds of Human Mobility." Scientific Reports 3 (2013)

6 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 6 Differentially Private Data Sharing

7 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 7 Differential Privacy (in a nutshell) Rigorous definition Doesn t stipulate the prior knowledge of the attacker Upon seeing the published data, an attacker should gain little knowledge about any specific individual. α-differential Privacy[BLR08] Smaller α values (α < 1) indicate stronger privacy guarantee Privacy Budget

8 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 8 Static α-differential Privacy Laplace perturbation Dataset D Query f A D = f D + Lap( f α )d Global Sensitivity f = max D,D f D f(d ) 1 strong privacy high perturbation noise f(d): c 1 :2 c 2 :1 c 3 :3 c 4 :4 Laplace Perturbation ci=c i + Lap( 1 α ) A(D): c1:1 c2:0 c3:5 c4:3 Δf = 1

9 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 9 Composability of Differential Privacy Sequential Composition [McSherry10] Let A k each provide α k -differential privacy. A sequence of A k (D) over dataset D provides α k -differential privacy. Timestamp k = 0, T 1 f k (D): 2D cell histogram at time k A k (D): released 2D histogram that satisfies α -DP T A 0 D,, A T 1 (D) satisfies α-dp

10 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 10 Baseline Solution: LPA Laplace Perturbation Algorithm For each timestamp k: Release A k D = f k (D) + Lap( T α )d High perturbation noise for long time-series, i.e. when T is large Low utility output since data is sparse c 1 :2 c 2 :1 c 3 :3 c 4 :4 c1:1 c2:0 c3:5 c4:3 Relative error c 1 : 50% c 2 : 100% Fact: location data is VERY sparse.

11 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 11 Two Proposed Solutions Temporal Estimation for each cell Utilize time series model and posterior estimation to reduce perturbation error. c 1 c 2 c 3 c 4 Spatial Estimation within each partition Group similar cells together to overcome data sparsity

12 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 12 Framework Domain knowledge: known Sparse or Dense label for each cell. Raw Series Modeling/Partitioning Differentially Private Series Laplace Perturbation Estimation Doesn t incur extra differential privacy cost

13 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 13 Temporal Estimation For each cell, its count series {x k }, k = 0, T 1 e.g. {3,3,4,5,4,3,2, } Process Model Measurement Model x k+1 = x k + ω ω~n(0, Q) z k = x k + ν ν~lap( T α ) Goal: given z k and the above models, estimate x k. Small value for Sparse cells; Large value for Dense cells.

14 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 14 Temporal Estimation(cont.) Estimation algorithm based on the Kalman filter Gaussian approx ν~n(0, R), R T2 α 2 O(1) computation per timestamp Model-based Prediction Posterior Estimate/Output Linearly combine prediction and measurement Fan and Xiong CIKM 12, TKDE 13

15 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 15 Temporal Estimation Example For cell c, at time k: Suppose x k = 4 Prediction x k, e.g. 2 Measurement/Laplace perturbed value z k, e.g. 8 Posterior estimation x k, e.g. 3 Impact of perturbation noise is reduced by taking into account of the process model and prediction!

16 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 16 Spatial Estimation Goal: group cells to overcome data sparsity. First partition the space until each partition contains Sparse or Dense cells only Topdown algorithm based on QuadTree Data independency and efficiency For each timestamp k: f k D : partition counts A k D = f k (D) + Lap( T α )d Release f k (D) estimated from A k D Δf k = 1 Each cell is visited O(1) times at each timestamp. S S S S S S S S S S S S S S D D

17 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 17 Spatial Estimation Example At time k Original Cell Histogram f k D : Perturbation noise is evenly distributed to every cell within the partition Partition Histogram f k D Laplace Perturbed A k D Estimated Cell Histogram f k (D)

18 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 18 Evaluation: Data Generated moving objects on a road network City of Oldenburg, Germany 500K objects at the beginning 25K new objects at every timestamp total time: 100 timestamps Two-dimensional 1024 by 1024 grid over the city map Each cell represents 400 m 2 Record object locations at cell resolution 95% cells are labeled Sparse!

19 cell count 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 19 Temporal Estimation orig Laplace Kalman time

20 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 20 Spatial Partitions Oldenburg Road Network Partitions by QuadTree

21 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 21 Evaluation: Utility vs. Privacy Utility of each cell: Average Relative Error of released series For each α value, median utility among each class is plotted DFT: Rastogi and Nath, SIGMOD 10

22 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 23 Evaluation: Range Queries How many objects are in the area of m by m cells at every timestamp? For each m, 100 areas are randomly selected and evaluated.

23 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 24 Evaluation: Runtime Overall runtime is plotted in millisecond.

24 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 25 Conclusion Difficult when time series is long and data is sparse! Domain knowledge can be used for temporal modeling as well as spatial partitioning. Output utility is improved with same privacy guarantee. We don t observe extra time cost by our solutions. Ongoing work: Utilize rich information in spatio-temporal data. Model learning and parameter learning. Contact: liyue.fan@emory.edu AIMS Group:

25 9/4/2013 DBSec'13: Privacy Preserving Traffic Monitoring 26 Q&A

Differentially Private Multi-Dimensional Time Series Release for Traffic Monitoring

Differentially Private Multi-Dimensional Time Series Release for Traffic Monitoring Differentially Private Multi-Dimensional Time Series Release for Traffic Monitoring Liyue Fan, Li Xiong, and Vaidy Sunderam Emory University Atlanta GA 30322, USA {lfan3,lxiong,vss}@mathcs.emory.edu Abstract.

More information

Differentially Private H-Tree

Differentially Private H-Tree GeoPrivacy: 2 nd Workshop on Privacy in Geographic Information Collection and Analysis Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media System Center University of Southern

More information

CS573 Data Privacy and Security. Differential Privacy tabular data and range queries. Li Xiong

CS573 Data Privacy and Security. Differential Privacy tabular data and range queries. Li Xiong CS573 Data Privacy and Security Differential Privacy tabular data and range queries Li Xiong Outline Tabular data and histogram/range queries Algorithms for low dimensional data Algorithms for high dimensional

More information

CS573 Data Privacy and Security. Differential Privacy. Li Xiong

CS573 Data Privacy and Security. Differential Privacy. Li Xiong CS573 Data Privacy and Security Differential Privacy Li Xiong Outline Differential Privacy Definition Basic techniques Composition theorems Statistical Data Privacy Non-interactive vs interactive Privacy

More information

Privacy Preserving Machine Learning: A Theoretically Sound App

Privacy Preserving Machine Learning: A Theoretically Sound App Privacy Preserving Machine Learning: A Theoretically Sound Approach Outline 1 2 3 4 5 6 Privacy Leakage Events AOL search data leak: New York Times journalist was able to identify users from the anonymous

More information

Time Distortion Anonymization for the Publication of Mobility Data with High Utility

Time Distortion Anonymization for the Publication of Mobility Data with High Utility Time Distortion Anonymization for the Publication of Mobility Data with High Utility Vincent Primault, Sonia Ben Mokhtar, Cédric Lauradoux and Lionel Brunie Mobility data usefulness Real-time traffic,

More information

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015.

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015. Privacy-preserving machine learning Bo Liu, the HKUST March, 1st, 2015. 1 Some slides extracted from Wang Yuxiang, Differential Privacy: a short tutorial. Cynthia Dwork, The Promise of Differential Privacy.

More information

Co-clustering for differentially private synthetic data generation

Co-clustering for differentially private synthetic data generation Co-clustering for differentially private synthetic data generation Tarek Benkhelif, Françoise Fessant, Fabrice Clérot and Guillaume Raschia January 23, 2018 Orange Labs & LS2N Journée thématique EGC &

More information

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Welcome to DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Time: 6:00pm 8:50pm Wednesday Location: Fuller 320 Spring 2017 2 Team assignment Finalized. (Great!) Guest Speaker 2/22 A

More information

Differentially Private H-Tree

Differentially Private H-Tree Differentially Private H-Tree Hien To, Liyue Fan, Cyrus Shahabi Integrated Media Systems Center University of Southern California Los Angeles, CA, U.S.A {hto,liyuefan,shahabi}@usc.edu ABSTRACT In this

More information

Guarding user Privacy with Federated Learning and Differential Privacy

Guarding user Privacy with Federated Learning and Differential Privacy Guarding user Privacy with Federated Learning and Differential Privacy Brendan McMahan mcmahan@google.com DIMACS/Northeast Big Data Hub Workshop on Overcoming Barriers to Data Sharing including Privacy

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Privacy preserving data mining Li Xiong Slides credits: Chris Clifton Agrawal and Srikant 4/3/2011 1 Privacy Preserving Data Mining Privacy concerns about personal data AOL

More information

Publishing CitiSense Data: Privacy Concerns and Remedies

Publishing CitiSense Data: Privacy Concerns and Remedies Publishing CitiSense Data: Privacy Concerns and Remedies Kapil Gupta Advisor : Prof. Bill Griswold 1 Location Based Services Great utility of location based services data traffic control, mobility management,

More information

Scalable Selective Traffic Congestion Notification

Scalable Selective Traffic Congestion Notification Scalable Selective Traffic Congestion Notification Győző Gidófalvi Division of Geoinformatics Deptartment of Urban Planning and Environment KTH Royal Institution of Technology, Sweden gyozo@kth.se Outline

More information

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open

DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open AI 2 3 Deep Learning Fashion Cognitive tasks: speech,

More information

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis

An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis An Ad Omnia Approach to Defining and Achiev ing Private Data Analysis Mohammad Hammoud CS3525 Dept. of Computer Science University of Pittsburgh Introduction This paper addresses the problem of defining

More information

arxiv: v1 [cs.ds] 12 Sep 2016

arxiv: v1 [cs.ds] 12 Sep 2016 Jaewoo Lee Penn State University, University Par, PA 16801 Daniel Kifer Penn State University, University Par, PA 16801 JLEE@CSE.PSU.EDU DKIFER@CSE.PSU.EDU arxiv:1609.03251v1 [cs.ds] 12 Sep 2016 Abstract

More information

Towards Practical Differential Privacy for SQL Queries. Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley

Towards Practical Differential Privacy for SQL Queries. Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley Towards Practical Differential Privacy for SQL Queries Noah Johnson, Joseph P. Near, Dawn Song UC Berkeley Outline 1. Discovering real-world requirements 2. Elastic sensitivity & calculating sensitivity

More information

Max-Count Aggregation Estimation for Moving Points

Max-Count Aggregation Estimation for Moving Points Max-Count Aggregation Estimation for Moving Points Yi Chen Peter Revesz Dept. of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA Abstract Many interesting problems

More information

Frequent grams based Embedding for Privacy Preserving Record Linkage

Frequent grams based Embedding for Privacy Preserving Record Linkage Frequent grams based Embedding for Privacy Preserving Record Linkage ABSTRACT Luca Bonomi Emory University Atlanta, USA lbonomi@mathcs.emory.edu Rui Chen Concordia University Montreal, Canada ru_che@encs.concordia.ca

More information

Public Sensing Using Your Mobile Phone for Crowd Sourcing

Public Sensing Using Your Mobile Phone for Crowd Sourcing Institute of Parallel and Distributed Systems () Universitätsstraße 38 D-70569 Stuttgart Public Sensing Using Your Mobile Phone for Crowd Sourcing 55th Photogrammetric Week September 10, 2015 Stuttgart,

More information

Statistical and Synthetic Data Sharing with Differential Privacy

Statistical and Synthetic Data Sharing with Differential Privacy pscanner and idash Data Sharing Symposium UCSD, Sept 30 Oct 2, 2015 Statistical and Synthetic Data Sharing with Differential Privacy Li Xiong Department of Mathematics and Computer Science Department of

More information

Parallel Composition Revisited

Parallel Composition Revisited Parallel Composition Revisited Chris Clifton 23 October 2017 This is joint work with Keith Merrill and Shawn Merrill This work supported by the U.S. Census Bureau under Cooperative Agreement CB16ADR0160002

More information

DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li

DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li Welcome to DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: KH 116 Fall 2017 First Grading for Reading Assignment Weka v 6 weeks v https://weka.waikato.ac.nz/dataminingwithweka/preview

More information

Security Control Methods for Statistical Database

Security Control Methods for Statistical Database Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security Statistical Database A statistical database is a database which provides statistics on subsets of records OLAP

More information

PrivApprox. Privacy- Preserving Stream Analytics.

PrivApprox. Privacy- Preserving Stream Analytics. PrivApprox Privacy- Preserving Stream Analytics https://privapprox.github.io Do Le Quoc, Martin Beck, Pramod Bhatotia, Ruichuan Chen, Christof Fetzer, Thorsten Strufe July 2017 Motivation Clients Analysts

More information

Matrix Mechanism and Data Dependent algorithms

Matrix Mechanism and Data Dependent algorithms Matrix Mechanism and Data Dependent algorithms CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 9 : 590.03 Fall 16 1 Recap: Constrained Inference Lecture 9 : 590.03 Fall 16 2 Constrained Inference

More information

Differential Privacy and Its Application

Differential Privacy and Its Application Differential Privacy and Its Application By Tianqing Zhu Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Deakin University March 2014 To my Son... iii Table of Contents

More information

Data for Development Challenge Senegal. Book of Abstracts: Scientific Papers

Data for Development Challenge Senegal. Book of Abstracts: Scientific Papers Data for Development Challenge Senegal Book of Abstracts: Scientific Papers At Organized by Sponsored by www.d4d.orange.com / Tweeter : @O4Dev Contact: Nicolas De Cordes, Orange, VP Marketing Anticipation,

More information

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets Philippe Cudré-Mauroux Eugene Wu Samuel Madden Computer Science and Artificial Intelligence Laboratory Massachusetts Institute

More information

Privacy-Preserving Machine Learning

Privacy-Preserving Machine Learning Privacy-Preserving Machine Learning CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following concepts:

More information

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris

A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris A Case Study: Privacy Preserving Release of Spa9o- temporal Density in Paris Gergely Acs (INRIA) gergely.acs@inria.fr!! Claude Castelluccia (INRIA) claude.castelluccia@inria.fr! Outline 2! Dataset descrip9on!

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

MobiEyes: Distributed Architecture for Location-based Services

MobiEyes: Distributed Architecture for Location-based Services MobiEyes: Distributed Architecture for Location-based Services Ling Liu Georgia Institute of Technology Jointly with Buğra Gedik, Kipp Jones, Anand Murugappan, Bhuvan Bamba Outline of the Talk Motivation

More information

CrowdPath: A Framework for Next Generation Routing Services using Volunteered Geographic Information

CrowdPath: A Framework for Next Generation Routing Services using Volunteered Geographic Information CrowdPath: A Framework for Next Generation Routing Services using Volunteered Geographic Information Abdeltawab M. Hendawi, Eugene Sturm, Dev Oliver, Shashi Shekhar hendawi@cs.umn.edu, sturm049@umn.edu,

More information

NOWADAYS, mobile devices such as smartphones and tablets

NOWADAYS, mobile devices such as smartphones and tablets 1 Protecting Location Privacy for Task Allocation in Ad Hoc Mobile Cloud Computing Yanmin Gong, Student Member, IEEE, Chi Zhang, Member, IEEE, Yuguang Fang, Fellow, IEEE, and Jinyuan Sun, Member, IEEE

More information

Predictive Indexing for Fast Search

Predictive Indexing for Fast Search Predictive Indexing for Fast Search Sharad Goel, John Langford and Alex Strehl Yahoo! Research, New York Modern Massive Data Sets (MMDS) June 25, 2008 Goel, Langford & Strehl (Yahoo! Research) Predictive

More information

Constructing Popular Routes from Uncertain Trajectories

Constructing Popular Routes from Uncertain Trajectories Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei, Yu Zheng, Wen-Chih Peng presented by Slawek Goryczka Scenarios A trajectory is a sequence of data points recording location information

More information

PARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS. Wentong CAI

PARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS. Wentong CAI PARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS Wentong CAI Parallel & Distributed Computing Centre School of Computer Engineering Nanyang Technological University Singapore

More information

Collaboration with: Dieter Pfoser, Computer Technology Institute, Athens, Greece Peter Wagner, German Aerospace Center, Berlin, Germany

Collaboration with: Dieter Pfoser, Computer Technology Institute, Athens, Greece Peter Wagner, German Aerospace Center, Berlin, Germany Towards traffic-aware aware a routing using GPS vehicle trajectories Carola Wenk University of Texas at San Antonio carola@cs.utsa.edu Collaboration with: Dieter Pfoser, Computer Technology Institute,

More information

De-anonymization of Mobility Trajectories: Dissecting the Gaps between Theory and Practice

De-anonymization of Mobility Trajectories: Dissecting the Gaps between Theory and Practice De-anonymization of Mobility Trajectories: Dissecting the Gaps between Theory and Practice Huandong Wang 1, Chen Gao 1, Yong Li 1, Gang Wang 2, Depeng Jin 1, Jingbo Sun 3 1 Tsinghua University, China 2

More information

Combining 3D Shape, Color, and Motion for Robust Anytime Tracking David Held, Jesse Levinson, Sebas5an Thrun, and Silvio Savarese

Combining 3D Shape, Color, and Motion for Robust Anytime Tracking David Held, Jesse Levinson, Sebas5an Thrun, and Silvio Savarese Goal: Fast and Robust Velocity Es5ma5on Combining 3D Shape, Color, and Motion for Robust Anytime Tracking David Held, Jesse Levinson, Sebas5an Thrun, and Silvio Savarese Our Approach: Alignment Probability

More information

Sanitization of call detail records via differentially-private Bloom filters

Sanitization of call detail records via differentially-private Bloom filters Sanitization of call detail records via differentially-private Bloom filters Mohammad Alaggan Helwan University Joint work with Sébastien Gambs (Université de Rennes 1 - Inria / IRISA), Stan Matwin and

More information

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Introduction to Indexing R-trees. Hong Kong University of Science and Technology Introduction to Indexing R-trees Dimitris Papadias Hong Kong University of Science and Technology 1 Introduction to Indexing 1. Assume that you work in a government office, and you maintain the records

More information

Mobility Models. Larissa Marinho Eglem de Oliveira. May 26th CMPE 257 Wireless Networks. (UCSC) May / 50

Mobility Models. Larissa Marinho Eglem de Oliveira. May 26th CMPE 257 Wireless Networks. (UCSC) May / 50 Mobility Models Larissa Marinho Eglem de Oliveira CMPE 257 Wireless Networks May 26th 2015 (UCSC) May 2015 1 / 50 1 Motivation 2 Mobility Models 3 Extracting a Mobility Model from Real User Traces 4 Self-similar

More information

Developing MapReduce Programs

Developing MapReduce Programs Cloud Computing Developing MapReduce Programs Dell Zhang Birkbeck, University of London 2017/18 MapReduce Algorithm Design MapReduce: Recap Programmers must specify two functions: map (k, v) * Takes

More information

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth

Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis. Part 1 Aaron Roth Algorithmic Approaches to Preventing Overfitting in Adaptive Data Analysis Part 1 Aaron Roth The 2015 ImageNet competition An image classification competition during a heated war for deep learning talent

More information

Mobile Millennium Using Smartphones as Traffic Sensors

Mobile Millennium Using Smartphones as Traffic Sensors Mobile Millennium Using Smartphones as Traffic Sensors Dan Work and Alex Bayen Systems Engineering, Civil and Environmental Engineering, UC Berkeley Intelligent Infrastructure, Center for Information Technology

More information

Sparse coding for image classification

Sparse coding for image classification Sparse coding for image classification Columbia University Electrical Engineering: Kun Rong(kr2496@columbia.edu) Yongzhou Xiang(yx2211@columbia.edu) Yin Cui(yc2776@columbia.edu) Outline Background Introduction

More information

Mobility Data Management & Exploration

Mobility Data Management & Exploration Mobility Data Management & Exploration Ch. 07. Mobility Data Mining and Knowledge Discovery Nikos Pelekis & Yannis Theodoridis InfoLab University of Piraeus Greece infolab.cs.unipi.gr v.2014.05 Chapter

More information

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks

More information

ADAPTIVE K MEANS CLUSTERING FOR HUMAN MOBILITY MODELING AND PREDICTION Anu Sharma( ) Advisor: Prof. Peizhao Hu

ADAPTIVE K MEANS CLUSTERING FOR HUMAN MOBILITY MODELING AND PREDICTION Anu Sharma( ) Advisor: Prof. Peizhao Hu ADAPTIVE K MEANS CLUSTERING FOR HUMAN MOBILITY MODELING AND PREDICTION Anu Sharma( axs3617@rit.edu ) Advisor: Prof. Peizhao Hu ABSTRACT Human movement follows repetitive trajectories. There has been extensive

More information

Behavioral Data Mining. Lecture 9 Modeling People

Behavioral Data Mining. Lecture 9 Modeling People Behavioral Data Mining Lecture 9 Modeling People Outline Power Laws Big-5 Personality Factors Social Network Structure Power Laws Y-axis = frequency of word, X-axis = rank in decreasing order Power Laws

More information

Mining Frequent Patterns with Differential Privacy

Mining Frequent Patterns with Differential Privacy Mining Frequent Patterns with Differential Privacy Luca Bonomi (Supervised by Prof. Li Xiong) Department of Mathematics & Computer Science Emory University Atlanta, USA lbonomi@mathcs.emory.edu ABSTRACT

More information

Pufferfish: A Semantic Approach to Customizable Privacy

Pufferfish: A Semantic Approach to Customizable Privacy Pufferfish: A Semantic Approach to Customizable Privacy Ashwin Machanavajjhala ashwin AT cs.duke.edu Collaborators: Daniel Kifer (Penn State), Bolin Ding (UIUC, Microsoft Research) idash Privacy Workshop

More information

Fosca Giannotti et al,.

Fosca Giannotti et al,. Trajectory Pattern Mining Fosca Giannotti et al,. - Presented by Shuo Miao Conference on Knowledge discovery and data mining, 2007 OUTLINE 1. Motivation 2. T-Patterns: definition 3. T-Patterns: the approach(es)

More information

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets Philippe Cudré-Mauroux Eugene Wu Samuel Madden Computer Science and Artificial Intelligence Laboratory Massachusetts Institute

More information

Differentially-Private Network Trace Analysis. Frank McSherry and Ratul Mahajan Microsoft Research

Differentially-Private Network Trace Analysis. Frank McSherry and Ratul Mahajan Microsoft Research Differentially-Private Network Trace Analysis Frank McSherry and Ratul Mahajan Microsoft Research Overview. 1 Overview Question: Is it possible to conduct network trace analyses in a way that provides

More information

More Data, Less Work: Runtime as a decreasing function of data set size. Nati Srebro. Toyota Technological Institute Chicago

More Data, Less Work: Runtime as a decreasing function of data set size. Nati Srebro. Toyota Technological Institute Chicago More Data, Less Work: Runtime as a decreasing function of data set size Nati Srebro Toyota Technological Institute Chicago Outline we are here SVM speculations, other problems Clustering wild speculations,

More information

Selective 4D modelling framework for spatialtemporal Land Information Management System

Selective 4D modelling framework for spatialtemporal Land Information Management System Selective 4D modelling framework for spatialtemporal Land Information Management System A. Doulamis, S. Soile, N. Doulamis, C. Chrisouli, N. Grammalidis, K. Dimitropoulos C. Manesis, C. Potsiou, C. Ioannidis

More information

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Hidden Markov Models Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Sequential Data Time-series: Stock market, weather, speech, video Ordered: Text, genes Sequential

More information

MauveDB: Statistical Modeling inside Database Systems. Amol Deshpande, University of Maryland

MauveDB: Statistical Modeling inside Database Systems. Amol Deshpande, University of Maryland MauveDB: Statistical Modeling inside Database Systems Amol Deshpande, University of Maryland Motivation Unprecedented, and rapidly increasing, instrumentation of our every-day world Huge data volumes generated

More information

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017

Differential Privacy. Seminar: Robust Data Mining Techniques. Thomas Edlich. July 16, 2017 Differential Privacy Seminar: Robust Techniques Thomas Edlich Technische Universität München Department of Informatics kdd.in.tum.de July 16, 2017 Outline 1. Introduction 2. Definition and Features of

More information

Contact: Ye Zhao, Professor Phone: Dept. of Computer Science, Kent State University, Ohio 44242

Contact: Ye Zhao, Professor Phone: Dept. of Computer Science, Kent State University, Ohio 44242 Table of Contents I. Overview... 2 II. Trajectory Datasets and Data Types... 3 III. Data Loading and Processing Guide... 5 IV. Account and Web-based Data Access... 14 V. Visual Analytics Interface... 15

More information

Seminar Heidelberg University

Seminar Heidelberg University Seminar Heidelberg University Mobile Human Detection Systems Pedestrian Detection by Stereo Vision on Mobile Robots Philip Mayer Matrikelnummer: 3300646 Motivation Fig.1: Pedestrians Within Bounding Box

More information

Differentially Private Spatial Decompositions

Differentially Private Spatial Decompositions Differentially Private Spatial Decompositions Graham Cormode Cecilia Procopiuc Divesh Srivastava AT&T Labs Research {graham, magda, divesh}@research.att.com Entong Shen Ting Yu North Carolina State University

More information

Chapter 1, Introduction

Chapter 1, Introduction CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from

More information

Large-scale Video Classification with Convolutional Neural Networks

Large-scale Video Classification with Convolutional Neural Networks Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area

More information

Data Model and Management

Data Model and Management Data Model and Management Ye Zhao and Farah Kamw Outline Urban Data and Availability Urban Trajectory Data Types Data Preprocessing and Data Registration Urban Trajectory Data and Query Model Spatial Database

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2016 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

DriveFaster: Optimizing a Traffic Light Grid System

DriveFaster: Optimizing a Traffic Light Grid System DriveFaster: Optimizing a Traffic Light Grid System Abstract CS221 Fall 2016: Final Report Team Members: Xiaofan Li, Ahmed Jaffery Traffic lights are the central point of control of traffic for cities

More information

Private Database Synthesis for Outsourced System Evaluation

Private Database Synthesis for Outsourced System Evaluation Private Database Synthesis for Outsourced System Evaluation Vani Gupta 1, Gerome Miklau 1, and Neoklis Polyzotis 2 1 Dept. of Computer Science, University of Massachusetts, Amherst, MA, USA 2 Dept. of

More information

Visual Traffic Jam Analysis based on Trajectory Data

Visual Traffic Jam Analysis based on Trajectory Data Visualization Workshop 13 Visual Traffic Jam Analysis based on Trajectory Data Zuchao Wang 1, Min Lu 1, Xiaoru Yuan 1, 2, Junping Zhang 3, Huub van de Wetering 4 1) Key Laboratory of Machine Perception

More information

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010

More information

Location Based Advertising and Location k- Anonymity

Location Based Advertising and Location k- Anonymity Location Based Advertising and Location k- Anonymity How can our location information be kept safe? Matthew Gaba Protecting Location Privacy with Personalized k-anonymity: Architecture and Algorithms Bugra

More information

Differentially private Bayesian learning on distributed data

Differentially private Bayesian learning on distributed data Differentially private Bayesian learning on distributed data Mikko Heikkilä 1 mikko.a.heikkila@helsinki.fi Samuel Kaski 3 samuel.kaski@aalto.fi Sasu Tarkoma 2 sasu.tarkoma@helsinki.fi Eemil Lagerspetz

More information

Semantic Website Clustering

Semantic Website Clustering Semantic Website Clustering I-Hsuan Yang, Yu-tsun Huang, Yen-Ling Huang 1. Abstract We propose a new approach to cluster the web pages. Utilizing an iterative reinforced algorithm, the model extracts semantic

More information

Measuring the World: Designing Robust Vehicle Localization for Autonomous Driving. Frank Schuster, Dr. Martin Haueis

Measuring the World: Designing Robust Vehicle Localization for Autonomous Driving. Frank Schuster, Dr. Martin Haueis Measuring the World: Designing Robust Vehicle Localization for Autonomous Driving Frank Schuster, Dr. Martin Haueis Agenda Motivation: Why measure the world for autonomous driving? Map Content: What do

More information

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada Spatio-Temporal Salient Features Amir H. Shabani Vision and Image Processing Lab., University of Waterloo, ON CRV Tutorial day- May 30, 2010 Ottawa, Canada 1 Applications Automated surveillance for scene

More information

A System of Image Matching and 3D Reconstruction

A System of Image Matching and 3D Reconstruction A System of Image Matching and 3D Reconstruction CS231A Project Report 1. Introduction Xianfeng Rui Given thousands of unordered images of photos with a variety of scenes in your gallery, you will find

More information

D-Grid: An In-Memory Dual Space Grid Index for Moving Object Databases

D-Grid: An In-Memory Dual Space Grid Index for Moving Object Databases D-Grid: An In-Memory Dual Space Grid Index for Moving Object Databases Xiaofeng Xu Department of Math/CS Emory University Atlanta, Georgia 0 xxu7@emory.edu Li Xiong Department of Math/CS Emory University

More information

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, & Thomas Hofmann Max Planck Institute for Biological Cybernetics Tübingen, Germany Google,

More information

Christian Doppler Laboratory for Dependable Wireless Connectivity for the Society in Motion Three-Dimensional Beamforming

Christian Doppler Laboratory for Dependable Wireless Connectivity for the Society in Motion Three-Dimensional Beamforming Christian Doppler Laboratory for Three-Dimensional Beamforming Fjolla Ademaj 15.11.216 Studying 3D channel models Channel models on system-level tools commonly 2-dimensional (2D) 3GPP Spatial Channel Model

More information

Continuous Density Queries for Moving Objects

Continuous Density Queries for Moving Objects Continuous Density Queries for Moving Objects Xing Hao School of Information Renmin University of China haoxing@ruc.edu.cn Xiaofeng Meng School of Information Renmin University of China xfmeng@ruc.edu.cn

More information

Introduction to Trajectory Clustering. By YONGLI ZHANG

Introduction to Trajectory Clustering. By YONGLI ZHANG Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem

More information

An efficient approach for continuous density queries

An efficient approach for continuous density queries Front. Comput. Sci. DOI RESEARCH ARTICLE An efficient approach for continuous density queries Jie WEN, Xiaofeng MENG, Xing HAO, Jianliang XU School of Information, Renmin University of China, Beijing 7,

More information

Distributed Data Mining with Differential Privacy

Distributed Data Mining with Differential Privacy Distributed Data Mining with Differential Privacy Ning Zhang, Ming Li, Wenjing Lou Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, MA Email: {ning, mingli}@wpi.edu,

More information

Differentially Private Histogram Publication

Differentially Private Histogram Publication Noname manuscript No. will be inserted by the editor) Differentially Private Histogram Publication Jia Xu Zhenjie Zhang Xiaokui Xiao Yin Yang Ge Yu Marianne Winslett Received: date / Accepted: date Abstract

More information

FMRI Pre-Processing and Model- Based Statistics

FMRI Pre-Processing and Model- Based Statistics FMRI Pre-Processing and Model- Based Statistics Brief intro to FMRI experiments and analysis FMRI pre-stats image processing Simple Single-Subject Statistics Multi-Level FMRI Analysis Advanced FMRI Analysis

More information

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu

Differential Privacy. CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Differential Privacy CPSC 457/557, Fall 13 10/31/13 Hushiyang Liu Era of big data Motivation: Utility vs. Privacy large-size database automatized data analysis Utility "analyze and extract knowledge from

More information

Leveraging Textural Features for Recognizing Actions in Low Quality Videos

Leveraging Textural Features for Recognizing Actions in Low Quality Videos Leveraging Textural Features for Recognizing Actions in Low Quality Videos Saimunur Rahman, John See, Chiung Ching Ho Centre of Visual Computing, Faculty of Computing and Informatics Multimedia University,

More information

Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University

Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University Beyond Mere Pixels: How Can Computers Interpret and Compare Digital Images? Nicholas R. Howe Cornell University Why Image Retrieval? World Wide Web: Millions of hosts Billions of images Growth of video

More information

Crowd-Blending Privacy

Crowd-Blending Privacy Crowd-Blending Privacy Johannes Gehrke, Michael Hay, Edward Lui, and Rafael Pass Department of Computer Science, Cornell University {johannes,mhay,luied,rafael}@cs.cornell.edu Abstract. We introduce a

More information

Spatial and multi-scale data assimilation in EO-LDAS. Technical Note for EO-LDAS project/nceo. P. Lewis, UCL NERC NCEO

Spatial and multi-scale data assimilation in EO-LDAS. Technical Note for EO-LDAS project/nceo. P. Lewis, UCL NERC NCEO Spatial and multi-scale data assimilation in EO-LDAS Technical Note for EO-LDAS project/nceo P. Lewis, UCL NERC NCEO Abstract Email: p.lewis@ucl.ac.uk 2 May 2012 In this technical note, spatial data assimilation

More information

From Neural Re-Ranking to Neural Ranking:

From Neural Re-Ranking to Neural Ranking: From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)

More information

A Correlation Test: What were the interferometric observation conditions?

A Correlation Test: What were the interferometric observation conditions? A Correlation Test: What were the interferometric observation conditions? Correlation in Practical Systems For Single-Pass Two-Aperture Interferometer Systems System noise and baseline/volumetric decorrelation

More information

Demonstration of Damson: Differential Privacy for Analysis of Large Data

Demonstration of Damson: Differential Privacy for Analysis of Large Data Demonstration of Damson: Differential Privacy for Analysis of Large Data Marianne Winslett 1,2, Yin Yang 1,2, Zhenjie Zhang 1 1 Advanced Digital Sciences Center, Singapore {yin.yang, zhenjie}@adsc.com.sg

More information

The Confounding Problem of Private Data Release

The Confounding Problem of Private Data Release The Confounding Problem of Private Data Release Divesh Srivastava AT&T Labs-Research Acknowledgments: Ramón, Graham, Colin, Xi, Ashwin, Magda This material represents the views of the individual contributors

More information