Outsourcing Privacy-Preserving Social Networks to a Cloud

Similar documents
Outsourcing Privacy-Preserving Social Networks to a Cloud

Dynamic Grouping Strategy in Cloud Computing

Clock-Based Proxy Re-encryption Scheme in Unreliable Clouds

Effective Query Grouping Strategy in Clouds

Mutually Exclusive Data Dissemination in the Mobile Publish/Subscribe System

Cooperative Private Searching in Clouds

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments

De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs,

Sampling Large Graphs: Algorithms and Applications

Clustering Analysis for Malicious Network Traffic

A Multi-Copy Delegation Forwarding Based On Short-term and Long-Term Speed in DTNs

Multi-resource Energy-efficient Routing in Cloud Data Centers with Network-as-a-Service

Sampling Large Graphs: Algorithms and Applications

Connected Point Coverage in Wireless Sensor Networks using Robust Spanning Trees

Low Power Hitch-hiking Broadcast in Ad Hoc Wireless Networks

An Analysis of Onion-Based Anonymous Routing in Delay Tolerant Networks

Semi-Supervised Clustering with Partial Background Information

Supplementary file for SybilDefender: A Defense Mechanism for Sybil Attacks in Large Social Networks

NSFA: Nested Scale-Free Architecture for Scalable Publish/Subscribe over P2P Networks

Scalable Influence Maximization in Social Networks under the Linear Threshold Model

Distributed Data Anonymization with Hiding Sensitive Node Labels

Thwarting Traceback Attack on Freenet

A Closed Asynchronous Dynamic Model of Cellular Learning Automata and its Application to Peer-to-Peer Networks

On the Design and Analysis of Data Center Network Architectures for Interconnecting Dual-Port Servers. Dawei Li and Prof. Jie Wu Temple University

A Joint Replication-Migration-based Routing in Delay Tolerant Networks

SOCIAL MEDIA MINING. Data Mining Essentials

DE-ANONYMIZING SOCIAL NETWORKS AND MOBILITY TRACES

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

A Procedural Based Encryption Technique for Accessing Data on Cloud

Efficient Subgraph Matching by Postponing Cartesian Products

Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion

Center for Networked Computing

Estimating Sizes of Social Networks via Biased Sampling

NED: An Inter-Graph Node Metric Based On Edit Distance

Computer Based Image Algorithm For Wireless Sensor Networks To Prevent Hotspot Locating Attack

PRIVACY PRESERVING CONTENT BASED SEARCH OVER OUTSOURCED IMAGE DATA

Introduction to Data Mining

Clustering-Based Distributed Precomputation for Quality-of-Service Routing*

Link Analysis in the Cloud

QoS-Aware Service Selection in Geographically Distributed Clouds

Simulations of the quadrilateral-based localization

Secure Multiparty Computation Multi-round protocols

Efficient Directional Network Backbone Construction in Mobile Ad Hoc Networks

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

Some Routing Challenges in Dynamic Networks

Scalable privacy-enhanced traffic monitoring in vehicular ad hoc networks

PRIVACY AND TRUST-AWARE FRAMEWORK FOR SECURE ROUTING IN WIRELESS MESH NETWORKS

arxiv: v1 [cs.db] 11 Oct 2013

Dynamic Search Algorithm in P2P Networks

arxiv: v1 [cs.si] 16 Dec 2015

Virtual Network Embedding with Substrate Support for Parallelization

EFFICIENT AND PRIVACY-AWARE DATA AGGREGATION IN MOBILE SENSING

α Coverage to Extend Network Lifetime on Wireless Sensor Networks

Genetic-Algorithm-Based Construction of Load-Balanced CDSs in Wireless Sensor Networks

CLUSTER BASED ANONYMIZATION FOR PRIVACY PRESERVATION IN SOCIAL NETWORK DATA COMMUNITY

Real-time Collaborative Filtering Recommender Systems

Distributed Data Aggregation Scheduling in Wireless Sensor Networks

A hierarchical network model for network topology design using genetic algorithm

InterestSpread: An Efficient Method for Content Transmission in Mobile Social Networks

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Top-k Keyword Search Over Graphs Based On Backward Search

Group User Revocation in Cloud for Shared Data

An Efficient Link Scheduler for MIMO Wireless Mesh Networks

Community-Aware Opportunistic Routing in Mobile Social Networks

Mobility Control for Complete Coverage in Wireless Sensor Networks

Efficient Evaluation of Shortest Travel-time Path Queries through Spatial Mashups

Anonymizing Social Networks

PRIVACY PRESERVATION IN SOCIAL GRAPHS

Void main Technologies

Adaptive Backbone-based Routing in Delay Tolerant Networks

Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1

Edge-Based Beaconing Schedule in Duty- Cycled Multihop Wireless Networks

Classifying Online Social Network Users Through the Social Graph

Machine Learning : Clustering, Self-Organizing Maps

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

Framework Research on Privacy Protection of PHR Owners in Medical Cloud System Based on Aggregation Key Encryption Algorithm

More Efficient Classification of Web Content Using Graph Sampling

You are Who You Know and How You Behave: Attribute Inference Attacks via Users Social Friends and Behaviors

Data Distortion for Privacy Protection in a Terrorist Analysis System

Privacy-preserving machine learning. Bo Liu, the HKUST March, 1st, 2015.

Attribute Based Encryption with Privacy Protection in Clouds

Practical Approach to Identifying Additive Link Metrics with Shortest Path Routing

Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks

Case Study: Social Network Analysis. Part II

Importance of IP Alias Resolution in Sampling Internet Topologies

Sensor Tasking and Control

QUINT: On Query-Specific Optimal Networks

IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W STRUCTURAL DIVERSITY ANONYMITY

A Novel Technique for Finding Influential Nodes

Hyperbolic Traffic Load Centrality for Large-Scale Complex Communications Networks

A Generalized Coverage-Preserving Scheduling in WSNs. a Case Study in Structural Health Monitoring

Final Exam. Introduction to Artificial Intelligence. CS 188 Spring 2010 INSTRUCTIONS. You have 3 hours.

Fault-Tolerant and Secure Data Transmission Using Random Linear Network Coding

Channel Dynamics Matter: Forwarding Node Set Selection in Cognitive Radio Networks

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011

Light-weight Contour Tracking in Wireless Sensor Networks

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

CQNCR: Optimal VM Migration Planning in Cloud Data Centers

Inference Attack-Resistant E-Healthcare Cloud System with Fine-Grained Access Control

Transcription:

IEEE INFOCOM 2013, April 14-19, Turin, Italy Outsourcing Privacy-Preserving Social Networks to a Cloud Guojun Wang a, Qin Liu a, Feng Li c, Shuhui Yang d, and Jie Wu b a Central South University, China b Temple University, USA c Indiana University-Purdue University Indianapolis, USA d Purdue University Calumet, USA 2013-4-18

Introduction

Cloud Computing Model Cloud computing as a new commercial paradigm enables organizations that host social network data to outsource a portion of their data to a cloud. Publisher Outsource anonymized social network data Cloud service provider Answer : 1 0 Attacker : Analyze social network to re -identify a target User Ask aggregate queries : What is the average number of co -authors? How to protect individuals identities?

1-Neighborhood Attack Anonymization cannot resist the 1-neighborhood attack, where the attacker is assumed to know the target s 1-neighborhood graph. D onland B o b G race H arry A lice (a) Initial social network C la rk E d a F re d c) Anonymized social network (b) 1-neighborhood of Bob (d) 2-anonymity social network Existing work made any node s 1-neighborhood graph isomorphic with at least k - 1 other nodes graphs by adding noise edges (k-anonymity).

Challenges K-anonymity cannot resist 1*-neighborhood attack, where an attacker is assumed to know the degrees of the target s one-hop neighbors, in addition to the 1-neighborhood graph. D onland G race H arry B o b A lice (a) Initial social network C la rk E d a F re d c) Anonymized social network (b) 1-neighborhood of Bob (d) 2-anonymity social network Existing work requires the addition of more edges, so that the degrees of the K-isomorphic graphs are the same. The utility of the graph is reduced.

Our Contributions We identify a novel attack, 1*-neighborhood attack, for outsourcing social networks to a cloud. We define the probabilistic indistinguishability property for an outsourced social network, and propose a heuristic indistinguishable group anonymization scheme (HIGA) to generate social networks with this privacy property. We conduct experiments on both synthetic and real data sets to verify the effectiveness of the proposed scheme.

Preliminaries

System Model Publisher (Facebook or Twitter ) User (Interested in aggregate queries ) Cloud service provider (Google or Amazon ) Attacker (Have 1*-neighborhood graph background knowledge, and want to re - identify a target from the social network graph ) Privacy goal. Given any target's 1-*neighborhood graph, the attacker cannot re-identify the target from an anonymized social network with confidence higher than a threshold. Utility goal. The anonymized social networks can be used to answer aggregate queries with high accuracy.

Problem Formulation The problem of generating a social network with above three properties is NP-hard.

Definitions Let G * u and G * u denote the 1*-neighborhood graph of node u in the original social network G and in the anonymized social network G, respectively.

Scheme Description

Intuition The heuristic indistinguishable group anonymization (HIGA) scheme consists of 4 steps: Grouping Testing Anonymization Randomization

Intuition Grouping classifies nodes whose 1*-neighborhood graphs satisfy certain metrics into groups, where each group size is at least equal to k. (1) (2) (3) (A) Grouping (4)

Intuition Testing uses random walk (RW) to test whether the 1- neighborhood graphs of nodes in a group approximately match or not. A A B 1 B Group (2) D 1 C 1 D 1 C First round D (B) Testing and anonymization Anonymization uses a heuristic anonymization algorithm to make the 1-neighborhood graphs of nodes in each group approximately match

Intuition Testing uses random walk (RW) to test whether the 1- neighborhood graphs of nodes in a group approximately match or not. A B 1 Group (2) A A 1 1 A 1 B 2 C 1 C C 2 2 D1D 1 C Second round (B) Testing and anonymization Anonymization uses a heuristic anonymization algorithm to make the 1-neighborhood graphs of nodes in each group approximately match

Intuition Randomization randomly modifies the graph with certain probability to make each node s 1*-neighborhood graph be changed with certain probability (1) A 1 (2) (3) (C) Random ization (4)

Step 1: Grouping A social network is modeled as an undirected and unlabeled graph G = (V (G),E(G)), where V (G) is a set of nodes, and E(G) V (G) V (G) is a set of edges.

Step 1: Grouping We group nodes by using the following metric: number of one-hop neighbors, in-degree sequence, out-degree sequence, total number of edges, and betweenness.

Step 2: Testing We analyze each pair of nodes u and v by computing the steady states of their 1-neighborhood graphs G u and G v with RW. Eq. 1 calculates the probability of a node u j being located at time t Eq. 2 calculates the probability distribution on all nodes in the graph Eq. 3 calculates the steady state of Eq. 2

Step 2: Testing We use Eq. 4 to calculate the Euclidean distance between the topological signatures of the nodes: The cost for matching two 1-neighborhood graphs is calculated with Eq. 5 How to decide α is the key problem

Step 3: Anonymization

Step 4: Randomization Given a randomization probability p. We first randomly remove p( E(G) ) edges from the graph, and then for two nodes that are not linked, we add an edge with probability p. The key problem lies in determining p to randomize the graph

Evaluation

Parameter Setting Threshold value α First, randomly generate a 1- neighborhood graph with N nodes Then generate a similar graph by randomly modifying p* percentage of edges

Parameter Setting Random probability p First randomly generate a graph with N nodes and M edges. Then, randomize the graph with different p values, and calculate the percentage P of 1*-neighborhood graphs being changed in the randomized graph.

Synthetic Data Set We use the Barabai-Albert algorithm (B-A algorithm) to generate synthetic data sets. First generate a network of a small size (5 nodes), and then use that network as a seed to build a larger-sized network (1,000, 2,000, 3,000, 4,000, and 5,000 nodes). Number of modified edges on synthetic data sets. p* = 0.1.

Real Data Set Real social network, Astro Physics collaboration network, which contains 18,772 nodes and 396,160 edges. If an author i co-authored a paper with author j, the graph contains an undirected edge from i to j. (a) Number of modified edges (b) Comparison with existing work

Real Data Set The maximal node degree: MAX The minimal node degree: MIN The average node degree: AVE The error rate for answering the shortest distance queries: Error Rate

Conclusion We identify a novel 1*- neighborhood attack for publishing a social network graph to a cloud We define a key property, probabilistic indistinguishability, for anonymizing outsourced social networks We propose a heuristic anonymization scheme to anonymize social networks with this property

Thank you!