Large-scale Graph Google NY

Similar documents
Randomized Composable Core-sets for Distributed Optimization Vahab Mirrokni

Coverage Approximation Algorithms

Absorbing Random walks Coverage

modern database systems lecture 10 : large-scale graph processing

Absorbing Random walks Coverage

Graph-Parallel Problems. ML in the Context of Parallel Architectures

Parallel Algorithms for Geometric Graph Problems Grigory Yaroslavtsev

Similarity Ranking in Large- Scale Bipartite Graphs

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Fennel: Streaming Graph Partitioning for Massive Scale Graphs

Distributed Submodular Maximization in Massive Datasets. Alina Ene. Joint work with Rafael Barbosa, Huy L. Nguyen, Justin Ward

Apache Giraph: Facebook-scale graph processing infrastructure. 3/31/2014 Avery Ching, Facebook GDM

Clustering. Unsupervised Learning

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs

Clustering. (Part 2)

Viral Marketing and Outbreak Detection. Fang Jin Yao Zhang

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis

Distributed Balanced Clustering via Mapping Coresets

Approximation Algorithms

CS 5220: Parallel Graph Algorithms. David Bindel

Clustering. Unsupervised Learning

Outline. CS38 Introduction to Algorithms. Approximation Algorithms. Optimization Problems. Set Cover. Set cover 5/29/2014. coping with intractibility

Databases 2 (VU) ( / )

Approximation Algorithms for NP-Hard Clustering Problems

Real-time Scheduling of Skewed MapReduce Jobs in Heterogeneous Environments

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

Clustering. Unsupervised Learning

Revolver: Vertex-centric Graph Partitioning Using Reinforcement Learning

Scalable Big Graph Processing in Map Reduce

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

MapReduce ML & Clustering Algorithms

PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks

Introduction to Text Mining. Hongning Wang

Case Study 4: Collaborative Filtering. GraphLab

Distributed Graph Algorithms

Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

Large-Scale Graph Processing 1: Pregel & Apache Hama Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

Community Detection. Community

Mining Data Streams. Outline [Garofalakis, Gehrke & Rastogi 2002] Introduction. Summarization Methods. Clustering Data Streams

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations

Mosaic: Processing a Trillion-Edge Graph on a Single Machine

Modeling and Detecting Community Hierarchies

King Abdullah University of Science and Technology. CS348: Cloud Computing. Large-Scale Graph Processing

Graph Connectivity in MapReduce...How Hard Could it Be?

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

MapReduce, Hadoop and Spark. Bompotas Agorakis

Data Partitioning and MapReduce

Approximation Algorithms for Clustering Uncertain Data

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Scalable Influence Maximization in Social Networks under the Linear Threshold Model

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Dynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling

Optimisation While Streaming

Clustering Large scale data using MapReduce

Introduction to Parallel & Distributed Computing Parallel Graph Algorithms

A survey of submodular functions maximization. Yao Zhang 03/19/2015

Big Data Analytics Influx of data pertaining to the 4Vs, i.e. Volume, Veracity, Velocity and Variety

Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing

Graph Processing. Connor Gramazio Spiros Boosalis

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

Approximation-stability and proxy objectives

Holistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges

Graphs (Part II) Shannon Quinn

Finding Connected Components in Map-Reduce in Logarithmic Rounds

Distributed computing: index building and use

ECS289: Scalable Machine Learning

An Empirical Analysis of Communities in Real-World Networks

Large Scale Graph Algorithms

CS 598CSC: Approximation Algorithms Lecture date: March 2, 2011 Instructor: Chandra Chekuri

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing

Counting Triangles & The Curse of the Last Reducer. Siddharth Suri Sergei Vassilvitskii Yahoo! Research

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Clustering. Chapter 10 in Introduction to statistical learning

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs

Parallel Local Graph Clustering

Greedy and Local Ratio Algorithms in the MapReduce Model

A Local Algorithm for Structure-Preserving Graph Cut

Behavioral Data Mining. Lecture 9 Modeling People

Implementation of Parallel CASINO Algorithm Based on MapReduce. Li Zhang a, Yijie Shi b

Pregel. Ali Shah

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

CS570: Introduction to Data Mining

G(B)enchmark GraphBench: Towards a Universal Graph Benchmark. Khaled Ammar M. Tamer Özsu

Parallel repartitioning and remapping in

MaxCover in MapReduce Flavio Chierichetti, Ravi Kumar, Andrew Tomkins

Multicore Triangle Computations Without Tuning

Fast Hierarchical Clustering via Dynamic Closest Pairs

CLUSTERING BIG DATA USING NORMALIZATION BASED k-means ALGORITHM

Introduction to MapReduce (cont.)

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

Scalable Clustering of Signed Networks Using Balance Normalized Cut

A Computational Theory of Clustering

XtraPuLP. Partitioning Trillion-edge Graphs in Minutes. State University

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Locality- Sensitive Hashing Random Projections for NN Search

Transcription:

Large-scale Graph Mining @ Google NY Vahab Mirrokni Google Research New York, NY DIMACS Workshop

Large-scale graph mining Many applications Friend suggestions Recommendation systems Security Advertising Benefits Big data available Rich structured information New challenges Process data efficiently Privacy limitations

Google NYC Large-scale graph mining Develop a general-purpose library of graph mining tools for XXXB nodes and XT edges via MapReduce+DHT(Flume), Pregel, ASYMP Goals: Develop scalable tools (Ranking, Pairwise Similarity, Clustering, Balanced Partitioning, Embedding, etc) Compare different algorithms/frameworks Help product groups use these tools across Google in a loaded cluster (clients in Search, Ads, Youtube, Maps, Social) Fundamental Research (Algorithmic Foundations and Hybrid Algorithms/System Research)

Outline Three perspectives: Part 1: Application-inspired Problems Algorithms for Public/Private Graphs Part 2: Distributed Optimization for NP-Hard Problems Distributed algorithms via composable core-sets Part 3: Joint systems/algorithms research MapReduce + Distributed HashTable Service

Problems Inspired by Applications Part 1: Why do we need scalable graph mining? Stories: Algorithms for Public/Private Graphs, How to solve a problem for each node on a public graph+its own private network with Chierchetti,Epasto,Kumar,Lattanzi,M: KDD 15 Ego-net clustering How to use graph structures and improve collaborative filtering with EpastoLattanziSebeTaeiVerma, Ongoing Local random walks for conductance optimization, Local algorithms for finding well connected clusters with AllenZu,Lattanzi, ICML 13

Private-Public networks Idealistic vision

Private-Public networks Reality ~52% of NYC Facebook users hide their friends My friends are private Only my friends can see my friends

Applications: friend suggestions Network signals are very useful [CIKM03] Number of common neighbors Personalized PageRank Katz

Applications: friend suggestions Network signals are very useful [CIKM03] Number of common neighbors Personalized PageRank Katz From a user perspective, there are interesting signals

Applications: advertising Maximize the reachable sets How many can be reached by re-sharing?

Applications: advertising Maximize the reachable sets How many can be reached by re-sharing? More influential from global prospective

Applications: advertising Maximize the reachable sets How many can be reached by re-sharing? More influential from Starbucks prospective

Private-Public problem There is a public graph G in addition each node u has access to a local graph G u u

Private-Public problem There is a public graph G in addition each node u has access to a local graph G u u

Private-Public problem There is a public graph G in addition each node u has access to a local graph G u u

Private-Public problem There is a public graph G in addition each node u has access to a local graph G u u u G u

Private-Public problem For each u, we like to execute some computation on G [ G u u

Private-Public problem For each G [ G u u, we like to execute some computation on u Doing it naively is too expensive

Private-Public problem Can we precompute data structure for solve problems in G [ G u efficiently? G so that we can preprocessing fast computation + u

Private-Public problem Ideally Preprocessing time: Preprocessing space: Õ ( E G ) Õ ( V G ) Post-processing time: Õ ( E Gu )

Problems Studied (Approximation) Algorithms with provable bounds Reachability Approximate All-pairs shortest paths Correlation clustering Social affinity Heuristics Personalized PageRank Centrality measures

Problems Studied Algorithms Reachability Approximate All-pairs shortest paths Correlation clustering Social affinity Heuristics Personalized PageRank Centrality measures

Part 2: Distributed Optimization Distributed Optimization for NP-Hard Problems on Large Data Sets: Two stories: Distributed Optimization via composable core-sets Sketch the problem in composable instances Distributed computation in constant (1 or 2) number of rounds Balanced Partitioning Partition into ~equal parts & minimize the cut

Distributed Optimization Framework Run ALG in each machine Machine 1 T 1 Machine 2 S 1 Run ALG to find the final size k output set Input Set N T 2 S 2 Selected elements output set S m T m Machine m

Composable Core-sets Technique for effective distributed algorithm One or Two rounds of Computation Minimal Communication Complexity Can also be used in Streaming Models and Nearest Neighbor Search Problems o Diversity Maximization o Composable Core-sets o Indyk, Mahabadi, Mahdian, Mirrokni, ACM PODS 14 o Clustering Problems o Mapping Core-sets o Bateni, Bashkara, Lattanzi, Mirrokni, NIPS 2014 o Submodular/Coverage Maximization: o Randomized Composable Core-sets o work by Mirrokni, ZadiMoghaddam, ACM STOC 2015

Problems considered: General: Find a set S of k items & maximize f(s). Diversity Maximization: Find a set S of k points and maximize the sum of pairwise distances i.e. diversity(s). Capacitated/Balanced Clustering: Find a set S of k centers and cluster nodes around them while minimizing the sum of distances to S. Coverage/submodular Maximization: Find a set S of k items. Maximize submodular function f(s). Distributed Graph Algorithmics: Theory and Practice. WSDM 2015, Shanghai

Distributed Clustering Clustering: Divide data into groups containing Minimize: k-center : k-means : k-median : Metric space (d, X) α-approximation algorithm: cost less than α*opt

Distributed Clustering Many objectives: k-means, k- median, k-center,... minimize max cluster radius Framework: - Divide into chunks V1, V2,, Vm - Come up with representatives Si on machine i << Vi - Solve on union of Si, others by closest rep.

Balanced/Capacitated Clustering Theorem(BhaskaraBateniLattanziM. NIPS 14): distributed balanced clustering with - approx. ratio: (small constant) * (best single machine ratio) - rounds of MapReduce: constant (2) - memory: ~(n/m)^2 with m machines Works for all Lp objectives.. (includes k-means, k-median, k-center) Improving Previous Work Bahmani, Kumar, Vassilivitskii, Vattani: Parallel K-means++ Balcan, Enrich, Liang: Core-sets for k-median and k-center

Experiments Aim: Test algorithm in terms of (a) scalability, and (b) quality of solution obtained Setup: Two base instances and subsamples (used k=1000, #machines = 200) US graph: N = x0 Million distances: geodesic size of seq. inst. increase in OPT World graph: N = x00 Million distances: geodesic US 1/300 1.52 World 1/1000 1.58 Accuracy: analysis pessimistic Scaling: sub-linear

Coverage/Submodular Maximization Max-Coverage: Given: A family of subsets S 1 S m Goal: choose k subsets S 1 S k with the maximum union cardinality. Submodular Maximization: Given: A submodular function f Goal: Find a set S of k elements & maximize f(s). Applications: Data summarization, Feature selection, Exemplar clustering, Distributed Graph Algorithmics: Theory and Practice. WSDM 2015, Shanghai

Bad News! Theorem[IndykMahabadiMahdianM PODS 14] There exists no better than approximate composable core-set for submodular maximization. Question: What if we apply random partitioning? YES! Concurrently answered in two papers: Barbosa, Ene, Nugeon, Ward: ICML 15. M.,ZadiMoghaddam: STOC 15.

Summary of Results [M. ZadiMoghaddam STOC 15] 1. A class of 0.33-approximate randomized composable core-sets of size k for nonmonotone submodular maximization. 2. Hard to go beyond ½ approximation with size k. Impossible to get better than 1-1/e. 3. 0.58-approximate randomized composable core-set of size 4k for monotone f. Results in 0.54-approximate distributed algorithm. 4. For small-size composable core-sets of k less than k: sqrt{k /k}-approximate randomized composable core-set.

-approximate Randomized Core-set (2 2) Positive Result [M, ZadiMoghaddam]: If we increase the output sizes to be 4k, Greedy will be (2-2)-o(1) 0.585-approximate randomized core-set for a monotone submodular function. Remark: In this result, we send each item to C random machines instead of one. As a result, the approximation factors are reduced by a O(ln(C)/C) term.

Summary: composable core-sets Diversity maximization (PODS 14) Apply constant-factor composable core-sets Balanced clustering (k-center, k-median & k-means) (NIPS 14) Apply Mapping Core-sets! constant-factor Coverage and Submodular maximization (STOC 15) Impossible for deterministic composable core-set Apply randomized core-sets! 0.54-approximation Future: Apply core-sets to other ML/graph problems, feature selection. For submodular: 1-1/e-approximate core-set 1-1/e-approximation in 2 rounds (even with multiplicity)?

Distributed Balanced Partitioning via Linear Embedding o Based on work by Aydin, Bateni, Mirrokni

Balanced Partitioning Problem Balanced Partitioning: o Given graph G(V, E) with edge weights o Find k clusters of approximately the same size o Minimize Cut, i.e., #intercluster edges Applications: o Minimize communication complexity in distributed computation o Minimize number of multi-shard queries while serving an algorithm over a graph, e.g., in computing shortest paths or directions on Maps

Outline of Algorithm Three-stage Algorithm: 1. Reasonable Initial Ordering a. Space-filling curves b. Hierarchical clustering 2. Semi-local moves a. Min linear arrangement b. Optimize by random swaps 3. Introduce imbalance a. Dynamic programming b. Linear boundary adjustment c. Min-cut boundary optimization G=(V,E) Initial ordering 0 1 2 3 4 5 6 7 8 9 10 11 Semi-local moves 0 1 3 2 6 5 4 11 8 9 10 7 Imbalance 0 1 3 2 6 5 4 11 8 9 10 7

Step 1 - Initial Embedding Space-filling curves (Geo Graphs) Hierarchical clustering (General Graphs) C 0 B 0 B 1 A 0 A 2 v 1 v 5 0 1 2 3 4 5 6 7 8 9 10 11

Datasets Social graphs o o o Twitter: 41M nodes, 1.2B edges LiveJournal: 4.8M nodes, 42.9M edges Friendster: 65.6M nodes, 1.8B edges Geo graphs o o World graph > 1B edges Country graphs (filtered)

Related Work FENNEL, WSDM 14 [Tsourakakis et al.] o o Microsoft Research Streaming algorithm UB13, WSDM 13 [Ugander & Backstorm] o o Facebook Balanced label propagation Spinner, (very recent) arxiv [Martella et al.] METIS o In-memory

Comparison to Previous Work k Spinner (5%) UB13 (5%) Affinity (0%) Our Alg (0%) 20 38% 37% 35.71% 27.5% 40 40% 43% 40.83% 33.71% 60 43% 46% 43.03% 36.65% 80 44% 47.5% 43.27% 38.65% 100 46% 49% 45.05% 41.53%

Comparison to Previous Work k Spinner (5%) Fennel (10%) Metis (2-3%) Our Alg (0%) 2 15% 6.8% 11.98% 7.43% 4 31% 29% 24.39% 18.16% 8 49% 48% 35.96% 33.55%

Outline: Part 3 Practice: Algorithms+System Research Two stories: Connected components in MapReduce & Beyond Going beyond MapReduce to build efficient tool in practice. ASYMP A new asynchronous message passing system. Large-scale Graph Mining. BIG 2015, Florence

Graph Mining Frameworks Applying various frameworks to graph algorithmic problems Iterative MapReduce (Flume): o More widely fault-tolerant available tool o Can be optimized with algorithmic tricks Iter. MapReduce + DHT Service (Flume): o Better speed compared to MR Pregel: o Good for synch. computation w/ many rounds o Simpler implementation ASYMP (ASYnchronous Message-Passing): o More scalable/more efficient use of CPU o Asych. self-stabilizing algorithms

Metrics for MapReduce algorithms Running Time o Number of MapReduce rounds o Quasi-linear time processing of inputs Communication Complexity o Linear communication per round o Total communication across multiple rounds Load Balancing o No mapper or reducer should be overloaded Locality of the messages o Sending messages locally when possible o Use the same key for mapper/reducer when possible o Effective while using MR with DHT (more later)

Connected Components: Example output Web Subgraph: 8.5B nodes, 700B edges

Prior Work: Connected Components in MR Connected components in MapReduce, Rastogi et al, ICDE 12 Algorithm #MR Rounds Communication / Round Practice Hash-Min D (Diameter) O(m+n) Many rounds Hash-to-All Log D O(n Long rounds Hash-to-Min Open O(nlog n+m) BEST Hash-Greater - to-min 3 log D 2(n+m) OK, but not the best

Connected Components: Summary Connected Components in MR & MR+DHT Simple, local algorithms with O(log 2 n) round complexity Communication efficient (#edges non-increasing) Use Distributed HashTable Service (DHT) to improve # rounds to O~(log n) [from ~20 to ~5] Data: Graphs with ~XT edges. Public data with 10B edges Results: MapReduce: 10-20 times faster than HashtoMin MR+DHT: 20-40 times faster than HashtoMin ASYMP: A simple algorithm in ASYMP: 25-55 times faster than HashtoMin KiverisLattnziM.RastogiVassilivitskii, SOCC 14.

ASYMP:ASYnchrouns Message Passing ASYMP: New graph mining framework Compare with MapReduce, Pregel Computation does not happen in a synchronize number of rounds Fault-tolerance implementation is also asynchronous More efficient use of CPU cycles We study its fault-tolerance and scalability Impressive empirical performance (e.g., for connectivity and shortest path) Fleury, Lattanzi, M.: ongoing.

Asymp model Nodes are distributed among many machines (workers) Each node keeps a state and send messages to its neighbors. Each machine has a priority queue for sending messages to other machines Initialization: Set nodes states & activate some nodes Main Propagation Loop (Roughly): o Until all nodes converge to a stable state: Asynchronously update states and send top messages in each priority queue Stop Condition: Stop when priority queues are empty

Asymp worker design

Data Sets 5 Public and 5 Internal Google graphs e.g. o UK Web graph: 106M nodes, 6.6B edges [Public] o Google+ subgraph: 178M nodes, 2.9B edges o Keyword similarity : 371M nodes, 3.5B edges o Document similarity: 4,700M nodes, 452B edges Sequence of Web subgraphs: o ~1B, 3B, 9B, 27B core nodes [16B, 47B, 110B, 356B ] o ~36B, 108B, 324B, 1010B edges respectively Sequence of RMAT graphs [Synthetic and Public]: o ~2 26, 2 28, 2 30, 2 32, 2 34 nodes o ~2B, 8B, 34B, 137B, 547B edges respectively.

Comparison with best MR algorithms Running time comparison Speed up 1 2 5 10 20 50 MR ext MR int MR+HT Asymp GP O RE F P LJ

Asymp Fault-tolerance Asynchronous Checkpointing: o Store the current states of nodes once in a while Upon failure of a machine: o Fetch the last recorded state of each node, & o Activate these nodes (send messages to neighbors), and ask them to resend the messages it may have lost. Therefore, a self-stabilizing algorithm works correctly in ASYMP. Example: Dijsktra Shortest Path Algorithm

Impact of failures on running time Make a fraction/all of machines fail over time. o Question: What is the impact of frequent failures? Let D be the running time without any failures. Then % Machine Failures over the whole period (! #per batch) 6% of machine failures at a time 12% of machine failures at a time 50% Time ~= 2D Time ~= 1.4D 100% Time ~= 3.6D Time ~= 3.2D 200% Time ~= 5.3D Time ~= 4.1D More frequent small-size failures is worse than less frequent large-size failures o More robust against group-machine failures

Questions? Thank you!

Algorithmic approach: Operation 1 Large-star(v): Connect all strictly larger neighbors to the min neighbor including self Do this in parallel on each node & build a new graph Theorems (KLMRV 14): Executing Large-star in parallel preserves connectivity Every Large-star operation reduces height of tree by a constant factor

Algorithmic approach: Operation 2 Small-star(v): Connect all smaller neighbors and self to the min neighbor including self Connect all parents to the minimum parent Theorem(KLMRV 14): Executing Small-star in parallel preserves connectivity

Final Algorithm: Combine Operations Input o Set of edges with a unique ID per node Algorithm: Repeat until convergence o Repeated until convergence o Large-Star o Small-star Theorem(KLMRV 14): o The above algorithm converges in O(log 2 n) rounds.

Improved Connected Components in MR Idea 1: Alternate between Large-Star and Small- Star Less #rounds compared to Hash-to-Min, Less Communication compared to Hash-Greater-to-Min Theory: Provable O(log 2 n) MR rounds Optimization: Avoid large-degree nodes by branching them into a tree of height two Practice: Graphs with 1T edges. Public data w/ 10B edges 2 to 20 times faster than Hash-to-Min (Best of ICDE 12) Takes 5 to 22 rounds on these graphs

CC in MR + DHT Service Idea 2: Use Distributed HashTable (DHT) service to save in #rounds After small #rounds (e.g., after 3rd round), consider all active cluster IDs, and resolve their mapping in an array in memory (e.g. using DHT) Theory: O~(log n) MR rounds + O(n/log n) memory. Practice: Graphs with 1T edges. Public data w/ 10B edges. 4.5 to 40 times faster than Hash-to-Min (Best of ICDE 12 paper), and 1.5 to 3 times faster than our best pure MR implementation. Takes 3 to 5 rounds on these graphs.

Data Sets 5 Public and 5 Internal Google graphs e.g. o UK Web graph: 106M nodes, 6.6B edges [Public] o Google+ subgraph: 178M nodes, 2.9B edges o Keyword similarity : 371M nodes, 3.5B edges o Document similarity: 4,700M nodes, 452B edges Sequence of RMAT graphs [Synthetic and Public]: o ~2 26, 2 28, 2 30, 2 32, 2 34 nodes o ~2B, 8B, 34B, 137B, 547B edges respectively. Algorithms: o Min2Hash o Alternate Optimized (MR-based) o Our best MR + DHT Implementation o Pregel Implementation

Speedup: Comparison with HTM

#Rounds: Comparing different algorithms

Comparison with Pregel

Warm-up: # connected components GraphEx Symposium, Lincoln Laboratory

Warm-up: # connected components B B B A A B B A A C A A A C C C We can compute the components and assign to each component an id. GraphEx Symposium, Lincoln Laboratory

Warm-up: # connected components B B B A A B B A A C A A A C C C After adding private edges it is possible to recompute it by counting # newly connected components GraphEx Symposium, Lincoln Laboratory

Warm-up: # connected components B B B A A B B A A C A A A C C C After adding private edges it is possible to recompute it by counting # newly connected components GraphEx Symposium, Lincoln Laboratory

Warm-up: # connected components B A C After adding private edges it is possible to recompute it by counting # newly connected components GraphEx Symposium, Lincoln Laboratory