Learning Graphs with Queries. Lev Reyzin Clique talk Fall 2006

Similar documents
Learning and Verifying Graphs Using Queries with a Focus on Edge Counting

SFI Talk. Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes

Parallel Breadth First Search

Parameterized graph separation problems

On Covering a Graph Optimally with Induced Subgraphs

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

LEARNING CIRCUITS AND NETWORKS BY INJECTING VALUES. Lev Reyzin Yale University

The strong chromatic number of a graph

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

Faster and Dynamic Algorithms For Maximal End-Component Decomposition And Related Graph Problems In Probabilistic Verification

Objective of the present work

Solution for Homework set 3

Reading for this lecture (Goodrich and Tamassia):

Deciding k-colorability of P 5 -free graphs in polynomial time

Network construction with subgraph connectivity constraints

Encoding Data Structures

Fundamental Properties of Graphs

Property Testing for Sparse Graphs: Structural graph theory meets Property testing

Math 170- Graph Theory Notes

Fast Skew Partition Recognition

On the computational complexity of the domination game

Highway Dimension and Provably Efficient Shortest Paths Algorithms

Distributed minimum spanning tree problem

On Universal Cycles of Labeled Graphs

Math 776 Graph Theory Lecture Note 1 Basic concepts

V10 Metabolic networks - Graph connectivity

These are not polished as solutions, but ought to give a correct idea of solutions that work. Note that most problems have multiple good solutions.

Graph Algorithms Using Depth First Search

PRAM Divide and Conquer Algorithms

arxiv: v4 [math.co] 4 Apr 2011

Colored trees in edge-colored graphs

Graph Theory and Network Measurment

Online Coloring Known Graphs

Exact Algorithms Lecture 7: FPT Hardness and the ETH

Dynamic Graph Coloring

Faster parameterized algorithms for Minimum Fill-In

Computational Geometry

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

Algorithms for Minimum Spanning Trees

CONNECTIVITY CHECK IN 3-CONNECTED PLANAR GRAPHS WITH OBSTACLES

Lectures 8/9. 1 Overview. 2 Prelude:Routing on the Grid. 3 A couple of networks.

Bipartite Coverings and the Chromatic Number

CSE 431/531: Analysis of Algorithms. Greedy Algorithms. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

Fixed Parameter Algorithms

Algorithmic Problems Related to Internet Graphs

K 4 C 5. Figure 4.5: Some well known family of graphs

DO NOT RE-DISTRIBUTE THIS SOLUTION FILE

Problem. Input: An array A = (A[1],..., A[n]) with length n. Output: a permutation A of A, that is sorted: A [i] A [j] for all. 1 i j n.

Review implementation of Stable Matching Survey of common running times. Turn in completed problem sets. Jan 18, 2019 Sprenkle - CSCI211

Evolution Module. 6.1 Phylogenetic Trees. Bob Gardner and Lev Yampolski. Integrated Biology and Discrete Math (IBMS 1300)

Soft Heaps And Minimum Spanning Trees

Cluster Editing with Locally Bounded Modifications Revisited

Connectivity, Graph Minors, and Subgraph Multiplicity

Characterizing Graphs (3) Characterizing Graphs (1) Characterizing Graphs (2) Characterizing Graphs (4)

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

Treewidth and graph minors

Faster parameterized algorithms for Minimum Fill-In

arxiv: v2 [cs.ds] 30 Sep 2016

Rainbow spanning trees in properly coloured complete graphs

Math 778S Spectral Graph Theory Handout #2: Basic graph theory

Minimum-Spanning-Tree problem. Minimum Spanning Trees (Forests) Minimum-Spanning-Tree problem

The Visibility Problem and Binary Space Partition. (slides by Nati Srebro)

Week 12: Minimum Spanning trees and Shortest Paths

Interleaving Schemes on Circulant Graphs with Two Offsets

Algorithms for GIS:! Quadtrees

Lecture 4: Primal Dual Matching Algorithm and Non-Bipartite Matching. 1 Primal/Dual Algorithm for weighted matchings in Bipartite Graphs

On Structural Parameterizations of the Matching Cut Problem

Seminar on Algorithms and Data Structures: Multiple-Edge-Fault-Tolerant Approximate Shortest-Path Trees [1]

Abstract. A graph G is perfect if for every induced subgraph H of G, the chromatic number of H is equal to the size of the largest clique of H.

Sublinear Time and Space Algorithms 2016B Lecture 7 Sublinear-Time Algorithms for Sparse Graphs

Lecture 1: A Monte Carlo Minimum Cut Algorithm

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Straight-Line Drawings of 2-Outerplanar Graphs on Two Curves

MapReduce Algorithms. Barna Saha. March 28, 2016

The Price of Connectivity for Feedback Vertex Set

COT 6405 Introduction to Theory of Algorithms

Decreasing the Diameter of Bounded Degree Graphs

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Notes on Binary Dumbbell Trees

18.3 Deleting a key from a B-tree

Reference Sheet for CO142.2 Discrete Mathematics II

CS61BL. Lecture 5: Graphs Sorting

We can use a max-heap to sort data.

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

Acyclic Network. Tree Based Clustering. Tree Decomposition Methods

On 2-Subcolourings of Chordal Graphs

Graphs and Network Flows ISE 411. Lecture 7. Dr. Ted Ralphs

Definition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1

arxiv: v1 [cs.ds] 24 Apr 2013

HEAPS ON HEAPS* Downloaded 02/04/13 to Redistribution subject to SIAM license or copyright; see

The Encoding Complexity of Network Coding

Connected size Ramsey number for matchings vs. small stars or cycles

Design and Analysis of Algorithms

COMP Online Algorithms. Online Bin Packing. Shahin Kamali. Lecture 20 - Nov. 16th, 2017 University of Manitoba

Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b

PTAS for Matroid Matching

V Advanced Data Structures

All-Pairs Nearly 2-Approximate Shortest-Paths in O(n 2 polylog n) Time

Discrete mathematics

A Note on Scheduling Parallel Unit Jobs on Hypercubes

Transcription:

Learning Graphs with Queries Lev Reyzin Clique talk Fall 2006

Some Papers Angluin, Chen. Learning a Hidden Graph Using O(log n) Queries Per Edge. (COLT 04) Bouvel, Grebinski, Kucherov. Combinatorial Search on Graphs Motivated by Bioinformatics Applications: A Brief Survey (WG 05) Reyzin, Srivastava. A Survey of Graph Learning with Queries (Manuscript 06) Hein. An Optimal Algorithm to Reconstruct Trees from Additive Distance Data. (Journal of Math Bio 89)

Hidden Graphs Oracle at Delphi

Queries Edge query pair of vertices in graph to detect presence of edge (social networks) Edge Counting query set of vertices in graph for number of edges (DNA sequencing) Edge Detection query set of vertices in graph for presence of edges (chemical reactions) Connectedness query pair of vertices in graph to discover if they re in same component (electrical networks) Shortest Path query two vertices in graph for shortest path length between them (evolutionary trees)

Edge Queries Edge Query: query pair of vertices in graph to detect presence of edge

Edge Detection Queries No Yes Yes Edge Detection Query: query a set of vertices in graph for presence of an edge

Edge Counting Queries 0 3 2 Edge Counting Query: query a set of vertices in a graph for number of edges in subgraph induced by the vertices

Shortest Path Queries Shortest Path: query two vertices in graph for shortest path length between them

Connectedness Queries Connectedness Queries: query pair of vertices in graph to discover if they re in same component

Relative Power of Queries upper bounds lower bounds

Target Classes of Graphs general graphs trees partitions

Results Summary n = number of nodes, E = number of edges, k = number of connected components, d = maximum degree E = Edge Query, ED = Edge Detection Query, EC = Edge Counting Query, SP = Shortest Path Query, C = Connectedness Query references omitted

Learning Partitions with ED Upper bound: O(n 2 ) trivial. query all pairs of vertices. Lower bound: Ω(n 2 ) Distinguishing vs K n/2 K n/2 K n/2 K n/2 requires Ω(n 2 ) by adversarial argument

Results Summary n = number of nodes, E = number of edges, k = number of connected components, d = maximum degree E = Edge Query, ED = Edge Detection Query, EC = Edge Counting Query, SP = Shortest Path Query, C = Connectedness Query references omitted

Learning Partitions with C Upper bound: O(nk) algorithm - Step 1: Place v 1 in its own component - Step i > 1: Query C(v i,v w ) for items v w from each existing component; if a yes is encountered, place v i in the corresponding component and continue to step i+1. Otherwise place v i in its own component.

An Example 1 1 7 3 5 1 2 1 3 2 6 1 3 2 4 2 4 8 1 3 2 5 4 partitions 1 3 2 5 6 4 1 3 7 2 5 6 4 1 3 7 2568 4

Learning Partitions with C Upper bound: O(nk) correctness: trivial running time: For complexity, note that there at most k components at any step (since there are at most k components at phase n and components are never destroyed); hence n vertices take at most nk queries.

Back to Our Table n = number of nodes, E = number of edges, k = number of connected components, d = maximum degree E = Edge Query, ED = Edge Detection Query, EC = Edge Counting Query, SP = Shortest Path Query, C = Connectedness Query references omitted

Learning Partitions with EC Lower bound: Ω(n) information theoretic argument: The number of partitions of an element set is given by the Bell number B n. lg(b n ) = n lg n (de Bruijn 81) each EC query gives lg(c(n,2)) = lg n bits. we need Ω((n lg n) / (lg n)) = Ω(n)

An Algorithm for the Upper Bound Upper bound: O(n log n) Phase 1: set C = {c 1 } with c 1 =1 Phase i : let v = (v i+1 ) query EC(C+v) if EC(C+v)=EC(C) add a new component c=v to C else split C into roughly equal halves C 1 and C 2 and query EC(C 1 +v),ec(c 2 +v). Recurse until EC({c j }+v) > EC(c j ) for a single component c j. Call c j a live component. Repeat recursively on C/c j until all live components are found. Merge them and v into 1 component in C.

An Example 1 1 7 3 5 1 2 1 2 3 2 6 8 1 2 3 4 4 partitions 1 2 3 4 5 1 2 56 3 4 13 7 2 56 4 13 7 2 568 4

Proof Sketch Correctness of the algorithm is simple by induction on the phase. C contains the components of G[1... i] at end of phase i Running time is bounded by O(n lg n) because we do O(lg n) queries to find each live component. But each time we find a live component, the number of components in our set decreases by 1. Since we can have at most n partitions, the total running time is bounded by O(n lg n)

Back to Our Table n = number of nodes, E = number of edges, k = number of connected components, d = maximum degree E = Edge Query, ED = Edge Detection Query, EC = Edge Counting Query, SP = Shortest Path Query, C = Connectedness Query references omitted

Learning Graphs with ED (Angluin, Chen COLT 04) Upper bound: O( E log n) Lemma 1: if S 1 and S 2 are two non-empty independent sets of vertices in G. We can find s edges between S 1 and S 2 in O(s log n) Case 1: Case 2:

Learning Graphs w/ ED - continued (Angluin, Chen COLT 04) Upper bound: O( E log n) Fact: A graph with m edges can be O(m) 1/2 colored we can collapse pairs of vertices not joined by an edge, until we get a clique of m edges. It can be O(t)-colored and has O(t 2 ) vertices. Lemma 2: if S 1 and S 2 ( S 1 < S 2 are two non-empty sets of vertices in G (w/ s 1 and s 2 edges respectively). We can find s edges between S 1 and S 2 in O(s*log S 2 +s 1 +s 2 ) we color both S 1 and S 2 with (s 1 ) 1/2 and (s 2 ) 1/2 colors for each pair of color classes in S 1 and S 2, we query the union recurse

Learning Graphs w/ ED - continued (Angluin, Chen COLT 04) Upper bound: O( E log n) The Algorithm: 1) If V = 2, mark pair of vertices as edge andreturn 2) Divide V into halves S1 and S2. Ask ED(S 1 ) and ED(S 1 ) 3) Recursively solve the problem for S i if ED(S i )=1 4) Using lemma 2, find edges between S 1 and S 2

Back to Our Table n = number of nodes, E = number of edges, k = number of connected components, d = maximum degree E = Edge Query, ED = Edge Detection Query, EC = Edge Counting Query, SP = Shortest Path Query, C = Connectedness Query references omitted

Learning Trees with SP Queries (Hein, 89) Upper bound: O(d n log n) Proof intuition: 0 0 0 0 0 0 0 3 4 4 1 0 0 0 0 3 2 3 0 0 0 0 0 0

Open Problems and Future Work Close the gap for partition with EC queries Consider other queries: traceroute queries, beacon queries, Graph Verification. Given a result from a certain class, how hard is it to verify. Very interesting: verifying general graphs with EC queries.

Beacon and Traceroute Queries Target 2 Beacon 1 2 Traceroute 3 3 1

Verification

Thank You Questions?