Advanced Query Processing

Size: px
Start display at page:

Download "Advanced Query Processing"

Transcription

1 Advanced Query Processing CSEP544 Nov CSEP544 Optimal Sequential Algorithms Nov / 28

2 Lecture 9: Advanced Query Processing Optimal Sequential Algorithms. Semijoin Reduction Optimal Parallel Algorithms. CSEP544 Optimal Sequential Algorithms Nov / 28

3 Bibliography E Friedgut, Hypergraphs, entropy, and inequalities, American Mathematical Monthly, , Albert Atserias, Martin Grohe, Dniel Marx: Size Bounds and Query Plans for Relational Joins. SIAM J. Comput. 42(4): (2013) Hung Q. Ngo, Christopher Ré, Atri Rudra: Skew strikes back: new developments in the theory of join algorithms. SIGMOD Record 42(4): 5-16 (2013) Paul Beame, Paraschos Koutris, Dan Suciu: Skew in parallel query processing. PODS 2014: Paul Beame, Paraschos Koutris, Dan Suciu: Communication steps for parallel query processing. PODS 2013: CSEP544 Optimal Sequential Algorithms Nov / 28

4 Natural Join R S Joins R, S on all common attributes, removes duplicate attributes R A B a 1 b 1 a 1 b 2 a 2 b 3 a 3 b 4 S A C a 1 c 1 a 1 c 2 a 3 c 3 a 4 c 4 T = R S A B C a 1 b 1 c 1 a 1 b 1 c 2 a 1 b 2 c 1 a 1 b 2 c 2 a 3 b 4 c 3 Input schemas: R(A, B), S(A, C) Output schema: T (A, B, C) CSEP544 Optimal Sequential Algorithms Nov / 28

5 Very Quick Review of Basic Join Algorithms Compute R A=B S Nested-loop join Hash-join Merge-join (To describe in class.) Complexity: O(( R + S + R A=B S ) log( R + S )) Ignoring log factors, Complexity: O( Input + Output ) CSEP544 Optimal Sequential Algorithms Nov / 28

6 Conjunctive Queries Example Q 1 (x, y, z, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: (R(x, y) S(y, z)) T (z, u) First Order Logic: Q 1 = {(x, y, z, u) (x, y) R (y, z) S (z, u) T } SQL: select from where CSEP544 Optimal Sequential Algorithms Nov / 28

7 Conjunctive Queries Example Q 1 (x, y, z, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: (R(x, y) S(y, z)) T (z, u) First Order Logic: Q 1 = {(x, y, z, u) (x, y) R (y, z) S (z, u) T } SQL: select from where Example Q 2 (x, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: Π x,u ((R(x, y) S(y, z)) T (z, u)) First Order Logic: Q 1 = {(x, u) y z((x, y) R (y, z) S (z, u) T )} SQL: select from where CSEP544 Optimal Sequential Algorithms Nov / 28

8 Traditional Approach to Computing Conjunctive Queries Q(x, y, z) = R(x, y), S(y, z), T (z, x) Optimizer generates a query plan: Temp(x, y, z) =R(x, y) S(y, z) Q(x, y, z) =Temp(x, y, z) T (z, x) Optimizers examines many possible plans, evaluates the cheapest plan. Problem: intermediate results may be large, and very hard to estimate. CSEP544 Optimal Sequential Algorithms Nov / 28

9 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? CSEP544 Optimal Sequential Algorithms Nov / 28

10 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 CSEP544 Optimal Sequential Algorithms Nov / 28

11 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 Can be m 1 m 2 CSEP544 Optimal Sequential Algorithms Nov / 28

12 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 Can be m 1 m 2 Answer: 0 Q m 1 m 2. CSEP544 Optimal Sequential Algorithms Nov / 28

13 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? CSEP544 Optimal Sequential Algorithms Nov / 28

14 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) CSEP544 Optimal Sequential Algorithms Nov / 28

15 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) CSEP544 Optimal Sequential Algorithms Nov / 28

16 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) But also: m 1 m 3, m 2 m 3 CSEP544 Optimal Sequential Algorithms Nov / 28

17 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) But also: m 1 m 3, m 2 m 3 We will show: Also (and better!): Q m 1 m 2 m 3 There exists an algorithm that computes Q in time min(all the above) How this generalizes to any conjunctive query. CSEP544 Optimal Sequential Algorithms Nov / 28

18 The Hypergraph of a Query Definition Let Q be a full conjunctive query without self-joins. The hypergraph G of Q consists of: Nodes(G) = Vars(Q) the set of variables of Q HyperEdges(G) = Atoms(Q) the set of atoms of Q. CSEP544 Optimal Sequential Algorithms Nov / 28

19 The Hypergraph of a Query Definition Let Q be a full conjunctive query without self-joins. The hypergraph G of Q consists of: Nodes(G) = Vars(Q) the set of variables of Q HyperEdges(G) = Atoms(Q) the set of atoms of Q. Q(x,y,z) = R(x,y),S(y,z),T(z,x) x z y Q(x,y,z) = R(x,y,z),S(x),T(y),K(z),M(x,u) u x z y CSEP544 Optimal Sequential Algorithms Nov / 28

20 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28

21 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28

22 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28

23 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28

24 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28

25 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28

26 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28

27 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28

28 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l A fractional edge: u = (1/2, 1/2, 1/2) 1/2 x 1/2 z 1/2 y CSEP544 Optimal Sequential Algorithms Nov / 28

29 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l A fractional edge: u = (1/2, 1/2, 1/2) 1/2 x 1/2 z 1/2 y It follows that Q m 1/2 m 1/2 m 1/2 = m 3/2 With m edges you can built at most m 3/2 triangles! CSEP544 Optimal Sequential Algorithms Nov / 28

30 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28

31 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28

32 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28

33 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28

34 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Find three relations R = S = T = m such that Q = m 3/2 CSEP544 Optimal Sequential Algorithms Nov / 28

35 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Find three relations R = S = T = m such that Q = m 3/2 Answer: let n = m 1/2, and R = S = T = [n] [n]. Then Q = n 3 = m 3/2. CSEP544 Optimal Sequential Algorithms Nov / 28

36 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 CSEP544 Optimal Sequential Algorithms Nov / 28

37 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 Fractional Edge Cover maximize v i i j v i log m j i x i R j Fractional Vertex Packing CSEP544 Optimal Sequential Algorithms Nov / 28

38 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 Fractional Edge Cover maximize v i i j v i log m j i x i R j Fractional Vertex Packing Theorem (Strong Duality of Linear Programs) min u j u j log m j = max v v i i CSEP544 Optimal Sequential Algorithms Nov / 28

39 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 CSEP544 Optimal Sequential Algorithms Nov / 28

40 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 max(v x + v y + v z) R v x + v y log R S v y + v z log S T v x + v z log T Strong duality theorem: min u R log R + u S log S + u T log T = max v x + v y + v z. CSEP544 Optimal Sequential Algorithms Nov / 28

41 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 max(v x + v y + v z) R v x + v y log R S v y + v z log S T v x + v z log T Strong duality theorem: min u R log R + u S log S + u T log T = max v x + v y + v z. In class: what are the optimal solutions in these cases: R = S = T R = S T CSEP544 Optimal Sequential Algorithms Nov / 28

42 The AGM Bound is Tight AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 maximize v i i j v i log m j i x i R j Theorem The AGM bound is tight Proof: start with an optimal solution v i. Define R(x 1, x 5, x 8 ) = [2 v 1 ] [2 v 5 ] [2 v 8 ] etc Then Q = 2 v 1+v = 2 u 1 log m 1 +u 2 log m = m u 1 1 mu 2 2 CSEP544 Optimal Sequential Algorithms Nov / 28

43 Computing Full Conjunctive Queries Recall: all database systems compute one join at a time This may be much larger than the maximum output size, AGM(Q). Goal: design a algorithm that runs in time AGM(Q). Worst-Case-Optimal algorithm: runs in time AGM(Q). CSEP544 Optimal Sequential Algorithms Nov / 28

44 Generic Join Compute Q(x) = R 1 (x 1 ),..., R l (x l ) If x = 1 then return R 1 R l. Otherwise, choose a variable x Assuem it occurrs in atoms R i1,..., R ik Compute A = Π x (R i1 ) Π x (R ik ) For each a A, compute Result a = Q[a/x] using Generic-Join Return a Result a. Runtime: O(AGM(Q)) (Plus a log n factor for index lookup) CSEP544 Optimal Sequential Algorithms Nov / 28

45 Generic Join Compute Q(x) = R 1 (x 1 ),..., R l (x l ) If x = 1 then return R 1 R l. Otherwise, choose a variable x Assuem it occurrs in atoms R i1,..., R ik Compute A = Π x (R i1 ) Π x (R ik ) For each a A, compute Result a = Q[a/x] using Generic-Join Return a Result a. Runtime: O(AGM(Q)) (Plus a log n factor for index lookup) CSEP544 Optimal Sequential Algorithms Nov / 28

46 Generic Join Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Compute A = Π x (R) Π x (T ) = {a 1,..., a n } For each a i A, denote R (y) = R(a i, y), T (z) = T (z, a i ) Compute Result i (a i, y, z) = R (y), S(y, z), T (z) Return i Result i Runtime: O(m 3/2 ) assuming R = S = T = m. CSEP544 Optimal Sequential Algorithms Nov / 28

47 Details of Generic Join Fix variable order: x 1, x 2,... (AGM bound always holds, but in practice the order can matter a lot) Order/index the relations accordingly. For example, R(x 3, x 6, x 7 ) has a B+-index on x 3, x 6, x 7. Computing A = Π x (R i1 ) Π x (R ik ) is similar to multi-way merge join. Must ensure that runtime is min( R i1, R i2,...). When we iterate a A, we are making one more binding in all indexes. LeapFrog Tree Join (by LogicBlox) is based on these ideas. CSEP544 Optimal Sequential Algorithms Nov / 28

48 Details of Generic Join Q(x, y, z) = R(x, y), S(y, z), T (z, x) Compute A = Π x (R) Π x (T ) = {a 1,..., a n } For each a i A, denote R (y) = R(a i, y), T (z) = T (z, a i ) Compute Result i (a i, y, z) = R (y), S(y, z), T (z) Return i Result i Runtime: O(m 3/2 ) assuming R = S = T = m. Variable order: x, y, z. What happens in each case? Complete bipartite graph: R = S = T = [m 1/2 ] [m 1/2 ]. Skewed at x: R = [1] [m], S [m] [m], T = [m] [1]. Skewed at y: R = [m] [1], S = [1] [m], T [m] [m] Skewed at z: R [m] [m], S = [m] [1], T [1] [m] CSEP544 Optimal Sequential Algorithms Nov / 28

49 Discussion Major advantage over one-join-at-at-time algorithms for cyclic queries! For acyclic queries, the story is more complex (discussed later) We haven t proven its optimality yet: will do this next. CSEP544 Optimal Sequential Algorithms Nov / 28

50 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 CSEP544 Optimal Sequential Algorithms Nov / 28

51 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 CSEP544 Optimal Sequential Algorithms Nov / 28

52 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i ) w CSEP544 Optimal Sequential Algorithms Nov / 28

53 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? CSEP544 Optimal Sequential Algorithms Nov / 28

54 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); CSEP544 Optimal Sequential Algorithms Nov / 28

55 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); Q triangle (x, y, z) = R(x, y), S(y, z), T (z, x); CSEP544 Optimal Sequential Algorithms Nov / 28

56 What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); Q triangle (x, y, z) = R(x, y), S(y, z), T (z, x); Q Hölder (x) = R(x), S(x), T (x) CSEP544 Optimal Sequential Algorithms Nov / 28 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w

57 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l CSEP544 Optimal Sequential Algorithms Nov / 28

58 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x CSEP544 Optimal Sequential Algorithms Nov / 28

59 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. CSEP544 Optimal Sequential Algorithms Nov / 28

60 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) CSEP544 Optimal Sequential Algorithms Nov / 28

61 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz byz u2 yz x axy u1 czx u3 group by x CSEP544 Optimal Sequential Algorithms Nov / 28

62 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x CSEP544 Optimal Sequential Algorithms Nov / 28

63 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y A y ) u1 ( C z ) u3 Induction for Q z CSEP544 Optimal Sequential Algorithms Nov / 28

64 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz =( yz b yz ) u2 ( xy byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y a xy ) u1 ( c zx ) u3 zx A y ) u1 ( C z ) u3 Induction for Q z CSEP544 Optimal Sequential Algorithms Nov / 28

65 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz =( yz b yz ) u2 ( xy byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y a xy ) u1 ( c zx ) u3 zx A y ) u1 ( C z ) u3 Induction for Q z QED CSEP544 Optimal Sequential Algorithms Nov / 28

66 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

67 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

68 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

69 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

70 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

71 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

72 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28

73 Proof of Optimality for Generic Join Use exactly the same induction step as in Friedgut s inequality. CSEP544 Optimal Sequential Algorithms Nov / 28

Join Algorithms. Qiu, Yuan and Zhang, Haoqian

Join Algorithms. Qiu, Yuan and Zhang, Haoqian Join Algorithms Qiu, Yuan and Zhang, Haoqian Contents 1. Join algorithms in the lecture a. Nested-Loop Join b. Sort-merge Join c. Hash Join 2. Preliminaries a. Join as Hypergraphs b. The AGM bound for

More information

TECHNICAL REPORT Leapfrog Triejoin: A Worst-Case Optimal Join Algorithm. October 1, 2012 Todd L. Veldhuizen LB1201

TECHNICAL REPORT Leapfrog Triejoin: A Worst-Case Optimal Join Algorithm. October 1, 2012 Todd L. Veldhuizen LB1201 TECHNICAL REPORT Leapfrog Triejoin: A Worst-Case Optimal Join Algorithm October, 22 Todd L. Veldhuizen LB2 Leapfrog Triejoin: a worst-case optimal join algorithm Todd L. Veldhuizen Contents Introduction......................................

More information

Lecture 1: Conjunctive Queries

Lecture 1: Conjunctive Queries CS 784: Foundations of Data Management Spring 2017 Instructor: Paris Koutris Lecture 1: Conjunctive Queries A database schema R is a set of relations: we will typically use the symbols R, S, T,... to denote

More information

1 Linear programming relaxation

1 Linear programming relaxation Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Primal-dual min-cost bipartite matching August 27 30 1 Linear programming relaxation Recall that in the bipartite minimum-cost perfect matching

More information

Massively Parallel Communication and Query Evaluation

Massively Parallel Communication and Query Evaluation Massively Parallel Communication and Query Evaluation Paul Beame U. of Washington Based on joint work with ParaschosKoutrisand Dan Suciu [PODS 13], [PODS 14] 1 Massively Parallel Systems 2 MapReduce [Dean,Ghemawat

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Query Evaluation and Optimization

Query Evaluation and Optimization Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21

More information

Parallel Breadth First Search

Parallel Breadth First Search CSE341T/CSE549T 11/03/2014 Lecture 18 Parallel Breadth First Search Today, we will look at a basic graph algorithm, breadth first search (BFS). BFS can be applied to solve a variety of problems including:

More information

Size and Treewidth Bounds for Conjunctive Queries

Size and Treewidth Bounds for Conjunctive Queries Size and Treewidth Bounds for Conjunctive Queries Georg Gottlob Computing Laboratory and Oxford Man Institute of Quantitative Finance, University of Oxford,UK ggottlob@gmail.com Stephanie Tien Lee Computing

More information

Pentagons vs. triangles

Pentagons vs. triangles Discrete Mathematics 308 (2008) 4332 4336 www.elsevier.com/locate/disc Pentagons vs. triangles Béla Bollobás a,b, Ervin Győri c,1 a Trinity College, Cambridge CB2 1TQ, UK b Department of Mathematical Sciences,

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Lecture 3: Totally Unimodularity and Network Flows

Lecture 3: Totally Unimodularity and Network Flows Lecture 3: Totally Unimodularity and Network Flows (3 units) Outline Properties of Easy Problems Totally Unimodular Matrix Minimum Cost Network Flows Dijkstra Algorithm for Shortest Path Problem Ford-Fulkerson

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Optimal Query Processing Meets Information Theory

Optimal Query Processing Meets Information Theory Optimal Query Processing Meets Information Theory Dan Suciu University of Washington Mahmoud Abo-Khamis Hung Ngo RelationalAI Inc. [PODS 2016] [PODS 2017] Basic Question What is the optimal runtime to

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

Notes for Lecture 20

Notes for Lecture 20 U.C. Berkeley CS170: Intro to CS Theory Handout N20 Professor Luca Trevisan November 13, 2001 Notes for Lecture 20 1 Duality As it turns out, the max-flow min-cut theorem is a special case of a more general

More information

Approximation Algorithms: The Primal-Dual Method. My T. Thai

Approximation Algorithms: The Primal-Dual Method. My T. Thai Approximation Algorithms: The Primal-Dual Method My T. Thai 1 Overview of the Primal-Dual Method Consider the following primal program, called P: min st n c j x j j=1 n a ij x j b i j=1 x j 0 Then the

More information

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week. Database Systems ( 料 ) December 13/14, 2006 Lecture #10 1 Announcement Assignment #4 is due next week. 2 1 Overview of Query Evaluation Chapter 12 3 Outline Query evaluation (Overview) Relational Operator

More information

Combinatorial Optimization

Combinatorial Optimization Combinatorial Optimization Frank de Zeeuw EPFL 2012 Today Introduction Graph problems - What combinatorial things will we be optimizing? Algorithms - What kind of solution are we looking for? Linear Programming

More information

Query Evaluation (i)

Query Evaluation (i) ICS 421 Spring 2010 Query Evaluation (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Query Enumerate

More information

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36 CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 36 CS 473: Algorithms, Spring 2018 LP Duality Lecture 20 April 3, 2018 Some of the

More information

8 Matroid Intersection

8 Matroid Intersection 8 Matroid Intersection 8.1 Definition and examples 8.2 Matroid Intersection Algorithm 8.1 Definitions Given two matroids M 1 = (X, I 1 ) and M 2 = (X, I 2 ) on the same set X, their intersection is M 1

More information

λ -Harmonious Graph Colouring

λ -Harmonious Graph Colouring λ -Harmonious Graph Colouring Lauren DeDieu McMaster University Southwestern Ontario Graduate Mathematics Conference June 4th, 201 What is a graph? What is vertex colouring? 1 1 1 2 2 Figure : Proper Colouring.

More information

{ 1} Definitions. 10. Extremal graph theory. Problem definition Paths and cycles Complete subgraphs

{ 1} Definitions. 10. Extremal graph theory. Problem definition Paths and cycles Complete subgraphs Problem definition Paths and cycles Complete subgraphs 10. Extremal graph theory 10.1. Definitions Let us examine the following forbidden subgraph problems: At most how many edges are in a graph of order

More information

Logical Aspects of Massively Parallel and Distributed Systems

Logical Aspects of Massively Parallel and Distributed Systems Logical Aspects of Massively Parallel and Distributed Systems Frank Neven Hasselt University PODS Tutorial June 29, 2016 PODS June 29, 2016 1 / 62 Logical aspects of massively parallel and distributed

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

K 4 C 5. Figure 4.5: Some well known family of graphs

K 4 C 5. Figure 4.5: Some well known family of graphs 08 CHAPTER. TOPICS IN CLASSICAL GRAPH THEORY K, K K K, K K, K K, K C C C C 6 6 P P P P P. Graph Operations Figure.: Some well known family of graphs A graph Y = (V,E ) is said to be a subgraph of a graph

More information

RELATIONAL OPERATORS #1

RELATIONAL OPERATORS #1 RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query

More information

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected graph G with at least 2 vertices contains at least 2

More information

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material

More information

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3) COMPSCI 632: Approximation Algorithms September 18, 2017 Lecturer: Debmalya Panigrahi Lecture 7 Scribe: Xiang Wang 1 Overview In this lecture, we will use Primal-Dual method to design approximation algorithms

More information

In this lecture, we ll look at applications of duality to three problems:

In this lecture, we ll look at applications of duality to three problems: Lecture 7 Duality Applications (Part II) In this lecture, we ll look at applications of duality to three problems: 1. Finding maximum spanning trees (MST). We know that Kruskal s algorithm finds this,

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

arxiv: v7 [cs.db] 5 Jan 2017

arxiv: v7 [cs.db] 5 Jan 2017 EmptyHeaded: A Relational Engine for Graph Processing Christopher R. Aberger Stanford University caberger@stanford.edu Susan Tu Stanford University sctu@stanford.edu Christopher Ré Stanford University

More information

Maximum flow problem CE 377K. March 3, 2015

Maximum flow problem CE 377K. March 3, 2015 Maximum flow problem CE 377K March 3, 2015 Informal evaluation results 2 slow, 16 OK, 2 fast Most unclear topics: max-flow/min-cut, WHAT WILL BE ON THE MIDTERM? Most helpful things: review at start of

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from

More information

Optimization of logical query plans Eliminating redundant joins

Optimization of logical query plans Eliminating redundant joins Optimization of logical query plans Eliminating redundant joins 66 Optimization of logical query plans Query Compiler Execution Engine SQL Translation Logical query plan "Intermediate code" Logical plan

More information

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into

More information

Extremal results for Berge-hypergraphs

Extremal results for Berge-hypergraphs Extremal results for Berge-hypergraphs Dániel Gerbner Cory Palmer Abstract Let G be a graph and H be a hypergraph both on the same vertex set. We say that a hypergraph H is a Berge-G if there is a bijection

More information

Implementation of Relational Operations

Implementation of Relational Operations Implementation of Relational Operations Module 4, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset of rows

More information

2.3 Algorithms Using Map-Reduce

2.3 Algorithms Using Map-Reduce 28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Group Members: 1. Geng Xue (A0095628R) 2. Cai Jingli (A0095623B) 3. Xing Zhe (A0095644W) 4. Zhu Xiaolu (A0109657W) 5. Wang Zixiao (A0095670X) 6. Jiao Qing (A0095637R) 7. Zhang

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VIII Lecture 16, March 19, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part VII Algorithms for Relational Operations (Cont d) Today s Session:

More information

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007

College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007 College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007 CS G399: Algorithmic Power Tools I Scribe: Eric Robinson Lecture Outline: Linear Programming: Vertex Definitions

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

Solving Boolean Equations with BDDs and Clause Forms. Gert Smolka

Solving Boolean Equations with BDDs and Clause Forms. Gert Smolka Solving Boolean Equations with BDDs and Clause Forms Gert Smolka Abstract Methods for solving Boolean equations BDDs [Bryant 1986] Clause forms [Quine 1959] Efficient data structure and algorithms for

More information

From Hypertree Width to Submodular Width and Data-dependent Structural Decompositions

From Hypertree Width to Submodular Width and Data-dependent Structural Decompositions From Hypertree Width to Submodular Width and Data-dependent Structural Decompositions Francesco Scarcello, DIMES, Università della Calabria and Georg Gottlob, Vienna, Oxford... Foundational Challenges

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017 COMPSCI 632: Approximation Algorithms August 30, 2017 Lecturer: Debmalya Panigrahi Lecture 2 Scribe: Nat Kell 1 Introduction In this lecture, we examine a variety of problems for which we give greedy approximation

More information

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I. Instructor: Shaddin Dughmi

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I. Instructor: Shaddin Dughmi CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I Instructor: Shaddin Dughmi Announcements Posted solutions to HW1 Today: Combinatorial problems

More information

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016 GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS ZOÉ HAMEL March 3, 2016 1. Introduction Let G = (V (G), E(G)) be a graph G (loops and multiple edges not allowed) on the set of vertices V (G) and the set

More information

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1 CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query Sub-System Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer Plan Generator Plan Cost

More information

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Query Processing. Introduction to Databases CompSci 316 Fall 2017 Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2

More information

15-415/615 Faloutsos 1

15-415/615 Faloutsos 1 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415/615 Faloutsos 1 Outline introduction selection

More information

TotalCost = 3 (1, , 000) = 6, 000

TotalCost = 3 (1, , 000) = 6, 000 156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about

More information

COMP260 Spring 2014 Notes: February 4th

COMP260 Spring 2014 Notes: February 4th COMP260 Spring 2014 Notes: February 4th Andrew Winslow In these notes, all graphs are undirected. We consider matching, covering, and packing in bipartite graphs, general graphs, and hypergraphs. We also

More information

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler Database Theory Database Theory VU 181.140, SS 2011 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 8 March,

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

Relational Model, Relational Algebra, and SQL

Relational Model, Relational Algebra, and SQL Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity

More information

CS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department

CS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of

More information

Shannon Switching Game

Shannon Switching Game EECS 495: Combinatorial Optimization Lecture 1 Shannon s Switching Game Shannon Switching Game In the Shannon switching game, two players, Join and Cut, alternate choosing edges on a graph G. Join s objective

More information

CSE 544: Principles of Database Systems

CSE 544: Principles of Database Systems CSE 544: Principles of Database Systems Semijoin Reductions Theory Wrap-up CSE544 - Spring, 2012 1 Announcements Makeup lectures: Friday, May 18, 10:30-11:50, CSE 405 Friday, May 25, 10:30-11:50, CSE 405

More information

Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy

Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy CompSci 516 ata Intensive Computing Systems Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy uke CS, Fall 2016 CompSci 516: ata Intensive Computing Systems Announcements Final

More information

CPSC 536N: Randomized Algorithms Term 2. Lecture 10

CPSC 536N: Randomized Algorithms Term 2. Lecture 10 CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very

More information

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 624 Spring 2011 Overview of DB & IR Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/12/2011 Lipyeow Lim -- University of Hawaii at Manoa 1 Example

More information

Steiner Trees and Forests

Steiner Trees and Forests Massachusetts Institute of Technology Lecturer: Adriana Lopez 18.434: Seminar in Theoretical Computer Science March 7, 2006 Steiner Trees and Forests 1 Steiner Tree Problem Given an undirected graph G

More information

Parameterized coloring problems on chordal graphs

Parameterized coloring problems on chordal graphs Parameterized coloring problems on chordal graphs Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary dmarx@cs.bme.hu

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

CSE 417 Network Flows (pt 3) Modeling with Min Cuts

CSE 417 Network Flows (pt 3) Modeling with Min Cuts CSE 417 Network Flows (pt 3) Modeling with Min Cuts Reminders > HW6 is due on Friday start early bug fixed on line 33 of OptimalLineup.java: > change true to false Review of last two lectures > Defined

More information

arxiv:cs/ v1 [cs.ds] 20 Feb 2003

arxiv:cs/ v1 [cs.ds] 20 Feb 2003 The Traveling Salesman Problem for Cubic Graphs David Eppstein School of Information & Computer Science University of California, Irvine Irvine, CA 92697-3425, USA eppstein@ics.uci.edu arxiv:cs/0302030v1

More information

Reductions and Satisfiability

Reductions and Satisfiability Reductions and Satisfiability 1 Polynomial-Time Reductions reformulating problems reformulating a problem in polynomial time independent set and vertex cover reducing vertex cover to set cover 2 The Satisfiability

More information

Number Theory and Graph Theory

Number Theory and Graph Theory 1 Number Theory and Graph Theory Chapter 6 Basic concepts and definitions of graph theory By A. Satyanarayana Reddy Department of Mathematics Shiv Nadar University Uttar Pradesh, India E-mail: satya8118@gmail.com

More information

Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Shortest path problems

Shortest path problems Next... Shortest path problems Single-source shortest paths in weighted graphs Shortest-Path Problems Properties of Shortest Paths, Relaxation Dijkstra s Algorithm Bellman-Ford Algorithm Shortest-Paths

More information

Evaluation of Relational Operations. Relational Operations

Evaluation of Relational Operations. Relational Operations Evaluation of Relational Operations Chapter 14, Part A (Joins) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Operations v We will consider how to implement: Selection ( )

More information

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler

Database Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler Database Theory Database Theory VU 181.140, SS 2018 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 6 March,

More information

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Schema Mappings and Data Exchange

Schema Mappings and Data Exchange Schema Mappings and Data Exchange Lecture #2 EASSLC 2012 Southwest University August 2012 1 The Relational Data Model (E.F. Codd 1970) The Relational Data Model uses the mathematical concept of a relation

More information

Math 170- Graph Theory Notes

Math 170- Graph Theory Notes 1 Math 170- Graph Theory Notes Michael Levet December 3, 2018 Notation: Let n be a positive integer. Denote [n] to be the set {1, 2,..., n}. So for example, [3] = {1, 2, 3}. To quote Bud Brown, Graph theory

More information

Lecture #7. 1 Introduction. 2 Dijkstra s Correctness. COMPSCI 330: Design and Analysis of Algorithms 9/16/2014

Lecture #7. 1 Introduction. 2 Dijkstra s Correctness. COMPSCI 330: Design and Analysis of Algorithms 9/16/2014 COMPSCI 330: Design and Analysis of Algorithms 9/16/2014 Lecturer: Debmalya Panigrahi Lecture #7 Scribe: Nat Kell 1 Introduction In this lecture, we will further examine shortest path algorithms. We will

More information

CS 377 Database Systems

CS 377 Database Systems CS 377 Database Systems Relational Algebra and Calculus Li Xiong Department of Mathematics and Computer Science Emory University 1 ER Diagram of Company Database 2 3 4 5 Relational Algebra and Relational

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

A List Heuristic for Vertex Cover

A List Heuristic for Vertex Cover A List Heuristic for Vertex Cover Happy Birthday Vasek! David Avis McGill University Tomokazu Imamura Kyoto University Operations Research Letters (to appear) Online: http://cgm.cs.mcgill.ca/ avis revised:

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007 Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,

More information

Logic and Databases. Lecture 4 - Part 2. Phokion G. Kolaitis. UC Santa Cruz & IBM Research - Almaden

Logic and Databases. Lecture 4 - Part 2. Phokion G. Kolaitis. UC Santa Cruz & IBM Research - Almaden Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research - Almaden Lecture 4 - Part 2 2 / 17 Alternative Semantics of Queries Bag Semantics We focused on the containment problem for conjunctive

More information

4 Greedy Approximation for Set Cover

4 Greedy Approximation for Set Cover COMPSCI 532: Design and Analysis of Algorithms October 14, 2015 Lecturer: Debmalya Panigrahi Lecture 12 Scribe: Allen Xiao 1 Overview In this lecture, we introduce approximation algorithms and their analysis

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Menu. Algebraic Simplification - Boolean Algebra EEL3701 EEL3701. MSOP, MPOS, Simplification

Menu. Algebraic Simplification - Boolean Algebra EEL3701 EEL3701. MSOP, MPOS, Simplification Menu Minterms & Maxterms SOP & POS MSOP & MPOS Simplification using the theorems/laws/axioms Look into my... 1 Definitions (Review) Algebraic Simplification - Boolean Algebra Minterms (written as m i ):

More information

Introduction to Approximation Algorithms

Introduction to Approximation Algorithms Introduction to Approximation Algorithms Dr. Gautam K. Das Departmet of Mathematics Indian Institute of Technology Guwahati, India gkd@iitg.ernet.in February 19, 2016 Outline of the lecture Background

More information

2 The Fractional Chromatic Gap

2 The Fractional Chromatic Gap C 1 11 2 The Fractional Chromatic Gap As previously noted, for any finite graph. This result follows from the strong duality of linear programs. Since there is no such duality result for infinite linear

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

Monochromatic loose-cycle partitions in hypergraphs

Monochromatic loose-cycle partitions in hypergraphs Monochromatic loose-cycle partitions in hypergraphs András Gyárfás Alfréd Rényi Institute of Mathematics Hungarian Academy of Sciences Budapest, P.O. Box 27 Budapest, H-364, Hungary gyarfas.andras@renyi.mta.hu

More information

Course No: 4411 Database Management Systems Fall 2008 Midterm exam

Course No: 4411 Database Management Systems Fall 2008 Midterm exam Course No: 4411 Database Management Systems Fall 2008 Midterm exam Last Name: First Name: Student ID: Exam is 80 minutes. Open books/notes The exam is out of 20 points. 1 1. (16 points) Multiple Choice

More information

Assignment 5: Solutions

Assignment 5: Solutions Algorithm Design Techniques Assignment 5: Solutions () Port Authority. [This problem is more commonly called the Bin Packing Problem.] (a) Suppose K = 3 and (w, w, w 3, w 4 ) = (,,, ). The optimal solution

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information