Advanced Query Processing
|
|
- Ashlie Long
- 5 years ago
- Views:
Transcription
1 Advanced Query Processing CSEP544 Nov CSEP544 Optimal Sequential Algorithms Nov / 28
2 Lecture 9: Advanced Query Processing Optimal Sequential Algorithms. Semijoin Reduction Optimal Parallel Algorithms. CSEP544 Optimal Sequential Algorithms Nov / 28
3 Bibliography E Friedgut, Hypergraphs, entropy, and inequalities, American Mathematical Monthly, , Albert Atserias, Martin Grohe, Dniel Marx: Size Bounds and Query Plans for Relational Joins. SIAM J. Comput. 42(4): (2013) Hung Q. Ngo, Christopher Ré, Atri Rudra: Skew strikes back: new developments in the theory of join algorithms. SIGMOD Record 42(4): 5-16 (2013) Paul Beame, Paraschos Koutris, Dan Suciu: Skew in parallel query processing. PODS 2014: Paul Beame, Paraschos Koutris, Dan Suciu: Communication steps for parallel query processing. PODS 2013: CSEP544 Optimal Sequential Algorithms Nov / 28
4 Natural Join R S Joins R, S on all common attributes, removes duplicate attributes R A B a 1 b 1 a 1 b 2 a 2 b 3 a 3 b 4 S A C a 1 c 1 a 1 c 2 a 3 c 3 a 4 c 4 T = R S A B C a 1 b 1 c 1 a 1 b 1 c 2 a 1 b 2 c 1 a 1 b 2 c 2 a 3 b 4 c 3 Input schemas: R(A, B), S(A, C) Output schema: T (A, B, C) CSEP544 Optimal Sequential Algorithms Nov / 28
5 Very Quick Review of Basic Join Algorithms Compute R A=B S Nested-loop join Hash-join Merge-join (To describe in class.) Complexity: O(( R + S + R A=B S ) log( R + S )) Ignoring log factors, Complexity: O( Input + Output ) CSEP544 Optimal Sequential Algorithms Nov / 28
6 Conjunctive Queries Example Q 1 (x, y, z, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: (R(x, y) S(y, z)) T (z, u) First Order Logic: Q 1 = {(x, y, z, u) (x, y) R (y, z) S (z, u) T } SQL: select from where CSEP544 Optimal Sequential Algorithms Nov / 28
7 Conjunctive Queries Example Q 1 (x, y, z, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: (R(x, y) S(y, z)) T (z, u) First Order Logic: Q 1 = {(x, y, z, u) (x, y) R (y, z) S (z, u) T } SQL: select from where Example Q 2 (x, u) = R(x, y), S(y, z), T (z, u) Relational Algebra: Π x,u ((R(x, y) S(y, z)) T (z, u)) First Order Logic: Q 1 = {(x, u) y z((x, y) R (y, z) S (z, u) T )} SQL: select from where CSEP544 Optimal Sequential Algorithms Nov / 28
8 Traditional Approach to Computing Conjunctive Queries Q(x, y, z) = R(x, y), S(y, z), T (z, x) Optimizer generates a query plan: Temp(x, y, z) =R(x, y) S(y, z) Q(x, y, z) =Temp(x, y, z) T (z, x) Optimizers examines many possible plans, evaluates the cheapest plan. Problem: intermediate results may be large, and very hard to estimate. CSEP544 Optimal Sequential Algorithms Nov / 28
9 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? CSEP544 Optimal Sequential Algorithms Nov / 28
10 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 CSEP544 Optimal Sequential Algorithms Nov / 28
11 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 Can be m 1 m 2 CSEP544 Optimal Sequential Algorithms Nov / 28
12 Upper Bound on the Size of the Answer Consider the join of two relations: Q(x, y, z) = R(x, y), S(y, z) Question If R = m 1, S = m 2, how large can Q be? Can be 0 Can be m 1 m 2 Answer: 0 Q m 1 m 2. CSEP544 Optimal Sequential Algorithms Nov / 28
13 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? CSEP544 Optimal Sequential Algorithms Nov / 28
14 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) CSEP544 Optimal Sequential Algorithms Nov / 28
15 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) CSEP544 Optimal Sequential Algorithms Nov / 28
16 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) But also: m 1 m 3, m 2 m 3 CSEP544 Optimal Sequential Algorithms Nov / 28
17 Upper Bound on the Size of the Answer Q(x, y, z) = R(x, y), S(y, z), T (z, x) Question If R = m 1, S = m 2, T = m 3, how large can the result be? Naive answer: m 1 m 2 m 3 (why?) Better answer: m 1 m 2 (why?) But also: m 1 m 3, m 2 m 3 We will show: Also (and better!): Q m 1 m 2 m 3 There exists an algorithm that computes Q in time min(all the above) How this generalizes to any conjunctive query. CSEP544 Optimal Sequential Algorithms Nov / 28
18 The Hypergraph of a Query Definition Let Q be a full conjunctive query without self-joins. The hypergraph G of Q consists of: Nodes(G) = Vars(Q) the set of variables of Q HyperEdges(G) = Atoms(Q) the set of atoms of Q. CSEP544 Optimal Sequential Algorithms Nov / 28
19 The Hypergraph of a Query Definition Let Q be a full conjunctive query without self-joins. The hypergraph G of Q consists of: Nodes(G) = Vars(Q) the set of variables of Q HyperEdges(G) = Atoms(Q) the set of atoms of Q. Q(x,y,z) = R(x,y),S(y,z),T(z,x) x z y Q(x,y,z) = R(x,y,z),S(x),T(y),K(z),M(x,u) u x z y CSEP544 Optimal Sequential Algorithms Nov / 28
20 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28
21 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28
22 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28
23 Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition An edge cover = subset of edges that contain all nodes. Full conjunctive query: Q(x) = R 1 (x 1 ),..., R l (x l ) Relation sizes: R 1 = m 1,..., R l = m l Proposition (Simple!) Let R i1,..., R iu be any edge cover. Then Q m i1 m i2 m iu (proof in class) CSEP544 Optimal Sequential Algorithms Nov / 28
24 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28
25 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28
26 Fractional Edge Cover of a Hypergraph G G = nodes x 1,..., x k and hyperedges R 1,..., R l. Definition A fractional edge cover = sequence of positive numbers u 1,..., u l s.t.: i j x i R j u j 1 Theorem (AGM 13) Let u 1,..., u l be any fractional edge cover. Then Q m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28
27 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28
28 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l A fractional edge: u = (1/2, 1/2, 1/2) 1/2 x 1/2 z 1/2 y CSEP544 Optimal Sequential Algorithms Nov / 28
29 Examples Q(x, y, z) = R(x, y), S(y, z), T (z, x) R = S = T = m AGM u (Q) = m u 1 1 mu 2 2 mu l l A fractional edge: u = (1/2, 1/2, 1/2) 1/2 x 1/2 z 1/2 y It follows that Q m 1/2 m 1/2 m 1/2 = m 3/2 With m edges you can built at most m 3/2 triangles! CSEP544 Optimal Sequential Algorithms Nov / 28
30 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28
31 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28
32 AGM Bound Definition AGM(Q) = min u m u 1 1 mu 2 2 mu l l Thus: Q AGM(Q). Example Q(x, y, z) = R(x, y), S(y, z), T (z, x), R = m 1, S = m 2, T = m 3 u = (1, 1, 0) (1, 0, 1) (0, 1, 1) ( 1 2, 1 2, 1 2 ) AGM(Q) = min of m 1 m 2 m 1 m 3 m 2 m 3 (m 1 m 2 m 3 ) 1/2 Example Q(x, y, z, v, w) = R(x, y), S(y, z), T (z, v), K(v, w) u = (1, 0, 1, 1) (1, 1, 0, 1) AGM(Q) = min of m 1 m 3 m 4 m 1 m 2 m 4 CSEP544 Optimal Sequential Algorithms Nov / 28
33 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l CSEP544 Optimal Sequential Algorithms Nov / 28
34 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Find three relations R = S = T = m such that Q = m 3/2 CSEP544 Optimal Sequential Algorithms Nov / 28
35 The Worst-Case Query Output Question Can the query output ever get as large as the AGM bound? AGM u (Q) = m u 1 1 mu 2 2 mu l l Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Find three relations R = S = T = m such that Q = m 3/2 Answer: let n = m 1/2, and R = S = T = [n] [n]. Then Q = n 3 = m 3/2. CSEP544 Optimal Sequential Algorithms Nov / 28
36 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 CSEP544 Optimal Sequential Algorithms Nov / 28
37 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 Fractional Edge Cover maximize v i i j v i log m j i x i R j Fractional Vertex Packing CSEP544 Optimal Sequential Algorithms Nov / 28
38 Background in Algebra: Duality of Linear Programs AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 Fractional Edge Cover maximize v i i j v i log m j i x i R j Fractional Vertex Packing Theorem (Strong Duality of Linear Programs) min u j u j log m j = max v v i i CSEP544 Optimal Sequential Algorithms Nov / 28
39 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 CSEP544 Optimal Sequential Algorithms Nov / 28
40 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 max(v x + v y + v z) R v x + v y log R S v y + v z log S T v x + v z log T Strong duality theorem: min u R log R + u S log S + u T log T = max v x + v y + v z. CSEP544 Optimal Sequential Algorithms Nov / 28
41 Background in Algebra: Duality of Linear Programs Q(x, y, z) = R(x, y), S(y, z), T (z, x) 1/2 x 1/2 z 1/2 y min(u R log R + u S log S + u T log T ) x u R + u T 1 y u R + u S 1 z u S + u T 1 max(v x + v y + v z) R v x + v y log R S v y + v z log S T v x + v z log T Strong duality theorem: min u R log R + u S log S + u T log T = max v x + v y + v z. In class: what are the optimal solutions in these cases: R = S = T R = S T CSEP544 Optimal Sequential Algorithms Nov / 28
42 The AGM Bound is Tight AGM u (Q) = m u 1 1 mu 2 2 mu l l AGM(Q) = min u AGM u (Q) is the optimal solution to: minimize u j log m j j i j x i R j u j 1 maximize v i i j v i log m j i x i R j Theorem The AGM bound is tight Proof: start with an optimal solution v i. Define R(x 1, x 5, x 8 ) = [2 v 1 ] [2 v 5 ] [2 v 8 ] etc Then Q = 2 v 1+v = 2 u 1 log m 1 +u 2 log m = m u 1 1 mu 2 2 CSEP544 Optimal Sequential Algorithms Nov / 28
43 Computing Full Conjunctive Queries Recall: all database systems compute one join at a time This may be much larger than the maximum output size, AGM(Q). Goal: design a algorithm that runs in time AGM(Q). Worst-Case-Optimal algorithm: runs in time AGM(Q). CSEP544 Optimal Sequential Algorithms Nov / 28
44 Generic Join Compute Q(x) = R 1 (x 1 ),..., R l (x l ) If x = 1 then return R 1 R l. Otherwise, choose a variable x Assuem it occurrs in atoms R i1,..., R ik Compute A = Π x (R i1 ) Π x (R ik ) For each a A, compute Result a = Q[a/x] using Generic-Join Return a Result a. Runtime: O(AGM(Q)) (Plus a log n factor for index lookup) CSEP544 Optimal Sequential Algorithms Nov / 28
45 Generic Join Compute Q(x) = R 1 (x 1 ),..., R l (x l ) If x = 1 then return R 1 R l. Otherwise, choose a variable x Assuem it occurrs in atoms R i1,..., R ik Compute A = Π x (R i1 ) Π x (R ik ) For each a A, compute Result a = Q[a/x] using Generic-Join Return a Result a. Runtime: O(AGM(Q)) (Plus a log n factor for index lookup) CSEP544 Optimal Sequential Algorithms Nov / 28
46 Generic Join Example Q(x, y, z) = R(x, y), S(y, z), T (z, x) Compute A = Π x (R) Π x (T ) = {a 1,..., a n } For each a i A, denote R (y) = R(a i, y), T (z) = T (z, a i ) Compute Result i (a i, y, z) = R (y), S(y, z), T (z) Return i Result i Runtime: O(m 3/2 ) assuming R = S = T = m. CSEP544 Optimal Sequential Algorithms Nov / 28
47 Details of Generic Join Fix variable order: x 1, x 2,... (AGM bound always holds, but in practice the order can matter a lot) Order/index the relations accordingly. For example, R(x 3, x 6, x 7 ) has a B+-index on x 3, x 6, x 7. Computing A = Π x (R i1 ) Π x (R ik ) is similar to multi-way merge join. Must ensure that runtime is min( R i1, R i2,...). When we iterate a A, we are making one more binding in all indexes. LeapFrog Tree Join (by LogicBlox) is based on these ideas. CSEP544 Optimal Sequential Algorithms Nov / 28
48 Details of Generic Join Q(x, y, z) = R(x, y), S(y, z), T (z, x) Compute A = Π x (R) Π x (T ) = {a 1,..., a n } For each a i A, denote R (y) = R(a i, y), T (z) = T (z, a i ) Compute Result i (a i, y, z) = R (y), S(y, z), T (z) Return i Result i Runtime: O(m 3/2 ) assuming R = S = T = m. Variable order: x, y, z. What happens in each case? Complete bipartite graph: R = S = T = [m 1/2 ] [m 1/2 ]. Skewed at x: R = [1] [m], S [m] [m], T = [m] [1]. Skewed at y: R = [m] [1], S = [1] [m], T [m] [m] Skewed at z: R [m] [m], S = [m] [1], T [1] [m] CSEP544 Optimal Sequential Algorithms Nov / 28
49 Discussion Major advantage over one-join-at-at-time algorithms for cyclic queries! For acyclic queries, the story is more complex (discussed later) We haven t proven its optimality yet: will do this next. CSEP544 Optimal Sequential Algorithms Nov / 28
50 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 CSEP544 Optimal Sequential Algorithms Nov / 28
51 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 CSEP544 Optimal Sequential Algorithms Nov / 28
52 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i ) w CSEP544 Optimal Sequential Algorithms Nov / 28
53 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? CSEP544 Optimal Sequential Algorithms Nov / 28
54 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); CSEP544 Optimal Sequential Algorithms Nov / 28
55 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); Q triangle (x, y, z) = R(x, y), S(y, z), T (z, x); CSEP544 Optimal Sequential Algorithms Nov / 28
56 What are the queries in the examples above? Q Cauchy-Schwartz (x) = R(x), S(x); Q triangle (x, y, z) = R(x, y), S(y, z), T (z, x); Q Hölder (x) = R(x), S(x), T (x) CSEP544 Optimal Sequential Algorithms Nov / 28 Friedgut s Inequality Cauchy-Schwartz: i a i b i ( i a 2 i ) 1 2 ( i b 2 i ) 1 2 Triangle: i,j,k a ij b jk c ki ( i,j a 2 ij ) 1 2 ( j,k b 2 jk ) 1 2 ( k,i c 2 ki ) 1 2 Hölder (u + v + w 1): Theorem (Friedgut 04) i a i b i c i ( i a 1 u i ) u ( i b 1 v i ) v ( i c 1 w i Let Q(x) = R 1 (x 1 ),..., R l (x l ) be a query and u 1,..., u l be a fractional edge cover. Then: 1 u 1 1 u l u x a 1,x1 a l,xl ( x1 a 1 ( a u l xl 1,x 1 ) l,x l ) ) w
57 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l CSEP544 Optimal Sequential Algorithms Nov / 28
58 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x CSEP544 Optimal Sequential Algorithms Nov / 28
59 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. CSEP544 Optimal Sequential Algorithms Nov / 28
60 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) CSEP544 Optimal Sequential Algorithms Nov / 28
61 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz byz u2 yz x axy u1 czx u3 group by x CSEP544 Optimal Sequential Algorithms Nov / 28
62 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x CSEP544 Optimal Sequential Algorithms Nov / 28
63 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y A y ) u1 ( C z ) u3 Induction for Q z CSEP544 Optimal Sequential Algorithms Nov / 28
64 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz =( yz b yz ) u2 ( xy byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y a xy ) u1 ( c zx ) u3 zx A y ) u1 ( C z ) u3 Induction for Q z CSEP544 Optimal Sequential Algorithms Nov / 28
65 Friedgut s Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l x a u 1 1,x 1 a u l l,x l ( x1 a 1,x1 ) u1 ( xl a l,xl ) u l Proof: by induction on x Base Case. x = 1: Q(x) = R 1 (x),..., R l (x), u u l 1 Prove: x a u1 1,x au l l,x ( x a 1,x ) u1 ( x a l,x ) u l This is Hölder. Induction Step. Pick a variable x, and remove it. For example, Q(x, y, z) = R(x, y), S(y, z), T (z, x) becomes Q (y, z) = R (y), S(y, z), T (z) axy u1 byzc u2 zx u3 = xyz b u2 yz yz( x = byza u2 u1 y C u3 yz =( yz b yz ) u2 ( xy byz u2 yz x axy u1 czx u3 group by x a xy ) u1 ( c zx ) u3 Hölder u 1 + u 3 1 x z ( yz b yz ) u2 ( y a xy ) u1 ( c zx ) u3 zx A y ) u1 ( C z ) u3 Induction for Q z QED CSEP544 Optimal Sequential Algorithms Nov / 28
66 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
67 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
68 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
69 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
70 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
71 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
72 The AGM Inequality Proof Query Q(x) = R 1 (x 1 ),..., R l (x l ), fractional cover u 1,..., u l Sizes R 1 = m 1,..., R l = m l Prove Q m u 1 1 mu l l Let Dom = the domain of all constants in the relations R 1,..., R l. For every j = 1,..., l, and every tuple x j Dom x j, define: 1 if the tuple x j belongs to R j a j,xj = 0 otherwise Then: m j = R j = xj Dom x j a j,x j, Q = x Dom x a 1,x1 a l,xl Now use Friedgut s inequality: Q = a u 1 1,x 1 a u l l,x l ( a 1,x1 ) u 1 ( a l,xl ) u l = m u 1 1 mu l l x x 1 x l QED CSEP544 Optimal Sequential Algorithms Nov / 28
73 Proof of Optimality for Generic Join Use exactly the same induction step as in Friedgut s inequality. CSEP544 Optimal Sequential Algorithms Nov / 28
Join Algorithms. Qiu, Yuan and Zhang, Haoqian
Join Algorithms Qiu, Yuan and Zhang, Haoqian Contents 1. Join algorithms in the lecture a. Nested-Loop Join b. Sort-merge Join c. Hash Join 2. Preliminaries a. Join as Hypergraphs b. The AGM bound for
More informationTECHNICAL REPORT Leapfrog Triejoin: A Worst-Case Optimal Join Algorithm. October 1, 2012 Todd L. Veldhuizen LB1201
TECHNICAL REPORT Leapfrog Triejoin: A Worst-Case Optimal Join Algorithm October, 22 Todd L. Veldhuizen LB2 Leapfrog Triejoin: a worst-case optimal join algorithm Todd L. Veldhuizen Contents Introduction......................................
More informationLecture 1: Conjunctive Queries
CS 784: Foundations of Data Management Spring 2017 Instructor: Paris Koutris Lecture 1: Conjunctive Queries A database schema R is a set of relations: we will typically use the symbols R, S, T,... to denote
More information1 Linear programming relaxation
Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Primal-dual min-cost bipartite matching August 27 30 1 Linear programming relaxation Recall that in the bipartite minimum-cost perfect matching
More informationMassively Parallel Communication and Query Evaluation
Massively Parallel Communication and Query Evaluation Paul Beame U. of Washington Based on joint work with ParaschosKoutrisand Dan Suciu [PODS 13], [PODS 14] 1 Massively Parallel Systems 2 MapReduce [Dean,Ghemawat
More informationAnnouncement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17
Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa
More informationQuery Evaluation and Optimization
Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21
More informationParallel Breadth First Search
CSE341T/CSE549T 11/03/2014 Lecture 18 Parallel Breadth First Search Today, we will look at a basic graph algorithm, breadth first search (BFS). BFS can be applied to solve a variety of problems including:
More informationSize and Treewidth Bounds for Conjunctive Queries
Size and Treewidth Bounds for Conjunctive Queries Georg Gottlob Computing Laboratory and Oxford Man Institute of Quantitative Finance, University of Oxford,UK ggottlob@gmail.com Stephanie Tien Lee Computing
More informationPentagons vs. triangles
Discrete Mathematics 308 (2008) 4332 4336 www.elsevier.com/locate/disc Pentagons vs. triangles Béla Bollobás a,b, Ervin Győri c,1 a Trinity College, Cambridge CB2 1TQ, UK b Department of Mathematical Sciences,
More informationMathematical and Algorithmic Foundations Linear Programming and Matchings
Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis
More informationLecture 3: Totally Unimodularity and Network Flows
Lecture 3: Totally Unimodularity and Network Flows (3 units) Outline Properties of Easy Problems Totally Unimodular Matrix Minimum Cost Network Flows Dijkstra Algorithm for Shortest Path Problem Ford-Fulkerson
More informationR & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:
Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More informationOptimal Query Processing Meets Information Theory
Optimal Query Processing Meets Information Theory Dan Suciu University of Washington Mahmoud Abo-Khamis Hung Ngo RelationalAI Inc. [PODS 2016] [PODS 2017] Basic Question What is the optimal runtime to
More informationEvaluation of relational operations
Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book
More informationNotes for Lecture 20
U.C. Berkeley CS170: Intro to CS Theory Handout N20 Professor Luca Trevisan November 13, 2001 Notes for Lecture 20 1 Duality As it turns out, the max-flow min-cut theorem is a special case of a more general
More informationApproximation Algorithms: The Primal-Dual Method. My T. Thai
Approximation Algorithms: The Primal-Dual Method My T. Thai 1 Overview of the Primal-Dual Method Consider the following primal program, called P: min st n c j x j j=1 n a ij x j b i j=1 x j 0 Then the
More informationDatabase Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.
Database Systems ( 料 ) December 13/14, 2006 Lecture #10 1 Announcement Assignment #4 is due next week. 2 1 Overview of Query Evaluation Chapter 12 3 Outline Query evaluation (Overview) Relational Operator
More informationCombinatorial Optimization
Combinatorial Optimization Frank de Zeeuw EPFL 2012 Today Introduction Graph problems - What combinatorial things will we be optimizing? Algorithms - What kind of solution are we looking for? Linear Programming
More informationQuery Evaluation (i)
ICS 421 Spring 2010 Query Evaluation (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Query Enumerate
More informationCS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36
CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC) CS473 1 Spring 2018 1 / 36 CS 473: Algorithms, Spring 2018 LP Duality Lecture 20 April 3, 2018 Some of the
More information8 Matroid Intersection
8 Matroid Intersection 8.1 Definition and examples 8.2 Matroid Intersection Algorithm 8.1 Definitions Given two matroids M 1 = (X, I 1 ) and M 2 = (X, I 2 ) on the same set X, their intersection is M 1
More informationλ -Harmonious Graph Colouring
λ -Harmonious Graph Colouring Lauren DeDieu McMaster University Southwestern Ontario Graduate Mathematics Conference June 4th, 201 What is a graph? What is vertex colouring? 1 1 1 2 2 Figure : Proper Colouring.
More information{ 1} Definitions. 10. Extremal graph theory. Problem definition Paths and cycles Complete subgraphs
Problem definition Paths and cycles Complete subgraphs 10. Extremal graph theory 10.1. Definitions Let us examine the following forbidden subgraph problems: At most how many edges are in a graph of order
More informationLogical Aspects of Massively Parallel and Distributed Systems
Logical Aspects of Massively Parallel and Distributed Systems Frank Neven Hasselt University PODS Tutorial June 29, 2016 PODS June 29, 2016 1 / 62 Logical aspects of massively parallel and distributed
More informationFaloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline
Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection
More informationK 4 C 5. Figure 4.5: Some well known family of graphs
08 CHAPTER. TOPICS IN CLASSICAL GRAPH THEORY K, K K K, K K, K K, K C C C C 6 6 P P P P P. Graph Operations Figure.: Some well known family of graphs A graph Y = (V,E ) is said to be a subgraph of a graph
More informationRELATIONAL OPERATORS #1
RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query
More informationSection 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected
Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected graph G with at least 2 vertices contains at least 2
More informationAdvanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs
Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material
More informationLecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)
COMPSCI 632: Approximation Algorithms September 18, 2017 Lecturer: Debmalya Panigrahi Lecture 7 Scribe: Xiang Wang 1 Overview In this lecture, we will use Primal-Dual method to design approximation algorithms
More informationIn this lecture, we ll look at applications of duality to three problems:
Lecture 7 Duality Applications (Part II) In this lecture, we ll look at applications of duality to three problems: 1. Finding maximum spanning trees (MST). We know that Kruskal s algorithm finds this,
More informationCS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing
CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2
More informationarxiv: v7 [cs.db] 5 Jan 2017
EmptyHeaded: A Relational Engine for Graph Processing Christopher R. Aberger Stanford University caberger@stanford.edu Susan Tu Stanford University sctu@stanford.edu Christopher Ré Stanford University
More informationMaximum flow problem CE 377K. March 3, 2015
Maximum flow problem CE 377K March 3, 2015 Informal evaluation results 2 slow, 16 OK, 2 fast Most unclear topics: max-flow/min-cut, WHAT WILL BE ON THE MIDTERM? Most helpful things: review at start of
More informationEvaluation of Relational Operations
Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from
More informationOptimization of logical query plans Eliminating redundant joins
Optimization of logical query plans Eliminating redundant joins 66 Optimization of logical query plans Query Compiler Execution Engine SQL Translation Logical query plan "Intermediate code" Logical plan
More informationImplementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12
Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into
More informationExtremal results for Berge-hypergraphs
Extremal results for Berge-hypergraphs Dániel Gerbner Cory Palmer Abstract Let G be a graph and H be a hypergraph both on the same vertex set. We say that a hypergraph H is a Berge-G if there is a bijection
More informationImplementation of Relational Operations
Implementation of Relational Operations Module 4, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset of rows
More information2.3 Algorithms Using Map-Reduce
28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure
More informationApproximation Algorithms
Approximation Algorithms Group Members: 1. Geng Xue (A0095628R) 2. Cai Jingli (A0095623B) 3. Xing Zhe (A0095644W) 4. Zhu Xiaolu (A0109657W) 5. Wang Zixiao (A0095670X) 6. Jiao Qing (A0095637R) 7. Zhang
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part VIII Lecture 16, March 19, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part VII Algorithms for Relational Operations (Cont d) Today s Session:
More informationCollege of Computer & Information Science Fall 2007 Northeastern University 14 September 2007
College of Computer & Information Science Fall 2007 Northeastern University 14 September 2007 CS G399: Algorithmic Power Tools I Scribe: Eric Robinson Lecture Outline: Linear Programming: Vertex Definitions
More informationMatching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.
18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have
More informationSolving Boolean Equations with BDDs and Clause Forms. Gert Smolka
Solving Boolean Equations with BDDs and Clause Forms Gert Smolka Abstract Methods for solving Boolean equations BDDs [Bryant 1986] Clause forms [Quine 1959] Efficient data structure and algorithms for
More informationFrom Hypertree Width to Submodular Width and Data-dependent Structural Decompositions
From Hypertree Width to Submodular Width and Data-dependent Structural Decompositions Francesco Scarcello, DIMES, Università della Calabria and Georg Gottlob, Vienna, Oxford... Foundational Challenges
More informationExtremal Graph Theory: Turán s Theorem
Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini
More informationLecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017
COMPSCI 632: Approximation Algorithms August 30, 2017 Lecturer: Debmalya Panigrahi Lecture 2 Scribe: Nat Kell 1 Introduction In this lecture, we examine a variety of problems for which we give greedy approximation
More informationCS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I. Instructor: Shaddin Dughmi
CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I Instructor: Shaddin Dughmi Announcements Posted solutions to HW1 Today: Combinatorial problems
More informationGRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016
GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS ZOÉ HAMEL March 3, 2016 1. Introduction Let G = (V (G), E(G)) be a graph G (loops and multiple edges not allowed) on the set of vertices V (G) and the set
More informationCAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1
CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query Sub-System Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer Plan Generator Plan Cost
More informationQuery Processing. Introduction to Databases CompSci 316 Fall 2017
Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2
More information15-415/615 Faloutsos 1
Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415/615 Faloutsos 1 Outline introduction selection
More informationTotalCost = 3 (1, , 000) = 6, 000
156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about
More informationCOMP260 Spring 2014 Notes: February 4th
COMP260 Spring 2014 Notes: February 4th Andrew Winslow In these notes, all graphs are undirected. We consider matching, covering, and packing in bipartite graphs, general graphs, and hypergraphs. We also
More informationDatabase Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2011 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 8 March,
More informationChapter 13: Query Optimization. Chapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical
More informationRelational Model, Relational Algebra, and SQL
Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity
More informationCS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of
More informationShannon Switching Game
EECS 495: Combinatorial Optimization Lecture 1 Shannon s Switching Game Shannon Switching Game In the Shannon switching game, two players, Join and Cut, alternate choosing edges on a graph G. Join s objective
More informationCSE 544: Principles of Database Systems
CSE 544: Principles of Database Systems Semijoin Reductions Theory Wrap-up CSE544 - Spring, 2012 1 Announcements Makeup lectures: Friday, May 18, 10:30-11:50, CSE 405 Friday, May 25, 10:30-11:50, CSE 405
More informationLecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy
CompSci 516 ata Intensive Computing Systems Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy uke CS, Fall 2016 CompSci 516: ata Intensive Computing Systems Announcements Final
More informationCPSC 536N: Randomized Algorithms Term 2. Lecture 10
CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very
More informationOverview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 624 Spring 2011 Overview of DB & IR Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/12/2011 Lipyeow Lim -- University of Hawaii at Manoa 1 Example
More informationSteiner Trees and Forests
Massachusetts Institute of Technology Lecturer: Adriana Lopez 18.434: Seminar in Theoretical Computer Science March 7, 2006 Steiner Trees and Forests 1 Steiner Tree Problem Given an undirected graph G
More informationParameterized coloring problems on chordal graphs
Parameterized coloring problems on chordal graphs Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary dmarx@cs.bme.hu
More informationIntroduction Alternative ways of evaluating a given query using
Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction
More informationCSE 417 Network Flows (pt 3) Modeling with Min Cuts
CSE 417 Network Flows (pt 3) Modeling with Min Cuts Reminders > HW6 is due on Friday start early bug fixed on line 33 of OptimalLineup.java: > change true to false Review of last two lectures > Defined
More informationarxiv:cs/ v1 [cs.ds] 20 Feb 2003
The Traveling Salesman Problem for Cubic Graphs David Eppstein School of Information & Computer Science University of California, Irvine Irvine, CA 92697-3425, USA eppstein@ics.uci.edu arxiv:cs/0302030v1
More informationReductions and Satisfiability
Reductions and Satisfiability 1 Polynomial-Time Reductions reformulating problems reformulating a problem in polynomial time independent set and vertex cover reducing vertex cover to set cover 2 The Satisfiability
More informationNumber Theory and Graph Theory
1 Number Theory and Graph Theory Chapter 6 Basic concepts and definitions of graph theory By A. Satyanarayana Reddy Department of Mathematics Shiv Nadar University Uttar Pradesh, India E-mail: satya8118@gmail.com
More informationApplied Lagrange Duality for Constrained Optimization
Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationShortest path problems
Next... Shortest path problems Single-source shortest paths in weighted graphs Shortest-Path Problems Properties of Shortest Paths, Relaxation Dijkstra s Algorithm Bellman-Ford Algorithm Shortest-Paths
More informationEvaluation of Relational Operations. Relational Operations
Evaluation of Relational Operations Chapter 14, Part A (Joins) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Operations v We will consider how to implement: Selection ( )
More informationDatabase Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2018 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 6 March,
More informationCS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination
CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy
More informationLecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.
Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject
More informationQuery Processing & Optimization
Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction
More informationSchema Mappings and Data Exchange
Schema Mappings and Data Exchange Lecture #2 EASSLC 2012 Southwest University August 2012 1 The Relational Data Model (E.F. Codd 1970) The Relational Data Model uses the mathematical concept of a relation
More informationMath 170- Graph Theory Notes
1 Math 170- Graph Theory Notes Michael Levet December 3, 2018 Notation: Let n be a positive integer. Denote [n] to be the set {1, 2,..., n}. So for example, [3] = {1, 2, 3}. To quote Bud Brown, Graph theory
More informationLecture #7. 1 Introduction. 2 Dijkstra s Correctness. COMPSCI 330: Design and Analysis of Algorithms 9/16/2014
COMPSCI 330: Design and Analysis of Algorithms 9/16/2014 Lecturer: Debmalya Panigrahi Lecture #7 Scribe: Nat Kell 1 Introduction In this lecture, we will further examine shortest path algorithms. We will
More informationCS 377 Database Systems
CS 377 Database Systems Relational Algebra and Calculus Li Xiong Department of Mathematics and Computer Science Emory University 1 ER Diagram of Company Database 2 3 4 5 Relational Algebra and Relational
More informationHash-Based Indexing 165
Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19
More information9.1 Cook-Levin Theorem
CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two
More informationA List Heuristic for Vertex Cover
A List Heuristic for Vertex Cover Happy Birthday Vasek! David Avis McGill University Tomokazu Imamura Kyoto University Operations Research Letters (to appear) Online: http://cgm.cs.mcgill.ca/ avis revised:
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007
Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationGreedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d
Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,
More informationLogic and Databases. Lecture 4 - Part 2. Phokion G. Kolaitis. UC Santa Cruz & IBM Research - Almaden
Logic and Databases Phokion G. Kolaitis UC Santa Cruz & IBM Research - Almaden Lecture 4 - Part 2 2 / 17 Alternative Semantics of Queries Bag Semantics We focused on the containment problem for conjunctive
More information4 Greedy Approximation for Set Cover
COMPSCI 532: Design and Analysis of Algorithms October 14, 2015 Lecturer: Debmalya Panigrahi Lecture 12 Scribe: Allen Xiao 1 Overview In this lecture, we introduce approximation algorithms and their analysis
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationMenu. Algebraic Simplification - Boolean Algebra EEL3701 EEL3701. MSOP, MPOS, Simplification
Menu Minterms & Maxterms SOP & POS MSOP & MPOS Simplification using the theorems/laws/axioms Look into my... 1 Definitions (Review) Algebraic Simplification - Boolean Algebra Minterms (written as m i ):
More informationIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms Dr. Gautam K. Das Departmet of Mathematics Indian Institute of Technology Guwahati, India gkd@iitg.ernet.in February 19, 2016 Outline of the lecture Background
More information2 The Fractional Chromatic Gap
C 1 11 2 The Fractional Chromatic Gap As previously noted, for any finite graph. This result follows from the strong duality of linear programs. Since there is no such duality result for infinite linear
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationMonochromatic loose-cycle partitions in hypergraphs
Monochromatic loose-cycle partitions in hypergraphs András Gyárfás Alfréd Rényi Institute of Mathematics Hungarian Academy of Sciences Budapest, P.O. Box 27 Budapest, H-364, Hungary gyarfas.andras@renyi.mta.hu
More informationCourse No: 4411 Database Management Systems Fall 2008 Midterm exam
Course No: 4411 Database Management Systems Fall 2008 Midterm exam Last Name: First Name: Student ID: Exam is 80 minutes. Open books/notes The exam is out of 20 points. 1 1. (16 points) Multiple Choice
More informationAssignment 5: Solutions
Algorithm Design Techniques Assignment 5: Solutions () Port Authority. [This problem is more commonly called the Bin Packing Problem.] (a) Suppose K = 3 and (w, w, w 3, w 4 ) = (,,, ). The optimal solution
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:
More information