Database Group Research Overview. Immanuel Trummer

Size: px
Start display at page:

Download "Database Group Research Overview. Immanuel Trummer"

Transcription

1 Database Group Research Overview Immanuel Trummer

2 Talk Overview User Query Data Analysis Result Processing

3 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

4 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

5 Data Vocalization

6 How to Present Data?

7 How to Present Data? Data Visual Interfaces User

8 How to Present Data? Data Visualization Visual Interfaces Data User

9 How to Present Data? Data Visualization Visual Interfaces Data Audio Interfaces User

10 How to Present Data? Data Visualization Visual Interfaces Data Data Vocalization Audio Interfaces User

11 Reading versus Listening Feature Written Text Voice Output Slowing Rereading Skimming

12 Vocalization: Overview Data Vocalization User

13 Vocalization: Overview Cognitive Load Memory Load Data Vocalization User

14 Vocalization: Overview Speaking Time Cognitive Load Memory Load Data Vocalization User

15 Vocalization: Overview Precision Speaking Time Cognitive Load Memory Load Data Vocalization User

16 Vocalization: Overview Precision Speaking Time Cognitive Load Memory Load Data Vocalization User Optimization Problem: Given input relation, find time-optimal vocalization under constraints on precision, output structure, and memory load.

17 Example

18 Vocalization: Naive Baseline Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View Traditional American 4.9 La Masseria Italian 3.2 Input

19 Vocalization: Naive Baseline Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View Traditional American 4.9 Vocalization Upstate with Traditional American cuisine and four point seven five stars user average rating. Thai Castle with Thai cuisine and three point three stars user average rating. John s with Traditional American cuisine and four point seven stars user average rating. Paris with French cuisine and three point three stars user average rating. The View with Traditional American cuisine and four point nine stars user average rating. La Masseria with Italian cuisine and three point two stars average rating. La Masseria Italian 3.2 Input Output

20 Vocalization: Naive Baseline Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View Traditional American 4.9 Vocalization Upstate with Traditional American cuisine and four point seven five stars user average rating. Thai Castle with Thai cuisine and three point three stars user average rating. John s with Traditional American cuisine and four point seven stars user average rating. Paris with French cuisine and three point three stars user average rating. The View with Traditional American cuisine and four point nine stars user average rating. La Masseria with Italian cuisine and three point two stars average rating. La Masseria Italian 3.2 Input 502 Characters Output

21 Vocalization: Naive Baseline Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Input Vocalization Upstate with Traditional American cuisine and four point seven five stars user average rating. Thai Castle with Thai cuisine and three point three stars user average rating. John s with Traditional American cuisine and four point seven stars user average rating. Paris with French cuisine and three point three stars user average rating. The View with Traditional American cuisine and four point nine stars user average rating. La Masseria with Italian cuisine and three point two stars average rating. " 502 Characters Output

22 Vocalization: Naive Baseline Restaurant Cuisine Rating Upstate Equal Values Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Input Vocalization Upstate with Traditional American cuisine and four point seven five stars user average rating. Thai Castle with Thai cuisine and three point three stars user average rating. John s with Traditional American cuisine and four point seven stars user average rating. Paris with French cuisine and three point three stars user average rating. The View with Traditional American cuisine and four point nine stars user average rating. La Masseria with Italian cuisine and three point two stars average rating. " 502 Characters Output

23 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. Input Output

24 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. Contexts Output

25 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. Scopes Output

26 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. Input Output

27 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine Characters Output

28 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine Characters Output "

29 More Concise Version La Masseria with Italian cuisine and three point two stars average rating. Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Similar Values Restaurants with Traditional American cuisine: Upstate with four point seven five stars user average rating. John s with four point seven stars user average rating. The View with four point nine stars user average rating. Restaurants with three point three stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. Output

30 Even More Concise Version Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View Traditional American 4.9 Vocalization Restaurants with Traditional American cuisine and four to five stars user average rating: Upstate. John s. The View. Restaurants with three to four stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. La Masseria with Italian cuisine. La Masseria Italian 3.2 Input Output

31 Even More Concise Version Restaurant Cuisine Rating Upstate Traditional American 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View Traditional American 4.9 Vocalization Restaurants with Traditional American cuisine and four to five stars user average rating: Upstate. John s. The View. Restaurants with three to four stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. La Masseria with Italian cuisine. La Masseria Italian 3.2 Input Characters Output

32 Even More Concise Version Restaurant Cuisine Rating Upstate Traditional American Input 4.75 Thai Castle Thai 3.3 John s Traditional American 4.7 Paris French 3.3 The View La Masseria Traditional American 4.9 Italian 3.2 Vocalization Restaurants with Traditional American cuisine and four to five stars user average rating: Upstate. John s. The View. Restaurants with three to four stars user average rating: Thai Castle with Thai cuisine. Paris with French cuisine. La Masseria with Italian cuisine Characters Output #

33 Vocalization Problem Input: a relation to vocalize Search Space: sequence of scopes Constraints: Context size S Categorical value domains: Domain size C Numerical value domains: Upper Bound Lower Bound * W Memory Precision Precision Objective: minimize speaking time!

34 Vocalization Problem Input: a relation to vocalize NP-Hard! Search Space: sequence of scopes Constraints: Context size S Categorical value domains: Domain size C Numerical value domains: Upper Bound Lower Bound * W Memory Precision Precision Objective: minimize speaking time!

35 Algorithms

36 Overview of Algorithms

37 Overview of Algorithms Polynomial Exponential Complexity

38 Overview of Algorithms Guarantees Optimal Near-Optimal None Polynomial Exponential Complexity

39 Overview of Algorithms Optimal Guarantees Integer Programming Near-Optimal None Polynomial Exponential Complexity

40 Overview of Algorithms Optimal Guarantees Integer Programming Near-Optimal None Polynomial Two-Phase Algorithm Exponential Complexity

41 Overview of Algorithms Optimal Near-Optimal None Guarantees Integer Programming Greedy Approach Two-Phase Algorithm Polynomial Exponential Complexity

42 Greedy Algorithm Find Optimal Voice Output Optimal Exponential

43 Greedy Algorithm Find Optimal Context Set Find Optimal Voice Output Optimal Optimal Exponential Polynomial

44 Greedy Algorithm Greedily Construct Context Set Find Optimal Context Find Optimal Voice Output? Optimal Optimal Polynomial Exponential Polynomial

45 Greedy Algorithm Greedily Construct Context Set Find Optimal Context Find Optimal Voice Output? Optimal Optimal Time Savings(Context Set) Property Holds? Submodular Monotone Non-Negative Polynomial Exponential Polynomial

46 Greedy Algorithm Greedily Construct Context Set Find Optimal Context Find Optimal Voice Output Near-Optimal Optimal Optimal Time Savings(Context Set) Property Holds? Submodular Monotone Non-Negative Polynomial Exponential Polynomial

47 Greedy Algorithm Greedily Construct Context Set Greedily Construct Context Find Optimal Voice Output Near-Optimal? Optimal Time Savings(Context Set) Property Holds? Submodular Monotone Non-Negative Polynomial Polynomial Polynomial

48 Greedy Algorithm Greedily Construct Context Set Greedily Construct Context Find Optimal Voice Output Near-Optimal? Optimal Time Savings(Context Set) Property Holds? Submodular Monotone Non-Negative Polynomial Polynomial Polynomial Time Savings(Assignments) Property Holds? Submodular Monotone Non-Negative

49 Greedy Algorithm Greedily Construct Context Set Greedily Construct Context Find Optimal Voice Output Near-Optimal Near-Optimal Optimal Time Savings(Context Set) Property Holds? Submodular Monotone Non-Negative Polynomial Polynomial Polynomial Time Savings(Assignments) Property Holds? Submodular Monotone Non-Negative

50 Experiments

51 Scope of Experiments

52 Scope of Experiments Restaurants Mobile Phones Football Statistics Laptop Models Scenario

53 Scope of Experiments Restaurants Mobile Phones Football Statistics Laptop Models Naive Baseline Scenario Algorithm Integer Programming Two-Phase Algorithm Greedy Approach (Different Configurations)

54 Scope of Experiments Metric User Preference Optimization Time Speech Length Restaurants Mobile Phones Football Statistics Laptop Models Naive Baseline Scenario Algorithm Integer Programming Two-Phase Algorithm Greedy Approach (Different Configurations)

55 Experimental Platform Hardware MacBook Pro 2.3 GHz CPU 8 GB memory Software CPLEX 12.7 IBM Watson MacOS 10.12

56 Speech Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Laptops Restaurants Football Phones

57 Speech Length Reduced Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Laptops Restaurants Football Phones

58 Speech Length Reduced Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Laptops Restaurants Football Phones

59 Speech Length Reduced Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Laptops Restaurants Football Phones

60 Speech Length Reduced Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Best Laptops Restaurants Football Phones

61 Speech Length Reduced Length Precision Complexity #Rows Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium #Rows #Rows #Rows #Rows #Rows Best Entropy: 0.59 Entropy: 0.7 Laptops Restaurants Entropy: 0.65 Entropy: 0.59 Football Phones

62 User Preference vs. Time 10 Naive Concise 60 Naive Concise Preference (#Votes) Speech Length (Seconds) #Rows #Rows

63 Optimization Time Precision Complexity Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium Laptops Restaurants Football Phones #Rows #Rows #Rows #Rows #Rows #Rows

64 Best Precision Complexity Optimization Time Greedy Approach Two-Phase Algorithm Integer Programming High Medium Low High Medium Low Low Low Low Medium Medium Medium Laptops Restaurants Football Phones #Rows #Rows #Rows #Rows #Rows #Rows

65 Data to Text Written Text Sonification Non-Speech Speech Output Text for Voice Output Vocalization Relational Input Input is Text Text Reader Audio Output Visual Output Visualization

66 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

67 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

68 Query Optimization

69 Problem Model Tuning Knobs Operation order Operators Query Query Optimization Cost Metric Execution time Optimal Plan(s)

70 Problem Model Tuning Knobs Operation order Operators NP-Hard to Approximate Query Query Optimization Optimal Plan(s) NP-Hard to Solve Cost Metric Execution time

71 Pre-Computation Before Run Time VLDB 15 VLDBJ 16 Dissertation Overview Query Template Optimizer Potentially Optimal Plans CACM Research Jim Gray Highlight Award Best of VLDB Query Optimizer Best of SIGMOD Optimal Plans SLOW Exhaustive SLOW SIGMOD 17 SIGMOD 14 Query Optimizer Solver Near-Optimal Plans Solver FASTER Query Optimizer FASTER Near-Optimal Plans VLDB 16 Optimizer SIGMOD 15 Query * Plans Parallel FASTER Query Optimizer Acceptable Plans Near-Optimal Plans Optimal Plans Approximate Incremental Random Optimality Guarantees VLDB 16 FASTER Optimizer SIGMOD 16 Query Best Effort Plans Quantum Optimization Platform Query Optimizer Best Effort Plans FAST FAST

72 Solving Hard Optimization Problems Optimization Problem

73 Solving Hard Optimization Problems Optimization Problem Specialized Solver

74 Solving Hard Optimization Problems Standard Problem Optimization Problem Specialized Solver

75 Solving Hard Optimization Problems Standard Problem Optimization Problem Specialized Solver Standard Solver

76 Solving Hard Optimization Problems Join Ordering Optimization Problem Standard Problem Specialized Solver Standard Solver

77 Solving Hard Optimization Problems Join Ordering Optimization Problem Standard Problem Specialized Solver Query Optimizer Standard Solver

78 Solving Hard Optimization Problems Join Ordering Optimization Problem MILP Standard Problem Specialized Solver Query Optimizer Standard Solver

79 Solving Hard Optimization Problems Join Ordering Optimization Problem MILP Standard Problem Specialized Solver Query Optimizer Standard Solver MILP Solver

80 Evolution of MILP Solvers CPLEX Version CPLEX MILP Speedup Year Source: A brief history of linear and mixed-integer linear programming by R. Bixby

81 Evolution of MILP Solvers CPLEX Version Hardware Independently CPLEX MILP Speedup Year Source: A brief history of linear and mixed-integer linear programming by R. Bixby

82 Potential Benefits 1E+16 Search Space Size (#Relations) 1E+08 1E Number of Joined Tables

83 Potential Benefits 1E+16 Exhaustive Non-Exhaustive Search Space Size (#Relations) 1E+08 1E Number of Joined Tables

84 Potential Benefits 1E+16 Exhaustive MILP Non-Exhaustive Search Space Size (#Relations) 1E+08 1E Number of Joined Tables

85 Potential Benefits Search Space Size (#Relations) 1E+16 1E+08 1E+00 Exhaustive MILP Non-Exhaustive + Parallel Optimization Number of Joined Tables

86 Potential Benefits Search Space Size (#Relations) 1E+16 1E+08 1E+00 Exhaustive MILP Non-Exhaustive + Parallel Optimization + Incremental Optimization Number of Joined Tables

87 Potential Benefits Search Space Size (#Relations) 1E+16 1E+08 1E+00 Exhaustive MILP Non-Exhaustive + Parallel Optimization + Incremental Optimization + Configuration Options Number of Joined Tables

88 Transformation

89 Overview Join Ordering Given tables with cardinality, predicates with selectivity Find optimal join order, best operators, etc. Transformation Mixed Integer Linear Programming Integer and continuous Variables Set of Linear Constraints Linear Objective Function

90 Transformation Scope Queries: Projections, N-Ary Predicates, Sub-Queries, Operators: Hash Join, BNL Join, Sort-Merge Join, Cost: Interesting Orders, Correlated Predicates, Byte Sizes

91 Transformation Scope Queries: Projections, N-Ary Predicates, Sub-Queries, See Paper Operators: Hash Join, BNL Join, Sort-Merge Join, Cost: Interesting Orders, Correlated Predicates, Byte Sizes

92 Transformation Scope Queries: Projections, N-Ary Predicates, Sub-Queries, See Paper Operators: Hash Join, BNL Join, Sort-Merge Join, Cost: Interesting Orders, Correlated Predicates, Byte Sizes This Talk: Highly Simplified Model

93 SELECT * FROM R,S,T WHERE P(R,S)

94 SELECT * FROM R,S,T WHERE P(R,S)????

95 SELECT * FROM R,S,T WHERE P(R,S) BR BS BR BS Operand Tables? Operand Tables BT BR BS BT? Operand Tables BT BR BS?? Operand Tables BT BR BS Operand Tables BT Variables

96 SELECT * FROM R,S,T WHERE P(R,S) BR BS BR BS Operand Tables? Operand Tables BT BR BS BT? Operand Tables BT BR BS?? Operand Tables BT BR BS Operand Tables BT Constraints Variables

97 SELECT * FROM R,S,T WHERE P(R,S) BR BS BR BS Operand Tables? Operand Tables BT BR BS BT? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

98 SELECT * FROM R,S,T WHERE P(R,S) BR BS BR BS Operand Tables? Operand Tables BT BR BS BT =1? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

99 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality BR RC BS BR? Operand Tables Cardinality BS Operand Tables BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

100 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality BP? Operand Tables Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

101 Calculating Cardinality T 1 T n WHERE P 1 P m

102 Calculating Cardinality Cardinality(T 1 T n WHERE P 1 P m )

103 Calculating Cardinality Cardinality(T 1 T n WHERE P 1 P m ) = icardinality(t i )* iselectivity(p i )

104 Calculating Cardinality Cardinality(T 1 T n WHERE P 1 P m ) = icardinality(t i )* iselectivity(p i )

105 Calculating Cardinality Cardinality(T 1 T n WHERE P 1 P m ) = icardinality(t i )* iselectivity(p i ) Not Linear!

106 Calculating Cardinality Cardinality(T 1 T n WHERE P 1 P m ) = icardinality(t i )* iselectivity(p i )

107 Calculating Cardinality Log(Cardinality(T 1 T n WHERE P 1 P m )) = Log( icardinality(t i )* iselectivity(p i ))

108 Calculating Cardinality Log(Cardinality(T 1 T n WHERE P 1 P m )) = Log( icardinality(t i )* iselectivity(p i )) = ilog(cardinality(t i ))+ ilog(selectivity(p i ))

109 Calculating Cardinality Log(Cardinality(T 1 T n WHERE P 1 P m )) = Log( icardinality(t i )* iselectivity(p i )) = ilog(cardinality(t i ))+ ilog(selectivity(p i ))

110 Calculating Cardinality Log(Cardinality(T 1 T n WHERE P 1 P m )) = Log( icardinality(t i )* iselectivity(p i )) = ilog(cardinality(t i ))+ ilog(selectivity(p i )) Linear!

111 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality BP? Operand Tables Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

112 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality RL BP? Operand Tables Log-Cardinality Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

113 Calculating Cardinality II Cardinality Log Cardinality

114 Calculating Cardinality II Cardinality Log Cardinality

115 Calculating Cardinality II Cardinality Log Cardinality BT1 BT2 BT3

116 Calculating Cardinality II Cardinality =0 =1 Log Cardinality =0 BT1 BT2 BT3 =1 =0 =1

117 Calculating Cardinality II Cardinality Approximate Cardinality =0 =1 Log Cardinality =0 BT1 BT2 BT3 =1 =0 =1

118 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality RL BP? Operand Tables Log-Cardinality Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

119 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality BT1 BT2 RL BP? Operand Tables Cardinality Bounds Log-Cardinality Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT3 BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

120 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality RC Approximate Cardinality BT1 BT2 RL BP? Operand Tables Cardinality Bounds Log-Cardinality Applicable Predicates BR RC BS BR Cardinality BS Operand Tables BT3 BT RC BR BS BT =1 Cardinality? Operand Tables BT =1 BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

121 RC SELECT * FROM R,S,T WHERE P(R,S) Cardinality RC Approximate Cardinality BT1 BT2 RL BP? Operand Tables Cardinality Bounds Log-Cardinality Applicable Predicates BR BS Cardinality Operand Tables BT3 BT =1 Cardinality? Operand Tables =1 #Variables O(nrTables 2 ) RC BR BS RC BR BS BT BT BR BS?? Operand Tables BT =1 BR BS Operand Tables BT =1 Constraints Variables

122 Experimental Results

123 Prototypical Implementation Postgres Parser Optimizer ILP Solver Execution

124 Prototypical Implementation Postgres Parser Optimizer ILP Solver Execution

125 Prototypical Implementation Postgres Parser Optimizer ILP Solver Decomposition Join Ordering Execution

126 Prototypical Implementation Postgres Parser Optimizer ILP Solver Decomposition Join Ordering Genetic Algorithm Dynamic Programming Execution

127 Prototypical Implementation Postgres Parser Optimizer ILP Solver Decomposition Join Ordering Genetic Algorithm Dynamic Programming MILP Proxy Execution Query & Statistics Optimal Order MILP Problem MILP MILP Solution Generation

128 Prototypical Implementation Postgres Parser Optimizer ILP Solver 32 LOC Decomposition Join Ordering Genetic Algorithm Dynamic Programming MILP Proxy Execution Query & Statistics Optimal Order MILP Problem 396 LOC MILP MILP Solution Generation

129 Experimental Scope

130 Parameters Experimental Scope

131 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database

132 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Star Chain Cycle Irregular Queries

133 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries

134 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries Metrics

135 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries Metrics Optimization Time Plan Quality

136 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries Metrics Optimization Time Plan Quality Guarantees Actual

137 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries Metrics Optimization Time Plan Quality Guarantees Actual Estimated Measured

138 Parameters Experimental Scope NFL TPC-H Graphs Loans Survey Musicbrainz Database Integer Programming Dynamic Programming: Bushy Dynamic Programming: LD+Cross Dynamic Programming: Left-Deep Genetic Algorithm: Left-Deep Genetic Algorithm: Bushy Optimizers Star Chain Cycle Irregular Queries Metrics Optimization Time Plan Quality Guarantees Actual Estimated Approximated Model Postgres Model Measured DBMS-X Postgres

139 Optimization Time DP-Left DP-Left+Cartesian DP-Bushy IP T-out 500 sec IP T-out 50 sec IP T-out 5 sec 500 sec 50 sec 5 sec Optimization Time (ms) Memory out Memory out Memory out Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

140 Optimization Time DP-Left DP-Left+Cartesian DP-Bushy IP T-out 500 sec IP T-out 50 sec IP T-out 5 sec 500 sec 50 sec 5 sec Optimization Time (ms) Memory out Memory out Memory out Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

141 Optimization Time DP-Left DP-Left+Cartesian DP-Bushy IP T-out 500 sec IP T-out 50 sec IP T-out 5 sec 500 sec 50 sec 5 sec Optimization Time (ms) Memory out Memory out Memory out Join Orders x Relations x Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

142 MILP Optimizer Timeouts Budget 5 Timeouts Budget 50 Timeouts Budget 500 Timeouts Timeout (%) Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

143 Guarantees on Plan Quality After 5 Seconds Survey-Star Loan-Star NFL-Star Graph-Chain Graph-Cycle Plan Cost Bound Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

144 Approximate Cost Model vs. Postgres Cost Model Theoretical Worst Case Empirical Worst Case Optimal Plan 4 Cost Overhead Number of Joined Tables DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

145 10000 Musicbrainz : Combined Optimization and Execution Time Genetic Optimizer Left Genetic Optimizer Bushy MILP Optimizer Genetic Optimizer Left Genetic Optimizer Bushy MILP Optimizer Genetic Optimizer Left Genetic Optimizer Bushy MILP Optimizer Joined Tables Joined Tables Joined Tables Optimization Time Execution Time Total Time DBMS: Postgres 9.6 ILP Solver: Gurobi 6.5 Hardware: 12 GB Memory, 2.6 GHz CPU

146 Replacing Cost Models Motivation: unreliable execution cost estimates Correlated data User-defined predicates Goal: regret bounded query evaluation Method: use reinforcement learning (UCT)

147 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

148 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

149 The FactChecker

150 Text Summarizing Data

151 Text Summarizing Data

152 Text Summarizing Data

153 Authoring Tools Spell Checker

154 Authoring Tools Fact Checker

155 Input Overview Output Data Text Fact Checker Verified

156 Input Overview Output Data Text Fact Checker Claim: The average income in New York was $61,437. Verified

157 Input Overview Output Data Text Fact Checker Claim: The average income in New York was $61,437. Query: SELECT Avg(Income) FROM Population WHERE State = NY Claimed Result: 61,437 Verified

158 Input Overview Output Data Text Fact Checker Claim: The average income in New York was $61,437. Query: SELECT Avg(Income) FROM Population WHERE State = NY Claimed Result: 61,437 Evaluation Result: 60,419 Verified

159 Input Overview Output Data Text No Match! Fact Checker Claim: The average income in New York was $61,437. Query: SELECT Avg(Income) FROM Population WHERE State = NY Claimed Result: 61,437 Evaluation Result: 60,419 Verified

160 Fact Checking vs. Natural Language Querying Criterion NL Querying Fact Checking Input Type Query Query + Result Input Format One Sentence Multiple Sentences Batch Size One Query Multiple Related Claims

161 High-Level Overview Main Ideas Massive scale evaluation of candidates Search for common document theme Scope Focus on query sub-class E.g., SELECT AVG(Income)WHERE Profession = Computer Scientist AND Location = NY Semi-automatic fact checking

162 Matching

163 Matching Query Fragments Avg() Average, mean, State = NY NY, State, New York State, Empire State Income Income, Gain,

164 Matching Query Fragments Avg() Average, mean, State = NY NY, State, New York State, Empire State Claim Sentence Income Income, Gain,

165 Claim Keywords Matching Query Fragments Headline Section Headline Avg() Average, mean, Paragraph Start Claim Sentence State = NY NY, State, New York State, Empire State Income Income, Gain,

166 Claim Keywords IR Matching Query Fragments Headline Section Headline Avg() Average, mean, Paragraph Start Claim Sentence State = NY NY, State, New York State, Empire State Income Income, Gain,

167 Probabilistic Model Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate)

168 Probabilistic Model Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Claim n Variables Q1 Qn

169 Probabilistic Model Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Claim n Variables Q1 Qn S1 E1 Sn En

170 Probabilistic Model Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Claim n Variables Q1 Qn S1 E1 Sn En

171 Probabilistic Model Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Expectation Claim n Variables Q1 Qn S1 E1 Sn En

172 Probabilistic Model Maximization Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Expectation Claim n Variables Q1 Qn S1 E1 Sn En

173 Probabilistic Model Maximization Document Parameters Pr(Function) Pr(Aggregate) Pr(Predicate) Claim 1 Variables Expectation Claim n Variables Q1 Qn S1 E1 Sn En Requires Execution!

174 Query Evaluation Fact Checker - Execution Engine Cost Model + Optimizer Query Merging Result Cache Incremental Execution Relational DBMS

175 User Interface

176 User Interface

177 User Interface

178 User Interface

179 Test Cases 34 articles with relational data sets Sources New York Times FiveThirtyEight Stack Overflow Data set size: from few KB to ~100 MB

180 Analysis of Test Cases

181 Claims by Article Correct Incorrect # Claims Articles

182 Claim Statistics Granularity Total Erroneous Erroneous (%) Articles Claims

183 Query Complexity No Predicates One Predicate Two Predicates 12% 18% 70%

184 Impact of Document Theme Aggregation Function Set of Restricted Columns Aggregated Column Coverage (%) Top-N Likely Candidates (N)

185 Evaluating Fact Checking

186 Accuracy of Query Mapping Total Correct Claim Wrong Claim Coverage (%) Top-N Queries (N)

187 Number of Possible Queries 1E+12 #Possible Queries 1E+06 1E+00 Test Cases

188 Impact of Keyword Source 80 Sentence +Previous +First in Paragraph +Synonyms +Headlines 70 Coverage (%) Top 1 Top 5 Top 10 Top-N Likely Candidates (N)

189 Time vs. Accuracy: #Iterations Top Coverage (%) Execution Time (s)

190 Impact of Optimizations on Execution Time Version Run Time (s) Speedup Naive Query Merging 82 x62 + Result Caching 57 x1.4 + Incremental Evaluation 54 x1.05

191 Execution Time Breakdown Indexing Expectation Maximization Information Retrieval Query Processing 10% 5% 24% 20% 6% 61% 67% 7% Small Data Large Data

192 User Study

193 User Study Setup Goal: compare FactChecker vs. SQL interface Criterion: time to verify claims in documents Participants: eight participants (7 CS majors) Setup: Intro: six minutes tutorial for fact checker Alternate between two interfaces Two long articles, 20 minutes per article Four short articles, 5 minutes per article

194 #Claims Verified by Article Stack Overflow 2016 Stack Overflow 2015 HipHop Candidate Airline Etiquettes Elo Blattner Stack Overflow # Claims Verified Time (s) Time (s) Time (s) Time (s) Fact Checker (AVG) SQL (AVG) Time (s) Time (s)

195 FC-Speedup by User 18 Speedup (#Verifications/Minute) U1 U2 U3 U4 U5 U6 U7 U8 AVG User

196 Features Used Top-1 Top-5 Top-10 Custom 5% 13% 45% 38%

197 Precision and Recall Approach Recall Precision F1 Score FactChecker + User FactChecker (Automatic) 100% 91% 95% 67% 67% 67% SQL + User 30% 57% 39%

198 Talk Overview Fact Checking Query User Data Vocalization Data Analysis Result Processing Query Optimization

199 ?

Accelerating Analytical Workloads

Accelerating Analytical Workloads Accelerating Analytical Workloads Thomas Neumann Technische Universität München April 15, 2014 Scale Out in Big Data Analytics Big Data usually means data is distributed Scale out to process very large

More information

Column Stores vs. Row Stores How Different Are They Really?

Column Stores vs. Row Stores How Different Are They Really? Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background

More information

Join Processing for Flash SSDs: Remembering Past Lessons

Join Processing for Flash SSDs: Remembering Past Lessons Join Processing for Flash SSDs: Remembering Past Lessons Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison $/MB GB Flash Solid State Drives (SSDs) Benefits of

More information

Vocalizing Large Time Series Efficiently

Vocalizing Large Time Series Efficiently Vocalizing Large Time Series Efficiently Immanuel Trummer, Mark Bryan, Ramya Narasimha {it224,mab539,rn343}@cornell.edu Cornell University ABSTRACT We vocalize query results for time series data. We describe

More information

Column-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi

Column-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi Column-Stores vs. Row-Stores How Different are they Really? Arul Bharathi Authors Daniel J.Abadi Samuel R. Madden Nabil Hachem 2 Contents Introduction Row Oriented Execution Column Oriented Execution Column-Store

More information

Weaving Relations for Cache Performance

Weaving Relations for Cache Performance VLDB 2001, Rome, Italy Best Paper Award Weaving Relations for Cache Performance Anastassia Ailamaki David J. DeWitt Mark D. Hill Marios Skounakis Presented by: Ippokratis Pandis Bottleneck in DBMSs Processor

More information

Parallel Query Optimisation

Parallel Query Optimisation Parallel Query Optimisation Contents Objectives of parallel query optimisation Parallel query optimisation Two-Phase optimisation One-Phase optimisation Inter-operator parallelism oriented optimisation

More information

Advanced Data Management

Advanced Data Management Advanced Data Management Medha Atre Office: KD-219 atrem@cse.iitk.ac.in Aug 11, 2016 Assignment-1 due on Aug 15 23:59 IST. Submission instructions will be posted by tomorrow, Friday Aug 12 on the course

More information

XML Data Stream Processing: Extensions to YFilter

XML Data Stream Processing: Extensions to YFilter XML Data Stream Processing: Extensions to YFilter Shaolei Feng and Giridhar Kumaran January 31, 2007 Abstract Running XPath queries on XML data steams is a challenge. Current approaches that store the

More information

Column-Stores vs. Row-Stores: How Different Are They Really?

Column-Stores vs. Row-Stores: How Different Are They Really? Column-Stores vs. Row-Stores: How Different Are They Really? Daniel J. Abadi, Samuel Madden and Nabil Hachem SIGMOD 2008 Presented by: Souvik Pal Subhro Bhattacharyya Department of Computer Science Indian

More information

Triangle SQL Server User Group Adaptive Query Processing with Azure SQL DB and SQL Server 2017

Triangle SQL Server User Group Adaptive Query Processing with Azure SQL DB and SQL Server 2017 Triangle SQL Server User Group Adaptive Query Processing with Azure SQL DB and SQL Server 2017 Joe Sack, Principal Program Manager, Microsoft Joe.Sack@Microsoft.com Adaptability Adapt based on customer

More information

Relational Database Index Design and the Optimizers

Relational Database Index Design and the Optimizers Relational Database Index Design and the Optimizers DB2, Oracle, SQL Server, et al. Tapio Lahdenmäki Michael Leach (C^WILEY- IX/INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface xv 1

More information

Red Fox: An Execution Environment for Relational Query Processing on GPUs

Red Fox: An Execution Environment for Relational Query Processing on GPUs Red Fox: An Execution Environment for Relational Query Processing on GPUs Georgia Institute of Technology: Haicheng Wu, Ifrah Saeed, Sudhakar Yalamanchili LogicBlox Inc.: Daniel Zinn, Martin Bravenboer,

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Database Learning: Toward a Database that Becomes Smarter Over Time

Database Learning: Toward a Database that Becomes Smarter Over Time Database Learning: Toward a Database that Becomes Smarter Over Time Yongjoo Park Ahmad Shahab Tajik Michael Cafarella Barzan Mozafari University of Michigan, Ann Arbor Today s databases Database Users

More information

Applying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems

Applying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems V Brazilian Symposium on Computing Systems Engineering Applying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems Alessandro Trindade, Hussama Ismail, and Lucas Cordeiro Foz

More information

Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers

Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers , October 20-22, 2010, San Francisco, USA Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers I-Lun Tseng, Member, IAENG, Huan-Wen Chen, and Che-I Lee Abstract Longest-path routing problems,

More information

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012 Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 6 April 12, 2012 1 Acknowledgements: The slides are provided by Nikolaus Augsten

More information

Red Fox: An Execution Environment for Relational Query Processing on GPUs

Red Fox: An Execution Environment for Relational Query Processing on GPUs Red Fox: An Execution Environment for Relational Query Processing on GPUs Haicheng Wu 1, Gregory Diamos 2, Tim Sheard 3, Molham Aref 4, Sean Baxter 2, Michael Garland 2, Sudhakar Yalamanchili 1 1. Georgia

More information

Improving the Performance of OLAP Queries Using Families of Statistics Trees

Improving the Performance of OLAP Queries Using Families of Statistics Trees Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University

More information

Outline. Database Tuning. Join Strategies Running Example. Outline. Index Tuning. Nikolaus Augsten. Unit 6 WS 2014/2015

Outline. Database Tuning. Join Strategies Running Example. Outline. Index Tuning. Nikolaus Augsten. Unit 6 WS 2014/2015 Outline Database Tuning Nikolaus Augsten University of Salzburg Department of Computer Science Database Group 1 Examples Unit 6 WS 2014/2015 Adapted from Database Tuning by Dennis Shasha and Philippe Bonnet.

More information

Elementary IR: Scalable Boolean Text Search. (Compare with R & G )

Elementary IR: Scalable Boolean Text Search. (Compare with R & G ) Elementary IR: Scalable Boolean Text Search (Compare with R & G 27.1-3) Information Retrieval: History A research field traditionally separate from Databases Hans P. Luhn, IBM, 1959: Keyword in Context

More information

A Class of Submodular Functions for Document Summarization

A Class of Submodular Functions for Document Summarization A Class of Submodular Functions for Document Summarization Hui Lin, Jeff Bilmes University of Washington, Seattle Dept. of Electrical Engineering June 20, 2011 Lin and Bilmes Submodular Summarization June

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

International Journal of Advance Engineering and Research Development. Performance Enhancement of Search System

International Journal of Advance Engineering and Research Development. Performance Enhancement of Search System Scientific Journal of Impact Factor(SJIF): 3.134 International Journal of Advance Engineering and Research Development Volume 2,Issue 7, July -2015 Performance Enhancement of Search System Ms. Uma P Nalawade

More information

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross Track Join Distributed Joins with Minimal Network Traffic Orestis Polychroniou Rajkumar Sen Kenneth A. Ross Local Joins Algorithms Hash Join Sort Merge Join Index Join Nested Loop Join Spilling to disk

More information

Record Placement Based on Data Skew Using Solid State Drives

Record Placement Based on Data Skew Using Solid State Drives BPOE-5 Record Placement Based on Data Skew Using Solid State Drives Jun Suzuki 1,2, Shivaram Venkataraman 2, Sameer Agarwal 2, Michael Franklin 2, and Ion Stoica 2 1 Green Platform Research Laboratories,

More information

Automated Configuration of MIP solvers

Automated Configuration of MIP solvers Automated Configuration of MIP solvers Frank Hutter, Holger Hoos, and Kevin Leyton-Brown Department of Computer Science University of British Columbia Vancouver, Canada {hutter,hoos,kevinlb}@cs.ubc.ca

More information

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization CMPT 354: Database System I Lecture 7. Basics of Query Optimization 1 Why should you care? https://databricks.com/glossary/catalyst-optimizer https://sigmod.org/sigmod-awards/people/goetz-graefe-2017-sigmod-edgar-f-codd-innovations-award/

More information

Bridging the Processor/Memory Performance Gap in Database Applications

Bridging the Processor/Memory Performance Gap in Database Applications Bridging the Processor/Memory Performance Gap in Database Applications Anastassia Ailamaki Carnegie Mellon http://www.cs.cmu.edu/~natassa Memory Hierarchies PROCESSOR EXECUTION PIPELINE L1 I-CACHE L1 D-CACHE

More information

Chapter 9. Cardinality Estimation. How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11

Chapter 9. Cardinality Estimation. How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11 Chapter 9 How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11 Wilhelm-Schickard-Institut für Informatik Universität Tübingen 9.1 Web Forms Applications

More information

Lecture #14 Optimizer Implementation (Part I)

Lecture #14 Optimizer Implementation (Part I) 15-721 ADVANCED DATABASE SYSTEMS Lecture #14 Optimizer Implementation (Part I) Andy Pavlo / Carnegie Mellon University / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017 2 TODAY S AGENDA

More information

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Database Engines Main Components Query Processing Transaction Processing Access Methods JAN 2014 Slide

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

HANA Performance. Efficient Speed and Scale-out for Real-time BI

HANA Performance. Efficient Speed and Scale-out for Real-time BI HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business

More information

A Case for Merge Joins in Mediator Systems

A Case for Merge Joins in Mediator Systems A Case for Merge Joins in Mediator Systems Ramon Lawrence Kirk Hackert IDEA Lab, Department of Computer Science, University of Iowa Iowa City, IA, USA {ramon-lawrence, kirk-hackert}@uiowa.edu Abstract

More information

Venice: Reliable Virtual Data Center Embedding in Clouds

Venice: Reliable Virtual Data Center Embedding in Clouds Venice: Reliable Virtual Data Center Embedding in Clouds Qi Zhang, Mohamed Faten Zhani, Maissa Jabri and Raouf Boutaba University of Waterloo IEEE INFOCOM Toronto, Ontario, Canada April 29, 2014 1 Introduction

More information

A Fast Randomized Algorithm for Multi-Objective Query Optimization

A Fast Randomized Algorithm for Multi-Objective Query Optimization A Fast Randomized Algorithm for Multi-Objective Query Optimization Immanuel Trummer and Christoph Koch {firstname}.{lastname}@epfl.ch École Polytechnique Fédérale de Lausanne ABSTRACT Query plans are compared

More information

CSE 232A Graduate Database Systems

CSE 232A Graduate Database Systems CSE 232A Graduate Database Systems Arun Kumar Topic 4: Query Optimization Chapters 12 and 15 of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris 1 Lifecycle of a Query Query Result Query Database Server

More information

Architecture-Conscious Database Systems

Architecture-Conscious Database Systems Architecture-Conscious Database Systems 2009 VLDB Summer School Shanghai Peter Boncz (CWI) Sources Thank You! l l l l Database Architectures for New Hardware VLDB 2004 tutorial, Anastassia Ailamaki Query

More information

Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery

Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery Introduction Problems & Solutions Join Recognition Experimental Results Introduction GK Spring Workshop Waldau: Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery Database & Information

More information

Optimizing Join Enumeration in Transformation- based Query Optimizers

Optimizing Join Enumeration in Transformation- based Query Optimizers Optimizing Join Enumeration in Transformation- based Query Optimizers ANIL SHANBHAG, S. SUDARSHAN IIT BOMBAY anils@mit.edu, sudarsha@cse.iitb.ac.in VLDB 2014 Query Optimization: Quick Background System

More information

HYRISE In-Memory Storage Engine

HYRISE In-Memory Storage Engine HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

Background: disk access vs. main memory access (1/2)

Background: disk access vs. main memory access (1/2) 4.4 B-trees Disk access vs. main memory access: background B-tree concept Node structure Structural properties Insertion operation Deletion operation Running time 66 Background: disk access vs. main memory

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

How Achaeans Would Construct Columns in Troy. Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte

How Achaeans Would Construct Columns in Troy. Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte How Achaeans Would Construct Columns in Troy Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte Number of Visas Received 1 0,75 0,5 0,25 0 Alekh Jens Health Level

More information

Submodular Optimization

Submodular Optimization Submodular Optimization Nathaniel Grammel N. Grammel Submodular Optimization 1 / 28 Submodularity Captures the notion of Diminishing Returns Definition Suppose U is a set. A set function f :2 U! R is submodular

More information

Relational Database Index Design and the Optimizers

Relational Database Index Design and the Optimizers Relational Database Index Design and the Optimizers DB2, Oracle, SQL Server, et al. Tapio Lahdenmäki Michael Leach A JOHN WILEY & SONS, INC., PUBLICATION Relational Database Index Design and the Optimizers

More information

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

Automatic Scaling Iterative Computations. Aug. 7 th, 2012 Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics

More information

OPEN INFORMATION EXTRACTION FROM THE WEB. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni

OPEN INFORMATION EXTRACTION FROM THE WEB. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni OPEN INFORMATION EXTRACTION FROM THE WEB Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni Call for a Shake Up in Search! Question Answering rather than indexed key

More information

JHPCSN: Volume 4, Number 1, 2012, pp. 1-7

JHPCSN: Volume 4, Number 1, 2012, pp. 1-7 JHPCSN: Volume 4, Number 1, 2012, pp. 1-7 QUERY OPTIMIZATION BY GENETIC ALGORITHM P. K. Butey 1, Shweta Meshram 2 & R. L. Sonolikar 3 1 Kamala Nehru Mahavidhyalay, Nagpur. 2 Prof. Priyadarshini Institute

More information

Principles of Data Management. Lecture #9 (Query Processing Overview)

Principles of Data Management. Lecture #9 (Query Processing Overview) Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2015 Quiz I Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2015 Quiz I There are 12 questions and 13 pages in this quiz booklet. To receive

More information

Record Placement Based on Data Skew Using Solid State Drives

Record Placement Based on Data Skew Using Solid State Drives Record Placement Based on Data Skew Using Solid State Drives Jun Suzuki 1, Shivaram Venkataraman 2, Sameer Agarwal 2, Michael Franklin 2, and Ion Stoica 2 1 Green Platform Research Laboratories, NEC j-suzuki@ax.jp.nec.com

More information

Introduction to Column Stores with MemSQL. Seminar Database Systems Final presentation, 11. January 2016 by Christian Bisig

Introduction to Column Stores with MemSQL. Seminar Database Systems Final presentation, 11. January 2016 by Christian Bisig Final presentation, 11. January 2016 by Christian Bisig Topics Scope and goals Approaching Column-Stores Introducing MemSQL Benchmark setup & execution Benchmark result & interpretation Conclusion Questions

More information

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Shark Hive on Spark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Agenda Intro to Spark Apache Hive Shark Shark s Improvements over Hive Demo Alpha

More information

BDCC: Exploiting Fine-Grained Persistent Memories for OLAP. Peter Boncz

BDCC: Exploiting Fine-Grained Persistent Memories for OLAP. Peter Boncz BDCC: Exploiting Fine-Grained Persistent Memories for OLAP Peter Boncz NVRAM System integration: NVMe: block devices on the PCIe bus NVDIMM: persistent RAM, byte-level access Low latency Lower than Flash,

More information

Overview. DW Performance Optimization. Aggregates. Aggregate Use Example

Overview. DW Performance Optimization. Aggregates. Aggregate Use Example Overview DW Performance Optimization Choosing aggregates Maintaining views Bitmapped indices Other optimization issues Original slides were written by Torben Bach Pedersen Aalborg University 07 - DWML

More information

D2.P2: A Query Planner Based on Quantitative and Structural Information. Francesco Scarcello Alfredo Mazzitelli Università della Calabria

D2.P2: A Query Planner Based on Quantitative and Structural Information. Francesco Scarcello Alfredo Mazzitelli Università della Calabria D2.P2: A Query Planner Based on Quantitative and Structural Information Francesco Scarcello Alfredo Mazzitelli Università della Calabria Query Planners Given a query and some information on the database,

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

Histogram-Aware Sorting for Enhanced Word-Aligned Compress

Histogram-Aware Sorting for Enhanced Word-Aligned Compress Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes 1- University of New Brunswick, Saint John 2- Université du Québec at Montréal (UQAM) October 23, 2008 Bitmap indexes SELECT

More information

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Parallel DBMS Prof. Yanlei Diao University of Massachusetts Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke I. Parallel Databases 101 Rise of parallel databases: late 80 s Architecture: shared-nothing

More information

Performance Optimization for Informatica Data Services ( Hotfix 3)

Performance Optimization for Informatica Data Services ( Hotfix 3) Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Smooth Scan: Statistics-Oblivious Access Paths. Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser

Smooth Scan: Statistics-Oblivious Access Paths. Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser Smooth Scan: Statistics-Oblivious Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q16 Q18 Q19 Q21 Q22

More information

Overview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri

Overview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri Overview of Data Exploration Techniques Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri data exploration not always sure what we are looking for (until we find it) data has always been big volume

More information

Weaving Relations for Cache Performance

Weaving Relations for Cache Performance Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon Computer Platforms in 198 Execution PROCESSOR 1 cycles/instruction Data and Instructions cycles

More information

FASTER UPPER BOUNDING OF INTERSECTION SIZES

FASTER UPPER BOUNDING OF INTERSECTION SIZES FASTER UPPER BOUNDING OF INTERSECTION SIZES IBM Research Tokyo - Japan Daisuke Takuma Hiroki Yanagisawa 213 IBM Outline Motivation of upper bounding Contributions Data structure Proposal Evaluation (introduce

More information

Approximation Schemes for Many-Objective Query Optimization

Approximation Schemes for Many-Objective Query Optimization Approximation Schemes for Many-Objective Query Optimization Immanuel Trummer and Christoph Koch École Polytechnique Fédérale de Lausanne {firstname}.{lastname}@epfl.ch ABSTRACT The goal of multi-objective

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi

Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi M.Tech (IT) GGS Indraprastha University Delhi mk_kumar_76@yahoo.com Abstract DDBS adds to the conventional centralized DBS some

More information

Just In Time Compilation in PostgreSQL 11 and onward

Just In Time Compilation in PostgreSQL 11 and onward Just In Time Compilation in PostgreSQL 11 and onward Andres Freund PostgreSQL Developer & Committer Email: andres@anarazel.de Email: andres.freund@enterprisedb.com Twitter: @AndresFreundTec anarazel.de/talks/2018-09-07-pgopen-jit/jit.pdf

More information

Improving IBM Red Brick Warehouse Query Performance

Improving IBM Red Brick Warehouse Query Performance Improving IBM Red Brick Warehouse Query Performance Aman Sinha, Richard Taylor, Mandar Pimpale, David Wilhite, Cindy Fung European Red Brick Users Group Conference 2003 Milano, Italy September 9 September

More information

ORACLE TRAINING CURRICULUM. Relational Databases and Relational Database Management Systems

ORACLE TRAINING CURRICULUM. Relational Databases and Relational Database Management Systems ORACLE TRAINING CURRICULUM Relational Database Fundamentals Overview of Relational Database Concepts Relational Databases and Relational Database Management Systems Normalization Oracle Introduction to

More information

Asking the Right Questions in Crowd Data Sourcing

Asking the Right Questions in Crowd Data Sourcing MoDaS Mob Data Sourcing Asking the Right Questions in Crowd Data Sourcing Tova Milo Tel Aviv University Outline Introduction to crowd (data) sourcing Databases and crowds Declarative is good How to best

More information

COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE)

COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE) COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE) PRESENTATION BY PRANAV GOEL Introduction On analytical workloads, Column

More information

Keywords : Skyline Join, Skyline Objects, Non Reductive, Block Nested Loop Join, Cardinality, Dimensionality

Keywords : Skyline Join, Skyline Objects, Non Reductive, Block Nested Loop Join, Cardinality, Dimensionality Volume 4, Issue 3, March 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Skyline Evaluation

More information

PS2 out today. Lab 2 out today. Lab 1 due today - how was it?

PS2 out today. Lab 2 out today. Lab 1 due today - how was it? 6.830 Lecture 7 9/25/2017 PS2 out today. Lab 2 out today. Lab 1 due today - how was it? Project Teams Due Wednesday Those of you who don't have groups -- send us email, or hand in a sheet with just your

More information

Graph-Based Synopses for Relational Data. Alkis Polyzotis (UC Santa Cruz)

Graph-Based Synopses for Relational Data. Alkis Polyzotis (UC Santa Cruz) Graph-Based Synopses for Relational Data Alkis Polyzotis (UC Santa Cruz) Data Synopses Data Query Result Data Synopsis Query Approximate Result Problem: exact answer may be too costly to compute Examples:

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi Evaluation of Relational Operations: Other Techniques Chapter 14 Sayyed Nezhadi Schema for Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer,

More information

Multi-threaded Queries. Intra-Query Parallelism in LLVM

Multi-threaded Queries. Intra-Query Parallelism in LLVM Multi-threaded Queries Intra-Query Parallelism in LLVM Multithreaded Queries Intra-Query Parallelism in LLVM Yang Liu Tianqi Wu Hao Li Interpreted vs Compiled (LLVM) Interpreted vs Compiled (LLVM) Interpreted

More information

Primavera Compression Server 5.0 Service Pack 1 Concept and Performance Results

Primavera Compression Server 5.0 Service Pack 1 Concept and Performance Results - 1 - Primavera Compression Server 5.0 Service Pack 1 Concept and Performance Results 1. Business Problem The current Project Management application is a fat client. By fat client we mean that most of

More information

Columnstore and B+ tree. Are Hybrid Physical. Designs Important?

Columnstore and B+ tree. Are Hybrid Physical. Designs Important? Columnstore and B+ tree Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 B+ tree & Columnstore on same table = Hybrid design 4? C O L C O L B+ tree B+ tree ? C O L C O L B+ tree B+ tree

More information

Heckaton. SQL Server's Memory Optimized OLTP Engine

Heckaton. SQL Server's Memory Optimized OLTP Engine Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability

More information

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez SQL Queries for Mere Mortals Third Edition A Hands-On Guide to Data Manipulation in SQL John L. Viescas Michael J. Hernandez r A TT TAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco

More information

The Heuristic (Dark) Side of MIP Solvers. Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL

The Heuristic (Dark) Side of MIP Solvers. Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL The Heuristic (Dark) Side of MIP Solvers Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL 1 Table of content [Lodi], The Heuristic (Dark) Side of MIP Solvers, Hybrid Metaheuristics, 273-284,

More information

Impala Intro. MingLi xunzhang

Impala Intro. MingLi xunzhang Impala Intro MingLi xunzhang Overview MPP SQL Query Engine for Hadoop Environment Designed for great performance BI Connected(ODBC/JDBC, Kerberos, LDAP, ANSI SQL) Hadoop Components HDFS, HBase, Metastore,

More information

CSE 530A. Query Planning. Washington University Fall 2013

CSE 530A. Query Planning. Washington University Fall 2013 CSE 530A Query Planning Washington University Fall 2013 Scanning When finding data in a relation, we've seen two types of scans Table scan Index scan There is a third common way Bitmap scan Bitmap Scans

More information

CS 310: Memory Hierarchy and B-Trees

CS 310: Memory Hierarchy and B-Trees CS 310: Memory Hierarchy and B-Trees Chris Kauffman Week 14-1 Matrix Sum Given an M by N matrix X, sum its elements M rows, N columns Sum R given X, M, N sum = 0 for i=0 to M-1{ for j=0 to N-1 { sum +=

More information

An Initial Study of Overheads of Eddies

An Initial Study of Overheads of Eddies An Initial Study of Overheads of Eddies Amol Deshpande University of California Berkeley, CA USA amol@cs.berkeley.edu Abstract An eddy [2] is a highly adaptive query processing operator that continuously

More information

Column-Stores vs. Row-Stores: How Different Are They Really?

Column-Stores vs. Row-Stores: How Different Are They Really? Column-Stores vs. Row-Stores: How Different Are They Really? Daniel Abadi, Samuel Madden, Nabil Hachem Presented by Guozhang Wang November 18 th, 2008 Several slides are from Daniel Abadi and Michael Stonebraker

More information

Restricted Use Case Modeling Approach

Restricted Use Case Modeling Approach RUCM TAO YUE tao@simula.no Simula Research Laboratory Restricted Use Case Modeling Approach User Manual April 2010 Preface Use case modeling is commonly applied to document requirements. Restricted Use

More information

DATABASE MANAGEMENT SYSTEMS

DATABASE MANAGEMENT SYSTEMS www..com Code No: N0321/R07 Set No. 1 1. a) What is a Superkey? With an example, describe the difference between a candidate key and the primary key for a given relation? b) With an example, briefly describe

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2008 Quiz II There are 14 questions and 11 pages in this quiz booklet. To receive

More information

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Lifecycle of a Query CSE 190D Database System Implementation Arun Kumar Query Query Result

More information

Robustness in Automatic Physical Database Design

Robustness in Automatic Physical Database Design Robustness in Automatic Physical Database Design Kareem El Gebaly Ericsson kareem.gebaly@ericsson.com Ashraf Aboulnaga University of Waterloo ashraf@cs.uwaterloo.ca ABSTRACT Automatic physical database

More information

Optimal Sequential Multi-Way Number Partitioning

Optimal Sequential Multi-Way Number Partitioning Optimal Sequential Multi-Way Number Partitioning Richard E. Korf, Ethan L. Schreiber, and Michael D. Moffitt Computer Science Department University of California, Los Angeles Los Angeles, CA 90095 IBM

More information

La Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes

La Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes National Engineering School of Mechanic & Aerotechnics 1, avenue Clément Ader - BP 40109-86961 Futuroscope cedex France La Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes

More information

Web User Session Reconstruction Using Integer Programming

Web User Session Reconstruction Using Integer Programming Calhoun: The NPS Institutional Archive DSpace Repository Faculty and Researchers Faculty and Researchers Collection 2008 Web User Session Reconstruction Using Integer Programming Dell, Robert F.; Román,

More information