Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism

Size: px
Start display at page:

Download "Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism"

Transcription

1 Query optimization Query Optimization It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a good plan matured, and wait for a chance of using it. Thomas Hardy (187) in Far from the Madding Crowd Generate query plans Estimate size of intermediate results Estimate cost of plan ($,time, ) Q P1 P P3 Pn C1 C C3 Cn pick minimum C55 Query Optimization 1 C55 Query Optimization Query Optimization Query: 1 3 3@ite3 esult@ite1 Possible plans Plan 1: end and 3 to ite1. Perform query at ite 1 Plan : end 3 to ite; Evaluate I = 3; send I to ite1; Evaluate result = I 1 Many other plans (including types of joins, number of sites, semijoins, etc) C55 Query Optimization 3 Cost estimation As in centralized system: estimate result sizes But: # IOs may not be best metric e.g., Transmission time may dominate work work answer at site at site T1 T >>> TIME > or $ C55 Query Optimization Another reason why plain IOs not enough: Parallelism Plan A Plan B site 1 50 IOs 100 IOs site 70 IOs site 3 50 IOs Cost metrics IOs, Bytes transmitted, $, Can add together esponse time metric cannot add need scheduling and dependency info skew important Task 1 Task Task 3 C55 Query Optimization 5 C55 Query Optimization 6 1

2 Take into account: (in parallel/distributed system) tart up costs (for parallel operation) Data distribution costs/time Contention memory, disk, network, Assembling result ite 1 ite ite 3 ite Example: esponse time tartup Distri- earching Final bution +send results proc. C55 Query Optimization 7 C55 Query Optimization 8 Query Optimizer Cost model Plan space Deep tree vs bushy tree Enumeration/earch strategy Exhaustive (with pruning) Hill climbing (greedy) Query separation DD-1 (semi-join based) (1) Exhaustive - consider all query plans with a set of techniques - prune some plans (heuristics) C55 Query Optimization 9 C55 Query Optimization 10 Example: A B T > > T In generating plans, keep goal in mind: T T T T T 1 1 ( ) T (T ) ship semi ship T semi to join to join Heuristics: 1 Prune because cross-product not necessary Prune because larger relation first C55 Query Optimization 11 e.g.: Goal is parallelism in system with fast net, consider partitioning relation(s) first e.g.: Goal is reduction of net traffic, consider semi-joins C55 Query Optimization 1

3 () Hill climbing 1 Worse plans Better plans x Initial plan Hill Climbing Algorithm tep 1: Do initial processing tep : elect initial feasible solution (P 0 ).1 Determine the candidate result sites - sites where a relation referenced in the query exist. Compute the cost of transferring all the other referenced relations to each candidate site.3 P 0 = candidate site with minimum cost tep 3: Determine candidate splits of P 0 into P 1 = {P 1a, P 1b } 3.1 P 1a consists of sending one of the relations to the other relation's site 3. P 1b consists of sending the join of the relations to the final result site C55 Query Optimization 13 C55 Query Optimization 1 Hill Climbing Algorithm tep : eplace P 0 with P 1 that gives cost(p 1a ) + cost(local join) + cost(p 1b ) < cost(p 0 ) tep 5: ecursively apply steps 3 on P 1 until no such plans can be found tep 6: Check for redundant transmissions in the final plan and eliminate them. Example σ(u) V el ite ize tuple size = U 3 90 V 0 Goal: minimize data transmission C55 Query Optimization 15 C55 Query Optimization 16 tep 1: T V A B C T V el ite ize tuple size = T 3 30 V 0 T = σ(u) electivity is 1/3 tep : Initial plan send relations to one site What site do we send all relations to? To site 1: cost=0+30+0=90 To site : cost= =80 To site 3: cost=10+0+0=70 To site : cost= =60 C55 Query Optimization 17 C55 Query Optimization 18 3

4 P0: (1 ) ( ) T (3 ) Compute T V at site teps 3 & Consider sending each relation to neighbor: e.g.: 1 1 C55 Query Optimization 19 C55 Query Optimization 0 Assume: ize = 0 T = 5 T V = cost = 30 cost = 30 cost = 0 No savings Worse! 1 T T T cost = 50 cost = 35 cost = 5 A Win! A Bigger Win 5 0 T 3 C55 Query Optimization 1 C55 Query Optimization P1: P1a: ( 3) α = T P1b: (1 ) α (3 ) compute answer at site tep 5: epeat teps 3 & - Treat α = T as relation α 1 3 vs α 1 α α C55 Query Optimization 3 C55 Query Optimization

5 Hill climbing may miss best plan! Example: best plan could be: V PB: T (3 ) 30 β=t V β β T β ( ) β β = β β ( 1) 1 β = β β (1 ) Compute answer 1 33 = total [optional] T Costs could be low because is very selective C55 Query Optimization 5 (3) Query separation - separate query into or more steps - optimize each step independently C55 Query Optimization 6 Example: simple queries 1. Compute = ΠA[σc ] = ΠA[σc3 ]. Compute J = 3. Compute σc σc1 Ans = σc1{[j σc ] [J σc3 ]} A σc3 In other words: (a) Compute A values in answer (steps 1,) (b) Get tuples from sites with matching A values and compute answer (step 3) C55 Query Optimization 7 C55 Query Optimization 8 imple query - elations have a single attribute - Output has a single attribute e.g., J Idea Decompose query into Local processing imple query (or queries) Final processing Optimize simple query Philosophy Hard part is distributed join Do this part with only keys; get rest of data later impler to optimize simple queries C55 Query Optimization 9 C55 Query Optimization 30 5

6 DD-1 Algorithm tep 1: In the execution strategy (call it E), include all the local processing tep : eflect the effects of local processing on the database profile tep 3: Construct a set of beneficial semijoin operations (B) as follows : B = Ø For each semijoin J i B B J i if cost(j i ) < benefit(j i ) C55 Query Optimization 31 Consider the following query ELECT 3.C FOM 1,, 3 WHEE 1.A =.A AND.B = 3.B which has the following query graph and statistics: relation card tuple size relation size ite 1 ite ite A B attribute F size(π attribute ) 1.A A F: selectivity factor.b esult = F* C55 Query Optimization 3.B Beneficial semijoins: Assume: T MG = 0 J 1 = 1, whose benefit is 100 = (1 0.3)*3000 and cost is 36 J = 3, whose benefit is 1800 = (1 0.)*3000 and cost is 80 Nonbeneficial semijoins: J 3 = 1, whose benefit is 300 = (1 0.8)*1500 and cost is 30 Cost = transfer semijoin attribute = T MG + T T *(size) Benefit = cost of transferring irrelevant tuples (avoided by the semijoin) = (1-F)*size*T T J = 3, whose benefit is 0 and cost is 00 DD-1 Algorithm Iterative Process tep : emove the most beneficial J i from B and append it to E tep 5: Modify the database profile accordingly tep 6: Modify B appropriately compute new benefit/cost values check if any new semijoin need to be included in B tep 7: If B Ø, go back to tep. C55 Query Optimization 33 C55 Query Optimization 3 Iteration 1: emove J 1 from B and add it to E. Update statistics = size() = 900 (= ) size (.A) = 30*0.3 = 96 F (.A) = ~ = ~0. Iteration : Two beneficial semijoins: relation card tuple size relation size F attribute 1.A.A.B 3.B size(π attribute ) J = 3, whose benefit is 50 = (1 0.) 900 and cost is 80 J 3 = 1 ', whose benefit is 300=(1 0.8) 1500 and cost is 96 Add J to E NOT 110=(1-0.)*1500!! Update statistics size( ) = 360 (= 900*0.) Note: selectivity of may also change, but not important in this example. C55 Query Optimization 35 Iteration 3: No new beneficial semijoins. emove remaining beneficial semijoin J 3 from B and add it to E. Update statistics size(1) = 100 (= ) F (1.A) = ~ = 0. C55 Query Optimization 36 6

7 Assembly ite election tep 8: Find the site where the largest amount of data resides and select it as the assembly site Example: DD-1 Algorithm Amount of data stored at sites: ite 1: 100 ite : 360 ite 3: 000 Therefore, ite 3 will be chosen as the assembly site. DD-1 Algorithm Postprocessing tep 9: For each i at the assembly site, find the semijoins of the type i j where the total cost of E without this semijoin is smaller than the cost with it and remove the semijoin from E. tep 10: Permute the order of semijoins if doing so would improve the total cost of E. Example: Final strategy: end ( 1) 3 to ite 3 end (1 ) to ite 3 C55 Query Optimization 37 C55 Query Optimization 38 ummary: Query Optimization Cost/result estimation Three key components in optimizer Cost model, search space, enumeration algorithm In practice, avoid bad plans (rather than find optimal) trategies Exhaustive Hill climbing eparation DD-1 (semi-join based) C55 Query Optimization 39 7

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing C 37 Parallel and Distributed Data Processing pring 2016 Notes : Query Optimization Query Optimization Cost estimation trategies for exploring plans Q min C 37 Notes 2 Based on estimating result sizes

More information

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition Topics for the Day Distributed Databases Query processing in distributed databases Localization Distributed query operators Cost-based optimization C37 Lecture 1 May 30, 2001 1 2 Query Processing teps

More information

Optimization of Distributed Queries

Optimization of Distributed Queries Query Optimization Optimization of Distributed Queries Issues in Query Optimization Joins and Semijoins Query Optimization Algorithms Centralized query optimization: Minimize the cots function Find (the

More information

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization

More information

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing C 347 Parallel and Distributed Data Processing pring 2016 Query Processing Decomposition Localization Optimization Notes 3: Query Processing C 347 Notes 3 2 Decomposition ame as in centralized system 1.

More information

Magda Balazinska - CSE 444, Spring

Magda Balazinska - CSE 444, Spring Query Optimization Algorithm CE 444: Database Internals Lectures 11-12 Query Optimization (part 2) Enumerate alternative plans (logical & physical) Compute estimated cost of each plan Compute number of

More information

Query Optimization in Relational Database Systems

Query Optimization in Relational Database Systems Query Optimization in Relational Database Systems It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a good plan matured, and wait for a chance of using

More information

Chapter 4 Distributed Query Processing

Chapter 4 Distributed Query Processing Chapter 4 Distributed Query Processing Table of Contents Overview of Query Processing Query Decomposition and Data Localization Optimization of Distributed Queries Chapter4-1 1 1. Overview of Query Processing

More information

SDD-1 Algorithm Implementation

SDD-1 Algorithm Implementation National Institute of Technology Karnataka, Surathkal Project Report on SDD-1 Algorithm Implementation Under the Guidance of: Mr. Dr. Anantha Narayana (Professor) Submitted by: Mr. Vasanth Raja Chittampally

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

DISTRIBUTED QUERY OPTIMIZATION USING HILL CLIMBING ALGORITHM FOR COMPLEX CHURCH DATABASES

DISTRIBUTED QUERY OPTIMIZATION USING HILL CLIMBING ALGORITHM FOR COMPLEX CHURCH DATABASES DISTRIBUTED QUERY OPTIMIZATION USING HILL CLIMBING ALGORITHM FOR COMPLEX CHURCH DATABASES Esiefarienrhe Michael Bukohwo 1, Philemon Uten Emmoh 2 and Choji Davou Nyab 3 1,2 Department of Mathematics/Statistics/ComputerScience,University

More information

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1 Optimization of Join Queries on Distributed Relations Using Semi-Joins Suresh Sapa 1, K. P. Supreethi 2 1, 2 JNTUCEH, Hyderabad, India Abstract The processing and optimizing a join query in distributed

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

CS54200: Distributed. Introduction

CS54200: Distributed. Introduction CS54200: Distributed Database Systems Query Processing 9 March 2009 Prof. Chris Clifton Query Processing Introduction Converting user commands from the query language (SQL) to low level data manipulation

More information

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Artificial Intelligence, Computational Logic PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE Lecture 2 Uninformed Search vs. Informed Search Sarah Gaggl Dresden, 28th April 2015 Agenda 1 Introduction

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

Problem solving and search

Problem solving and search Problem solving and search hapter 3 hapter 3 1 Outline Problem-solving agents Problem types Problem formulation Example problems asic search algorithms hapter 3 3 Example: omania On holiday in omania;

More information

Problem Solving and Search. Chapter 3

Problem Solving and Search. Chapter 3 Problem olving and earch hapter 3 Outline Problem-solving agents Problem formulation Example problems asic search algorithms In the simplest case, an agent will: formulate a goal and a problem; Problem-olving

More information

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you

More information

ITCS 6150 Intelligent Systems. Lecture 6 Informed Searches

ITCS 6150 Intelligent Systems. Lecture 6 Informed Searches ITCS 6150 Intelligent Systems Lecture 6 Informed Searches Compare two heuristics Compare these two heuristics h 2 is always better than h 1 for any node, n, h 2 (n) >= h 1 (n) h 2 dominates h 1 Recall

More information

Problem solving and search

Problem solving and search Problem solving and search hapter 3 hapter 3 1 Outline Problem-solving agents Problem types Problem formulation Example problems asic search algorithms hapter 3 2 estricted form of general agent: Problem-solving

More information

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross Track Join Distributed Joins with Minimal Network Traffic Orestis Polychroniou Rajkumar Sen Kenneth A. Ross Local Joins Algorithms Hash Join Sort Merge Join Index Join Nested Loop Join Spilling to disk

More information

Reminders. Problem solving and search. Problem-solving agents. Outline. Assignment 0 due 5pm today

Reminders. Problem solving and search. Problem-solving agents. Outline. Assignment 0 due 5pm today ssignment 0 due 5pm today eminders Problem solving and search ssignment 1 posted, due 2/9 ection 105 will move to 9-10am starting next week hapter 3 hapter 3 1 hapter 3 2 Outline Problem-solving agents

More information

Background material for Chapter 3 taken from CS 245

Background material for Chapter 3 taken from CS 245 Background material for Chapter 3 taken from C 245 contributed by Hector Garcia-Molina selected/cut by tefan Böttcher Query Processing Q Query Plan Focus: elational ystem Others? Background C 245 material

More information

Outline. Problem solving and search. Problem-solving agents. Example: Romania. Example: Romania. Problem types. Problem-solving agents.

Outline. Problem solving and search. Problem-solving agents. Example: Romania. Example: Romania. Problem types. Problem-solving agents. Outline Problem-solving agents Problem solving and search hapter 3 Problem types Problem formulation Example problems asic search algorithms hapter 3 1 hapter 3 3 estricted form of general agent: Problem-solving

More information

Relational Algebra Equivalencies. Database Systems: The Complete Book Ch

Relational Algebra Equivalencies. Database Systems: The Complete Book Ch elational Algebra Equivalencies Database ystems: he Complete Book Ch. 16.2-16.3 Implementing: Joins olution 1 (Nested-Loop) For Each (a in A) { For Each (b in B) { emit (a, b); }} A B 2 Implementing: Joins

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

Parallel Query Optimisation

Parallel Query Optimisation Parallel Query Optimisation Contents Objectives of parallel query optimisation Parallel query optimisation Two-Phase optimisation One-Phase optimisation Inter-operator parallelism oriented optimisation

More information

Preprocessing and Feature Selection DWML, /16

Preprocessing and Feature Selection DWML, /16 Preprocessing and Feature Selection DWML, 2007 1/16 When features don t help Data generated by process described by Bayesian network: Class Class A 1 Class 0 1 0.5 0.5 0.4 0.6 0.5 0.5 A 1 A 3 A 3 Class

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Relational Database Systems 2 8. Join Order Optimization

Relational Database Systems 2 8. Join Order Optimization Relational Database Systems 2 8. Join Order Optimization Wolf-Tilo Balke Jan-Christoph Kalo Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 8 Join Order

More information

Query Processing. high level user query. low level data manipulation. query processor. commands

Query Processing. high level user query. low level data manipulation. query processor. commands Query Processing high level user query query processor low level data manipulation commands 1 Selecting Alternatives SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND DUR > 37 Strategy A ΠENAME(σDUR>37

More information

CompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy

CompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy CompSci 516 Data Intensive Computing Systems Lecture 11 Query Optimization Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 Announcements HW2 has been posted on sakai Due on Oct

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 15, March 15, 2015 Mohammad Hammoud Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+ Tree) and Hash-based (i.e., Extendible

More information

CSE 232A Graduate Database Systems

CSE 232A Graduate Database Systems CSE 232A Graduate Database Systems Arun Kumar Topic 4: Query Optimization Chapters 12 and 15 of Cow Book Slide ACKs: Jignesh Patel, Paris Koutris 1 Lifecycle of a Query Query Result Query Database Server

More information

Problem solving by search

Problem solving by search Problem solving by search based on tuart ussel s slides (http://aima.cs.berkeley.edu) February 27, 2017, E533KUI - Problem solving by search 1 Outline Problem-solving agents Problem types Problem formulation

More information

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Lifecycle of a Query CSE 190D Database System Implementation Arun Kumar Query Query Result

More information

Hardware-Software Codesign

Hardware-Software Codesign Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual

More information

The Enhancement of Semijoin Strategies in Distributed Query Optimization

The Enhancement of Semijoin Strategies in Distributed Query Optimization The Enhancement of Semijoin Strategies in Distributed Query Optimization F. Najjar and Y. Slimani Dept. Informatique - Facult6 des Sciences de Tunis Campus Universitaire - 1060 Tunis, Tunisie yahya, slimani@f

More information

Integration of Transactional Systems

Integration of Transactional Systems Integration of Transactional Systems Distributed Query Processing Robert Wrembel Poznań University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel

More information

Ad hoc and Sensor Networks Topology control

Ad hoc and Sensor Networks Topology control Ad hoc and Sensor Networks Topology control Goals of this chapter Networks can be too dense too many nodes in close (radio) vicinity This chapter looks at methods to deal with such networks by Reducing/controlling

More information

Feature Subset Selection using Clusters & Informed Search. Team 3

Feature Subset Selection using Clusters & Informed Search. Team 3 Feature Subset Selection using Clusters & Informed Search Team 3 THE PROBLEM [This text box to be deleted before presentation Here I will be discussing exactly what the prob Is (classification based on

More information

Problem-solving agents. Problem solving and search. Example: Romania. Reminders. Example: Romania. Outline. Chapter 3

Problem-solving agents. Problem solving and search. Example: Romania. Reminders. Example: Romania. Outline. Chapter 3 Problem-solving agents Problem solving and search hapter 3 function imple-problem-olving-gent( percept) returns an action static: seq, an action sequence, initially empty state, some description of the

More information

Problem solving and search

Problem solving and search Problem solving and search hapter 3 hapter 3 1 eminders ssignment 0 due midnight Thursday 9/8 ssignment 1 posted, due 9/20 (online or in box in 283) hapter 3 2 Outline Problem-solving agents Problem types

More information

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Midterm Exam Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall Covers topics through Decision Trees and Random Forests (does not include constraint satisfaction) Closed book 8.5 x 11 sheet with notes

More information

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations Goals for Today Brief overview of course CS 133: Databases Course evaluations Fall 2018 Lec 27 12/13 Course and Final Review Prof. Beth Trushkowsky More details about the Final Exam Practice exercises

More information

Lecture 20: Query Optimization (2)

Lecture 20: Query Optimization (2) Lecture 20: Query Optimization (2) Wednesday, May 19, 2010 1 Outline Search space Algorithms for enumerating query plans Estimating the cost of a query plan 2 Key Decisions Logical plan What logical plans

More information

Parser: SQL parse tree

Parser: SQL parse tree Jinze Liu Parser: SQL parse tree Good old lex & yacc Detect and reject syntax errors Validator: parse tree logical plan Detect and reject semantic errors Nonexistent tables/views/columns? Insufficient

More information

ABC basics (compilation from different articles)

ABC basics (compilation from different articles) 1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node

More information

A Fast Randomized Algorithm for Multi-Objective Query Optimization

A Fast Randomized Algorithm for Multi-Objective Query Optimization A Fast Randomized Algorithm for Multi-Objective Query Optimization Immanuel Trummer and Christoph Koch {firstname}.{lastname}@epfl.ch École Polytechnique Fédérale de Lausanne ABSTRACT Query plans are compared

More information

Datalog Evaluation. Linh Anh Nguyen. Institute of Informatics University of Warsaw

Datalog Evaluation. Linh Anh Nguyen. Institute of Informatics University of Warsaw Datalog Evaluation Linh Anh Nguyen Institute of Informatics University of Warsaw Outline Simple Evaluation Methods Query-Subquery Recursive Magic-Set Technique Query-Subquery Nets [2/64] Linh Anh Nguyen

More information

Ad hoc and Sensor Networks Chapter 10: Topology control

Ad hoc and Sensor Networks Chapter 10: Topology control Ad hoc and Sensor Networks Chapter 10: Topology control Holger Karl Computer Networks Group Universität Paderborn Goals of this chapter Networks can be too dense too many nodes in close (radio) vicinity

More information

Assignment No: Create a College database and apply different queries on it. 2. Implement GUI for SQL queries and display result of the query

Assignment No: Create a College database and apply different queries on it. 2. Implement GUI for SQL queries and display result of the query Assignment No: 1 GUI Implementation for SQL queries Learning Outcomes: At the end of this assignment students will be able to To create a simple table Write queries for the manipulation of the table Design

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Course Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta

Course Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta Database Management Systems Winter 2002 CMPUT 39: Spatial Data Management Dr. Jörg Sander & Dr. Osmar. Zaïane University of Alberta Chapter 26 of Textbook Course Content Introduction Database Design Theory

More information

Set 2: State-spaces and Uninformed Search. ICS 271 Fall 2015 Kalev Kask

Set 2: State-spaces and Uninformed Search. ICS 271 Fall 2015 Kalev Kask Set 2: State-spaces and Uninformed Search ICS 271 Fall 2015 Kalev Kask You need to know State-space based problem formulation State space (graph) Search space Nodes vs. states Tree search vs graph search

More information

TABU search and Iterated Local Search classical OR methods

TABU search and Iterated Local Search classical OR methods TABU search and Iterated Local Search classical OR methods tks@imm.dtu.dk Informatics and Mathematical Modeling Technical University of Denmark 1 Outline TSP optimization problem Tabu Search (TS) (most

More information

Outline. TABU search and Iterated Local Search classical OR methods. Traveling Salesman Problem (TSP) 2-opt

Outline. TABU search and Iterated Local Search classical OR methods. Traveling Salesman Problem (TSP) 2-opt TABU search and Iterated Local Search classical OR methods Outline TSP optimization problem Tabu Search (TS) (most important) Iterated Local Search (ILS) tks@imm.dtu.dk Informatics and Mathematical Modeling

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

Heuristic Optimisation

Heuristic Optimisation Heuristic Optimisation Part 3: Classification of algorithms. Exhaustive search Sándor Zoltán Németh http://web.mat.bham.ac.uk/s.z.nemeth s.nemeth@bham.ac.uk University of Birmingham S Z Németh (s.nemeth@bham.ac.uk)

More information

4 INFORMED SEARCH AND EXPLORATION. 4.1 Heuristic Search Strategies

4 INFORMED SEARCH AND EXPLORATION. 4.1 Heuristic Search Strategies 55 4 INFORMED SEARCH AND EXPLORATION We now consider informed search that uses problem-specific knowledge beyond the definition of the problem itself This information helps to find solutions more efficiently

More information

The Stratosphere Platform for Big Data Analytics

The Stratosphere Platform for Big Data Analytics The Stratosphere Platform for Big Data Analytics Hongyao Ma Franco Solleza April 20, 2015 Stratosphere Stratosphere Stratosphere Big Data Analytics BIG Data Heterogeneous datasets: structured / unstructured

More information

OSPF (Open Shortest Path First)

OSPF (Open Shortest Path First) OSPF (Open Shortest Path First) Open : specification publicly available RFC 1247, RFC 2328 Working group formed in 1988 Goals: Large, heterogeneous internetworks Uses the Link State algorithm Topology

More information

Nominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN

Nominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN NonMetric Data Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical

More information

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 Outline Local search techniques and optimization Hill-climbing

More information

CS 188: Artificial Intelligence. Recap Search I

CS 188: Artificial Intelligence. Recap Search I CS 188: Artificial Intelligence Review of Search, CSPs, Games DISCLAIMER: It is insufficient to simply study these slides, they are merely meant as a quick refresher of the high-level ideas covered. You

More information

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference STAT 598L Probabilistic Graphical Models Instructor: Sergey Kirshner Exact Inference What To Do With Bayesian/Markov Network? Compact representation of a complex model, but Goal: efficient extraction of

More information

Lecture 5: Search Algorithms for Discrete Optimization Problems

Lecture 5: Search Algorithms for Discrete Optimization Problems Lecture 5: Search Algorithms for Discrete Optimization Problems Definitions Discrete optimization problem (DOP): tuple (S, f), S finite set of feasible solutions, f : S R, cost function. Objective: find

More information

EXECUTION PLAN OPTIMIZATION TECHNIQUES

EXECUTION PLAN OPTIMIZATION TECHNIQUES EXECUTION PLAN OPTIMIZATION TECHNIQUES Július Štroffek Database Sustaining Engineer Sun Microsystems, Inc. Tomáš Kovařík Faculty of Mathematics and Physics, Charles University, Prague PostgreSQL Conference,

More information

Lecture #14 Optimizer Implementation (Part I)

Lecture #14 Optimizer Implementation (Part I) 15-721 ADVANCED DATABASE SYSTEMS Lecture #14 Optimizer Implementation (Part I) Andy Pavlo / Carnegie Mellon University / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017 2 TODAY S AGENDA

More information

Outline. Motivation. Traditional Database Systems. A Distributed Indexing Scheme for Multi-dimensional Range Queries in Sensor Networks

Outline. Motivation. Traditional Database Systems. A Distributed Indexing Scheme for Multi-dimensional Range Queries in Sensor Networks A Distributed Indexing Scheme for Multi-dimensional Range Queries in Sensor Networks Tingjian Ge Outline Introduction and Overview Concepts and Technology Inserting Events into the Index Querying the Index

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

Query Optimization. Chapter 13

Query Optimization. Chapter 13 Query Optimization Chapter 13 What we want to cover today Overview of query optimization Generating equivalent expressions Cost estimation 432/832 2 Chapter 13 Query Optimization OVERVIEW 432/832 3 Query

More information

Lecture 8. Database Management and Queries

Lecture 8. Database Management and Queries Lecture 8 Database Management and Queries Lecture 8: Outline I. Database Components II. Database Structures A. Conceptual, Logical, and Physical Components III. Non-Relational Databases A. Flat File B.

More information

Introduction to Combinatorial Algorithms

Introduction to Combinatorial Algorithms Fall 2009 Intro Introduction to the course What are : Combinatorial Structures? Combinatorial Algorithms? Combinatorial Problems? Combinatorial Structures Combinatorial Structures Combinatorial structures

More information

HYRISE In-Memory Storage Engine

HYRISE In-Memory Storage Engine HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University

More information

Machine Learning based Compilation

Machine Learning based Compilation Machine Learning based Compilation Michael O Boyle March, 2014 1 Overview Machine learning - what is it and why is it useful? Predictive modelling OSE Scheduling and low level optimisation Loop unrolling

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Nominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML

Nominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML Decision Trees Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical

More information

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Algorithm Design (4) Metaheuristics

Algorithm Design (4) Metaheuristics Algorithm Design (4) Metaheuristics Takashi Chikayama School of Engineering The University of Tokyo Formalization of Constraint Optimization Minimize (or maximize) the objective function f(x 0,, x n )

More information

Local Search for CSPs

Local Search for CSPs Local Search for CSPs Alan Mackworth UBC CS CSP February, 0 Textbook. Lecture Overview Domain splitting: recap, more details & pseudocode Local Search Time-permitting: Stochastic Local Search (start) Searching

More information

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing

More information

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld )

Local Search and Optimization Chapter 4. Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld ) 1 2 Outline Local search techniques and optimization Hill-climbing

More information

CS542. Algorithms on Secondary Storage Sorting Chapter 13. Professor E. Rundensteiner. Worcester Polytechnic Institute

CS542. Algorithms on Secondary Storage Sorting Chapter 13. Professor E. Rundensteiner. Worcester Polytechnic Institute CS542 Algorithms on Secondary Storage Sorting Chapter 13. Professor E. Rundensteiner Lesson: Using secondary storage effectively Data too large to live in memory Regular algorithms on small scale only

More information

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement. COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations

More information

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2)

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2) ESE535: Electronic Design Automation Preclass Warmup What cut size were you able to achieve? Day 4: January 28, 25 Partitioning (Intro, KLFM) 2 Partitioning why important Today Can be used as tool at many

More information

A Comparison of Knives for Bread Slicing

A Comparison of Knives for Bread Slicing A Comparison of Knives for Bread Slicing Alekh Jindal Endre Palatinus Vladimir Pavlov Jens Dittrich Information Systems Group, Saarland University http://infosys.cs.uni-saarland.de ABSTRACT Vertical partitioning

More information

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Midterm Examination CS 540-2: Introduction to Artificial Intelligence Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State

More information

Architecture and Implementation of Database Systems (Winter 2015/16)

Architecture and Implementation of Database Systems (Winter 2015/16) Jens Teubner Architecture & Implementation of DBMS Winter 2015/16 1 Architecture and Implementation of Database Systems (Winter 2015/16) Jens Teubner, DBIS Group jens.teubner@cs.tu-dortmund.de Winter 2015/16

More information

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing

More information

Local Search. (Textbook Chpt 4.8) Computer Science cpsc322, Lecture 14. May, 30, CPSC 322, Lecture 14 Slide 1

Local Search. (Textbook Chpt 4.8) Computer Science cpsc322, Lecture 14. May, 30, CPSC 322, Lecture 14 Slide 1 Local Search Computer Science cpsc322, Lecture 14 (Textbook Chpt 4.8) May, 30, 2017 CPSC 322, Lecture 14 Slide 1 Announcements Assignment1 due now! Assignment2 out today CPSC 322, Lecture 10 Slide 2 Lecture

More information

Mining Frequent Patterns without Candidate Generation

Mining Frequent Patterns without Candidate Generation Mining Frequent Patterns without Candidate Generation Outline of the Presentation Outline Frequent Pattern Mining: Problem statement and an example Review of Apriori like Approaches FP Growth: Overview

More information

CIS 192: Artificial Intelligence. Search and Constraint Satisfaction Alex Frias Nov. 30 th

CIS 192: Artificial Intelligence. Search and Constraint Satisfaction Alex Frias Nov. 30 th CIS 192: Artificial Intelligence Search and Constraint Satisfaction Alex Frias Nov. 30 th What is AI? Designing computer programs to complete tasks that are thought to require intelligence 4 categories

More information

The BKZ algorithm. Joop van de Pol

The BKZ algorithm. Joop van de Pol The BKZ algorithm Department of Computer Science, University f Bristol, Merchant Venturers Building, Woodland Road, Bristol, BS8 1UB United Kingdom. May 9th, 2014 The BKZ algorithm Slide 1 utline Lattices

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University HW2 Q1.1 parts (b) and (c) cancelled. HW3 released. It is long. Start early. CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/26/17 Jure Leskovec, Stanford

More information

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch Advanced Databases Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch www.mniazi.ir Parallel Join The join operation requires pairs of tuples to be

More information