Other Query Languages II Winter 2006-2007 Lecture 12
Last Lecture Previously discussed tuple relational calculus Purely declarative query language Same expressive power as relational algebra Could also express unsafe statements Statements that generate infinite relations! Datalog language based on relational calculus Very succinct, clean language for stating queries Some powerful features, such as recursive queries Not used in commercial database application development
Domain Relational Calculus Another form of relational calculus Instead of tuple variables, uses domain variables Values range over an attribute s domain Very similar to tuple relational calculus Queries have the form: { < x 1, x 2,, x n > P(x 1, x 2,, x n ) } x i are domain variables Schema of result specified by < x 1, x 2,, x n > P is a formula composed of atoms
Formal Definitions Valid atoms for P: < x 1, x 2,, x n > r x Θ y x Θ c r is a relation with n attributes x and y are domain variables Θ is a comparison c is a constant All atoms are also formulas
Formal Definitions (2) Compositions of atoms and formulas: If P 1 is a formula, then so are ŸP 1 and (P 1 ) If P 1 and P 2 are formulae, then so are: P 1 P 2 P 1 P 2 P 1 P 2 If P 1 (x) is a formula where x is a free domain variable, then so are: x (P 1 (x)) for all values in x, P 1 (x) is true x (P 1 (x)) there exists a value in x where P 1 (x) is true Shorthand: a, b, c (P 1 (a, b, c)) instead of a ( b ( c (P 1 (a, b, c))))
Example Queries Find all details of loans over $1200. In tuple relational calculus: { t t loan t[amount] > 1200 } In domain relational calculus: { < n, b, a > < n, b, a > loan a > 1200 } Very similar to tuple relational calculus form
Example Queries (2) Find loan numbers for loans over $1200. In tuple relational calculus: { t s loan ( t[loan_number] = s[loan_number] s[amount] > 1200 ) } In domain relational calculus: { < n > b, a ( < n, b, a > loan a > 1200 ) } Difference is when variables are constrained s is bound to loan immediately, by s loan b, a are initially unconstrained, until formula < n, b, a > loan
Joining Relations This query requires multiple relations: Find the names of customers with loans at the Perryridge branch, and the loan amounts. Domain relational calculus: { < c, a > n ( < c, n > borrower b ( < n, b, a > loan b = Perryridge )) } All customer names and amounts, such that: Customer name appears in borrower, and Associated loan number in borrower also appears in loan, with a branch name of Perryridge
Joining Relations (2) Domain relational calculus: { < c, a > n ( < c, n > borrower b ( < n, b, a > loan b = Perryridge )) } Tuple relational calculus: { t s borrower ( t[customer_name] = s[customer_name] u loan ( u[loan_number] = s[loan_number] t[amount] = u[amount] u[branch_name] = Perryridge )) } Join operation is more implicit in DRC form Statement of relationships is very explicit in TRC
Set Operations Find customers with an account or a loan. Set union operation Domain relational calculus: { < c > ln ( < c, ln > borrower ) an ( < c, an > depositor ) } Can change to set intersection, set difference with simple modifications Identical to tuple relational calculus
Safety of Expressions Like tuple relational calculus, can specify unsafe expressions { < n, b, a > Ÿ( < n, b, a > loan ) } Same issue as before: result contains values outside the domain of the formula What about this: { < x > y ( < x, y > r) z (Ÿ( < x, z > r ) P(x, z)) } Domain variable z can range over infinite values! Can t evaluate second half of this formula
Safety of Expressions (2) For an expression: { < x 1, x 2,, x n > P(x 1, x 2,, x n ) } Considered safe if these rules hold: All values that appear in expression s result are from dom(p) For every there exists subformula of form x (P 1 (x)), the subformula is true iff there is a value x in dom(p 1 ) such that P 1 (x) is true For every for all subformula of form x (P 1 (x)), the subformula is true iff P 1 (x) is true for all values x from dom(p 1 ) Ensures that no domain variable has an infinite set of values
Safety of Expressions (3) For this query: { < x > y ( < x, y > r) z (Ÿ( < x, z > r ) P(x, z)) } Formula z (Ÿ( < x, z > r ) P(x, z)) is true for values of z outside of formula s domain Not safe.
Domain Relational Calculus Has same expressive power as tuple relational calculus, and relational algebra (if restricted to safe expressions) As before, grouping and aggregation are extended operations, and must be added Simpler than tuple relational calculus for some queries Schema of an expression is more obvious Expressing relationships is very easy
Query By Example QBE is a query language based on the domain relational calculus QBE syntax is two-dimensional Queries actually look like tables Query specifies examples of what to retrieve Two variants: The original text-based version A graphical version used in Microsoft Access Query Design interface
QBE Skeleton Tables Skeleton tables specify results to retrieve Same columns as the actual tables, but with details of what to retrieve In forming a query, only required tables are displayed Limits clutter Example skeleton table: loan loan_number branch_name amount
Example Queries Find loan numbers of loans at Perryridge branch loan loan_number branch_name amount P._x Perryridge P. means print this value _x is a domain variable (not required here) Perryridge is a literal value QBE eliminates duplicates automatically Can specify ALL. to display all values loan loan_number branch_name amount P.ALL. Perryridge
Example Queries (2) To display all tuples in relation: Can also specify conditions to apply Find all loans over $700 Can apply negation loan loan_number branch_name amount P. loan loan_number branch_name amount P. > 700 Find names of all branches not located in Brooklyn branch branch_name branch_city assets P. ŸBrooklyn
QBE Domain Variables Can use variables to constrain results Example: Find all customers who live in same city as Jones customer customer_name customer_street customer_city P._x Jones _y _y Second row constrains _y to cities associated with customer name Jones First row prints customer names with same _y value
Multi-Relation Queries QBE supports multiple-relation queries Approach is simple: Use same variable name in multiple skeleton tables QBE constrains results to have matching values Example: Find names of customers with a loan at Perryridge branch borrower customer_name loan_number P._y _x loan loan_number branch_name amount _x Perryridge
Multi-Relation Queries (2) Can also perform not-in queries Example: Find names of customers with an account, but no loan. depositor customer_name account_number P._x borrower customer_name loan_number ÿ _x
The Condition Box QBE also has a condition box for more general constraints Easier way to formulate some queries Example: Find loan numbers of loans made to Smith or Jones. borrower customer_name loan_number _n P. Can state this without condition box too, but it s more confusing conditions _n = Smith or _n = Jones
Microsoft Access QBE MS Access includes a QBE interface Called Graphical Query-by-Example (GQBE) Like text-based QBE, tables are selected for each query Can also select other queries, to build a query against a derived relation Unlike QBE, joins are represented graphically, with links between tables Don t need to express relationships with variables Somewhat different layout than text-based QBE
Example GQBE Queries Find loan numbers of loans at Perryridge branch:
Example Queries (2) Report total account deposits per customer city:
Example GQBE Queries (3) GQBE queries are translated into SQL Previous query, in SQL: SELECT customer.customer_city, Sum(account.balance) AS SumOfbalance FROM (customer INNER JOIN depositor ON customer.customer_name = depositor.customer_name) INNER JOIN account ON depositor.account_number = account.account_number GROUP BY customer.customer_city; Like all auto-generated SQL, it s grungy
Query By Example Another query language, based on domain relational calculus Language is inherently visual in nature Allows queries to be stated in a reasonably simple, intuitive way User gives examples of what they want to retrieve Not used in large-scale applications MS Access is widely used, but primarily for small, simple databases Graphical QBE interface is much simpler than SQL for casual database users
Upcoming Events Next lecture is the midterm review Make sure to come to lecture Next week we start focusing on database schema design Entity-relationship model Dependencies and normal forms