Lecture 11 UFCEKG 20 2 : Data, Schemas and Applications Lecture 11 Database Theory & Practice (5) : Introduction to the Structured Query Language (SQL)
Origins & history Early 1970 s IBM develops Sequel as part of the System R project at its San Hose Research Lab; 1986 ANSI & ISO publish the standard SQL 86; 1987 IBM publishes its own standard SQL called SystemsArchitecture Database Interface (SAA SQL); SQL); 1989 SQL 89 published by ANSI (extended version of SQL 86); 1992 SQL 92 published with better support for algebraic operations; 1999 SQL 1999 published with support for typing, stored procedures, triggers, BLOBs etc. SQL 92 remains the most widely implemented standard and most database vendors also provide their own (proprietary) extensions.
Components of SQL The SQL language has several parts: Data definition language (DDL). The SQL DDL provides commands for defining relation schemas, deleting relations, andmodifyingrelation schemas. Interactive data manipulation language (DML). The SQL DML includes a query language based on both the relational algebra and the tuple relational calculus. It includes also commands to insert tuples into, delete tuples from, and modify tuples in the database. View definition. The SQL DDL includes commands for defining views. Transaction control. SQL includes commands for specifying the beginning and ending of transactions. Embedded SQL and dynamic SQL. Embedded and dynamic SQL define how SQL statements can be embedded within general purpose programming languages, such as C, C++, Java, PL/I, Cobol, Pascal, and Fortran. Integrity. The SQL DDL includes commands for specifying i integrity i constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed. Authorization. The SQL DDLincludes commands forspecifying access rights to relations and views.
SQL Example (example db) The Supplier Parts Database s sno sname status city 1 Smith 20 London 2 Jones 10 Paris 3 Blake 30 Paris 4 Clark 20 London 5 Adams 30 Athens sp sno pno qty 1 1 300 1 2 200 1 3 400 1 4 200 1 5 100 1 6 100 p pno pname color weight city 1 Nut Red 12.0 London 2 Bolt Green 17.0 Paris 3 Screw Blue 17.0 Oslo 4 Screw Red 14.0 London 5 Cam Blue 12.0 Paris 2 1 300 2 2 400 3 2 200 4 2 200 4 4 300 4 5 400 6 Cog Red 19.0 London
SQL Example (project) Project the columns renamed columns: SELECT sname FROM s SELECT sname AS Supplier, sname status * 5 AS 'Status times Five' computed columns: FROM s Smith SELECT sname, status * 5 FROM s Jones Blake Clark sname status * 5 Adams Smith 100 Jones 50 Blake 150 Clark 100 Adams 150 Supplier Status times Five Smith 100 Jones 50 Blake 150 Clark 100 Adams 150
Restrict the rows SELECT statement (restrict) SELECT * FROM s WHERE city= London sno sname status city s1 Smith 20 London s4 Clark 20 London complex condition: SELECT * FROM s WHERE city= London OR status = 30 sno sname status city s1 Smith 20 London s3 Blake 30 Paris s4 Clark 20 London s5 Adams 30 Athens
SELECT statement (restrict & project) Restrict & Project SELECT city FROM s WHERE sname='smith' OR status='20' London London city remove duplicate rows: SELECT DISTINCT city FROM s WHERE sname='smith' OR status='20' s London city
SELECT statement (group by & having) Group By and Having Use the GROUP BY clause to aggregate related rows SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city city Total Status Athens 30 London 40 Paris 40 Use the HAVING clause to restrict rows aggregated with GROUP BY SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city HAVING SUM(status) > 30 city Total Status London 40 Paris 40
SELECT statement summarized : For many of the modern uses of databases, it is often necessary to select some subset of the records from a table, and let some other program manipulate the results. In SQL the SELECT statement is the workhorse for these operations. A summary of the SELECT statement: SELECT columns or computations FROM table WHERE condition GROUP BY columns HAVING condition ORDER BY column [ASC DESC] LIMIT offset,count;
SQLComparison operators : In SQL, the WHERE clause is used to operate on subsets of a table. The following comparison operators are available: Usual logical operators: < > <= >= = <> BETWEEN used to test for a range IN used to test group membership Keyword NOT used for negation LIKE operator allows wildcards d _ means single character, % means anything SELECT salary WHERE name LIKE Fred % ;
SQL data tpes types : SQL supports a very large number of dt data types & formats for internal storage of data. Numeric INTEGER, SMALLINT, BIGINT NUMERIC(w,d), DECIMAL(w,d) numbers with width w and d decimal places REAL, DOUBLE PRECISION machine and database dependent FLOAT(p) floating point number with p binary digits of precision
SQL data types (cont.) : Character CHARACTER(L) - a fixed length character of length L CHARACTER VARYING(L) or VARCHAR(L) - supports maximum length of L Binary BIT(L), BIT VARYING(L) - like corresponding characters BINARY LARGE OBJECT(L) or BLOB(L) Temporal DATE TIME TIMESTAMP
SQL Functions : SQL provides a wide range of predefined functions to perform data manipulation. Four types of functions: arithmetic (sqrt(), log(), mod(), round() )) date (sysdate(), month(), dayname() ) character (length(), lower(), upper() ) ) aggregate (min(), max(), avg(), sum() )
Database & Table description commands : Since a single server can support many dtb databases, each containing many tables, with each table having a variety of columns, it s often necessary to view which databases are available and what the table structures are within a particular database. The following SQL commands are often used for these purposes : SHOW DATABASES; SHOW TABLES IN database; SHOW COLUMNS IN table; DESCRIBE table; shows the columns and their types
Inserting Records : Individual records can be entered using the INSERT command: INSERT INTO s VALUES(6, Thomas, 40, Cardiff); Using the column names: INSERT INTO s (sno, sname, status, city) VALUES(6, Thomas, 40, Cardiff); Insert multiple records: INSERT INTO s (sno, sname, status, city) VALUES(6, Thomas, 40, Cardiff), (7, Hamish, 30, Glasgow); Upload from file: LOAD DATA INFILE supplier.tab INTO TABLE s FIELDS TERMINATED BY \t ;
Updating (Editing) ExistingRecords : To change one or more values of columns of a table, the UPDATE command can be used. Edits are provided as a comma separated list of column/value pairs. UPDATE s SET status=status + 10 WHERE city= London ; Note that the UPDATE command without a WHERE clause will update all the rows of a table.
Deleting Records : To delete existing record/s the DELETE FROM command is used. DELETE FROM s WHERE city= London ; Note the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or records that should be deleted. If the WHERE clause is omitted, all records will be deleted!
Normalization (avoiding redundancy) : Repeating data (the same column values acrossmanyrecords) records) wastes space (redundancy) and introduces insert & update anomalies. To avoid this, tables are often normalized and repeating fields are moved to their own tables. These are then related to the base or parent table using foreign keys. For instance in the Quote example author and category are moved to their own tables since a specific category can have many associated quotes and an author can be the source of many quotes.
Joins (1) Joins are used to re combine records which have data spread across manytables tables. The followingsimple example database with two tables m, f is used to illustrate the various kinds of joins. The m f database m f id name age id name age 1 tom 23 1 mary 23 2 dick 20 2 anne 30 3 harry 30 3 sue 34
Joins (2) Product (or Cartesian Product) SELECT * FROM m, f id name age id name age 1 tom 23 1 mary 23 2 dick 20 1 mary 23 3 harry 30 1 mary 23 1 tom 23 2 anne 30 2 dick 20 2 anne 30 3 harry 30 2 anne 30 1 tom 23 3 sue 34 2 dick 20 3 sue 34 3 harry 30 3 sue 34 Synonymous with the CROSS JOIN, hence: SELECT * FROM m CROSS JOIN f; would return the same result. This is not very useful but is the basis for all other joins.
Joins (3) Natural ljoin Joins tables using some shared characteristic usually (but not necessarily) a foreign key. SELECT * FROM m,f WHERE m.age = f.age id name age id name age 1 tom 23 1 mary 23 3 harry 30 2 anne 30
Joins (4) Inner joins The previous example, besides being a natural join, is also an example of an inner join. An inner join retrieves data only from those rows where the join condition is met. SELECT * FROM m,f WHERE m.age > f.age id name age id name age 3 harry 30 1 mary 23
Joins (5) Outer joins Unmatched rows can be included in the output using as outer join. Left outer join: SELECT * FROM m LEFT OUTER JOIN f ON m.age = f.age id name age id name age 1 tom 23 1 mary 23 2 dick 20 NULL NULL NULL 3 harry 30 2 anne 30 Right outer join: SELECT * FROM m RIGHT OUTER JOIN f ON m.age = f.age id name age id name age 1 tom 23 1 mary 23 3 harry 30 2 anne 30 NULL NULL NULL 3 sue 34
Joins (6) Self Join Special case of the inner join here the table employee shows employees and their managers. Ruth manages Joe who manages Tom, Dick and Harry. emp_id emp_name mgr_id 1 Tom 4 2 Dick 4 3 Harry 4 4 Joe 5 5 Ruth NULL Show who manages who by name: Employee Tom Dick Harry Joe SELECT E1.emp_name AS Employee, E2.emp_name AS Manager FROM employee AS E1 INNER JOIN employee AS E2 ON E1.mgr_id = E2.emp_id Manager Joe Joe Joe Ruth