CS108 Lecture 18: Databases and SQL Databases for data storage and access The Structured Query Language Aaron Stevens 4 March 2013 What You ll Learn Today How does Facebook generate unique pages for each user? How is data stored within a computer? What is a database, and how does a database fit into an application? What are the operations we can do with data stored in a database? 1
Storing Data: Main Memory Memory is used to store programs and other data that are currently in use. LOAD, STOR operations Advantage of memory: short access times Read/write times in nanoseconds (10-9 sec) Disadvantages of memory: relatively expensive ($/byte) volatile Storing Data: Secondary Storage Secondary storage is used to store data for later use (examples: disks, CD, DVD). Data is written from memory to disk. When needed, data is read back into memory. 2
Secondary Storage Advantages of secondary storage: relatively inexpensive ($/byte) not volatile Disadvantage of secondary storage: long access times Read times in milliseconds (10-3 sec). in 10 ms, a modern CPU can perform millions of operations! Thus, it's important to minimize the number of times that the disk is accessed. Example: Facebook Profile What info is on a Facebook profile page? First Name Last Name Email Password Birthday About me Activities Favorite Books Favorite Movies Favorite Music Favorite Quotes Favorite TV shows No matter which page you view, it has these same elements These data are stored in a database table. 3
Whats a Table? Each table (sub-collection) is a collection of records, and each record contains fields. Example: a profiles table The primary key is a field which uniquely identifies one record within a table. Consider this URL: http://www.facebook.com/profile.php?id=919184 Primary Key The Primary Key: uniquely identifies a record within a table is an ideal search key is a way to create relationships between different tables Consider this URL again... http://www.facebook.com/profile.php?id=919184 4
Example: Status Updates Not all data fits neatly into the profile table. Consider status updates A separate Status table tracks status updates for all users. timestamp, userid, status Example: Status Table Each status message is related to exactly one profile by the foreign key (the field called id). These ids are called a foreign key, because they are primary keys in another table. 5
Example: Friends The Facebook friendrelationship is created by an entry in a friend table. Each record has two user ids: These ids are called a foreign keys, because they are primary keys in another table. Databases A database is: a collection of data stored in a way to enable quick access (by primary key) organized into related sub-collections called tables. Example: a mini facebook database: A profiles table has records of each user A status table has records of status messages A friends table has records friend relationships Each table (sub-collection) is organized by records, and each record contains fields. 6
DBMS A database is a collection of data (not software). A database management system (DBMS) is the software which manages a database. Functions of a DBMS: Efficient storage Providing a logical view of data (tables, records) Query processing Transaction management DBMS Efficient storage and retrieval. Indexing enables locating a record by unique attribute, called a key. Example: looking up stocks by their symbol. Logical representation - storage by record. Example: update the record for ID= 5. 7
Query Processing A query language is used to access and modify the data. SQL (Structured Query Language) is the standard for relational databases. Many different database vendors support SQL: Oracle, Sybase, IBM DB2, MS SQL Server, MS Access MySQL, SQLite (free/open-source) Transaction Integrity A transaction is an atomic sequence of operations that must complete together (or fail completely) but must not be half-done. Example: using the bank machine Check available balance, dispense cash, update balance. DBMS guarantees transaction integrity: completion or failure of a the entire sequence of steps that make up a transaction. 8
Database Applications End users rarely interact with a database directly. A database-enabled application allows the users to interact with the database without needing to know the query language. Structured Query Language You ll need to know some SQL to write databaseenabled applications. independent language -- syntax and semantics. SQL is comprised of 2 sub-languages: Data Manipulation Language (DML): SELECT, INSERT, UPDATE, DELETE Data Definition Language (DDL): CREATE TABLE, DROP TABLE, ALTER TABLE We ll focus on the DML for interacting with records. 9
Using sqlclient Program You may use the sqliteclient.py program to experiment with SQL statements: http://cs-webapps.bu.edu/cs108/util/sqlclient.py Example: Mini FB Database Lets consider just 2 tables: Notice that these tables share some data: ID is a primary key in profiles, and a foreign key in status; this enables cross-table relationships. 10
The SELECT Query General form: SELECT <field1>, <fields2>, FROM <table1>, <table2>, [WHERE <field>=<value> ] The minimal SELECT query requires only a list of fields (or * for all) and a single table. Example: SELECT * FROM profiles The SELECT Query (continued) The SELECT query can specify conditions using the WHERE clause, which creates a more refined result set (e.g. only matching records are returned). Example: 11
SQL Comparison Operators SQL WHERE clauses can support all of the usual comparison operations. Here are the SQL comparison operators. The SELECT Query (continued) Multiple WHERE criteria can be joined together using the logical operators AND, OR and NOT. Example: 12
The SELECT Query (continued) WHERE criteria can use wildcard comparisons as well, using the LIKE clause for nearmatches. Example: The INSERT Query General form: INSERT INTO <table> VALUES (<val1>, <val2>,<val3> ) The INSERT query will insert a record into the table. It requires a list values one value for each field in the record. Example: 13
The UPDATE Query General form: UPDATE <table> SET <field1>=<value1>, <field2>=<value2> [WHERE <field>=<value>] Example: Always use a WHERE clause in an UPDATE! The DELETE Query General form: DELETE FROM <table> [WHERE <field>=<value>] Example: The DELETE query is extremely dangerous. Always verify your criteria before deleting! Always use a WHERE clause in a DELETE! 14
What You Learned Today Databases persistence versus volatility. DBMS: efficiency, logical view, query language, and transaction integrity. Structured Query Language SELECT, UPDATE, INSERT, DELETE Constraints with WHERE clause Next Time: The Python DB API Python defines a standard API (objects and methods) for interaction with databases. No standard implementation of this interface. 3 rd party developers write their own libraries which conforms to the standard. We will be using 2 different DMBS in CS108: The SQLite3 DBMS comes standard with Python Free, nothing additional to install We ll move to the MySQL DBMS for web-application projects starting in 2 weeks 15
Using sqliteclient Program You may use the sqliteclient.py program to experiment with SQL statements. It is accesible here: http://cs-webapps.bu.edu/cs108/util/sqlclient.py Check your SQL statements against this client to rule out SQL syntax errors. Then implement the SQL with parameterized data in your client program. Announcements and To Do Readings: SQL Tutorial (today) http://www.firstsql.com/tutor.htm Python DBAPI and sqlite3 (for Wednesday) http://docs.python.org/library/sqlite3.html 16
Facebook Haiku Avoiding Facebook Became too much of a chore So I relented! - Scott M. Sokol Using Python with MS Access Should you want to use Python with Microsoft Access, you will need to download and install 2 components: 1- install Mark Hammond's pywin32-210 for python 2.5 http://starship.python.net/crew/mhammond/win32/downloads.html 2- download adodbapi.zip from http://adodbapi.sourceforge.net/ Unzip adodbapi.zip into C:\Python25\Lib\site-packages\ This is only available for Windows! 17