1
Why use an RDBMS? ❽ Data maintenance ❽ Standardized access ❽ Multi-user access ❽ Data protection 2
RDBMSs offer Data protection ❽ Recovery ❽ Concurrency ❽ Security 3
Data protection ❽ Recovery from ❽ User error ❽ Statement or process failure ❽ Instance failure ❽ Media failure ❽ Recovery must be: ❽ Predictable all committed transactions, no uncommitted trans ❽ Reliable ❽ Databases use optional transaction logging to allow recovery to the time just before the point of failure 4
Data protection ❽ Concurrency simultaneous access to the same data ❽ Multiple users modifying a table at the same time ❽ Read consistency ❽ Transactions are isolated from each other ❽ Locking 5
Data protection ❽ Security ❽ Users are assigned privileges ❿ Privileges on individual objects ❿ Privileges to perform specific actions ❿ Resource usage privileges ❽ Auditing ❿ Tracking the actions a user performs 6
RDBMS data storage ❽ All RDBMS need a way to obtain exclusive access to disk resources. ❽ Some systems prefer raw filesystems, others prefer them cooked. ❽ Known as tablespaces, devices, disks, chunks, 7
RDBMS organization ❽ Catalog or Data Dictionary ❽ An RDBMS addresses metadata with the same mechanisms as user data. ❽ The catalog of an RDBMS is the set of system objects required to implement itself. 8
Relational theory ❽ Relation ❽ The theoretical structure that is implemented as a table Degree (3) PIN STATUS DATE Tuples (rows) 123 124 125 126 APPROVED PENDING DENIED COMPLETED 19980410 19980729 19980510 19980215 Cardinality (4) Attributes (columns) 9
Relational theory ❽ Relation properties: ❽ Each tuple is distinct (no duplicates) ❽ Tuples are unordered (top->bottom) ❽ Attributes are unordered (left->right) ❽ All attribute values are atomic (relations do not contain repeating groups) 10
Relational theory ❽ Attribute values are taken from pools of legal values known as domains. ❽ In some cases, it is desirable to mark an attribute value as empty or missing. This can be achieved through the use of NULL values. 11
Relational theory ❽ Keys ❽ Candidate ❿ Provide tuple-level addressing system ❿ May be more than one per relation ❽ Primary ❿ One per relation (others are alternate keys) ❽ Foreign ❿ Attribute of a relation which refers to primary key in another relation 12
RDBMS organization ❽ Objects ❽ TABLE ❽ VIEW ❽ INDEX ❽ TRIGGER ❽ PROCEDURE 13
Relational algebra ❽ Relational theory is a mathematical model. It defines a number of operators and properties for the interaction of relations. ❽ One key concept is closure -- the output of any relational operation is another relation. This makes relational algebra very powerful! 14
Relational algebra ❽ Restrict ❽ Produce a tuple subset 15
Relational algebra ❽ Project ❽ Produce an attribute subset 16
Relational algebra ❽ (Cartesian) Product ❽ Produce all possible combinations a b c x y a a b b c c x y x y x y 17
Relational algebra Union Intersection Difference 18
Relational algebra ❽ (Natural) Join ❽ Produce all possible tuples which are a non-repeating combination of two tuples based on shared attribute value(s). a1 a2 a3 a4 b1 b2 b3 b3 b1 b2 b3 c1 c2 c3 a1 a2 a3 a4 b1 b2 b3 b3 c1 c2 c3 c3 19
Relational algebra ❽ Divide ❽ Given binary and unary relations, produce all values of the noncommon attribute which are present for all unary tuples. a a a b b x y z x y x z a 20
SQL ❽ Structured query language ❽ ANSI standard ❽ Broken into two components ❽ Data Definition Language (DDL) ❽ Data Manipulation Language (DML) 21
SQL ❽ To create a table use the CREATE TABLE command: CREATE TABLE table_name ( column_def_list ) CREATE TABLE PARCELS ( PARCEL_NO ASSESSED_VALUE INTEGER, FLOAT) ❽ The DROP command is used to remove objects from catalog: DROP TABLE PARCELS 22
SQL ❽ The most common SQL command is the SELECT statement. ❽ The basic format of a SELECT statement is: SELECT column_list FROM table_list {WHERE where_clause} {ORDER BY column_list} {GROUP BY column_list} 23
SQL ❽ Examples (I) SELECT * FROM PARCELS SELECT PARCEL_NO FROM PARCELS SELECT PARCEL_NO FROM PARCELS WHERE ASSESSED_VALUE > 360 AND TAX_DISTRICT = MRR 24
SQL ❽ Examples (II) SELECT b.owner_name as Owner, a.assessment as Cost FROM PARCELS a, OWNERS b WHERE ASSESSED_VALUE > 360 AND TAX_DISTRICT = MRR AND a.parcel_no = b.parcel_no 25
SQL ❽ ORDER BY ❽ Applies sort to resulting rows ❽ Example SELECT b.owner_name as Owner, a.assessment as Cost FROM PARCELS a, OWNERS b WHERE ASSESSED_VALUE > 360 AND TAX_DISTRICT = MRR AND a.parcel_no = b.parcel_no ORDER BY OWNER_NAME 26
SQL ❽ GROUP BY ❽ Used to summarize one or more rows ❽ column_list must be subset of selected columns ❽ Used in conjunction with summarization functions (SUM, MIN, MAX, ) ❽ Example SELECT a.owner_name as Owner, SUM(b.ASSESSMENT) as Total FROM OWNERS a, PARCELS b WHERE a.parcel_no = b.parcel_no GROUP BY OWNER_NAME 27
SQL ❽ Subselects ❽ The closure property allows for substitution of SELECT expressions into SQL statements ❽ Examples INSERT INTO PARCELS_BAK SELECT * FROM PARCELS SELECT PARCEL_NO FROM PARCELS WHERE ASSESSED_VALUE > (SELECT AVG(ASSESSED_VALUE) FROM PARCELS) 28
SQL ❽ The basic format for the INSERT command is: INSERT source INTO target(columns) VALUES (value_list) ❽ Example: INSERT INTO PARCELS (PARCEL_NO,ASSESSED_VALUE) VALUES (125,180) 29
SQL ❽ The basic format for the UPDATE command is: UPDATE target SET col_val_pair_list {WHERE where_clause} ❽ Example UPDATE PARCELS SET ASSESSED_VALUE = 200 WHERE PARCEL_NO = 123 30
SQL ❽ The basic format for the DELETE command is: DELETE FROM table {WHERE where_clause} ❽ Example: DELETE FROM PARCELS WHERE PARCEL_NO = 121 31
Indexing ❽ Queries represent the majority of operations performed on tables. ❽ Exhaustive searching of rows is SLOW! ❽ Indexes are used to store a sorted list of key values, allowing for faster searches. ❽ Examples ❽ CREATE INDEX PARCELS_PN ON PARCELS(PARCEL_NO) ❽ CREATE INDEX PARCELS_AV ON PARCELS(ASSESED_VALUE, PARCEL_NO) 32
Performance issues ❽ Many strategies have evolved in order to maximize RDBMS query performance: ❽ Block I/O ❿ Since locating the starting byte of a data page is the slowest part of a I/O request, accessing data in large blocks makes sense. ❽ Caching ❿ Disk access is very slow in comparison to memory access (~10,000x). ❿ A cache is a copy of most recently read pages. ❿ A blown cache penalty can occur if the cache is smaller than the frequentlyread pages. ❽ Indexing and Query optimization ❿ In a complex query, the order of the search is important ❿ Example: it is faster to first search on age, then on sex, to query for the 25 year old males ❿ Rule-based optimization; cost-based optimization; hints 33
SDE stores GIS data in an RDBMS ❽ ❽ GIS users want better data management ❽ data integrity ❽ fast access for many simultaneous users ❽ efficient use of the network ❽ common environment to manage spatial and tabular data ❽ SQL standard MIS users want spatial functionality ❽ include spatial data as a managed enterprise asset ❽ support GIS applications ❽ spatially enable applications. Example: ❿ Point-in-polygon query to determine auto insurance rate. ❿ Operator types in address, rate appears on screen. ❿ The operator never sees a map.
Traditional GIS data is not stored in RDBMS ❽ Problems coordinating transactions on data ❽ No referential integrity ❽ Different application environments Departmental Data Files Database Integrator or ODBC Enterprise Database Spatial files Tables RDBMS Tables GIS Apps. Database Apps.
Features ❽ Features are spatial objects ❽ object with a geometry attribute ❽ Vector model for geographic entities ❽ Features (rows) belong to feature classes (tables) Feature (row) Feature class (table) ROAD OID Shape RoadType 1 X,Y,Z,M,... Highway...... ❽ Feature location can be stored three ways: ❽ Binary ❽ Normalized ❽ Extended Type 36
Feature geometry Points Multipoints Lines Polygons 1 Line 1 Poly ❽ Segments have start/end with a curve in between ❽ Aggregate to paths/rings ❽ Aggregate to lines/polys ❽ Edit at any level 2 Paths Bezier curve Line Segments 3 Rings (closed paths) Circular arc 37
Feature coordinates ❽ Feature coordinates are: ❽ X - position in X Z ❽ Y - position in Y ❽ Z - position in Z (optional) X ❿ E.G.; elevation for a point or line segment end ❽ M - a measurement (optional) Y ❿ E.G.; Dynamic Segmentation measures M=30 M=35 M=35 M=42 One line made of two segments ❽ Stored as integer (scaled in Spatial Reference) 38
Storing the geodatabase ❽ Personal geodatabase ❽ Stored in an.mdb file Desktop ArcInfo client ❽ Automatically connected through JetEngine server ❽ ArcSDE geodatabase ❽ Stored in an RDBMS ❽ User connects through ArcSDE server ❽ The difference Personal geodatabase MS Access format (free) ❽ Type of RDBMS (and connection method) ArcSDE ArcSDE geodatabase Oracle SQL Server (not free) ❽ Multi-user editing and conflict resolution tools (ArcSDE) ❽ Once loaded, use same tools on either storage type 39
Geodatabase basics ❽ Stores tables, feature classes, feature datasets, more ❽ Tables ❽ A collection of attribute rows and columns ❽ Feature classes ❽ A collection of features ❽ Conceptually like a shapefile ❽ Feature datasets ❽ A collection of feature classes ❽ Conceptually like a coverage ❽ Raster datasets ❽ Rules ❽ Domains ❽ Connectivity rules feature dataset feature classes table 40
Introducing subtypes and domains ❽ Prevent illegal attribute assignment to features, tables ❽ Subtype - a subset of records within a field ❽ Domain - a definition of valid values for a field or subtype Feature class Streets PowerPoles Subtypes based on CLASS Primary Secondary Wood Steel Domain ST, RD, AV, BLVD Ln, Cir, Pl Height: 20-30 Height: 30-50 41
Subtypes in ArcMap ❽ Add, edit, symbolize by subtype ❽ New features get subtype defaults for attributes Feature class Subtypes Editor target list shows subtypes 42
Editing records with coded value domains ❽ Attribute editor only shows valid values ❽ The description is displayed instead of the code Attribute editor shows descriptions... but the underlying table stores the codes 43
Editing records that have range domains ❽ Perform edit in ArcMap ❽ Use Validate Selection to verify edit against range ❽ Invalid features remain selected LINE_SIZE has a range domain of 10-16 44
Introducing a Geometric network ❽ Feature classes in a single feature dataset ❽ A feature class can only participate in one network ❽ Connectivity based on geometric coincidence Valve Service Feature classes Geometric Network Lateral Main 45
Connectivity rules ❽ Validate using validate selection ❽ Edge - Junction ❽ Cardinality Junction B can connect to 3 A edges ❿ Number of junctions connecting to an edge ❿ Number of edges connecting to a junction ❽ Edge - Edge ❽ Default junction is automatically added during editing Edge A can connect to Edge B through junction C A C B B C A A A B A Edge A can have up to 2 B junctions attached Adding B Snap to A C is automatically created 46
Relationships ❽ An association between tables or feature classes Line2Maint Maint2Line LINE_ID LINE_MATERIAL 1 2 2 4 3 4 Line feature class LINE_ID MAINT_TYPE 1 1 1 2 1 2 1 3 1 2 2 1 3 1 Maintenance table ❽ Works with the geodatabase and coverages ❽ Tables must be in the same workspace ❽ Same data type 47
Accessing related records in ArcMap ❽ Fields in related table appear in Attributes editor Edit all related records Edit single related records ❽ Open related table ❽ Selected records in related table update when table is displayed 48
Simple and composite relationships ❽ Simple ❽ Objects exist independently ❽ Composite ❽ Destination objects cannot exist without origin objects (deleted) ❽ Destination features move with origin features Composite relationship, State to County Select state and move it the counties follow 49
Multi-user editing ❽ Many users may edit any version simultaneously ❽ You see your edits only ❽ Others see your edits when they save ❽ Version permissions ❽ Private ❿ Owner views and edits ❽ Public ❿ Everyone views and edits ❽ Protected ❿ Everyone views; owner edits Dan Molly Eric Ken Geodatabase Version: Plan 1 Version: Plan 2 Default 50
Conflicts ❽ Occur when two users edit the same feature ❽ A coordinate and attribute edit can create a conflict ❽ The second person to save will notice the conflict Molly USA: Plan 1 Dan Edit version (Dan s edit) Conflict version (Molly s edit) Pre-edit version 51
Displaying conflicts ❽ Triggered by Save or Reconcile ❽ Dan tried to save after Molly edited the same feature Conflicting feature class Conflicting feature Versions Conflicting fields 52
Summary ❽ RDBMS principles ❽ SQL ❽ Performance ❽ SDE ❽ Geodatabase 53
54