Ch. 21: Object Oriented Databases Learning Goals: * Learn about object data model * Learn about o.o. query languages, transactions Topics: * 21.1 * 21.2 * 21.3 * 21.4 * 21.5 Source: Ch#21, Bertino93, Kim 1989, Zdonik,... * X3/SPARC/DBSSG/OODBTG final report, 1991. * OOMG-93: Unified API for all OODBMS products.
2 Motivation Well-known limitations of Relational DBMS * Too simple for CAD/CAM, GIS, CASE... * Support limited set of data types, no user-defined types * Does not include generalization and aggregation * Segmentation and Performance penalty * Impedance mismatch b/w DB language and application languages * No support for design transactions, versions,, schema evolution,... Approaches to Next Generation Databases * Extended Relational DBs (object-relational) - User defined types, stored procedures/ rules, Xtive closure, generalization - Postgres (Illustra), Oracle v.7, Sybase 10, Starbust, new SQL * Object Oriented DBs - New data model, OO query languages (e.g. C++) - Itasca, ObjectStore, Versant, Ontos, O2, Gemstone (Smalltalk),...
3 Q? Object data model = object programming model? Basic Object Model is similar Objects and object identifiers * Object has state (attributes) and behaviour (methods) * OID, independent of attribute-values, location, structure Complex values and types * Attributes can be primitive (integer), or non-primitive (list) * Constructors (e.g. set, tuple, list,...) Encapsulation - interferes with querying, may not be enforced Classes * Template for a set of objects with same attributes and methods * Persistent vs. non-persistent classes * Meta-classes, An object can change its class Inheritance: subclass inherits attributes, methods of its superclass * Inheritance allows a class called subclass to be defined on the basis * define additional attributes/methods or redefine inherited ones. Overloading, overriding, late binding a method name
4 * Common name for different operations inside a class or a set of classes. * Late binding: resolve ambiguity at run-time
5 Object Data Model - Object Programming Model Semantic Extensions for data modeling Composite Objects, part-of relationship * Additional semantics for references to other objects * Weak (normal) vs. composite (part-of relationship) * Exclusive/Shared part (reference to sub-object) * Dependent/Independent parts: Q? Delete part if whole is deleted? Associations: a link b/w entities in an application * Q? How is an association different from a reference? (Bidirectional) * Similar to relationship in E-R model, has a degree, cardinality * Most OODBs do not represent associations explicitly. Integrity Constraints: assertions to be satisfied by objects in a database. * Uniqueness constraint: Identical vs. Equality * Often support referential integrity constraints, However, * Do not support other constraints peculiar to object data model. - migration of object b/w classes, exclusivity contsraints among classes, etc.
6 Object Data Model Vs. Relational Model Attribute/Field values * Multi-valued composite attributes allowed (Violate 1NF) * Complex values via user defined types + constructors - list, set, tuple * Value types: Extensible type system Object/Classes Vs. Entities/Instances * Object-id (independent of value, location and structure) # Primary Key! * Complex object: Has References (oids), not foreign keys * Model behaviour as well via methods/stored procedures * Encapsulation Relationships * Generalization/Specialization (subclass1 ISA class1, inheritance), * Classification/instantiation (instance1 IS A class1) * Aggregation (part-of, composition hierarchy, composite object), * Association (e.g. union, grouping of similar objects, for checkout/in) * (-ve) But, Relationships are not explicit unlike EER
7 * (-ve) References need not be bi-directional, unlike Relational model
8 Object Data Model Vs. Relational Model (Contd.) Integrity Constraints (Model Inherent) * Uniqueness constraint - 2 meanings (identical vs. equal) * Lack of automatic enforcement (eg IC inherent to object model ) * However, often complemented with flexible rule system * Referential IC is hard-coded in methods Management of changes over time * Object versions * Schema evolution (e.g. Orion, Itasca)
9 Physical data storage Data Manager (DM): Relational Vs.OO Storage Q?. How will you implement the following on a relational DM? * complex values, methods, oid, Object reference, complex object, inheritance Relational DM (e.g. Postgres, Itasca) * Convert complex values to atomic one (conversion procedures) * Methods = procedure-valued fields (stored procedures) * Oid = primary key; Object reference = foreign key * Complex object: Decompose into multiple tuples - Join relevant relations to compose an object * Ex. a car = 4 tires + 4 doors + 1 engine +... * Inheritance = how many tables for a class hierarchy? Object DM * One tuple per object * Q? How to generate/mange object-ids, object references? * Object Index : oid -> location of object (Usually Hashed) * Index: Nested attribute/path -> oid, Clustering
10 Query Languages/ Query Optimization in OODB Query Language * Seamless integration with programming languages - Ex. C++/Lisp API, ODMG standard * Path expressions: astudent.department.manager.name * User defined types/methods: - Ex. distance(astudent.address, adepartment.location) < 5 miles. * Transitive Closure: select* (Postgres) * Query across a class hierarchy: select-any* (Postgres) Query Optimization Issues * User defined methods are hard to optimize * Class hierarchy based queries: need new strategies * Multi-valued attribute: need new cost models * Dual buffering (Separately process objects in the buffer/ on disk) - separate access path for objects in buffer / on disk
11 Transactions, Locking, Recovery Classical Transactions * Lock granularity: intention lock on large complex object - real locks on smaller parts * Transactions share objects in memory buffers - database gurantees consistency of objects in the object buffers - Recovery needs to handle object buffers Design transaction * Ex. A programmer fixes a bug: (change 3 modules, test, integrate) * Long lived, many reads-write * Concurrency Control - check-in/check-out, versions (working, transient, released) Other Issues * Versions * Schema evolution * Rule Systems