Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases Level : 4 Year : 200 2002 Jim Craven (jcraven@bournemouth.ac.uk) Stephen Mc Kearney (smckearn@bournemouth.ac.uk) Melanie Coles (mcoles@bournemouth.ac.uk) Autumn Term 30/9 Induction Week 7/0 Project Week (No Lecture)(JC) This lecture is cancelled for project week. 4/0 Database Performance Issues(JC) This lecture introduces the theme of optimising the performance of a relational database system. When do performance issues arise in the database development process? How should performance issues be approached? What areas should be considered first when tuning a database.? What are the constraints? Reading: Ensor(Chapter 2), Corrigan 2/0 Service Time(SMcK) This lecture discusses the techniques to read/write data on a disc. How is data read from or written to a disc? What are the main parts of the read/write process? How much time does it take to read information from a disc? What factors affect the speed of a disc? How may disc performance be improved? How can the speed of a disc be calculated? Reading: Elmasri(Appendix B) 28/0 B+-Tree Indexing (SMcK) This lecture discusses a popular type of index structure. Why is an index necessary? What is a multilevel index? What is a B+-Tree index? How is an index searched? How is data inserted into or deleted from a B+-Tree index? How big does a B+-Tree get? What are the important properties of a B+-Tree? Reading: Elmasri(Section 5.3), Silberschatz(Section 8.3), Knuth(Pages 473 479), Connolly(Appendix C.5.5) Oracle environment structure and documentation This class will look at the structure of Oracle and how an Oracle database stores data. We will also look at the documentation provided with Oracle. The Data Dictionary We will look at the structure of a data dictionary and investigate some of the uses that may be made of the information contained in the dictionary. Lecture review - Service time This seminar will review some of the lecture examples and work through some exercises. We will also study some aspects of Oracle that are affected by the service time and methods of measuring database performance. Oracle Controlling storage parameters This class will investigate Oracle s storage structures and methods of adjusting the storage parameters in Oracle. One of the purposes of this class is to help you prepare for the assignment. 4,2,4 2,2,4
4/ B+-Tree Indexing 2(SMcK) This is the second part of the lecture B+-Tree Indexing. / Query processing (JC) This lecture considers altenative strategies for joining relational tables. The role and nature of the optimiser. What are inner and outer joins? What is the basic operation for table joins? In a multi table join, can the order of the joins speed up the processing? What other Features/ processing can speed up the join operation? What constraints need to be considered? Why is selectivity an important consideration? Reading: Elmasri(Chapter 6), Silberschatz(Chapter 2), Corrigan(Chapter 6), Connolly(Chapter 20) 8/ Query processing 2(JC) This is the second lecture on join strategies and considers particular strategies and their associated costs. What are the processing differences between simple, block and indexed nested loop joins? How do sort-merge and hash joins work? What are their relative advantages? Reading: Elmasri(Chapter 6), Silberschatz(Chapter 2), Corrigan(Chapter 6), Connolly(Chapter 20) 25/ Query processing 3(JC) This completes the lectures on query processing. The effects of selectivity on the choice of strategy. How can a cost function be constructed? Introduction to the assignment. Reading: Jain(Chapters 3) 2/2 Clustering(SMcK) This lecture describes clustering and how it is used to build databases. What is clustering? Why is clustering necessary? What types of clustering are possible? What are the advantages and disadvantages of clustering? What applications require clustering? How does Oracle cluster data? How does clustering compare to B+-Trees? Reading: Oracle Concepts(Pages 5-23 5-27) 9/2 Denormalisation(SMcK) This lecture describes a process called denormalisation that is used to improve the performance of databases. What is denormalisation? Why is denormalisation necessary? What alternatives methods could be used? What is the difference between normalised, unnormalised and denormalised? How is a database denormalised? Reading: Corrigan(Chapter 5, page 69) - B+tree Indexing B+-tree structure introduced in B+-Tree Indexing. We will investigate methods of modelling and measuring B+-tree performance. Loading data in Oracle SQL Loader We will review techniques for generating and loading data into databases. This class is intended to help you prepare for the assignment. Lecture review Query processing This class will review examples based on the lecture on query processing. Lecture review Query processing This class will review examples based on the lecture on query processing. Oracle Timing Queries This class look at different techniques for measuring the performance of database queries. Clustering and Denormalisation lectures.,4,2,4,4,2,4,2
Spring Term 6/ Distributed Databases (JC) This lecture describes the concept and the features of distributed database systems. What are the needs for a distributed database environment? What are the features of a DDBMS? What models and platforms support distributed databases? What are the alternatives to a distributed database? What are the advantages and disadvantages of distributed databases? Schema architecture for a DDBMS. Reading: Bell(Chapters, 2 and 3), Date(Chapter 20), Ozsu(Chapter and 4), Elmasri(Chapter 24), Connolly(Chapter 22 and 23) 3/ Distributed Databases 2(JC) This lecture considers the design and implementation issues for distributed databases. What determines where the data is stored? When should replication and fragmentation be used? What are the various data storage models, their relative strengths and weaknesses? How can concurrency be controlled? How can the integrity of the database be assured? What other management issues should be considered in a DDBMS environment? Reading: Bell(Chapters 4, 6, 7 and 8), Silberschatz(Chapter 8), Ozsu(Chapter 5 and ), Connolly(Chapter 22 and 23) 20/ Distributed Databases 3(JC) This lecture appraises alternative strategies for query processing in a distributed database environment. Why is query optimisation particularly important in a DDB environment? What are the alternative ways of joining distibuted tables? How does a semi-join work? Reading: Bell(Chapter 5), Silberschatz(Chapter 8), Ozsu(Chapter 7, 8 and 9), Connolly(Chapter 22 and 23) 27/ Extendible Hashing (SMcK) This lecture describes a new type of hashing. What is static hashing? What are the problems with static hashing? How can extendible hashing solve these problems? What are the main components of an extendible hashing index? How is an extendible hashing index searched? How is data inserted into or deleted from an extendible hashing index? What are the advantages and disadvantages of an extendible hashing index? Reading: Silberschatz(Sections 8.5 8.7), Elmasri(Section 4.8) 3/2 Extendible Hashing 2(SMcK) This is the second part of the lecture Extendible Hashing. Oracle Investigating query execution This class shall investigate techniques for analysing query execution and performance. We will use the tools available in Oracle to illustrate each technique. Distributed Databases Distributed Query Processing Extendible Hashing This class shall review the Extendible Hashing index that was introduced in the We shall compare the performance of the Extendible Hashing index against the B+-tree. Extendible Hashing 2,3 3 3
0/2 Comparison B+-Trees, Hashing, Clustering, Denormalisation(SMcK) This lecture presents a comparison of different methods of improving the performance of a database. Scenarios are presented that demonstrate which techniques are best suited to which problems. What are the advantages and disadvantages of each technique? How is a particular method selected? 7/2 Project Week 24/2 Extended Storage Hierarchy (JC) This lecture compares and contrasts the various physical storage media available for supporting database operations. What is the extended storage hierarchy? What are the characteristics of the various media? What features and constraints affect performance? Can performance be improved? Reading: Silberschatz(Chapters 0 and 7), Elmasri(Chapter 5) 3/3 DBMS Resources - Parallelism (JC) This lecture considers parallel processing arrangements. What are the main parallel database architectures? What are their relative advantages and disadvantages? The cost of parallel processing. I/O parallelism. RAID storage. What parallel features are supported by a typical dbms(eg Oracle). When is their use appropriate? Reading: Silberschatz(Chapters 6 and 7), Ozsu(Chapter 3) 0/3 Multi-Attribute Indexing (SMcK) This lecture studies multi-attribute index structures. What is a multi-attribute index? What is the difference between a multi-attribute index and a single attribute index such as a B+-Tree? What types of query can a multi-attribute index be used to answer? What problems exist when creating a multi-attribute index? What types of multi-attribute exist? Reading: Silberschatz(Section 8.9) 7/3 Schema Integration (SMcK) This lecture will look at the process of merging two or more databases to great a single database. What is schema integration? Why is schema integration necessary? What problems exist when merging databases? What strategies may be used to merge schemas? What is the process fro merging schemas? Reading: Batini(Chapter 5), Elmasri(Pages 456 460) 24/3 Revision Lecture(SMcK) Revision Resource scheduling CPU time allocation, disk and buffer management. Oracle internals.,2,4,4 3/3 Revision Lecture(JC) Revision Summer Term 28/4 Revision Lecture(SMcK) Revision
5/5 Reading Core Elmasri and Navathe, Fundamentals of Database Design, McGraw-Hill, 999. Silberschatz, Abraham; Korth, Henry F. and Sudarshan S., Database System Concepts, McGraw Hill, 998. Recommended Ramakrishnan, R., Database Management Systems, McGraw-Hill, 998. Connolly, Thomas and Begg, Database Systems, A Practical Approach to Design, Implementation and Management, Addison-Wesley, 998. Date, Chris J, Introduction to Database Systems, Longman Higher Education, 999. Gurry, Mark and Corrigan, Peter, Oracle Performance Tuning, O Reilly UK, 997. Ensor, Dave and Stevenson, Ian, Oracle Design, O Reilly UK, 997. Gray, Jim, The Benchmark Handbook, http://www.benchmarkresources.com/handbook/introduction.asp. Harrison, Guy, Oracle SQL High Performance Tuning, Prentice Hall, 997. Jain, Raj, The Art of Computer Systems Performance Analysis Techniques for Experimental Design, Measurement, Simulation, and Modeling, John Wiley and Sons, 99. Bell, David and Grimson, Jane, Distributed Database Systems, Addison-Wesley, 992. Ozsu, M. Tamer and Valduriez, Patrick, Principles of Distributed Database Systems, Prentice Hall, 997. Rodgers, Ulka, Denormalisation - Why, What and How?, Database Programming and Design, December, 989. Batini, C.; Lenzerini, M. and Navathe, S. B., A Comparative Analysis of Methodologies for Database Schema Integration, ACM Computing Surveys, Vol. 8, 4, December, 986. Learning Outcomes (L/O). Select from a range of extensible and non-extensible databases for a business requirement, as might be demonstrated by contrasting how different database management system features are implemented. 2. Execute and evaluate performance benchmarking as the basis of selection between commercial products, as might be demonstrated by evaluating the performance of a database management system. 3. Act as a DBA for a distributed database, as might be demonstrated by proposing the configuration of a distributed database system. 4. To select from a range of software client-server components that will integrate to form a balanced system as might be demonstrated by proposing a system configuration for a typical database system.