The Architectural Logic of Database Systems

Similar documents
Research on Industrial Security Theory

Interfacing with C++

Contributions to Economics

Computer Science Workbench. Editor: Tosiyasu L. Kunii

The Information Retrieval Series. Series Editor W. Bruce Croft

Software Development for SAP R/3

Advanced Data Mining Techniques

Jinkun Liu Xinhua Wang. Advanced Sliding Mode Control for Mechanical Systems. Design, Analysis and MATLAB Simulation

Computer Architecture

Real-Time Graphics Rendering Engine

Graphics Programming in c++

Inside Relational Databases with Examples in Access

Julien Masanès. Web Archiving. With 28 Figures and 6 Tables ABC

Guide to OSI and TCP/IP Models

High Availability and Disaster Recovery

Enterprise Networks and Telephony

Graph Data Model. and Its Data Language. Hideko S. Kunii. Foreword by Gio Wiederhold

Gengsheng Lawrence Zeng. Medical Image Reconstruction. A Conceptual Tutorial

Geometric Modeling and Algebraic Geometry

ITIL 2011 At a Glance. John O. Long

Failure-Modes-Based Software Reading

Lecture Notes in Mathematics Editors: J.--M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris

SpringerBriefs in Computer Science

8) A top-to-bottom relationship among the items in a database is established by a

Introduction to Databases

Enabling Technologies for Wireless E-Business

Databases and Database Management Systems

Low Level X Window Programming

Lecture Notes in Computer Science 2001 Edited by G. Goos, J. Hartmanis and J. van Leeuwen

Fundamentals of Operating Systems. Fifth Edition

MASTERING COBOL PROGRAMMING

Oracle Database 10g: Introduction to SQL

Robust SRAM Designs and Analysis

c-xsc R. Klatte U. Kulisch A. Wiethoff C. Lawo M. Rauch A C++ Class Library for Extended Scientific Computing Springer-Verlag Berlin Heidelberg GmbH

Computer-Aided Design in Magnetics

George Grätzer. Practical L A TEX

Human-Survey Interaction

Philip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition

Stereo Scene Flow for 3D Motion Analysis

Mobile Phone Security and Forensics

Mahathma Gandhi University

Stored Relvars 18 th April 2013 (30 th March 2001) David Livingstone. Stored Relvars

Programming with Turing and Object Oriented Turing

Foundations of 3D Graphics Programming

DBMS (FYCS) Unit - 1. A database management system stores data in such a way that it becomes easier to retrieve, manipulate, and produce information.

SQL Queries. for. Mere Mortals. Third Edition. A Hands-On Guide to Data Manipulation in SQL. John L. Viescas Michael J. Hernandez

Chapter. Chapter Objectives

0 Mastering Microsoft Office

Parallel Programming

Lecture Notes in Computer Science

Database Management Systems

Summary of Contents LIST OF FIGURES LIST OF TABLES

SQL. History. From Wikipedia, the free encyclopedia.

Basant Group of Institution

"Charting the Course to Your Success!" MOC D Querying Microsoft SQL Server Course Summary

Introduction to Database Concepts. Department of Computer Science Northern Illinois University January 2018

Data about data is database Select correct option: True False Partially True None of the Above

Database Systems Overview. Truong Tuan Anh CSE-HCMUT

Chapter 11 Database Concepts

Using MSC/NASTRAN: Statics and Dynamics

Microsoft Querying Microsoft SQL Server 2014

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Generalized Document Data Model for Integrating Autonomous Applications

Stefan Waldmann. Topology. An Introduction

Introduction Database Technology [DBTECO601]

FUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley

G64DBS Database Systems. G64DBS Module. Recommended Textbook. Assessment. Recommended Textbook. Recommended Textbook.

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

Functional Dependency: Design and Implementation of a Minimal Cover Algorithm

Data Management System (DMS 2200) FORTRAN Data Manipulation Language (FDML)

The Art of Lisp Programming


Windows 10 Revealed. The Universal Windows Operating System for PC, Tablets, and Windows Phone. Kinnary Jangla

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

THE RELATIONAL DATABASE MODEL

Full file at

A. Portela A. Charafi Finite Elements Using Maple

Systems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington

Lecture 01. Fall 2018 Borough of Manhattan Community College

C Quick Syntax Reference

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.

Mastering. Pascal and Delphi Programming

Introduction to SET08104

Computer Communications and Networks

Philipp von Weitershausen. Web Component Development with Zope 3

Computer Communications and Networks. Series editor A.J. Sammes Centre for Forensic Computing Cranfield University, Shrivenham campus Swindon, UK

Data analysis and design Unit number: 23 Level: 5 Credit value: 15 Guided learning hours: 60 Unit reference number: H/601/1991.

Wireless Networks. Series Editor Xuemin Sherman Shen University of Waterloo Waterloo, Ontario, Canada

Essential Angular for ASP.NET Core MVC

Course Introduction & Foundational Concepts

Digital Signal Processing System Design: LabVIEW-Based Hybrid Programming Nasser Kehtarnavaz

Program and Electronic Projects for the SSC, Electron and Spectrum Computers

Form-Oriented Analysis

About the course.

The Semantic Web Explained

Progress towards database management standards

Querying Data with Transact-SQL

Springer-Verlag Berlin Heidelberg GmbH

Querying Microsoft SQL Server 2014

Transcription:

The Architectural Logic of Database Systems

E. J. Yannakoudakis The Architectural Logic of Database Systems With 69 Figures Springer-Verlag london Berlin Heidelberg New York Paris Tokyo

E. 1. Yannakoudakis, BSc, PhD, CEng, FBCS Postgraduate School of Computer Sciences, University of Bradford, Bradford, West Yorkshire BD71DP, UK ISBN-13 :978-3-540-19513-9 DOT: 10.1007/978-1-4471-1616-5 e-isbn-13 :978-1-4471-1616-5 British Library Cataloguing in Publication Data Yannakoudakis, E.J.. 1950- Architectural logic of database systems 1. Machine - readable files. Design I. Title 005.74 Library of Congress Cataloging-in-Publication Data Yannakoudakis. E. J.. 1950- The architectural logic of database systems. Includes bibliographies and index. 1. Data base management. 2. Computer architecture. I. title. QA76.9.D3Y361988005.74 88-3248 This work is subject to copyright. All rights are reserved. whether the whole or part of the material is concerned. specifically the rights of translation. reprinting. reuse of illustrations. recitation. broadcasting. reproduction on microfilms or in other ways. and storage in data banks. Duplication of this publication or parts thereof is only permitted under the provisions of the German Copyright Law of September 9.1965, in its version of June 24. 1985. and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. Springer-Verlag Berlin Heidelberg 1988 The use of registered names. trademarks etc. in this publication does not imply. even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. Filmset by Saxon Printing Limited. Saxon House. Derby Printed by Page Bros (Norwich) Limited. Mile Cross Lane. Norwich. 2128/3916--543210

To Eve, John, Irene and Helen who involuntarily allowed me to finish this book.

Preface If we look back to pre-database systems and the data units which were in use, we will establish a hierarchy starting with the concept of 'field' used to build 'records' which were in turn used to build higher data units such as 'files'. The file was considered to be the ultimate data unit of information processing and data binding 'monolith'. Moreover, predatabase systems were designed with one or more programming languages in mind and this in effect restricted independent development and modelling of the applications and associated storage structures. Database systems came along not to turn the above three units into outmoded concepts, but rather to extend them further by establishing a higher logical unit for data description and thereby offer high level data manipulation functions. It also becomes possible for computer professionals and other users to view all information processing needs of an organisation through an integrated, disciplined and methodical approach. So, database systems employ the concepts field, record and file without necessarily making them transparent to the user who is in effect offered a high level language to define data units and relationships, and another language to manipulate these. A major objective of database systems is to allow logical manipulations to be carried out independent of storage manipulations and vice versa. A rather accurate parallel between database systems and high level languages such as FORTRAN, COBOL and Pascal can be drawn here by stating that database systems form a natural progressive step from file systems in the way that high level languages form a natural progressive step from assembly or low level languages. The Data Base Management System (DBMS) is the software necessary to set up, manipulate and maintain the object database, that is, the data of the organisation including appropriate control information. Since the establishment of this higher concept and its acceptance by the computer community as the next step towards an even more advanced information processing environment, the market has been stocked with a plethora of books on the subject, its implications and application environments. However, few books are available for the person who has elementary knowledge of programming and who wishes to have a general introduction to database principles and at the

viii Preface same time acquire knowledge of current database management system software and the various levels at which it is utilised, independent of any vendor-related software. The present book tackles this and also discusses the database environment under the following major areas: (a) The logic behind database systems (b) The architecture of database systems and related software (c) How an entire organisation can be viewed with the aid of appropriate database software (d) How data can be defined and manipulated using database languages as well as natural language (e) Models which can describe an organisation accurately (f) Database design methodologies and techniques to bind record types together (g) Potential administrative and technical tasks to be performed. A recent development has been the automation of database design following analyses of the different 'views' (applications) users may have of the same centralised data. The technique used is termed 'canonical database synthesis', where all user views are merged into a single unit which reflects the inherent structure of organisational data. The ultimate objective is to aid the design of the 'logical database structure' which: (a) Is free from duplication (b) Is optimised to reflect the organisation accurately (c) Does not depend on specific vendor-software (d) Can satisfy new applications without major restructuring Canonical synthesis is presented here in a step-wise fashion with examples that illustrate the merging, analysis and grouping of user views to form closely related clusters of data elements. An algorithm for canonical sythesis has been implemented in Pascal and this is used to analyse a complete hospital database environment for a Regional Health Authority. Although the software we developed is not included in the book, the Appendix contains example reports it produces for the hospital database, starting with the input of user views followed by their processing and finally the design of the complete logical schema. The American National Standards Institute (ANSI) and the International Standards Organisation (ISO) have adopted a new standard Relational Database Language (RDL) and a Network Database Language (NDL). Both RDL and NDL are presented here in their current form of development with our own extensions where appropriate, particularly for the definition of the storage structures. The book contains the syntax of the most important RDL and NDL commands for data definition and data manipulation functions. They are illustrated with simple examples that show the input, the statements the user actually types, and the end result. The ultimate objective of this book is to demystify database concepts and methodologies and at the same time explain in as simple a manner as possible, three important approaches to defining relationships among attributes: the 'hierarchical', 'network' and 'relational'.

Preface IX The emphasis is on the relational and network approaches to database management because they appear to be suitable for most data processing applications. Besides, we have not seen nor are we likely to see an international standard for the hierarchical approach to database design. The material includes ample examples and realistic attributes and relationships among these. It is presented in a rather laconic and synoptic fashion by avoiding unnecessary long introductions to the various concepts and by making direct, factual and precise statements on 'what it is', 'how it works' and 'how it can be applied'. The material can in fact be split into four parts: Part I The database environment and underlying data models (Chapters 1,2 and 3). Part II The architecture of database software and man-machine communication (Chapters 4 and 5). Part III Database design methodology (Chapters 6 and 7). Part IV The relational and network database architectures (Chapters 8 and 9). The concepts and method of presentation are completely independent of hardware and commercial software packages. The person who masters the material presented here will be in a strong position to judge and evaluate any type of database software, regardless of whether this is offered on a micro, mini or large mainframe. Moreover, the ISO database language discussed here will provide a yardstick for comparative assessment for some years to come. The book will be useful to all people who wish to acquire a working, sound and up to date knowledge on the subject, its terminology and method of application. Since it does not require any a priori knowledge of database systems but only simple programming principles, it is recommended to students taking A-level courses in computer studies, University undergraduates, postgraduates who may wish to use the DBMS as a tool for data manipulation, computer programmers who are about to commence programming under a DBMS, systems analysts who may wish to assess the feasability of introducing a DBMS into their organisation, and database administrators who wish to acquire sufficient and integrated knowledge on logical database architecture, associated structures and software modules, and technical tasks behind setting up and maintaining a database. Finally, data processing managers will find the material useful, particularly the terminological dictionary; most importantly, though, they will be able, identify and establish appropriate administrative posts for the effective maintenance of a database. Acknowledgements I would like to thank Chris Stoker (University of Bradford) for the assistance he has given me in the production of the canonical synthesis

x Preface reports for the hospital database presented in the Appendix. I would like particularly to thank Chi Pui Cheng (Hong Kong Polytechnic) for his invaluable comments in shaping up the chapter on RDL and NDL. January 1988 E.J.Y.

Contents 1 Foundations of Databases... 1 1.1 Data and Information............................ 1 1.2 Program and File Communication... 3 1.3 Program and Meta-file Communication... 7 1.4 Towards a Database System... 9 1.5 High Level Database Software... 13 1.6 Summary... 16 1. 7 References... 16 2 The Logic of the Database Environment... 17 2.1 The Principle of Data Independence... 17 2.1.1 Physical Independence... 18 2.1.2 Logical Independence... 19 2.2 Standard Software and the Database... 20 2.3 Three Architectural Levels............ 23 2.3.1 Logical Schema... 25 2.3.2 Logical Subschema... 25 2.3.3 Internal Schema... 26 2.4 Types of Users... 27 2.4.1 Database Administrator (DBA)... 28 2.4.2 System Software Engineer (SSE)... 28 2.4.3 Applications Analyst... 29 2.4.4 Applications Programmer... 29 2.4.5 General User... 29 2.5 Summary... 29 2.6 References... 30 3 Data Structures and Data Models.................. 31 3.1 Introduction... 31 3.2 Data Structures and Relationships............ 34 3.2.1 Data Structures on Keys... 35 3.2.2 Tree Structures... 40

xii Contents 3.3 Hierarchic Data Models... 43 3.4 Network Data Models... 45 3.5 Relational Data Models... 52 3.5.1 Relational Terminology...... 53 3.5.2 Basic Characteristics of Relational Models...... 54 3.6 An Example Schema Model... 56 3.7 An Example Subschema Model... 59 3.8 Summary... 60 3.9 References.................. 61 4 The Architecture of Database Software... 63 4.1 Introduction... 63 4.1.1 Data Types and Qualifiers................... 64 4.2 Data Description Language (DDL)... 66 4.2.1 RDL Commands... 67 4.3 Data Manipulation Language (DML)... 73 4.3.1 RDL Commands... 74 4.4 Data Storage Description Language (DSDL)... 87 4.4.1 RDL Commands... 89 4.5 Query Language.................................. 92 4.5.1 RDL Commands... 94 4.6 Query By Example (QBE)... 94 4.6.1 Example Forms of QBE... 95 4.7 Data Dictionary... 97 4.7.1 Aims and Objectives... 97 4.7.2 The Data Dictionary and the Database... 100 4.8 An Overview of Software Integration... 104 4.9 Summary... 107 4.10 References... 108 5 Communicating with Databases in Natural Language... 109 5.1 Programming Languages... 109 5.2 The PROLOG Programming Language... 110 5.3 Natural Language System Architecture... 113 5.3.1 The Language PROLOG and the Database... 117 5.3.2 Conclusions and Further Research... 125 5.4 Communicating with Databases by Voice......... 125 5.5 Speech Synthesis... 126 5.6 Speech Recognition... 128 5.7 An Integrated View of Man-Machine Interfaces........... 130 5.8 Summary... 132 5.9 References... 133

Contents xiii 6 Database Design Methodology...................................... 135 6.1 Introduction... 135 6.2 Top-Down and Bottom-Up Design... 135 6.3 Major Stages in Database Design... 137 6.4 Six Mappings......... 140 6.5 Decomposition and Normalisation... 142 6.6 Relationships between Attributes... 144 6.7 Key Attributes...... 146 6.8 The Five Normal Forms......... 148 6.8.1 First Normal Form (lnf)... 149 6.8.2 Second Normal Form (2NF)... 150 6.8.3 Third Normal Form (3NF)... 151 6.8.4 Fourth Normal Form (4NF)... 153 6.8.5 Fifth Normal Form (5NF)... 157 6.8.6 Conclusions... 158 6.9 Summary... 159 6.10 References... 160 7 Canonical Synthesis for Database Design...... 162 7.1 Introduction... 162 7.2 Element Associations... 164 7.2.1 Keys, Attributes, and Data Element Groups... 166 7.3 Readying the Views for Canonical Synthesis......... 168 7.3.1 Inconsistent Associations... 169 7.3.2 Illegal Associations...... 170 7.3. 3 Normalisation of the Views... 172 7.3.3.1 First Normal Form (lnf)... 172 7.3.3.2 Second Normal Form (2NF)... 173 7.3.3.3 Third Normal Form (3NF)... 174 7.3.4 Deviation from Third Normal Form... 176 7.4 The Canonical Synthesis Algorithm... 177 7.4.1 Merging of the Views... 178 7.4.2 Keys and Attributes... 180 7.4.3 Concatenated Keys... 180 7.4.3.1 Problems with Concatenated Attributes... 181 7.4.4 Dealing with M:M Associations... 181 7.4.5 Dealing with 1:1 Associations... 182 7.4.6 Removal of Redundancies... 184 7.4.6.1 Programming Considerations... 186 7.4.7 Isolated and Intersecting Attributes... 187 7.5 Further Investigations on Canonical Synthesis... 189 7.6 Summary............... 191 7.7 References... 192

xiv Contents 8 Relational Architecture... 193 8.1 Introduction........................ 193 8.2 Domains and Attributes... 194 8.3 Manipulation of Relaticnal Tables............... 196 8.3.1 Union... 198 8.3.2 Intersection... 198 8.3.3 Difference... 199 8.3.4 Selection... 199 8.3.5 Projection... 200 8.3.6 Join... 201 8.3.7 Division... 203 8.4 Subschema Definitions... 203 8.5 Representing Trees... 205 8.5.1 From Table to Tree... 207 8.6 Representing Networks... 210 8.7 Summary... 212 8.8 References... 213 9 A Network Database Language... 214 9.1 Introduction... 214 9.2 Logical Relationships and Set Types... 216 9.3 Structural Relationships......... 222 9.3.1 Order of Record Occurrences... 224 9.3.2 Membership of Record Occurrences... 226 9.4 Network Database Language... 227 9.4.1 Schema Definition... 228 9.4.2 Subschema Definition... 230 9.4.3 Data Manipulation... 230 9.5 Summary................................. 236 9.6 References.............................. 236 10 Dictionary of Database Terminology....... 238 11 Appendix. Example Reports from Canonical Synthesis... 281 12 Acronyms........................................... 311 Subject Index... 313