Ways of documenting Session 5: detailed requirements The Data Dictionary one any The Project A data dictionary Data Dictionary may be maintained

Similar documents
Ways of documenting Session 5: detailed requirements The Data Dictionary one any The Project A data dictionary Data Dictionary may be maintained

Week 6: Data. Let's focus first on application domain data.

Session 3: Data. Session 3a: Data

Session 3b: Defining data items

Defining Data Items Conrad Weisert (2003)

Week 11: Case study: Designing, building, & testing a Person class Background for the Project Needed in many applications Is it possible? practical?

Session 2. Getting started with a well-structured system specification

Session 2b: structured specifications Purpose and criteria Structured specification components Introduction to dataflow diagrams

Week 5: Background. A few observations on learning new programming languages. What's wrong with this (actual) protest from 1966?

Session 4b: Review of Program Quality

Session 8: UML The Unified Modeling (or the Unstructured Muddling) language?

Does anyone actually do this?

Enhanced Entity- Relationship Models (EER)

Session 1b: Overview of systems analysis methodology

Chapter 2: Entity-Relationship Model

Do these criteria apply to work in this course?

Why do we have to know all that? The stored program concept (the procedural paradigm) Memory

OO System Models Static Views

MITOCW watch?v=kz7jjltq9r4

Unified Modeling Language (UML)

Q &A on Entity Relationship Diagrams. What is the Point? 1 Q&A

Chapter 6: Entity-Relationship Model

Part 3. Why do we need both of them? The object-oriented programming paradigm (OOP) Two kinds of object. Important Special Kinds of Member Function

Session 6b: Specifying constraints

8.0. Classes and Classification In the real world

System development, design & implementation

Logical E/R Modeling: the Definition of Truth for Data

Chapter 6: Entity-Relationship Model

Entity-Relationship Modelling. Entities Attributes Relationships Mapping Cardinality Keys Reduction of an E-R Diagram to Tables

Object-Oriented Software Engineering Practical Software Development using UML and Java

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.

FINANCE MANAGER. Accounting Manual Finance Manager.

This is an almost-two week homework; it is almost twice as long as usual. You should complete the first half of it by October 2.

Chapter 10. Object-Oriented Analysis and Modeling Using the UML. McGraw-Hill/Irwin

Early programming languages ca. 1960

Lecture 5 STRUCTURED ANALYSIS. PB007 So(ware Engineering I Faculty of Informa:cs, Masaryk University Fall Bühnová, Sochor, Ráček

Chapter 2 Conceptual Modeling. Objectives

Week 2: The Clojure Language. Background Basic structure A few of the most useful facilities. A modernized Lisp. An insider's opinion

Progress Report. Object-Oriented Software Development: Requirements elicitation (ch. 4) and analysis (ch. 5) Object-oriented software development

Credit where Credit is Due. Lecture 4: Fundamentals of Object Technology. Goals for this Lecture. Real-World Objects

CaseComplete Roadmap

(Murlidhar Group of Institutions,Bhavnagar Road, Rajkot) by:-assit. Prof. Vijay Vora (SOOADM) MCA-III

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL

TERMINOLOGY MANAGEMENT DURING TRANSLATION PROJECTS: PROFESSIONAL TESTIMONY

CSCI315 Database Design and Implementation Singapore Assignment 2 11 January 2018

chapter 2 G ETTING I NFORMATION FROM A TABLE

Database Design with Entity Relationship Model

SOME TYPES AND USES OF DATA MODELS

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

Short Notes of CS201

Object-Oriented Software Engineering Practical Software Development using UML and Java. Chapter 5: Modelling with Classes

ITC213: STRUCTURED PROGRAMMING. Bhaskar Shrestha National College of Computer Studies Tribhuvan University

Week 7 Prolog overview

CS201 - Introduction to Programming Glossary By

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Goal: build an object-oriented model of the realworld system (or imaginary world) Slicing the soup: OOA vs. OOD

ER modeling. Lecture 4

Database Design Process

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

Getting Started with AnyBook

Class Diagrams in Analysis

Project #1 rev 2 Computer Science 2334 Fall 2013 This project is individual work. Each student must complete this assignment independently.

Progress Report. Object-Oriented Software Development: Requirements elicitation and analysis. Object-oriented analysis, design, implementation

2004 John Mylopoulos. The Entity-Relationship Model John Mylopoulos. The Entity-Relationship Model John Mylopoulos

Chapter (4) Enhanced Entity-Relationship and Object Modeling

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

Unit 1: Working With Tables

C++ Data Types. 1 Simple C++ Data Types 2. 3 Numeric Types Integers (whole numbers) Decimal Numbers... 5

Entity Relationship Data Model. Slides by: Shree Jaswal

Software Engineering Prof.N.L.Sarda IIT Bombay. Lecture-11 Data Modelling- ER diagrams, Mapping to relational model (Part -II)

1 State, objects, and abstraction

Design Pattern: Composite

XML Web Services Basics

1: Specifying Requirements with Use Case Diagrams

Slide 1 Welcome to Fundamentals of Health Workflow Process Analysis and Redesign: Process Mapping: Entity-Relationship Diagrams. This is Lecture e.

XV. The Entity-Relationship Model

It can be confusing when you type something like the expressions below and get an error message. a range variable definition a vector of sine values

Object-Oriented Software Engineering. Chapter 2: Review of Object Orientation

Chapter No. 2 Class modeling CO:-Sketch Class,object models using fundamental relationships Contents 2.1 Object and Class Concepts (12M) Objects,

Lesson 1. Why Use It? Terms to Know

Data Modeling Using the Entity-Relationship (ER) Model

Accounting Information Systems, 2e (Kay/Ovlia) Chapter 2 Accounting Databases. Objective 1

Copyright 2016 Ramez Elmasr and Shamkant B. Navathei

Extension Web Publishing 3 Lecture # 1. Chapter 6 Site Types and Architectures

The Essence of Object Oriented Programming with Java and UML. Chapter 2. The Essence of Objects. What Is an Object-Oriented System?

Introduction Primitive Data Types Character String Types User-Defined Ordinal Types Array Types. Record Types. Pointer and Reference Types

Requirements Analysis. SE 555 Software Requirements & Specification

Copyright 2016

LEADING WITH GRC. Approaching Integrated GRC. Knute Ohman, VP, GRC Program Manager. GRC Summit 2017 All Rights Reserved

Conceptual Data Models for Database Design

The Entity-Relationship Model (ER Model) - Part 2

Design Engineering. Dr. Marouane Kessentini Department of Computer Science

EXAM Microsoft MTA Software Development Fundamentals. Buy Full Product.

22c:111 Programming Language Concepts. Fall Types I

Object-Oriented Software Engineering Practical Software Development using UML and Java. Chapter 2: Review of Object Orientation

Unit 2 - Data Modeling. Pratian Technologies (India) Pvt. Ltd.

Full file at

Chapter 3.3 Programming Fundamentals. Languages Paradigms Basic Data Types Data Structures OO in Game Design Component Systems Design Patterns

THE UNIVERSITY OF ARIZONA

Segregating Data Within Databases for Performance Prepared by Bill Hulsizer

Transcription:

Session 5: The Data Dictionary relationship to systems analysis methodologies relationship to project management data definition vs. data representation taxonomy of data types COMP 477 /377, Fall, 2018 Mr. Weisert Project Dictionary 1. Data item definitions Ways of documenting detailed requirements Flowcharts + record layouts "Victorian novel" Structured analysis Object-oriented analysis UML with use-cases Discrete requirements list Incremental approach / stories ---------------------------------------------- No matter which one(s) you choose: There's one component you should always have. It goes with any of them. What is it? We saw this list before. The Project Data Dictionary A repository of rigorous definitions of every data item mentioned in the requirements documents. Its unique name Its type (What are the possibilities?) Its precise meaning, not the representation. (What it is, not what it looks like) Its attributes in the real world What kinds of data item? A data dictionary may be maintained With a specialized software product An independent data-dictionary program Part of a C.A.S.E. tool Some are also data directories What's that? With a general software product A spreadsheet processor A database manager A word processor Manually Stack of index cards or forms but unfortunately... COMP 477/377 fall, 2018 1-4 copyright 2015 Conrad Weisert

Many projects don't! If a project doesn't create and mantain a data dictionary, it may define data items: a. through inline documentation or footnotes This is particularly common with use-cases and user stories. We call that an implicit data dictionary. What's wrong with that? b. or not at all! Important!! The lack of a project data-dictionary is a common cause of failure of application development projects. What do we mean by "failure"? a. serious schedule overrun b. serious cost overrun c. never satisfies the users d. combination of the above An implicit data-dictionary example "Customers who place more than $10,000 business per year and have a good payment history or have been with us for more than 20 years are to receive priority treatment." - a business rule from a respected systems analysis textbook We saw that before! Are there data items there that should be defined? If so, should the business rule be rewritten? Obvious questions "Customers who place more than $10,000 business per year and have a good payment history or have been with us for more than 20 years are to receive priority treatment." For how many years must a customer have purchased more than $10,000 worth of merchandise? How recently? What is a good payment history? During how many years during that twenty-year period and how much must a customer have spent to have been with us? What does priority treatment mean? How should those questions be answered? COMP 477/377 fall, 2018 5-8 copyright 2015 Conrad Weisert

Computer programs manipulate two kinds of data Application domain data Exist in the real world Usually known to users / sponsors May be persistent or transient Program data Have no real-world existence Are none of the users' business! Are usually transient Examples? A taxonomy of data types The 3 fundamentally different kinds of data Some basic subtypes of those 3. This is all independent of any programming language, but most programming languages provide support for some or all of them. A data dictionary is concerned only (or mainly) with application-domain data. Why? Reminder: Application-domain data items: 3 basic types We explained last week: What those 3 basic types are Emphasis is on what an item is, not what it looks like Rigorous definition is the responsibility of the analyst Normally done in (what we're calling) phase 3, but can be augmented or clarified later, as needed. Internal representation will be determined later (by whom?) Application-domain data items: 3 basic types 1. Elementary items 2. Composite items 3. Container items (Java calls them Collections) COMP 477/377 fall, 2018 9-12 copyright 2015 Conrad Weisert

Application-domain data items: 3 basic types 1. Elementary items are not composed of other data items sometimes called "fields" or "elements" (when part of a composite item). defined in terms of their real world meaning 2. Composite items are: composed of other data items, which may be either elementary or composite, sometimes called "structures", "records", "blocks", "data flows", also called "entities" or "subjects" when they play a primary role in an application system defined mainly in terms of their components. Data items: 3 kinds 3. Container items are structures that hold other data items, which are usually either elementary or composite (sometimes other containers). either: static (staying the same size and shape throughout their life span), or dynamic (growing, shrinking, or reconfiguring either: homogeneous (all elements are of the same type), or heterogeneous (multiple kinds of data can be elements) defined in terms of their behavior Reminder: We then examined just the first major type We categorized elementary data in four main subcategories The systems analyst and user-audience are rarely interested in internal (in programs and files) representations. 1. Elementary data items Every elementary data item belongs to one (and only one) of these basic subtypes. discrete (or coded or enumerated) possible values belong to a finite set numeric some arithmetic operation is meaningful text (or character string) [see note on next page] logical (or Boolean or switch or indicator) 2 possible values (T/F, Y/N, on/off, 0/1, present/absent,... ) Are there any others? How do C#, Java, etc. designate them? COMP 477/377 fall, 2018 13-16 copyright 2015 Conrad Weisert

A language-dependent peculiarity In the C family of languages text data are often thought of as containers rather than elementary items! That's because they're usually represented by an array of char. But the great majority of char[] items and String objects are treated simply as fixed data: identifiers, titles, names, message text, etc. Programs can still use arrays when they need to scan or compose pieces of a string. Which of the four elementary subtypes do these belong to? MaritalStatus EmployeeName InterestRate TelephoneNumber DueDate BookTitle CreditLimit Sex AccountNumber Weight (of a shipment) Velocity City State ZIP code Color (of a product) Price (of an item) CreditApproved Attributes (or properties) of elementary data items Attributes of numeric data items: This is only for data definition; it doesn't restrict developers to any internal representation. unit of measure range precision scale Attributes of text data items: length internal format, delimiters Attributes of discrete data items: number of possible values (both current and potential) coding structure Avoid false numerics Many discrete data items are represented by a sequence of numeric digits. Examples? That doesn't make them numeric. Why not? But many of them have "number" as part of their name, e.g. accountnumber We emphasize what a data item is, not what it looks like. Some old-fashioned tools take the opposite view. Which ones? COMP 477/377 fall, 2018 17-20 copyright 2015 Conrad Weisert

Attributes of composite data items List of component data items with rules for including them Can often be defined by a language-dependent commented structure definition (C struct). De Marco's language-independent notation for specifying a composite data item: Symbol Meaning = is composed of + followed by ( ) optional { } iteration [ ] alternatives Composite item definition Example using DeMarco notation: EMPLOYEE INFO = NAME + DATE OF BIRTH + MARITAL STATUS { } 0 15 + DEPENDENT INFO + DATE HIRED HOURLY WAGE + [ MONTHLY SALARY] + (PREVIOUS EMPLOYER) + etc. It is assumed that user-reviewers will be able (with suitable briefing) to understand that. Examples of container data Static structures Arrays Dynamic structures: lists, stacks, queues trees, graphs dynamic arrays Why not character strings? External structures: files, data bases display / interface (GUI) objects Exploiting packaged application software documentation If It appears very likely that the project will choose a specific application software product, and documentation for that product defines some data items exactly the way we want them, and those data definitions are clear and easy to find in the vendor's documentation, and that documentation is available to everyone on the project team, whether or not we buy the product, Then: It is acceptable (and desirable) to avoid redundant work by just referring to (or copying) the product documentation. COMP 477/377 fall, 2018 21-24 copyright 2015 Conrad Weisert

Terminology update Java uses the term container in a different and narrower sense, to mean a GUI screen object (e.g. a window) that can contain other GUI objects. Java now uses the term collection in the more general sense, where we've been using container. Use whichever term you prefer as long as the context is clear. Elementary item Discrete item Numeric item Text item Logical item Taxonomy summary: the top of the tree Data item Composite item Records entities Other static tree structures Container item Arrays, vectors, matrices, tables Lists, stacks, queues Trees, Graphs Files, databases Windows, boxes Which ones are suited to being represented as object-oriented classes? Discrete COLOR (of a product) ACCOUNT NUMBER STATE (in an address) MARITAL STATUS BRANCH OFFICE CODE CREDIT CARD TYPE TELEPHONE NUMBER Text NAME (of an employee) DESCRIPTION (of a product) CITY (in an address) DUNNING LETTER Entity / subject EMPLOYEE VENDOR PRODUCT VEHICLE Examples Elementary items Composite items Numeric HOURS WORKED (of an employee) TEMPERATURE (of a substance) QUANTITY ORDERED (of a product) DATE SHIPPED (of an order) MASS (of a body) SPEED (of an object) DEPARTURE TIME (of a train) Logical CREDIT APPROVAL (for an order) NEW CUSTOMER FLAG UNION MEMBER (for an employee) AUDIT TRACE OPTION Other HOME ADDRESS (of an employee) CREDIT HISTORY (of a customer) SUBASSEMBLY (of a product) CURRENT ORBIT (of a satellite) Avoid false composites Don't confuse a mixed unit or structured representation of an elementary item with a true composite item. For example, these should be treated as elementary, not composite, items: Date = year + month + day PersonName = first + middle + last Time = hours + minutes + seconds Avoiding false composites is especially important to avoid clutter and complexity in a data dictionary COMP 477/377 fall, 2018 25-28 copyright 2015 Conrad Weisert

Data classes and inheritance Each basic data type can be divided into subtypes or data classes. Each such class can in turn be further divided into subclasses to any level. Data classes and inheritance Each class inherits the properties of its parent classes: data representation, esp. internal, and attributes associated functions and operations This inheritance principle can greatly simplify the definition of both data items and the functions or processes that operate on them (even without any object-oriented tools.) Classes and data items In a class hierarchy, properties (or attributes) of a class T are inherited by both: subclasses of T specific data items (instances, objects) of type T It is, therefore, unnecessary and undesirable (why?) to specify attributes in the dictionary definition of every data item. Nevertheless, many older tools (COBOL, some data-dictionary systems, etc.) demand that we do so! Would the "year-2000 crisis" have occurred, if everyone had understood this principle? Example We can define billingaddress as an instance of MailingAddress, where MailingAddress is a class that defines the structure and content of an address acceptable to post offices. Derived subclasses could include USStreetAddress, USBoxNoAddress, CanadaAddress, etc. It would then be redundant (and wrong) to define attributes of the individual data item billingaddress, such as number of lines for the street address portion and their maximum length format of a ZIP code Why? COMP 477/377 fall, 2018 29-32 copyright 2015 Conrad Weisert

Classes (or types) versus data items (or instances or objects) Inexperienced systems analysts and application designers sometimes confuse these two very different things. For example, these are appropriate names for classes, but not for data items. temperature date address Why not? What's wrong with them? Data items need more specific names (or to be components of a higher-level data aggregate! Ambiguous data names temperature: What is it the temperature of? When (starting temperature or current)? date: What is it the date of or for? shippingdate orderdate expirationdate etc. etc. address: What is it the address of? mailing address billing address home address Level of generality In a program with modular structure, the names of data items will depend upon the level of the program module: High-level modules will refer to a data item by its real-world name. A process Transaction module would refer to: customeraddress or address of customer dateshipped Low-level or general-purpose modules will use a more generic name. An edit (validation) module would refer to: address date Which is the analyst concerned with in phase 3? Warnings "The most pernicious and subtle bugs are system bugs arising from mismatched assumptions made by the authors of various components."--fred Brooks, p. 142 "Many, many failures concern exactly those aspects that were never quite specified."--v.a. Vyssotsky, Bell Laboratories, quoted by Dr. Brooks, same page. COMP 477/377 fall, 2018 33-36 copyright 2015 Conrad Weisert