Week 6: Data. Let's focus first on application domain data.

Similar documents
Session 3: Data. Session 3a: Data

Ways of documenting Session 5: detailed requirements The Data Dictionary one any The Project A data dictionary Data Dictionary may be maintained

Ways of documenting Session 5: detailed requirements The Data Dictionary one any The Project A data dictionary Data Dictionary may be maintained

Session 3b: Defining data items

Week 11: Case study: Designing, building, & testing a Person class Background for the Project Needed in many applications Is it possible? practical?

Does anyone actually do this?

Do these criteria apply to work in this course?

Why do we have to know all that? The stored program concept (the procedural paradigm) Memory

Short Notes of CS201

CS201 - Introduction to Programming Glossary By

Session 4b: Review of Program Quality

Lecture 13: Object orientation. Object oriented programming. Introduction. Object oriented programming. OO and ADT:s. Introduction

Introduction to Programming Using Java (98-388)

What's that? Why? Is one "better" than the other? Terminology. Comparison. Loop testing. Some experts (e.g. Pezze & Young) call it structural testing

Chapter 11. Categories of languages that support OOP: 1. OOP support is added to an existing language

Session 2. Getting started with a well-structured system specification

Part 3. Why do we need both of them? The object-oriented programming paradigm (OOP) Two kinds of object. Important Special Kinds of Member Function

Data Structures (list, dictionary, tuples, sets, strings)

Reviewing for the Midterm Covers chapters 1 to 5, 7 to 9. Instructor: Scott Kristjanson CMPT 125/125 SFU Burnaby, Fall 2013

COP 3330 Final Exam Review

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

Graphical Interface and Application (I3305) Semester: 1 Academic Year: 2017/2018 Dr Antoun Yaacoub

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

Object-Oriented Concepts and Design Principles

What are the characteristics of Object Oriented programming language?

Lecture 3. COMP1006/1406 (the Java course) Summer M. Jason Hinek Carleton University

Object-Oriented Programming

Contents. Figures. Tables. Examples. Foreword. Preface. 1 Basics of Java Programming 1. xix. xxi. xxiii. xxvii. xxix

Defining Data Items Conrad Weisert (2003)

Preface... (vii) CHAPTER 1 INTRODUCTION TO COMPUTERS

WA1278 Introduction to Java Using Eclipse

Classes and Methods לאוניד ברנבוים המחלקה למדעי המחשב אוניברסיטת בן-גוריון

SPARK-PL: Introduction

3D Graphics Programming Mira Costa High School - Class Syllabus,

Overview of OOP. Dr. Zhang COSC 1436 Summer, /18/2017

Outline. Computer Science 331. Information Hiding. What This Lecture is About. Data Structures, Abstract Data Types, and Their Implementations

OBJECT ORIENTED SYSTEM DEVELOPMENT Software Development Dynamic System Development Information system solution Steps in System Development Analysis

2. The object-oriented paradigm

Classes and Methods עזאם מרעי המחלקה למדעי המחשב אוניברסיטת בן-גוריון מבוסס על השקפים של אותו קורס שניתן בשנים הקודמות

Curriculum Map Grade(s): Subject: AP Computer Science

Chapter 8: Data Abstractions

E-R Model. Hi! Here in this lecture we are going to discuss about the E-R Model.

Lecture 2 and 3: Fundamental Object-Oriented Concepts Kenneth M. Anderson

Chapter 6 Introduction to Defining Classes

MITOCW watch?v=kz7jjltq9r4

The Essence of Object Oriented Programming with Java and UML. Chapter 2. The Essence of Objects. What Is an Object-Oriented System?

Topic 7: Algebraic Data Types

CompuScholar, Inc. Alignment to Nevada "Computer Science" Course Standards

Absolute C++ Walter Savitch

Session 2b: structured specifications Purpose and criteria Structured specification components Introduction to dataflow diagrams

THE OBJECT-ORIENTED DESIGN PROCESS AND DESIGN AXIOMS (CH -9)

Data Types. (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011

Classes and Methods גרא וייס המחלקה למדעי המחשב אוניברסיטת בן-גוריון

Chapter 10. Object-Oriented Analysis and Modeling Using the UML. McGraw-Hill/Irwin

3. Simple Types, Variables, and Constants

C++ (Non for C Programmer) (BT307) 40 Hours

Week 5: Background. A few observations on learning new programming languages. What's wrong with this (actual) protest from 1966?

Topics. Modularity and Object-Oriented Programming. Dijkstra s Example (1969) Stepwise Refinement. Modular program development

Topic 2. Collections

Chapter 8: Data Abstractions

COWLEY COLLEGE & Area Vocational Technical School

Data Structure. Recitation III

And Parallelism. Parallelism in Prolog. OR Parallelism

CHAPTER 1 Introduction to Computers and Programming CHAPTER 2 Introduction to C++ ( Hexadecimal 0xF4 and Octal literals 031) cout Object

Naming in OOLs and Storage Layout Comp 412

Data Structures and Algorithms in Java. Second Year Software Engineering

Chapter 13 Object Oriented Programming. Copyright 2006 The McGraw-Hill Companies, Inc.

Credit where Credit is Due. Lecture 4: Fundamentals of Object Technology. Goals for this Lecture. Real-World Objects

Object-Oriented Software Engineering Practical Software Development using UML and Java

CPS 506 Comparative Programming Languages. Programming Language

Week 2: The Clojure Language. Background Basic structure A few of the most useful facilities. A modernized Lisp. An insider's opinion

CSCI 445 Amin Atrash. Control Architectures. Introduction to Robotics L. Itti, M. J. Mataric

Variables and Constants

Design Patterns IV. Alexei Khorev. 1 Structural Patterns. Structural Patterns. 2 Adapter Design Patterns IV. Alexei Khorev. Structural Patterns

CS 231 Data Structures and Algorithms, Fall 2016

Software Engineering Prof. Rushikesh K.Joshi IIT Bombay Lecture-15 Design Patterns

Design Patterns IV Structural Design Patterns, 1

Big Java Late Objects

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Fundamentals of Programming Languages

PROGRAMMING IN C++ COURSE CONTENT

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

Lecture 7: Data Abstractions

Classes, objects, methods and properties

Introduction to Computers and Programming Languages. CS 180 Sunil Prabhakar Department of Computer Science Purdue University

Object Orientated Analysis and Design. Benjamin Kenwright

M301: Software Systems & their Development. Unit 4: Inheritance, Composition and Polymorphism

PROCESS DEVELOPMENT METHODOLOGY The development process of an API fits the most fundamental iterative code development

An OBJECT contains VARIABLES = PROPERTIES and METHODS = BEHAVIORS. Outline the general nature of an object.

Algorithms and Data Structures

Problem Solving with C++

McGill University School of Computer Science COMP-202A Introduction to Computing 1

OBJECT ORIENTED SIMULATION LANGUAGE. OOSimL Reference Manual - Part 1

Object Oriented Programming is a programming method that combines: Advantage of Object Oriented Programming

VALLIAMMAI ENGINEERING COLLEGE

Design Patterns. Comp2110 Software Design. Department of Computer Science Australian National University. Second Semester

Review sheet for Final Exam (List of objectives for this course)

Design Pattern: Composite

Object-Oriented Programming Concepts

Concepts of Programming Languages

Transcription:

review of things we already (should) know criteria for external & internal representation basic elementary data types composite data types container data types derived subtypes abstract data types (ADT) programming language support COMP 370, Fall, 2016 Mr. Weisert Week 6: Data Computer programs manipulate two kinds of data Application domain data Exist in the real world Usually known to users / sponsors May be persistent or transient Often identified by systems analyst Program data Have no real-world existence None of the users' direct business Usually transient Often identified by programmer/designer Let's focus first on application domain data. Part 1: Choosing data representations Who does it? When? What are some criteria? Three examples color (of an object) name (or a person) weight (of a shipment) Example 1: representing color Two possible representations: Color Choice A Choice B white "WHT" 1 black "BLK" 2 red "RED" 3 blue "BLU" 4 green "GRN" 5 aquamarine "AQM" 6 Which is better? Why? What other choices should we consider? COMP 370, fall, 2016 1-4

Consider these code fragments a. if (color == "WHT") ++whitecount; else if (color == "BLK") ++blackcount; else if (color == "GRN") ++greencount; else if (color == "BLU") ++bluecount; else if (color == "RED") ++redcount; Example 2: representing a person's name First choice: a 48-byte field with embedded delimiter Limbaugh,Rush De Gaulle,Chuck Quayle,J. Danforth Second choice: a 48-byte field Rush Limbaugh Chuck De Gaulle J. Danforth Quayle Third choice: three 20-byte fields Chuck J. Which is best? Why? What other choices should we consider? b. ++itemcount[color]; De Gaulle Danforth Quayle Name representation considerations What if we want to sort the names? What if someone has more than three names? George Herbert Walker Bush What if someone has only one name? Madonna What if some cultures put the family name first? Chiang Kai-Shek What if one part of the name is very long? Example 3: representing shipping weight First choice: unit of measure: kilograms range: 0.1-50 precision:.05 scale: float Second choice: unit of measure: <pounds, ounces> range: <0,8> - <99,15> precision <0,1> scale integer, integer Which is better? Why? What other representations should we consider? COMP 370, fall, 2016 5-8

What do those examples show? End users prefer representations that are familiar. Programmers prefer representations that are simple. Therefore a computer application often needs two (or more) representations of the same data item. What should we call them? Data representation -- 2 kinds External representations are visible to human users. They appear in: printed reports screen displays transactions and other input data source documents Internal representations (rarely seen by the end user) appear in: computer programs data bases / master files temporary (work / scratch) files Criteria for choosing each are quite different. Criteria for choosing a data representation External representations should be: familiar to the people who enter them or have to understand them. reliable; not error-prone How do programs convert between them? Internal representations should be: simple and efficient to store and manipulate flexible to accommodate growth or likely future changes (if possible) interchangeable among application systems, organizations, etc. (i.e. standard) How do programs convert between the two representations? Internal to external External to internal When in in what functions do these conversions ideally occur? COMP 370, fall, 2016 9-12

Data Conversion: external to internal (input) usually done by an editing program, which: What does it do? Data Conversion: external to internal (input) usually done by an editing program, which: 1. converts each data item from its external (input) representation to a standard internal representation, so that later programs needn't know about or deal with messy external representations 2. validates that each data item conforms to its specification. It rejects those that fail, so that subsequent programs can never encounter illegal data. Thorough input editing is essential in any program that accepts input from the external world, whether batch, interactive, event-driven,... Data Conversion: internal to external (output) simpler than input Why? usually done in a reporting or output display program In C++ can be implemented as a customized output stream insertion operator (<<) or a class-specific print function (method). In Java and C# can be implemented in the tostring() or ToString() method. How can object-oriented programming help in distinguishing between internal and external representations? Member data items (almost always private) constitute the internal representation Output-stream functions and ToString() handle internal to external conversion Constructors handle external to internal conversion (sometimes not very well) COMP 370, fall, 2016 13-16

Any Questions? About data representation in general? About the need to distinguish between internal representation and external representation The criteria for choosing both This is an important topic that's not covered in many textbooks, Part 2: A taxonomy of data types The 3 fundamentally different kinds of data Some basic subtypes of those 3. This is all independent of any programming language. We talked about some of this (numeric data) in week 3 Data items: 3 kinds 1. Elementary items are not composed of other data items sometimes called "fields" or "elements" (when part of a composite item). defined in terms of their real world meaning 2. Composite items are: composed of other data items, which may be either elementary or composite, sometimes called "structures", "records", "blocks", "data flows", also called "entities" or "subjects" when they play a primary role in an application system defined mainly in terms of their components. What's the third kind? Data items: 3 kinds 3. Container items are structures that can hold other data items, either elementary or composite (rarely other containers). either: static (staying the same size and shape throughout their life span), or dynamic (growing, shrinking, or reconfiguring either: homogeneous (all elements of the same type), or heterogeneous (multiple kinds of data can be elements) defined in terms of their behavior COMP 370, fall, 2016 17-20

Elementary data items Every elementary data item belongs to one (and only one) of these basic types. discrete (or coded or enumerated) possible values belong to a finite set numeric some arithmetic operation is meaningful text (or character string) logical (or Boolean) 2 possible values Which basic types did the three earlier examples belong to? Attributes (or properties) of elementary data items Attributes of numeric data items: unit of measure range precision scale Attributes of text data items: length internal format, delimiters Attributes of discrete data items: number of possible values (both current and possible) coding structure Avoid false numerics Many discrete data items are represented by a sequence of numeric digits. Examples? That doesn't make them numeric. Why not? We emphasize what a data item is, not what it looks like. Some old-fashioned tools take the opposite view. Which ones? Examples of container data Static structures Arrays Dynamic structures: lists, stacks, queues trees, graphs dynamic arrays External structures: files, data bases display / interface (GUI) objects What about character strings? COMP 370, fall, 2016 21-24

Terminology update Java uses the term container in a narrower sense, to mean a GUI screen object (e.g. a window) that can contain other GUI objects. Java now uses the term collection in the more general sense, where we've been using container. Use whichever term you prefer as long as the context is clear. Elementary item Discrete item Numeric item Text item Logical item Taxonomy summary: the top of the tree Data item Composite item Records, entities Other static tree structures Container item Arrays, vectors, matrices, tables Lists, stacks, queues Trees, Graphs Files, databases Windows, boxes Which ones are suited to being represented as object-oriented classes? Examples Discrete COLOR (of a product) ACCOUNT NUMBER STATE (in an address) MARITAL STATUS BRANCH OFFICE CODE CREDIT CARD TYPE TELEPHONE NUMBER Text NAME (of an employee) DESCRIPTION (of a product) CITY (in an address) DUNNING LETTER Entity / subject EMPLOYEE VENDOR PRODUCT VEHICLE Elementary items Numeric HOURS WORKED (of an employee) TEMPERATURE (of a substance) QUANTITY ORDERED (of a product) DATE SHIPPED (of an order) MASS (of a body) SPEED (of an object) Composite items DEPARTURE TIME (of a train) Logical CREDIT APPROVAL (for an order) NEW CUSTOMER FLAG UNION MEMBER (for an employee) AUDIT TRACE OPTION Other HOME ADDRESS (of an employee) CREDIT HISTORY (of a customer) SUBASSEMBLY (of a product) CURRENT ORBIT (of a satellite) Avoid false composites Don't confuse a mixed unit or structured representation of an elementary item with a true composite item. For example, these should be treated as elementary, not composite, items: Date = year + month + day PersonName Time = first + middle + last = hours + minutes + seconds Avoiding false composites is especially important to avoid clutter in a data dictionary COMP 370, fall, 2016 25-28

Data classes and inheritance Each basic data type can be divided into subtypes or data classes. Each such class can in turn be further divided into subclasses to any level. Data classes and inheritance Each class inherits the properties of its parent classes: data representation, esp. internal, and attributes associated functions and operations This inheritance principle can greatly simplify the definition of both data items and the functions or processes that operate on them (even without any object-oriented tools.) Classes and data items In a class hierarchy, properties of a class T are inherited by both: subclasses of T specific data items (instances, objects) of type T It is, therefore, unnecessary and undesirable (why?) to specify every attribute in the definition of every data item. Nevertheless, many older tools (COBOL, some data-dictionary systems, etc.) demand that we do so! Would the "year-2000 crisis" have occurred, if everyone had understood this principle? Classes (or types) versus data items (or instances or objects) Inexperienced systems analysts sometimes confuse these two very different things. For example, which (class or data item) is each of these likely to be? temperature date address How do we know? These names by themselves are usually appropriate for a class rather than for a specific data item. Careful data-item names qualify a noun: oven temperature, customer address But low-level modules may use generic names. COMP 370, fall, 2016 29-32

Data representation standards For any class, there's usually a big advantage in choosing the same internal representation for all instances (data items). That promotes: data interchange re-usable program modules reliability efficiency Level can be a generic class or subclass a basic type Scope can be: industry-wide corporate project Abstract data type (ADT) A class defined: not in terms of the representation but in terms of the operations or functions that are permitted / supported ("behavior", "services") When we discuss or manipulate an ADT, we: are unaware of the internal representation, including names, types, or sequence, or memory layout of any component items interact with objects of that type only through well-defined interface functions ("methods") Why is that good? Implementing an ADT in software Program designers can always (and often should) exploit the ADT concept as a design aid, even with non-oop tools, but Object-oriented languages (Smalltalk, C++, Java,... ) support ADT's through a class definition capability. Hides (i.e. localizes) the internal representation Encapsulates functions that operate on objects (data items, instances) of the class What C facility was extended to support ADT's in C++? COMP 370, fall, 2016 33-36