AN INITIAL THEORETICAL FOUNDATION FOR OBJECT-ORIENTED SYSTEMS ANALYSIS AND DESIGN

Similar documents
NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

Chapter 2 Overview of the Design Methodology

Handout 9: Imperative Programs and State

Organizing Information. Organizing information is at the heart of information science and is important in many other

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements.

Chapter 4. Capturing the Requirements. 4th Edition. Shari L. Pfleeger Joanne M. Atlee

Introduction to Formal Methods

CHAPTER 9 DESIGN ENGINEERING. Overview

OBJECT ORIENTED SYSTEM DEVELOPMENT Software Development Dynamic System Development Information system solution Steps in System Development Analysis

1. Write two major differences between Object-oriented programming and procedural programming?

DATA MODELS FOR SEMISTRUCTURED DATA

Towards Systematic Usability Verification

Cover Page. The handle holds various files of this Leiden University dissertation

A Comparison of the Booch Method and Shlaer-Mellor OOA/RD

CS SOFTWARE ENGINEERING QUESTION BANK SIXTEEN MARKS

Chapter 10. Object-Oriented Analysis and Modeling Using the UML. McGraw-Hill/Irwin

Software Service Engineering

UNIT 1 GEOMETRY TEMPLATE CREATED BY REGION 1 ESA UNIT 1

Chapter No. 2 Class modeling CO:-Sketch Class,object models using fundamental relationships Contents 2.1 Object and Class Concepts (12M) Objects,

Preface A Brief History Pilot Test Results

Darshan Institute of Engineering & Technology for Diploma Studies

Module 3. Requirements Analysis and Specification. Version 2 CSE IIT, Kharagpur

Chapter 1: Principles of Programming and Software Engineering

User Interface Design: The WHO, the WHAT, and the HOW Revisited

A Michael Jackson presentation. CSE503: Software Engineering. The following slides are from his keynote at ICSE 1995

SHRI ANGALAMMAN COLLEGE OF ENGINEERING & TECHNOLOGY (An ISO 9001:2008 Certified Institution) SIRUGANOOR,TRICHY

Addition and Multiplication with Volume and Area

Transformation of analysis model to design model

5 Mathematics Curriculum. Module Overview... i. Topic A: Concepts of Volume... 5.A.1

Managing Change and Complexity

Introducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence

Topics in Object-Oriented Design Patterns

Designing and documenting the behavior of software

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

Reducing Quantization Error and Contextual Bias Problems in Object-Oriented Methods by Applying Fuzzy-Logic Techniques

UML-Based Conceptual Modeling of Pattern-Bases

Content Sharing and Reuse in PTC Integrity Lifecycle Manager

SIR C R REDDY COLLEGE OF ENGINEERING

Carnegie Learning Math Series Course 1, A Florida Standards Program. Chapter 1: Factors, Multiples, Primes, and Composites

Ingegneria del Software Corso di Laurea in Informatica per il Management. Introduction to UML

Getting a Quick Start with RUP

Ch. 21: Object Oriented Databases

Methods for requirements engineering

Measuring the quality of UML Designs

! Use of formal notations. ! in software system descriptions. ! for a broad range of effects. ! and varying levels of use. !

Introduction to the UML

Guided Tour: Intelligent Conceptual Modelling in EER and UML-like class diagrams with icom compared to ORM2

Automated Support for the Development of Formal Object-Oriented Requirements Specifications

AN OBJECT-ORIENTED VISUAL SIMULATION ENVIRONMENT FOR QUEUING NETWORKS

Inheritance Metrics: What do they Measure?

etakeoff Bridge Training Guide

From Craft to Science: Rules for Software Design -- Part II

Generic and Domain Specific Ontology Collaboration Analysis

DEPARTMENT OF COMPUTER SCIENCE

Integrating SysML and OWL

Ans 1-j)True, these diagrams show a set of classes, interfaces and collaborations and their relationships.

Modeling Issues Modeling Enterprises. Modeling

Formal Foundations of Software Engineering

INFORMATION TECHNOLOGY COURSE OBJECTIVE AND OUTCOME

Definition of Information Systems

Class 22: Inheritance

CIS 1.5 Course Objectives. a. Understand the concept of a program (i.e., a computer following a series of instructions)

Q Body of techniques supported by. R precise mathematics. R powerful analysis tools. Q Rigorous, effective mechanisms for system.

National Council for Higher Education. Minimum standards for the courses of Study in. Bachelor of Science in Software Engineering

Introduction to Assurance

OBJECT-ORIENTED SOFTWARE DEVELOPMENT Using OBJECT MODELING TECHNIQUE (OMT)

CASE TOOLS LAB VIVA QUESTION

History of object-oriented approaches

Component-Based Software Engineering TIP

COS 320. Compiling Techniques

3.7 Denotational Semantics

OO Requirements to OO design. Csaba Veres Alan M. Davis (1995), Colorado

Developing Shlaer-Mellor Models Using UML

Chapter. Relational Database Concepts COPYRIGHTED MATERIAL

Practical Object-Oriented Design in Ruby

Generalized Document Data Model for Integrating Autonomous Applications

Crash Course in Modernization. A whitepaper from mrc

Chapter 8: Enhanced ER Model

Instances and Classes. SOFTWARE ENGINEERING Christopher A. Welty David A. Ferrucci. 24 Summer 1999 intelligence

UNIT II Requirements Analysis and Specification & Software Design

Vocabulary-Driven Enterprise Architecture Development Guidelines for DoDAF AV-2: Design and Development of the Integrated Dictionary

BCS THE CHARTERED INSTITUTE FOR IT. BCS Higher Education Qualifications BCS Level 6 Professional Graduate Diploma in IT EXAMINERS' REPORT

SOME TYPES AND USES OF DATA MODELS

6 Mathematics Curriculum

Multi-Paradigm Approach for Teaching Programming

REVIEW AND OUTLOOKS OF THE MEANS FOR VISUALIZATION OF SYNTAX SEMANTICS AND SOURCE CODE. PROCEDURAL AND OBJECT ORIENTED PARADIGM DIFFERENCES

Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page:

Subject: Scheduling Region Questions and Problems of new SystemVerilog commands

Reading 1 : Introduction

Introduction to Software Testing

An Architecture for Semantic Enterprise Application Integration Standards

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Category Theory in Ontology Research: Concrete Gain from an Abstract Approach

Describing Computer Languages

Design Metrics for Object-Oriented Software Systems

«Computer Science» Requirements for applicants by Innopolis University

Introduction to Software Engineering

Spemmet - A Tool for Modeling Software Processes with SPEM

A Small Interpreted Language

Chapter 1: Programming Principles

Transcription:

AN INITIAL THEORETICAL FOUNDATION FOR OBJECT-ORIENTED SYSTEMS ANALYSIS AND DESIGN A Dissertation Presented to the Department of Computer Science Brigham Young University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Stephen W. Clyde 1993 by Stephen W. Clyde April 1993

This dissertation by Stephen W. Clyde is accepted in its present form by the Department of Computer Science of Brigham Young University as satisfying the dissertation requirement for the degree Doctor of Philosophy. Scott N. Woodfield, Committee Chair David W. Embley, Committee Member and Graduate Coordinator Date Dan Olsen, Committee Member and Department Chair ii

ACKNOWLEDGEMENTS I wish to expresses my sincere appreciation to the many individuals that helped me complete this work. I give special thanks to Dr. David Embley and Dr. Scott Woodfield for the countless hours they spent with me discussing and formulating the theoretical foundation presented in this dissertation. They have also been excellent mentors who have taught me to identify interesting problems, set meaningful objectives, and conduct sound research. I acknowledge all the contributions made by members of the Object-oriented Systems Modeling group at Brigham Young University. In particular, I want to thank Susan Bodily, Jay Compton, Mark Gardner, Christophe Giraud-Carrier, Robert Jackson, Steve Liddle, and Brent Reed, for their constructive comments on many of the key concepts in the dissertation. I give special thanks to my father, Dr. Allan Clyde, for his valuable insight into software engineering, as well as his moral support. Funding and equipment for this research was provided by the Computer Science Department of Brigham Young University and the Hewlett-Packard Company. I express sincere thanks to John Burnham who played a major role in securing the funding and equipment grants. I want to especially thank my sweet wife, Emily, and my children, Jamie, David, Joseph, Matthew, and Nathan, for their unending patience and encouragement during the years of work that went into this research. They sacrificed many of the comforts and securities of life so I could pursue my degree. Finally, I expressive my gratitude to the Lord for the strength to persevere and for insight into the problems and concepts that I encountered during the research. iii

TABLE OF CONTENTS CHAPTER 1 TOWARDS AN ENGINEERING DISCIPLINE FOR OBJECT-ORIENTED SOFTWARE DEVELOPMENT............... 1 1.1 INTRODUCTION....................................... 1 1.2 OBJECT-ORIENTED SOFTWARE DEVELOPMENT............ 2 1.3 SOFTWARE SYSTEMS ANALYSIS AND DESIGN.............. 5 1.4 SOFTWARE ENGINEERING.............................. 6 1.5 A THEORETICAL FOUNDATION FOR A MODEL-DRIVEN APPROACH........................................... 8 1.6 OVERVIEW OF THE DISSERTATION...................... 9 CHAPTER 2 TUNABLE FORMALISM IN OBJECT-ORIENTED SYSTEMS ANALYSIS AND DESIGN: MEETING THE NEEDS OF BOTH THEORETICIANS AND PRACTITIONERS................. 11 2.1 INTRODUCTION....................................... 11 2.2 OSA................................................. 13 2.2.1 Object-Relationship Model (ORM)....................... 13 2.2.2 Object-Behavior Model (OBM)......................... 16 2.2.3 Object-Interaction Model (OIM)........................ 19 2.3 FORMAL DEFINITION OF OSA........................... 20 iv

2.3.1 OSA Semantics.................................... 21 2.3.2 OSA Syntax...................................... 26 2.4 BENEFITS OF FORMALISM.............................. 27 2.5 VARYING LEVELS OF DETAIL AND DEGREES OF COMPLE- TION................................................ 30 2.6 SUMMARY............................................ 33 CHAPTER 3 SET-BASED SYNTAX DEFINITION OF OBJECT-ORIENTED SYSTEM ANALYSIS (OSA)................. 35 3.1 INTRODUCTION....................................... 35 3.2 OSA STRUCTURES..................................... 37 3.3 SYNTACTICAL CONSTRAINTS........................... 38 3.4 OSA MODEL INSTANCE................................. 48 CHAPTER 4 OBJECT-ORIENTED SYSTEM MODELING LOGIC (OSM-LOGIC). 49 4.1 INTRODUCTION....................................... 49 4.2 LANGUAGE DEFINITION................................ 49 4.3 INTERPRETATIONS.................................... 53 CHAPTER 5 OSA-TO-OSM-LOGIC CONVERSION ALGORITHM............... 58 5.1 INTRODUCTION....................................... 58 v

5.2 PRELIMINARY TRANSFORMATIONS...................... 59 5.3 CONVERTING ORM COMPONENTS....................... 60 5.3.1 Static Properties.................................... 61 5.3.2 Dynamic Properties................................. 69 5.4 CONVERTING OBM COMPONENTS....................... 77 5.4.1 Static Properties.................................... 78 5.4.2 Dynamic Properties................................. 89 5.5 CONVERTING OIM COMPONENTS........................ 96 5.5.1 Static Properties.................................... 96 5.5.2 Dynamic Properties................................ 100 CHAPTER 6 OSA-BASED SYNTAX DEFINITION............................. 102 6.1 INTRODUCTION...................................... 102 6.2 OSA META-MODEL................................... 103 6.3 STATIC EQUIVALENCE................................ 104 6.3.1 Static-Equivalence Relations.......................... 105 6.3.2 Equivalence Classes................................ 108 6.4 ATEMPORAL INTERPRETATIONS....................... 109 6.4.1 Definition....................................... 109 6.4.2 Mapping Equivalence Classes to Atemporal Interpretations..... 110 6.4.3 Mapping Atemporal Interpretations to Meta-model Interpretations........................................... 113 6.5 MAPPING ATEMPORAL INTERPRETATIONS TO OSA STRUCvi

TURES.............................................. 118 6.6 ATEMPORAL EQUIVALENCE........................... 122 6.7 CONCLUDING REMARKS ON OSA S SYNTAX DEFINITIONS.. 123 CHAPTER 7 OBJECT-CLASS CONGRUENCY................................ 125 7.1 INTRODUCTION...................................... 125 7.2 DEFINITION......................................... 128 7.2.1 OSA........................................... 128 7.2.2 Potential Object Properties........................... 132 7.2.3 Immediate/Inherited Properties......................... 133 7.2.4 Common Properties................................ 133 7.2.5 Object-class Congruency............................. 134 7.3 CONGRUENCY PROBLEMS............................. 136 7.3.1 Property Overstatement.............................. 136 7.3.2 Property Understatement............................. 141 7.4 CONGRUENCY TRANSFORMATIONS..................... 142 7.4.1 Eliminating Overstatements........................... 143 7.4.2 Removing Understatements........................... 144 7.5 CONCLUSION........................................ 145 CHAPTER 8 CONCLUSION................................................. 147 8.1 SUMMARY........................................... 147 vii

8.2 FUTURE RESEARCH................................... 149 8.2.1 Extensions to Object-class Congruency................... 150 8.2.2 Other Desirable Properties........................... 150 8.2.3 Transition from Analysis to Design..................... 151 8.2.4 Additional Types of Design Information.................. 152 8.2.5 Type Theory..................................... 152 8.3 FINAL REMARK...................................... 153 APPENDIX A THE OSA META-MODEL...................................... 158 viii

LIST OF FIGURES AND TABLES Figure 2.1 Sample ORM........................................... 14 Figure 2.2 Details of a high-level object class............................ 16 Figure 2.3 Sample state net......................................... 17 Figure 2.4 Details of a high-level state................................. 18 Figure 2.5 Two-step formal semantics definition.......................... 21 Figure 2.6 A valid interpretation for the Pizza object class................... 26 Figure 2.7 Partial meta-model and interpretation........................... 27 Figure 2.8 Varying detail and completion with generalization/specialization........ 30 Figure 2.9 Varying detail and completion with aggregation................... 31 Figure 2.10 Increasing detail and completion with incremental development........ 31 Figure 2.11 Varying detail and completion with natural language................ 32 Figure 2.12 Varying detail and completion with high-level components........... 32 Figure 2.13 Hiding the detail of the high-level object classes in Figure 2.12........ 33 Figure 3.1 A sample OSA model instance............................... 35 Figure 3.2 Textual representation of sample model instance in?............... 36 Figure 4.1 A sample OSA model instance............................... 52 Figure 4.2 Sample OSM-Logic Formulas................................ 52 Figure 4.3 An interpretation for the formulas in Figure 4.2................... 55 Figure 5.1 Details of the open state for the Order object class................. 60 Figure 5.2 Sample ORM for pizza ordering system......................... 61 Figure 5.3 State net for the Order object class............................ 78 ix

Figure 6.1 Partial meta-model and interpretation.......................... 103 Figure 6.2 Sample ORM.......................................... 104 Figure 6.3 Atemporal equivalence.................................... 104 Figure 6.4 Statically equivalent interpretation............................ 105 Figure 7.1 An initial analysis model instance............................ 126 Figure 7.2 Extended analysis model.................................. 127 Figure 7.3 Extended analysis model with congruency...................... 127 Table 7.1 Predicates for the OSA model instance in Figure 7.2............... 130 Table 7.2 Formulas for the OSA model instance in Figure 7.2............... 131 Figure 7.4 Inventory view of a car-sales and rental application............... 136 Figure 7.5 Inventory view with incongruency corrected..................... 137 Figure 7.6 Missing specialization.................................... 138 Figure 7.7 Rental view of sample application............................ 139 Figure 7.8 Save view of sample application............................. 139 Table 7.3 Defined and common properties for Customer................... 139 Figure 7.9 Integrated Customer using a union of original common properties..... 140 Figure 7.10 Integrated Customer using a union of original I/I properties.......... 140 Figure 7.11 Integrated Customer using a new generalization.................. 141 Figure 7.12 An understatement of the Car object class...................... 142 Figure 7.13 Corrected Car object class................................. 142 x

CHAPTER 1 TOWARDS AN ENGINEERING DISCIPLINE FOR OBJECT-ORIENTED SOFTWARE DEVELOPMENT 1.1 INTRODUCTION Industry and academia are embracing the object-oriented paradigm for software development with considerable energy and enthusiasm [23, 49]. Its appeal comes from its ability to manage complexity [4], model problem domain concepts naturally throughout the development process [36], reuse problem domain concepts [31, 33], and provide a common platform for software development activities [23]. Unfortunately, many aspects of the paradigm are still immature, especially those related to early development activities, such as analysis and design [18]. Consequently, software-development organizations are experiencing various difficulties in switching over to an object orientation, including: scaling up technology for large applications [19], managing the development process [41], achieving expected productivity gains [37, 33], and teaching object-oriented concepts [29, 42]. These problems stem from the fact that object-oriented software development is still not an engineering discipline. As a minimum, a software engineering discipline requires an underlying theoretical foundation consisting of formal conceptual models, sound engineering procedures, and means for assessing quality. This dissertation moves object-oriented software development closer to a true engineering discipline by establishing an initial theoretical foundation for a model-driven approach to objectoriented systems analysis and design. Before elaborating on the problem and our approach, we

present some background information in the following sections. Section 1.2 presents a brief overview of object-oriented software development. Section 1.3 provides a summary of analysis and design goals. Section 1.4 discusses requirements for a software engineering discipline. Then, we introduce our approach in Section 1.5 and outline the rest of the dissertation in Section 1.6. 1.2 OBJECT-ORIENTED SOFTWARE DEVELOPMENT Object-oriented software development focuses on the form and behavior of problemdomain entities. In contrast, traditional software-development paradigms force engineers to think about software systems in terms of the computational machines they run on. Object orientation shifts an engineer s focus from the machine that does the work to the problem domain [23]. For example, consider a software engineer who is developing a drawing application that allows users to place geometric shapes on the screen. With a traditional approach, the engineer would think about sequences of instructions (procedures) that tell the computer how to draw geometric shapes. With an object-oriented approach, an engineer would identify and describe geometric shapes, organize a software system around these entities, and then build software components that encapsulate their form and behavior. Throughout the development process, the engineer is able to directly relate the software components to entities in the problem domain. The core concepts of object orientation include object identity, encapsulation, classification, abstraction, modularity, hierarchy, inheritance, typing, persistence, and concurrency [38, 49, 54, 55]. Of these, object identity, abstraction, and encapsulation are touted most often as the essence of object orientation. Object identity is the idea that entities exist independent of the values they hold or represent. Abstraction is the definition of services provided by an object that 2

allow other entities to interact with it. Abstraction enables engineers to view an object in terms of its services and not have to understand its details. Encapsulation is the enforcement of abstraction boundaries or what Parnas called information hiding [39]. When an object encapsulates data and behavior, it prevents outside objects from directly affecting that data and behavior. All interactions with the object must occur through services defined by the abstraction. Further discussion on object identity, abstraction, and encapsulation, as well as the other core concepts, can be found in [4, 27, 55]. Object orientation in software is not new. Its roots date back to the 1960s when researchers in artificial intelligence introduced objects as a way to represent knowledge [43]. In 1967, Dahl and Nygaard picked up on this idea and developed the Simula programming language, which many authors credit as being the first object-oriented programming language [23, 29, 43]. Simula inspired other object-oriented languages including Smalltalk, Eiffel, and Ada [29]. As the popularity of object orientation grew, contemporary languages, such as C, Pascal, Cobol, and Lisp, were retrofitted with object-oriented constructs. Booch provides a summary of popular objectoriented programming languages and their genealogy [4]. In addition to programming languages, object orientation independently evolved through the 1970s and 1980s in two other areas: database systems and software development methods [29]. Theoretical research on object-oriented database systems is relatively scarce compared to other types of database systems, such as relational and hierarchical. The research that does exist is disjoint and based on a wide variety of objectives and terminology. Despite the lack of a solid theoretical foundation, there are several successful object-oriented database systems. Some of the more notable ones include: O 2, ObjectStore, Gemstone, Starburst, and POSTGRES [5]. O 2, ObjectStore, and Gemstone have close ties to object-oriented programming languages; whereas, Starburst and POSTGRES evolved from database languages. 3

Object-oriented analysis and design methods also sprang into existence in the 1980s without solid theoretical underpinnings. Among the more popular methods are Object-oriented Analysis (OOA) [10], Object-oriented System Analysis/Object Life Cycle [46, 47], Object-oriented Structured Design (OOSD) [53], Object Oriented Requirements Specification (OORS) [2], Object Modeling Technique (OMT) [45], Booch s design method [4], and Hierarchical Object Oriented Design (HOOD) [22]. Since these methods and corresponding conceptual models lack a theoretical foundation, their semantics are informal and sometime inconsistent. Since object orientation has evolved independently in several areas and lacks a commonly accepted theoretical foundation, there is considerable confusion and controversy regarding fundamental concepts. For example, a recent article gives five different definitions for an object, three definitions for encapsulation, and two definitions for inheritance [49]. Other "overloaded" terms include: class, type, state, and property. Currently, developers are left to resolve the differences and ambiguities for themselves and then figure out how to apply the concepts to their own projects. An initial theoretical foundation that includes formal definitions for core concepts would improve knowledge transfer among practitioners and theoreticians. Currently, practitioners and theoreticians alike spend excessive time and effort in the work place, professional conferences, and other forums, debating core concepts before they can begin communicating about real problems. Formal definitions would minimize the time and effort required to establish common ground. Formalism can lead to more precise and unambiguous communication, which in turn, leads to more efficient knowledge transfer, higher quality software, and more productive engineering practices. 4

1.3 SOFTWARE SYSTEMS ANALYSIS AND DESIGN A theoretical foundation for software development must support engineering activities for analysis and design, especially for the object-oriented paradigm, because abstractions developed during analysis and design are the abstractions used throughout the development process. Thus, before we establish a software-engineering foundation, we must understand the purpose and informational content of analysis and design. Software systems analysis is a process of exploring, understanding, and communicating the salient features of a problem domain. The information gathered during analysis feeds subsequent development activities, such as design. Software systems design is a process of organizing information in a solution domain [35]. There is nothing inherent about analysis and design that requires all analysis to be complete before design can begin. More often, the opposite is true; developers interleave analysis and design activities during the early phases of a development project. Object-oriented systems analysis focuses on identifying and classifying objects, relationships among objects, object behavior, and object interactions. Object-oriented systems design deals with organizing these components into encapsulations with well-defined abstraction boundaries. Engineers must be able to perform these two activities in concert with each other. A theoretical foundation for object-oriented software engineering must include a conceptual model for the information acquired during analysis and design. Analysis involves information pertaining to the form of problem domain entities and their relationships, how they behave in response to stimuli, and how they interact with each other. Design adds information about abstractions and encapsulations. The more complete, consistent, and formal the conceptual model is, the more precise and unambiguous engineers can be in their description of analysis and 5

design information. The challenge is to make it easy for engineers to describe the problem domain and organize information in the solution domain while retaining the ability to be precise and unambiguous. Most informal models are easy to use, but lack the ability to represent real-world problems accurately. For example, OOA and OMT describe object behavior with states and instantaneous transitions [10, 45]. An object can also perform actions as it transitions from one state to another state. Since transitions are instantaneous, the action must also be instantaneous. While this may be theoretically possible, it is inaccurate with respect to what usually takes place in real-world systems. Formal models, on the other hand, are typically more precise, but incomplete and hard to use. Several formal models have been proposed, including: F-Logic [25], O-Logic [26, 32], C-Logic [6], and COL [1]. Unfortunately, these models are incomplete with respect to behavior and interaction information. They are also difficult to use because they require in-depth knowledge of their underlying formalism. To achieve the combined the advantages of informal and formal models, a theoretical foundation must include a conceptual model for analysis and design information that is complete, consistent, formal, and easy to use. 1.4 SOFTWARE ENGINEERING In additional to a conceptual model for information, an engineering discipline requires quality assessment mechanisms, sound engineering procedures, and means for effectively disseminating knowledge. Since the ultimate goal of engineering is to produce quality products, it stands to reason that we must have ways to assess quality. For software engineering, this means that we must know what is good (or bad) about analysis and design model instances. More 6

precisely, there must be a formal definition of desirable properties for the information captured in the conceptual model. The desirable properties must represent those characteristics that ultimately lead to more efficient, usable, maintainable, and extensible software products. Booch and Rumbaugh informally discuss several desirable properties, including: coupling, cohesion, sufficiency, completeness, and primitiveness [4, 45]. However, without formal definitions and supporting tools, developers have a difficult time applying these concepts in general. Embley and Woodfield provide a semi-formal treatment of coupling and cohesion for abstract data types [16, 17]. Although their approach is semi-formal, they do not deal with inheritance, which has a definite impact on the coupling and cohesion of objects. Identifying and formally defining desirable properties is only half of what is necessary for quality assessment. The other half involves defining metrics that allow engineers to measure the properties in a given model instance accurately. Chidamber and Kemerer propose seven metrics for object-oriented design, two of which address coupling and cohesion [7]. The other five attempt to estimate the complexity of an object class and its overall influence in the system. Except for Chidamber s and Kemerer s work, very little research has been published in this area [28]. Quality assessment techniques lead to sound engineering procedures. A sound engineering procedure is an activity, formally defined in terms of a conceptual model, that produces a model instance with desirable properties. With sound engineering procedures, developers can systematically and repeatedly achieve good results, which in turn, enable managers to plan and track development projects accurately. Finally, an engineering disciple must support effective means for disseminating knowledge among researchers and practitioners. A theoretical foundation would enable this to occur if it included: a complete, consistent, formal, and easy-to-use conceptual model; quality assessment 7

mechanisms; and engineering procedures. Currently, there is no commonly accepted theoretical foundation with these qualities. As a result, technology transfer for software engineering is a major concern for many organizations today [28]. Universities still teach software engineering as more of a craft than an engineering discipline. 1.5 A THEORETICAL FOUNDATION FOR A MODEL-DRIVEN APPROACH This dissertation proposes an initial theoretical foundation for a model-driven approach to object-oriented systems analysis and design. The foundation includes the formal definition of a conceptual model, a desirable property, a metric, and several transformations that yield model instances with the desirable property. We refer to the foundation as a model-driven approach because it centers on a conceptual model. In contrast, a process-driven approach would focus on sequences of development activities. The foundation s conceptual model, called Object-oriented Systems Analysis (OSA), was originally developed by Embley, Kurtz, and Woodfield at Brigham Young University in cooperation with Hewlett Packard [14]. Although OSA started out strictly as an analysis model, we also use it with some minor extensions for design. When we use OSA with the design extensions, we refer to it as Object-oriented Systems Design (OSD). OSA provides modeling components for describing information acquired during analysis and design. These components include objects, object classes, relationships, relationship sets, states, transitions, interactions, and constraints. OSA s modeling components are conceptual building blocking that developers can use to describe complex systems of concurrent objects. Because we are pursuing a model-driven approach, we do not prescribe the order or way in which 8

developers use these components to describe a system. Instead, we provide an understanding of the core concepts and how they relate to each other by formally defining their syntax and semantics. Just because the definitions are formal does not imply that developers have to work at a formal level. The formal definitions enable developers to be precise when necessary. Formalism can also stimulate refinement activities that lead to unambiguous descriptions. One of the challenges in developing a conceptual model for object-oriented software engineering is that it must meet the needs of both theoreticians and practitioners. Theoreticians want to explore the core concepts of object orientation, identify desirable properties, define metrics, and develop sound engineering procedures. To this end, they require a rigorous and formal conceptual model. On the other hand, practitioners want to develop quality software systems efficiently. They require a conceptual model that has enough expressive power to describe their problem and solution domains in a straightforward and natural way. Our approach is to accommodate both theoreticians and practitioners by supporting tunable formalism, which allows concept-model users to vary their awareness of the underlying formalism without compromising expressiveness or accuracy. With tunable formalism, both theoretician and practitioners can successfully use the same conceptual model. This can enable and stimulate increased interaction between research and practice. 1.6 OVERVIEW OF THE DISSERTATION In Chapter 2, we define the requirements of tunable formalism, including: expressiveness, formalism, and the ability to vary levels of details. To show how OSA meets these requirements, we first briefly describe OSAs submodels and their primary model components. Next, we provide 9

an overview of how we formally defined OSA s semantics using as a temporal, first-order logic language. We also summarize how the formal definition provides answers to fundamental, practical questions related to object-oriented software development. Finally, we show how OSA allows varying levels of abstraction and completion so that engineers, tool builders, and researchers can tune the formalism to suit their needs. Chapters 3-6 contain the formal definition of OSA. Chapter 3 provides an initial set-based definition for OSA s syntax. Chapter 4 defines a two-sorted, first-order logic language with temporal semantics, called Object-oriented Systems Modeling Logic (OSM-Logic). Using the setbased syntax definition and OSM-Logic, we formally define the semantics of OSA in Chapter 5. Once OSA s semantics are formally defined, we can use it as a formal descriptive mechanism. In Chapter 6, we use OSA itself to create an alternate syntax definition, called the OSA-based syntax definition, and show that it is atemporally equivalent to the original set-based syntax definition. In Chapter 7, we introduce and formally define a way to assess the quality of object classes in analysis and design models, called object-class congruency. Object-class congruency is based on the idea that immediate and inherited properties defined for an object class should match the common properties of the class s members. In addition, we define a congruency metric and several semantic-preserving transformations that convert incongruent classes into congruent classes. We also explain why object-class congruency leads to better abstractions for real-world concepts and to better implementation, extension, and reuse. In Chapter 8, we summarize the results of the dissertations and discuss future research directions. 10

CHAPTER 2 TUNABLE FORMALISM IN OBJECT-ORIENTED SYSTEMS ANALYSIS AND DESIGN: MEETING THE NEEDS OF BOTH THEORETICIANS AND PRACTITIONERS [8] 2.1 INTRODUCTION The use of formalisms in software development is controversial. Theoreticians claim that formalisms ensure accurate understanding in everything from analysis to verification and are necessary for establishing a sound theoretical foundation for engineering practices. Practitioners believe that formalisms hinder productivity and do not provide the means for constructing realworld applications in a reasonable amount of time. This chapter shows how we meets the needs of both theoreticians and practitioners by meeting the requirements of tunable formalism for our concept model, Object-oriented Systems Analysis (OSA). Intuitively, tunable formalism means that model users can work with different levels of formalism, ranging from informal to mathematically rigorous. A software model with tunable formalism must (a) be sufficiently expressive for practitioners, (b) have a formally defined syntax and semantics, and (c) allow various levels of detail and completion. An object-oriented software model is sufficiently expressive if it provides modeling components for describing objects, classes of objects, generalization and aggregation hierarchies, relationships among objects, object behavior, object interactions, and constraints. OSA provides a complete set of modeling components for describing these fundamental object-oriented concepts, which we present in a brief overview in Section 2.2. OSA modeling components are like those

provided by other object-oriented analysis models, such as Object Modeling Technique (OMT) [45], Object-oriented Analysis (OOA) [10], and Object-oriented Systems Analysis/Object Life Cycles (OOSA/OLC) [46, 47]. Unlike these other models, however, OSA model components are formally defined. A formal definition of both the syntax and the semantics of an object-oriented software model is necessary to guarantee accurate understanding and communication. In this way, OSA is similar to other formal, object-oriented models that are based on a logic language, such as F- Logic [25], O-Logic [26, 32], C-Logic [6], and COL [1]. However, these models do not support engineering at an informal level and have limited or no means for describing object behavior. In Section 2.3, we explain how OSA is formally defined. We then discuss, in Section 2.4, some of the benefits of the formal definition. The last requirement for providing tunable formalism, is that the model allows varying levels of detail and degrees of completion. By levels of detail, we mean that higher level components can be described in more detail with lower level components as we do for processes in data flow diagrams. Component leveling should be available for all object-oriented modeling concepts including object classes, relationship sets, states, transitions, and interactions. The semantics of these components should be uniform throughout the leveling hierarchy. By varying degrees of completion, we mean that model components need not be fully described as they are added to a model instance. As more information becomes available, and as it becomes desirable to execute or simulate a model instance, descriptions may be completed. In Section 2.5, we explain how OSA allows varying levels of detail and degrees of completion. 12

2.2 OSA Any analysis model that provides tunable formalism must have enough expressiveness for practitioners to productively describe real-world or conceptual systems. We now present an overview of OSA, which fulfills this requirement by providing a robust set of modeling components for object-oriented concepts. The full details of OSA are presented in [14]. Readers already familiar with OSA may skip to Section 2.3. OSA has three sub-models: the Object-Relationship Model (ORM), the Object-Behavior Model (OBM), and the Object-Interaction Model (OIM). An ORM instance describes the objects in a system and their relationships with each other. An OBM instance consists of object-behavior descriptions for members of various classes of objects. An OIM instance describes interactions between objects. We represent an OSA model instance with one or more diagrams. Although a single diagram typically focuses on components from one sub-model, it may contain components from all three. 2.2.1 Object-Relationship Model (ORM) ORM components describe objects, object classes, relationships, relationship sets, and constraints. Figure 2.1 shows a portion of an ORM instance for a pizza ordering system. The rectangles represent objects classes, and the lines represent relationship sets. An object class is a set of objects that share common properties or behavior. An object is any identifiable entity. A relationship links two or more objects. A relationship set is a set of relationships such that each associates objects from the same collection of object classes. In Figure 2.1, the line 13

Figure 2.1 Sample ORM between Order and Total Price represents a relationship set consisting of relationships between order objects and total-price objects. represent a ternary relationship set. The lines connecting Discount, Time Frame, and Pizza Relationship sets with more than two connections use diamonds in their representation. Figure 2.1 shows two kinds of constraints that apply to relationship sets: participation constraints and co-occurrence constraints. The min:max pairs or single integers near the connections of object classes to relationship sets are participation constraints. A participation constraint restricts the number of times an object in the connected object class can participate in the relationship set. For example, the 1:20 near the Order object class specifies that an order object is related to between one and twenty pizza objects. The Pizza Time Frame ---(1:2)- Discount is a co-occurrence constraint, which says that each (pizza, time-frame) pair associates with either 14

one or two discounts. ORM components include three special kinds of relationship sets: generalization/specialization, aggregation, and association. The open triangle in Figure 2.1 represents a generalization/ specialization, which says that the object classes Large Pizza, Medium Pizza, and Small Pizza are all specializations of the object class Pizza. The set of objects in a specialization class is a subset of the set of objects in its generalization class. A generalization/specialization may have constraints, including: mutual exclusion, union, and partition. A mutual exclusion constraint, represented by a "+", states that the subset object classes are mutually exclusive. A union constraint, represented by a " ", states that the generalization object class is a union of all the specialization object classes. A partition constraint, represented by a " +", combines a mutual exclusion constraint with a union constraint. Thus, the generalization/specialization constraint in Figure 2.1 states that all pizzas are in exactly one category: large, medium, or small. The solid triangle in Figure 2.1 represents an aggregation. An aggregation describes the composition of the objects in some object class. Figure 2.1 describes a pizza as consisting of a crust, a sauce serving, cheese servings, and topping servings. The relationship set in Figure 2.1 connecting Order and Pizza has a star on the Order side. This relationship set represents an association, which describes the objects in an object class, called the set class, as sets of objects from another object class, called the member class. Here, an order is a set of one to twenty pizzas. ORM components also include general constraints that can further restrict membership in object classes and relationship sets. We can write general constraints in a natural language, in an object-oriented programming language, or in a first-order logic language. General constraints, as well as all other types of constraints, may contain variables. The x+y>1 constraint in Figure 2.1 is an example of a general constraint with variables. It, along with the x:2 and y:15 participation 15

constraints on the aggregation, ensure that a pizza has at least either a cheese serving or a topping serving and allows up to 2 cheese servings and 15 topping servings. High-level object classes and highlevel relationship sets are complex object classes and relationship sets described in more detail by other ORM components. In Figure 2.2 Details of a high-level object class Figure 2.1, the Time Frame object class is high level. Figure 2.2 shows its details. The relationship set between Time Frame and End says that each time-frame object is composed of one or more end objects, each of which is related to one or more start objects. Both start and end objects are made up of date and time objects. Defined in this way, the time frame of a discount can be more than just a simple continuous span of time. For instance, we can represent "Every Wednesday in March from 4:00 pm to 8:00 pm" as a time frame. 2.2.2 Object-Behavior Model (OBM) An OBM instance describes the behavior of objects in a system. It consists of a collection of state nets, each of which defines the behavior for the members of an object class. Figure 2.3 shows a state net for the Order object class. The primary building blocks for state nets are states and transitions. We represent states with rounded boxes and transitions with rectangles with a horizontal dividing line. A state is an 16

Figure 2.3 Sample state net abstraction of an object s status, phase, mode, or situation. An object may be in several different states at any time. For example, an order object may be in the Customer Waiting and the Unpaid state at the same time. Transitions are the means by which objects leave and enter states. A transition consists of a trigger and an optional action. Triggers, which are in the top half of transition rectangles, describe the conditions or events that cause a transition to fire. Actions, which are in the bottom half, describe what an object does when a transition fires. A transition is enabled when an object is in all states of a conjunction of prior states. Usually, this amounts to an object being in one of the prior states, since prior-state conjunctions often include only one state. An enabled transition fires when the trigger holds. After firing and performing the transition s action an object enters all states of a conjunction of subsequent states, which often consists of just one subsequent state. Transition [1] in Figure 2.3 says that if an order object is in the Open state or 17

in both the Customer Waiting and Unpaid states, then if the Cancel event occurs, the object leaves its prior state(s) and enters the Canceled state. The a and b in Figure 2.3 identify a path over which there is a time constraint. The time constraint (a to b) 20 minutes specifies that the time from entering the Paid state to entering the Completed state should not be greater than 20 minutes. In addition to paths, time constraints can restrict the time spent in a state, transition, trigger, or action. Like general constraints, time constraints can be described in a natural language, an object-oriented programming language, or a first-order logic language. High-level states and high-level transitions are states and transitions described by other state nets. As indicated by the shading, the Open state in Figure 2.3 is a high-level state. Figure 2.4 shows its details. When an order object is in the open state, it is either waiting for a request or in transition [7], processing a request. We use high-level states to describe states that include actions to be performed while an object is in that state. We use high-level transitions to describe transitions with intermediate steps. Like many other behavior models based on states and transitions, such as Harel Charts [20], OMT State Diagrams [45], Object Charts [3, 21], and Petri-Nets [40], OBM state nets can describe intra-object concurrency. When a transition completes, Figure 2.4 Details of a high-level state 18

an object may enter more than one state. For example, Figure 2.3 specifies that an order enters both the Customer Waiting and Unpaid states after it is confirmed. However, unlike these other behavior models, OBM transitions are not necessarily instantaneous. This allows state nets to more realistically describe the behavior of objects in realworld systems. For example, in an actual pizzeria, it takes time for an order to transition from the Ready and Paid states to the Completed state, because the pizzas need to be given to the customer. We model this transitional action directly and naturally as part of transition [5]. With the other behavior models, we would have to either assume that this action is instantaneous or create an intermediate state. Assuming all transitional actions are instantaneous is inconsistent with the real world, and creating intermediate states to describe transitional actions increases complexity. 2.2.3 Object-Interaction Model (OIM) An OIM captures information about interactions between concurrent objects. OIM components include interactions and various types of constraints. An interaction is a rendezvous between origin and destination objects. Like an Ada rendezvous, an OIM interaction can take time to complete while the origin and destination objects exchange information. Either the origin or destination of an interaction may be outside the model instance, and therefore, not explicitly described. Figure 2.4 shows an interaction from an origin outside the model instance to an order object in transition [7]. We can associate activity descriptions and object descriptions with interactions. Pizza Request is the activity description for the interaction in Figure 2.4. An activity description 19

identifies the type or purpose of the interaction. Items listed in parentheses are object descriptions. Object descriptions identify information to be communicated during an interaction. We can associate a variety of constraints with an interaction, including: origin constraints, destination constraints, and time constraints. An origin constraint restricts the set of potential objects that can initiate an interaction. Similarly, a destination constraint limits the set of potential objects that can receive an interaction. A time constraint restricts the amount a time it takes to complete an interaction. Like all other types of constraints, we may use variables in the description of these constraints. OIM components also include high-level interactions, described by other OSA components. For example, we can represent buffered communication by a high-level interaction that we, in turn, describe by low-level interactions with the buffer. We would describe the behavior of the buffer with a state net. Even though we describe OIM here as the third OSA sub-model, we often start developing an OSA model instance by first creating an OIM instance that provides a high-level overview of a system. We use high-level OIM components to summarize information flows, event sequences, and sub-system interactions. When more detail is required, we use lower level OIM components to describe communication between individual objects. In this way, OIM fulfills the role of event traces and event flow diagrams in OMT [45], as well as data flow diagrams and Object Communication Diagrams in OOSA/OLC [46, 47]. 2.3 FORMAL DEFINITION OF OSA For a model to be formally tunable it must provide the means for users to work at any 20

level of formalism. To satisfy theoreticians, we must therefore provide a formal definition even though practitioners might not generally be interested. In this section, we provide a brief overview of how we formally define the semantics and syntax of OSA. Chapters 3-6 provide much more detail and the full definition is in [9]. 2.3.1 OSA Semantics We formally define the semantics of OSA using a two-step approach, as Figure 2.5 shows. The first-step converts an OSA model instance to a temporal, first-order logic language, called Object-oriented System Modeling Logic (OSM-Logic). The conversion provides a set of formulas Figure 2.5 Two-step formal semantic definition 21

that serves as an intermediate description for the meaning of the OSA model instance. The second step is the formal interpretation of this set of formulas using mathematical structures. A formal interpretation consists of a mapping of the language s symbols to objects, points in time, functions, and relations in a mathematical structure. An interpretation for a set of formulas is valid if and only if the formulas are true in the mathematical structure. Given an OSA model instance, we formally define its semantics as the set of all valid interpretations for the set of formulas resulting from the conversion of the model instance to OSM-Logic. With the two-step approach, we are able to capture the essence of the OSA semantics at a high level of abstraction, namely in the conversion algorithm to OSM-Logic. An alternative, but unwieldy, approach is to define a formal interpretation of an OSA model directly, skipping the conversion to OSM-Logic. With this approach, the interpretation would have to map 215 different types of components and component relationships to elements in the mathematical structure. In contrast, the two-step approach allows us to capture the complexities of the semantics in the conversion algorithm to OSM-Logic, while leaving the formal interpretation relatively simple. It requires that we map only four types of components to elements in the mathematical structure. In addition, since OSM-Logic is a first-order logic language, we can take advantage of a large volume of existing theory. It is not our intention in this chapter to give the full definition of the OSM-Logic language or the details of the OSA formal definition. Instead, we wish to show how the semantics are defined, as well as explain how we can use this definition to simplify the formal definition of the OSA syntax. Complete definitions for OSM-Logic, OSA semantics, and OSA syntax are found in [9]. Once we have discussed the formalization of OSA, we also wish to discuss, in later sections of this chapter, the benefits of the formalism for both practitioners and theoreticians. Consider an OSA model instance that consists of only a Pizza object class. What does 22

it mean for an object to become a pizza? Intuitively, we think of ingredients being put together and baked over a period of about 15 minutes. Now, how do we reflect this intuitive understanding in the semantics of an OSA model? Since we see that our semantics must handle objects and their changes over time, a natural choice is a two-sorted logic. Thus, OSM-Logic is two sorted, where we assign one sort to points in time and the other to objects in the universe. (Although any multi-sorted logic can be reduced to a single-sorted logic [13], we choose a twosorted logic for the convenience it provides in keeping time points separate from other objects.) To give OSM-Logic temporal semantics, we add two restrictions to the standard definition of an interpretation. First, we restrict relations to include exactly zero, one, or two arguments of the time sort. Relations that do not have any time arguments are time-invariant relations. Relations that have one time argument are events that happen instantaneously. Relations that have two time arguments are called temporal relations. For a tuple in a temporal relation, the two time points define a time interval over which the other objects in the tuple are related. Second, if a tuple is a member of a temporal relation for some set of objects, then so are other tuples that have the same set of objects and have time intervals that are sub-intervals of the original. This restriction guarantees that if certain objects are related over a time interval, then they are related for any of its sub-intervals. Predicate symbols in OSM-Logic represent relations. For example, the Pizza(_, _, _) predicate symbol represents a temporal relation describing objects in the Pizza object class. The first place in this predicate symbol corresponds to an object, the pizza. The other two places form a time interval over which the object is a member of the Pizza object class. Similarly, the Becoming Pizza(_, _, _) predicate symbol represents a time interval over which an object is becoming a pizza. The conversion algorithm expresses the semantics of becoming a member of the Pizza 23