A PARSING APPROACH FOR SYSTEM BEHAVIOUR MODELING

Similar documents
An Approach for Extracting UML Diagram from Object-Oriented Program Based on J2X Haoqing Zhang1,a

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

Introduction to Programming Using Java (98-388)

UNIT-IV BASIC BEHAVIORAL MODELING-I

Chapter 4 Defining Classes I

Software Development Fundamentals (SDF)

Debugging Reinvented: Asking and Answering Why and Why Not Questions about Program Behavior

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Software Architecture Recovery based on Dynamic Analysis

11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e

Designing Loop Condition Constraint Model for Join Point Designation Diagrams (JPDDs)

CompuScholar, Inc. Alignment to Nevada "Computer Science" Course Standards

Coverage Criteria for Testing of Object Interactions in Sequence Diagrams

Object Orientated Analysis and Design. Benjamin Kenwright

EECS168 Exam 3 Review

G Programming Languages - Fall 2012

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Short Notes of CS201

Object-Oriented Design. Module UFC016QM. and Programming. Objects and Classes. O-O Design Unit 2: Faculty of Computing, Engineering

Functions. (transfer of parameters, returned values, recursion, function pointers).

CS /534 Compiler Construction University of Massachusetts Lowell

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

When do We Run a Compiler?

CS201 - Introduction to Programming Glossary By

Eclipse Support for Using Eli and Teaching Programming Languages

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Course Hours

LABORATORY 1 REVISION

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM

A new international standard for data validation and processing

The Analysis and Proposed Modifications to ISO/IEC Software Engineering Software Quality Requirements and Evaluation Quality Requirements

Hippo Software BPMN and UML Training

An Approach for Quality Control Management in Object Oriented Projects Development

CASE TOOLS LAB VIVA QUESTION

Programming Languages Third Edition. Chapter 7 Basic Semantics

Automated generation of TTCN-3 test scripts for SIP-based calls

Class object initialization block destructor Class object

Comparing and Contrasting different Approaches of Code Generator(Enum,Map-Like,If-else,Graph)

Dynamic Data Flow Analysis for Object Oriented Programs

Configuration Management for Component-based Systems

SCOS-2000 Technical Note

What are the characteristics of Object Oriented programming language?

LECTURE NOTES ON COMPILER DESIGN P a g e 2

The design of the PowerTools engine. The basics

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Naming in OOLs and Storage Layout Comp 412

Using Scala for building DSL s

Recursion 1. Recursion is the process of defining something in terms of itself.

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Notes on the Exam. Question 1. Today. Comp 104:Operating Systems Concepts 11/05/2015. Revision Lectures (separate questions and answers)

Dynamic reverse engineering of Java software

CS606- compiler instruction Solved MCQS From Midterm Papers

NOTE: Answer ANY FOUR of the following 6 sections:

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Comp 204: Computer Systems and Their Implementation. Lecture 25a: Revision Lectures (separate questions and answers)

Semantic Analysis. Compiler Architecture

The Haskell Refactorer, HaRe, and its API

Compiler Design. Subject Code: 6CS63/06IS662. Part A UNIT 1. Chapter Introduction. 1.1 Language Processors

Abstract 1. Introduction

Object Histories in Java

Contents. Figures. Tables. Examples. Foreword. Preface. 1 Basics of Java Programming 1. xix. xxi. xxiii. xxvii. xxix

THE SOFTWARE REQUIREMENTS FRAMEWORK FOR DOCUMENT CHANGES USING REVERSE ENGINERING APPROACH

Programming in C. main. Level 2. Level 2 Level 2. Level 3 Level 3

TRIREME Commander: Managing Simulink Simulations And Large Datasets In Java

Compilers and Code Optimization EDOARDO FUSELLA

What do Compilers Produce?

Java Software Solutions for AP Computer Science 3rd Edition, Lewis et al. 2011

ChAmElEoN Parse Tree

PROBLEM SOLVING AND OFFICE AUTOMATION. A Program consists of a series of instruction that a computer processes to perform the required operation.

Sri Vidya College of Engineering & Technology

Review sheet for Final Exam (List of objectives for this course)

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

ECE 122. Engineering Problem Solving with Java

Automatic Generation of Execution Traces and Visualizing Sequence Diagrams

Big Ideas. Chapter Computational Recipes

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

MutanT: A Modular and Generic Tool for Multi-Sensor Data Processing

a. It will output It s NOT Rover b. Class Main should be changed to the following (bold characters show the changes)

CompuScholar, Inc. 9th - 12th grades

Beginning To Define ebxml Initial Draft

The Procedure Abstraction

Programming Languages 2nd edition Tucker and Noonan"

Java PathFinder JPF 2 Second Generation of Java Model Checker

Reverse Engineering Interface Protocols for Comprehension of Large C++ Libraries during Code Evolution Tasks

G COURSE PLAN ASSISTANT PROFESSOR Regulation: R13 FACULTY DETAILS: Department::


Example: Fibonacci Numbers

The Structure of a Syntax-Directed Compiler

CSE 504: Compiler Design. Runtime Environments

Chapter 6 Introduction to Defining Classes

Combining Different Business Rules Technologies:A Rationalization

Voluntary Product Accessibility Template (VPAT ) WCAG Edition. About This Document. Version 2.2 July 2018

CJT^jL rafting Cm ompiler

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Course Description. Learn To: : Intro to JAVA SE7 and Programming using JAVA SE7. Course Outline ::

MIDTERM EXAM (Solutions)

BPM optimization Part 1: Introduction to BPM code optimization

The role of semantic analysis in a compiler

Creational. Structural

Transcription:

IADIS International Conference Applied Computing 2007 A PARSING APPROACH FOR SYSTEM BEHAVIOUR MODELING Lau Sei Ping 1, Wee Bui Lin 2, Nurfauza bt Jali 3 Faculty of Computer Science and Information Technology Universiti Malaysia Sarawak ABSTRACT Software maintenance tasks are getting more complex and expensive as a system evolves, especially during the iterative development where the design documents have not been updated to reflect code changes. Thus, software maintainers devote most of their efforts trying to understand the flow of the system that they are working on during the maintenance process. Reverse engineering tools play an important role in facilitating this process by reproducing the design model. This will enable the software maintainer to have a clearer comprehension of the software behaviours and the architecture of the system, especially through the graphical representation models. Sequence diagram is one of the essential UML artefacts that depict the objects interaction in the system and provides a self-documented communication medium among the software project team members, especially the software developer and designer. This paper proposes a data extraction approach to map the program behaviours that reflect the actual program design through a reverse engineering process. The parsing technique is used throughout the data extraction process to transform source code into a data structure of tree form before it is analysed. This paper demonstrates the proposed approach using static Java programming source code. KEYWORDS Reverse Engineering, Modelling, Parsing Technique, Sequence Diagram 1. INTRODUCTION Software documentations are very essential in software development process to provide better comprehension and references to all stack holders of the system particularly during the software maintenance process. Without a proper documentation, software maintenance tasks can become very complex and expensive to perform as the system evolves during the repetitive software development, especially where the design documents have not been updated to reflect the actual code changes. Thus, software maintainers devote most of their efforts trying to understand the flows and the architecture design of the system that they are working on during the maintenance process. In order to solve this problem, reverse engineering tools are created to facilitate maintenance process by reproducing the design model. The aim of reverse engineering is to reproduce the design model from the software itself to offer the programmers high-level presentation of the program, to ensure consistency in the actual implementation (Systä, 1999). This enables the software maintainer to have a clearer comprehension of the software behaviours and the architecture of the system, especially through the graphical representation models. Sequence diagram is one of the essential UML artefacts that depict the objects interaction in the system and a self-documented communication medium among the software project team members, especially the software developer and designer. Such diagrams capture important aspects of the object interactions, and can be naturally used to define the testing goals that must be achieved during software testing (Rountev et al.,2005). To regenerate a sequence diagram, data extraction is one of the important components of reverse engineering in order to draw out the relevant information of the system behaviour. This paper presents the use of parsing technique approach for data extraction and analysis in attempt of mapping the program behaviours that reflect the actual program design and present it vividly via graphical presentation model; namely the sequence diagram. The data extracted from static java source code is converted into a data 541

ISBN: 978-972-8924-30-0 2007 IADIS structure in a tree form before it is analysed to achieve the goal. This work only concentrates on reversing the java programming source code into parsed graphical representation and is done on the static source code. 2. RELATED WORKS There are various methods and approaches have been introduced by researches aimed to achieve reverse engineering by modelling the system behaviour through the sequence diagram. For instance, Quigley has utilized the runtime results to identify the objects and classes interaction in a program based on the memory allocation (Quigley et al.,2000). This kind of graphical representation in form of memory numbers was not really readable and easily comprehends by readers. While A. Rountev has utilized the control flow analysis method to model the flow of the massages passing between the objects (Rountev et al.,2004)and this research was able to show the behaviour of a method clearly. The same authors have also produced a highly-precise object naming in mapping the existing code to sequence diagram through a static analysis (Rountev et al.,2005). However, it could only support a portion of the program and restricted to the singleton method call. Besides that, another method is by compacting a repetition from the dynamic information of object-oriented programs during the program execution to generate the sequence diagram (Koji et al.,2005). Four compaction rules are introduced in this method to reduce the amount of information on the method calls tracing by abstracting some repetition patterns and recursive calls. Another similar work conducted uses three different techniques; filtering, slicing and information hiding to extract information from program execution trace at various levels of abstraction in order to generate the sequence diagram (Vasconcelos et al.,2005). Whereas, Java Interactive Visualisation Environment (JIVE), is a prominent reverse engineering research project offering a different approach addressed to the same problem (Gestwicki et al.,2004). JIVE provides an interactive environment that facilitates the teaching and debugging, where it is able to show the relationships between the objects, methods and static contacts visually from different views depend on various interests. Besides that, JIVE is also depicts the program behaviour at program runtime with the interactive sequence diagram to allow better understanding of the program execution and this is also the main focuses of our project. However, they have proven that JIVEgenerated sequence diagram is unable to produce meaningful illustration of the system behaviour of various program structures, especially with graphical interface program. 3. APPROACH AND METHODS In this section, we propose an approach on how to retrieve the UML sequence diagram from Java source code. There are two main inputs must be made available for this approach, (a) the source codes and (b) the language grammar. The source codes are referring to Java source files, it can be a single source file or multiple source files, which are interrelated by object references. As for the language grammar, it is the detail descriptions and syntax on how a particular programming language is organized and it is unique for different types of programming language. In our case, the Java grammar is used. The parser will interpret and retrieve the necessary information based on the language syntax and its structure defined in the grammar file. The parsing process is iterated for several times to collect different information required for later use in the sequence diagram construction. Information such as class properties, relationship between classes and methods, method invocation, method lifetimes is collected during the parsing process. Tree data structure is used to store and organize the extracted information during the parsing process and then served as the blue print to reconstruct the sequence diagram. Two major tree structures are constructed during data extraction. The first tree contains class properties information, while the second tree containing the properties of the invoked methods. As shown in Figure 1, the constructed data structure tree will go through the second phase of the parsing process. At this phase, parsing technique is again been used to traverse through the tree and finally construct the sequence diagram based on the stored information. 542

IADIS International Conference Applied Computing 2007 Source Code Tree Structure Parsing 90 Process k1::kenderaan proton::car Kenderaan Kereta() Language Grammar Diagram 90 Builder ubahkelajuan() getkelajuan() Sequence Diagra Figure 1. Logical view of the Parsing approach. 3.1 Architecture of Parsing Approach Sequence Diagram Sequence Diagram Presentation Relation Identification Parsing Class Property Gathering Paser Data Gold Parser Grammar Java Source Code Stream File Handling Java Source File Figure 2. The Architecture of Proposed Parsing Approach. The proposed approach consists of four (4) major modules, which are File-Handling, Class Property Gathering, Relationship identification, and Presentation as shown in Figure 2. The parsing technique is applied at the Class Property Gathering and Relationship Identification modules. 3.1.1 File Handling In general, the file handling performs the task of locating, open and load the Java source codes files for parsing process. There are two type of Java source files used in this module namely Initial source file which refer to the targeted Java Source file to be modelled for its behaviour and Reference source file which is refers to the file that referred by initial file through objects creation and method invocations during system operation. Figure 3 illustrates the relationship between the Initial file and the reference file for the parsing process. 543

ISBN: 978-972-8924-30-0 2007 IADIS Initial File m1() m2() Reference file m3() m4() Java Source file Figure 3. Relationship between the Initial file and the reference file in parsing process of system modelling 3.1.2 Class Property Gathering The Class Property Gathering module consists of 2 major tasks. First, it provides Java file information to the file handing module. File information such as file path, file name and etc. are required for locating and opening of the reference source file. With this information, the File Handling module is able to locate and load the reference source files required by the initial source file. Secondly, the Class Property Gathering module gathers the necessary data to construct the relevance information on the involved classes during the parsing process. This information is part of the UML class diagram requirements, which including class s attributes, operations and visibility (Systä, 1999). Assume CP is a finite set of classes identified during the first round of the parsing process and is stored in a tree structure. If E CP, then E = (A,M) where A is a finite set of Java class s attributes in the form of (Visibility, Type, Identifier). Visibility is the access modifier of the Java identifier, which is one of the elements in set {public, private, protected. Type is the data type of the identifier, which is either a Java s primitive or reference data type. Identifier refers to the attribute s name assigned by the programmer and shall strictly follow the Java syntax. M is a finite set of Java class methods that declare in E and have the form of (Visibility, Return_Type, Identifier, Parameter_List). Return_Type is the return data type of a method while Parameter_List contains parameters to be passed into M and is in the form of (Type, Identifier). 3.1.3 Relationship Identification Process In order to construct the sequence diagram, the necessary building blocks of the diagram need to be identified. The crucial properties need to be identified during this process is the lifeline of each method call and whether control structure was involved to decide method to be invoked within the class. In order to achieve this, sequence of the method call needs to be traced and known by parsing through Java source code. All the information gathered during this process is stored in a tree structure to ease data retrieval in the next process. The Relation between class X and class Y is defined through method invocation in both class X and class Y. Lets say X s source file is the initial source and invoke methods defined in class Y where Y is the reference source file via class Y object declared in X source file. In this case, class X is said to have a relation on class Y via method invocation within class X. Lets R be the finite set of relation class X on class Y, R is in a form of {(Obj Y, M Y ) class X invoke method class Y, where X and Y are the classes defined in CP, M y is the method member in Y invoked by Obj Y. Obj Y is object reference of class Y declared in class X. Each of the ordered pairs in R is a node in the tree structure and each node is labelled using universal address system as illustrated in Figure 6. The lexicographic ordering of the labelling represents the sequence of invoked method in initial source file and reference source files. 3.1.4 Sequence Diagram Presentation In this process, the constructed data structure during the Class Property Gathering and the Relationship Identification process will serve as the blue print for constructing the sequence diagram. Since information is arranged in a tree structure, prefix tree traversal algorithm is applied for proper tracking and data retrieval. Based on the data flow information and the relationship kept in the tree structure, the sequence diagram is drawn. 544

IADIS International Conference Applied Computing 2007 4. IMPLEMENTATION As suggested in our proposed approach, the File Handling module is handled by the OS API, while in our case, it was handle by MS Windows XP and VB6 (Visual Basic 6.0) for rapid implementation of the proposed approach. For Class Property Gathering and Relationship Indemnification process, we use GOLD parser engine as our first phase parsing engine, which is capable to parse text string and retrieve the information as discussed in previous section. GOLD Parser (GOLD Parsing System, 2006)an acronym for Grammar Oriented Language Developer, a free and open source parser generator that supports multiple programming languages, Java programming language is one of them. GOLD Parser actually analyze the syntax and identify the classes by tokenized the reserved words and symbols and atoms of the language from the source strings before determining the syntax are sequence and structurally valid. GOLD Parser is designed with separated parse engine and grammar files that will be used to derive the table information on the particular parsed codes. These criteria can be beneficial to our project that required extraction of object information from the source code such as the relationship between classes and object as well as the behaviour of the program. However, alterations on grammar need to be done to meet the projects needs especially in term of tree graph generation. 5. EXPERIMENTAL RESULTS An experiment on generating the tree data structure from a sample source codes has been carried out to evaluate the accuracy on the data extraction through the use of parsing technique. Experiment is performed based on the sample source code as shown in Figure 4 as the input to the parsing engine. public class A { private B b; private C c; public void ma1(){ b = new B(); c = new C(); b.mb1(); c.mc2(); public Class B { public void mb1 () { C c1 = new C(); c1.mc1(); public void mb2(){ public class C { private B b1; public void mc1(){ public void mc2() { b1 = new B(); b1.mb2(); Figure 4. Sample source code for the project experiment Class A b.ma1() 0 a::a b::b c::c Class B ma1() mb1() mb2() b.mb1() 1 c.mc2() 2 0. ma1 1. mb1 1 ma1 2. mb1 1.1 3. mc1() Class C mc1() mc2() c1.mc1() 1.1 b1.mb2() 2.1 2. 4. mc2() 2.15. mb2 (a) Methods collection for each class (b) Levelled tree that represents relation of object interactions Figure 5. Data Structure for data collection and objection interaction (c) Sequence diagram extracted from tree Using the GOLD parser and interpretation of parsed data, we generated two (2) data trees, the method collection for each class as shown in Figure 5(a) and the levelled tree that represents the relationships among 545

ISBN: 978-972-8924-30-0 2007 IADIS the classes and the order of object interactions for the system illustrated in Figure 5(b). Based on the elements extracted in the levelled tree, we depict the output of the sequence diagram manually as shown in Figure 5(c). This is to demonstrate the consistency of the data extraction with the design model using our approach. 6. CONCLUSION The idea presented in this paper is part of the research work done in modelling the system behaviour through the sequence diagram using the parsing technique. With this approach, it enables the system programmer to model objects interaction of the system and map the interactions into the sequence diagram with a relatively low cost through a reverse engineering process by using parsing technique. The approach shows the processes of static analysis takes place in parsing through the complete source codes, data extraction to generate the object trees until the final generation of the system behaviour model is represented in a sequence diagram. By reverse engineering the static structure of a software through the parsing technique can help the engineer to ensure that the architectural guidelines are followed, tracing the sources of bug, understanding the current behaviour of the software, finding unused code, and so fort. However, we have to admit that the scope of this work is limited to Java source code and bounded to restriction on the static analysis, where the main program of the source codes must be identified before the parsing of source codes is performed to ensure a complete tracking of files during the parsing process. This weakness will be further improved in future where the parsing process should be able to perform by random selection of any single file among all files involved. ACKNOWLEDGEMENT The authors would like to thank C. E. Tan and E. Mit for proofreading the paper. This research was supported by UNIMAS grant 02(66)/524/2005(23) REFERENCES Gestwicki, P. V., & Jayaraman, B. (2004). JIVE: java interactive visualization environment. Object-oriented programming systems, languages, and applications (pp. 226-228 ). Vancouver, BC, CANADA : ACM Press. GOLD Parsing System. (2006, July 5). Retrieved July 8, 2006, from GOLD Parsing System - A Free, Multi- Programming Language, Parser: http://www.devincook.com/goldparser/ Koji, T., Takashi, I., Toshihiro, K., Shinji, K., & Katsuro, I. (2005). Extracting Sequence Diagram from Execution Trace of Java Program. Eighth International Workshop on Principles of Software Evolution (pp. 148-154). IEEE Computer Society. Quigley, A. J., Postem, M., & Schmid, H. (2000). ReVis: Reverse Engineering by Clustering and Visual Object Classification. Software Engineering Conference, (pp. 119-125 ). Australian. Rountev, A., Kagan, S., & Sawin, J. (2005). Coverage Criteria for Testing of Object Interactions in Sequence Diagrams. Fundamental Approaches to Software Engineering (pp. 282-297). Springer-Verlag. Rountev, A., & Connell, B. H. (2005). Object naming analysis for reverse-engineered sequence diagrams. 27th international conference on Software engineering (pp. 254-263 ). St. Louis, MO, USA : ACM Press. Rountev, A., Volgin, O., & Reddoch, M. (2004). Control Flow Analysis for Reverse Engineering of Sequence Diagrams. Technical Report OSU-CISRC-3/04-TR12, Ohio State University, Department of Computer Science and Engineering, Ohio State. Systä, T. (1999). Dynamic Reverse Engineering of Java Software. Workshop on Object-Oriented Technolog (pp. 174-175). London, UK: Springer-Verlag. Vasconcelos, A., Cepêda, R., & Werner, C. (2005). An Approach to Program Comprehension through Reverse Engineering of Complementary Software. 1st workshop on Program Comprehension through Dynamic Analysis (PCODA 2005), (pp. 58-62). Pittsburg, USA. 546