Executive Summary. More Maintainable Code with Grammars and the AnaGram 1 Programming Environment
|
|
- Kristian Francis Barker
- 5 years ago
- Views:
Transcription
1 July 5, Executive Summary More Maintainable Code with Grammars and the AnaGram 1 Programming Environment Norman Wilde University of West Florida Pensacola, Florida 32514, USA tel ; fax wilde@cs.uwf.edu For the last seven years our research group has been working on tools to aid the maintainer of old code. Not surprisingly, we have developed a keen interest in ways of making our own code easier to understand and maintain. Recently we have found that the technique of grammar based programming seems to provide good, maintainable solutions to many of our programming problems. Programs built around grammars seem to have several benefits that enhance maintainability. They are readable since the grammar acts as a working formal specification of program behavior. They are robust since the grammar eliminates many logic and I didn t think of that errors. They are flexible, and can be easily grown or extended as experience accumulates. And finally they are portable, since they use ASCII text and standard C/C++ programming techniques. We have used grammar-based programming for several applications that do not much resemble traditional compiler construction tasks. One system generated test drivers for unit testing of C++ object classes. Another was a querying system to help maintainers extract information about large software systems. Still others have taken output from existing code analysis tools and translated it to work with our program understanding tools. For the last three years we have been using the AnaGram grammar-based programming environment and we have found that it greatly helps us in developing these applications quickly. AnaGram is a PC multi-window system for writing, analyzing, and debugging programs that use grammars. We have had several good experiences with inexperienced students who produced correct, readable code using AnaGram. This report, which describes some of the benefits of grammar-based programming and the AnaGram environment, was originally prepared at the request of SERC affiliate Siemens Corporate Research for the Siemens CASELAB system, which distributes information on CASE tools and methods to Siemens employees worldwide. The views presented are exclusively those of the author. This report may be cited as SERC-TR--76F, Software Engineering Research Center, University of Florida, CSE-301, Gainesville, FL 32611, July/94 1 AnaGram is a trade mark of Parsifal Software
2 July 5, Grammar-Based Programming When you say the word parsing to most programmers the immediate word-association response is compiler writing. Computer Science students are usually introduced to grammars in a compilers course that focuses on this one narrow application of parsing tools. Since few programmers write compilers after leaving university, we tend to think of parsing as a solution to somebody else s problem. But parsing technology can be applied in a wide range of situations in which a program is controlled by a sequence of inputs. Parser generators can thus be used to provide clean, maintainable code to solve many programming problems. A parser generator reads a syntax file containing a grammar and associated actions specified for both normal and error inputs. From these it creates a C or C++ source program (Figure 1). This generated program, when compiled and executed, will accept sequences of inputs that match the grammar and carry out the corresponding actions. If the inputs do not match the grammar, the program will instead perform the specified error handling actions. Syntax File Parser Generator Has grammar for program inputs and actions to be performed C Source File C Compiler Program Inputs Executable Program Performs actions when input matches the grammar Does error handling actions when input does not match grammar Figure 1 - How a Parser Generator Works 2. Maintainability Benefits of Grammar-Based Programming The main benefits of organizing a program around a grammar are readability, robustness, flexibility and portability. Together these make for more maintainable code
3 July 5, Readability: It is generally easy to read and understand a grammar based program because the grammar explains most of the complicated logic the reader might need to worry about. For example, the key part of the AnaGram syntax file in Appendix I, is the following: grammar -> line?..., eof = writeunidentifiedfilelist(); line -> trace line, newline -> source line, newline -> other line, newline trace line -> "!trace on lines at ", location, not eol?... location -> my file:i, function spec, line spec:n = writer2tline(i,n); -> routine entry point -> meaningless number -> unidentified location This tells us immediately that the input (called grammar) is expected to be a sequence of lines and that each line may be a trace line, a source line, or an other line. Trace lines start with the string "!trace on lines at " which is followed by location information. If the location is composed of a file name, followed by a function name followed by a line number then the writer2tline function is called. Other cases, such as a routine entry point are ignored. While we may not understand immediately what an r2t line is, the overall behavior of the program is very clear. This program is scanning the input for trace lines and writing something when it finds each one. The grammar provides an excellent top down specification of what it expects as input and what it will do when it gets it. Robustness: In a conventional program it can be very difficult to check that all contingencies have been handled. What if there are extra blanks in the input? What if the last input line is not complete? Special cases such as these are often found during testing and handled by patching the program s logic. After a few such patches the program becomes incomprehensible. In a grammar based program, the grammar defines behavior for all such cases. The parser generator then checks for consistency and warns the programmer of possible errors. We have found that many of our logic errors are caught by the parser generator and relatively few get through into testing of the final program. Flexibility: Grammar based programs can often be extended easily to handle new cases
4 July 5, Often the addition of a couple of alternatives to the grammar is sufficient to take care of a new requirement. Such a change could be very complex to introduce in a conventional program because so many of the program s states could be affected. But the parser generator hides this kind of complication and generates a new, correct, program from the modified grammar. This flexibility can be especially useful in exploratory programming [WILD.94]. For example, the program in Appendix I takes output from a debugger and translates it for use in one of our projects. The exact form of the debugger s output was known only from examples. As a first cut at the problem the grammar in the appendix was written from one of these examples. The error handling features of the parser generator can then be used to systematically develop a complete and correct program. The grammar contains the following production for error handling: other line -> error = wrerror(pcb.line, PCB.column); The token error covers anything else not described by this grammar. Any input line that does not match the grammar will produce a call to the wrerror function that writes a message to standard error. The offending input can then be tracked down and added to the grammar. We have used this technique several times. The result is a program that is easy to understand despite the many revisions. The grammar acts as a formal but readable specification at all stages of the process. Portability: Grammar-based programs usually are designed to process simple ASCII text as input and thus tend to be easily portable. So far, AnaGram generated programs have worked with all of the C or C++ compilers we have tried. Since it can be difficult to predict all the environments a program may eventually face, a conservative grammar-based design can reduce the life-cycle cost of repeated porting. 3. Some Applications In this section we will describe some of the problems we have solved using grammar-based programming as part of our tool-building work. There are, of course, many other situations in which this technique can be useful. 3.1 Data Transformation Problems To save development time, we try to make our tools use existing tools as much as possible. However this frequently forces us to write data transformation programs. For example the syntax file in Appendix I generated a program to transform output from the Digital Equipment Vax debugger for some experiments with a program understanding tool
5 July 5, Vax Debugger r2getvd Transformation Program Program Understanding Tool Debugger Log File Needed Format Figure 2 - A Simple Data Transformation Problem Such transformation programs can be written quickly using grammar-based programming and, as described, error handling facilities of the parser generator allow systematic exploratory programming without making a rat s nest of the program logic. We have written several such programs as part of our maintenance tool set, including one to extract information from a C++ code analyzer and another to normalize Prolog facts in a C code factbase. 3.2 Program Querying and Unit Testing of C++ Objects Jon Bentley in More Programming Pearls describes the technique of inventing little languages to solve problems [BEN.88]. Instead of writing a program with a large number of switches and parameters for the user to set, the programmer defines a language in which the user can express what needs to be done. Then a parser generator can be used to produce the little language program to actually solve the user s problem (Figure 3). In one project we applied this method to define a simple query language that software maintainers could use to get information about the code they were maintaining [RICH.93]. The query program was written in AnaGram by an undergraduate student and was one of his first C language projects. Yet the final code was very easy to inspect and was almost completely error free
6 July 5, Syntax File Parser Generator Specifies the "little language" for this problem C Source File C Compiler Problem Description "Little Lang" Program Describes the user's problem using the "little language" User's Output Figure 3 - Programming Using a Little Language In another project, we developed a tool to help unit test C++ object classes. A little language was defined to specify objects and test conditions. The generated testing tool could read test specifications in the language and generate a C++ test driver program that would create the objects and step them through combinations of the specified conditions [WILD.93] The AnaGram Environment for Grammar-Based Programming 4.1 AnaGram Advantages One reason that grammar-based programming has been little applied is that parser generators were often very clumsy to use. For the last three years we have been using beta versions of a new grammar-based programming environment called AnaGram that has greatly facilitated our work. AnaGram is not just a parser generator, but as well provides a PC multi-window environment for analyzing and debugging grammars. Figure 4 shows the initial AnaGram screen. 2 This C++ test driver generating system is available from the author
7 July 5, Figure 4 - The Initial AnaGram Screen You normally use your editor to create an AnaGram syntax file, perhaps building on one of the sample files included with the system. Then you enter the AnaGram environment to analyze and debug your grammar, switching back to your editor to make any changes. Finally you tell AnaGram to build the parser and it generates a C/C++ source file for you to run through your compiler. One strength of AnaGram is the extensive help system. For example, if there are conflicts in your grammar you can call up a help screen that describes conflicts, explains the information in AnaGram s Conflicts window, and practically provides an on-line textbook on the subject of conflict handling (Figure 5). There are, literally, hundreds of help topics accessible either directly or from the Help Index window. We have found that this help facility greatly reduces the learning time for students starting out with grammar-based programming
8 July 5, Figure 5 - An AnaGram Help Window Another strong point is AnaGram s assistance with grammar conflicts. Grammars can be ambiguous which means that, in some state, there is an input that the grammar could interpret in more than one way. Conflicts often represent I didn t think of that! cases that can be dangerous. They also occur in conventional programs but there they may go undetected. In a grammar-based program, the parser generator can apply default rules for handling conflicts, but the programmer is warned to make sure the default is the right action to take. Good grammar-based programming practice is to eliminate conflicts whenever possible. Understanding a conflict is sometimes difficult since several productions in the grammar may be involved. AnaGram offers several tools to help, such as the Conflict Trace window. In the example of Figure 6, the user has moved the cursor to one of the items in the Conflicts window and has requested a trace. AnaGram responds with a window showing that the grammar can get into this conflict if the inputs are!trace on lines at, followed by an identifier, followed by an identifier start character. If the next input is a digit, there would be two possible rules that could apply. With this information you can find the offending sections of the grammar and, usually, eliminate the conflict
9 July 5, Figure 6 - Pinpointing a Grammar Conflict Perhaps the most useful feature of AnaGram is its File Trace facility for debugging grammars. With most conventional parser generators you cannot start debugging until you build the C language parser and compile it. This takes time. As well, the parser is often very difficult to debug since it is machine generated and contains large, anonymous state transition tables. With AnaGram you can debug your program logic by feeding data into your grammar without leaving the multi-window environment. For example, in Figure 7, the user has asked for a File Trace on the data file called DEBUG1.LOG. The bottom window in the figure shows this data file; the highlighting shows how far the user has progressed reading the data. The right arrow key is used to read each additional token. The top window shows the stack that the parser has created from the data read so far. Thus we can see that we have identified an initial line, and that on the second line the parser had identified the keyword token!trace on lines at It is about to read the RPN token. As each token is read, the File Trace shows how the stack grows and how productions are used to reduce it. Most errors in logic can be very quickly identified. Most important, there is no need to build the parser, compile, and then work through a complicated debugging session
10 July 5, Figure 7 - An AnaGram File Trace AnaGram has some other advantages that are illustrated in the example in Appendix I. With most earlier parser generators, parsing had to be preceded by a lexical analysis step that often required a separate tool. Lexical analysis scans the input and breaks it up into tokens. It helps keep down the size of parsing tables and does some look-ahead to avoid conflicts in the grammar. However it requires the programmer to use two tools instead of one, and to follow special conventions to hook the lexical analyzer and the parser together. AnaGram incorporates several techniques that usually eliminate the need for a lexical analyzer. First, you can specify useful sets of characters directly as part of the grammar instead of leaving that to a lexical analyzer. For example the syntax file in Appendix I has the following lexical definitions: /* lexical definitions */ newline = '\n' eof = ^Z backslash = '\\' percen = '%' blank = ' ' digit = '0-9' identifier start character = 'a-z'+'a-z'+'_'+'$'+'@' identifier follow character = identifier start character + digit not percen = ~(percen + newline + eof) not eol = ~(newline + eof) Second, an AnaGram grammar can contain character strings that are automatically treated as single keyword tokens by the parser. AnaGram will do the look ahead needed to identify these
11 July 5, keywords and distinguish them from the rest of the input. For example, the production for trace line contains such a string: trace line -> "!trace on lines at ", location, not eol? AnaGram Disadvantages The main problem that we have encountered with AnaGram has been that it only runs on PC s. Since a lot of our work requires Unix, we have often had to develop the grammar using an emulator or a separate machine. Fortunately the parsers generated by AnaGram are portable and can compile and run on Unix machines. A second inconvenience is that the AnaGram environment is function key driven. There are a large number of different windows that can be used for different kinds of analysis but it takes a little while to learn the function key combinations to call up each one at the right time. The quick reference window and the help system are a great aid here. Finally, we should not give the impression that grammar-based programming is completely trivial to learn. While most computer science graduates will have seen all the relevant theory in a compilers course, it takes some practical experience to write good, clean grammars and to debug conflicts. But AnaGram is certainly a better platform than most for getting that experience. 4.3 AnaGram Specifications AnaGram s requirements are as follows: IBM PC/AT or compatible, 512K memory Minimum 500K hard disk space DOS 3.1 or later C or C++ compiler 5. For Further Information The author would be pleased to discuss grammar-based programming with SERC affiliates. He may best be reached by electronic mail at wilde@cs.uwf.edu. AnaGram is available for $ directly from: Parsifal Software, P. O. Box 219 Wayland, Massachusetts 01778, U. S. A. Tel (voice/fax): or from the U.S jholland@world.std.com Compuserve: 72603,1763 Enquiries about AnaGram can be directed to Jerry Holland at this address. Trial copies are available
12 July 5, Bibliography [BEN.88] Bentley, Jon, More Programming Pearls: Confessions of a Coder, Chapter 9, Addison-Wesley, Reading, MA, [RICH.93] [WILD.93] [WILD.94] Richardson, Raymond and Wilde, Norman, Applying Extensible Dependency Analysis: A Case Study of a Heterogeneous System, report SERC-TR-62-F, Software Engineering Research Center, CIS Department, University of Florida, Gainesville, FL 32611, July Wilde, Norman Testing Your Objects, C Users Journal, Vol. 11, No. 5, May Wilde, Norman Dealing With Uncertain Inputs: Exploratory Software Engineering, C/C++ Users Journal, Vol. 12, No. 7, July 1994, pp
13 July 5, Appendix I A Sample AnaGram Syntax File (Simplified) { /* File: r2getvd.syn, By: N. Wilde, Nov. 14/93 Purpose: Get Vax Debugger input for RECON. This is an AnaGram syntax file that generates r2getvd.c, which converts a log file containing trace output from the VAX debugger to a RECON "<xx>.r2t" trace file. To help check consistency, r2getvd writes to standard output a list of all the file names found in the log file that it could NOT identify. If any log file line has an unexpected format, a message is written to standard error. Design: The Vax debugger log output contains, along with much else, lines like the following:!trace on lines at RPN\getop\%LINE that indicate that line 577 of function getop in file RPN.C has been executed. This input line should produce a r2t file line with: T where 004 is the file index of file RPN.C. The mapping between file indexes and file names is in the "my file" productions in this file. */ R2getvd reads the debugger log file. When it finds a line of the form shown above it looks up the corresponding file index and writes the corresponding r2t file line. If the line is of the form!trace on lines at XXXX... but XXXX is not one of the known file names, then XXXX is stored in a list and written out at the end of processing in the list of unidentified files. grammar -> line?..., eof = writeunidentifiedfilelist(); line -> trace line, newline -> source line, newline -> other line, newline trace line -> "!trace on lines at ", location, not eol?
14 July 5, location -> my file:i, function spec, line spec:n = writer2tline(i,n); -> routine entry point -> meaningless number -> unidentified location (int) my file -> "RPN" = 004; /* ADD other abbreviated file names here */ function spec -> not percen... (int) line spec ->"%LINE ", number:n = n; routine entry point -> "routine" unidentified location -> identifier = storeunidentifiedfile(release()); meaningless number -> number source line -> "! ", blank?..., number, ":", not eol?... other line -> error = wrerror(pcb.line, PCB.column); (int) number -> digit:d = d - '0'; -> number:n, digit:d = 10*n + (d -'0'); identifier -> identifier start character:c = collectfirst(c); -> identifier, identifier follow character:c = collect(c); /* lexical definitions */ newline = '\n' eof = ^Z backslash = '\\' percen = '%' blank = ' ' digit = '0-9' identifier start character = 'a-z'+'a-z'+'_'+'$'+'@'
15 July 5, identifier follow character = identifier start character + digit not percen = ~(percen + newline + eof) not eol = ~(newline + eof) /* C Support Code */ { /* main program */ void main(int argc, char * argv[]) { int i; char logfilepath[100]; char r2tfilepath[100]; for (i= 1; i < argc; i++){ if(0 == strncmp( argv[i], "-L", 2)){ strncpy(logfilepath, argv[i] + 2, 100); if (NULL == (logfile = fopen(logfilepath, "r"))) { fprintf(stderr, "R2GETVD - couldn't open log file %s\n", logfilepath); return; if(0 == strncmp( argv[i], "-R", 2)) { strncpy(r2tfilepath, argv[i] + 2, 100); if (NULL == (r2tfile = fopen(r2tfilepath, "w"))) { printf(stderr, "R2GETVD - couldn't open r2t file %s\n", r2tfilepath); return; if ((NULL == r2tfile) (NULL == logfile)) { fprintf(stderr, "R2GETVD ABORTING\n"); return; /* Write comment line of the r2t file */ fprintf(r2tfile, "From Vax Log file %s\n", logfilepath); /* Call the parser to do the rest */ r2getvd(); /* Close all files and quit */ fclose(logfile); fclose(r2tfile);
1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationRegular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications
Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata
More informationCS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer
CS164: Programming Assignment 2 Dlex Lexer Generator and Decaf Lexer Assigned: Thursday, September 16, 2004 Due: Tuesday, September 28, 2004, at 11:59pm September 16, 2004 1 Introduction Overview In this
More informationPRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS
Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the
More informationParsing and Pattern Recognition
Topics in IT 1 Parsing and Pattern Recognition Week 10 Lexical analysis College of Information Science and Engineering Ritsumeikan University 1 this week mid-term evaluation review lexical analysis its
More informationPipelines, Forks, and Shell
Scribe Notes for CS61: 11/14/13 By David Becerra and Arvind Narayanan Pipelines, Forks, and Shell Anecdote on Pipelines: Anecdote 1: In 1964, Bell Labs manager Doug Mcllroy sent a memo stating that programs
More informationChapter 2, Part I Introduction to C Programming
Chapter 2, Part I Introduction to C Programming C How to Program, 8/e, GE 2016 Pearson Education, Ltd. All rights reserved. 1 2016 Pearson Education, Ltd. All rights reserved. 2 2016 Pearson Education,
More informationCS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017
CS 426 Fall 2017 1 Machine Problem 1 Machine Problem 1 CS 426 Compiler Construction Fall Semester 2017 Handed Out: September 6, 2017. Due: September 21, 2017, 5:00 p.m. The machine problems for this semester
More informationCompiler phases. Non-tokens
Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree
More informationCS426 Compiler Construction Fall 2006
CS426 Compiler Construction David Padua Department of Computer Science University of Illinois at Urbana-Champaign 0. Course organization 2 of 23 Instructor: David A. Padua 4227 SC, 333-4223 Office Hours:
More informationFigure 2.1: Role of Lexical Analyzer
Chapter 2 Lexical Analysis Lexical analysis or scanning is the process which reads the stream of characters making up the source program from left-to-right and groups them into tokens. The lexical analyzer
More informationSyntax-Directed Translation
Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation
More informationCompiler Construction D7011E
Compiler Construction D7011E Lecture 2: Lexical analysis Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 Basics of Lexical Analysis: 2 Some definitions:
More informationReviewing gcc, make, gdb, and Linux Editors 1
Reviewing gcc, make, gdb, and Linux Editors 1 Colin Gordon csgordon@cs.washington.edu University of Washington CSE333 Section 1, 3/31/11 1 Lots of material borrowed from 351/303 slides Colin Gordon (University
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler
More informationThe NetBeans IDE is a big file --- a minimum of around 30 MB. After you have downloaded the file, simply execute the file to install the software.
Introduction to Netbeans This document is a brief introduction to writing and compiling a program using the NetBeans Integrated Development Environment (IDE). An IDE is a program that automates and makes
More informationCode Structure Visualization
TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics and Computer Science MASTER S THESIS Code Structure Visualization by G.L.P.M. Lommerse Supervisor: Dr. Ir. A.C. Telea (TUE) Eindhoven, August
More informationOutline. Computer programming. Debugging. What is it. Debugging. Hints. Debugging
Outline Computer programming Debugging Hints Gathering evidence Common C errors "Education is a progressive discovery of our own ignorance." Will Durant T.U. Cluj-Napoca - Computer Programming - lecture
More informationThe structure of a compiler
The structure of a compiler Source code front-end Intermediate front-end representation compiler back-end machine code Front-end & Back-end C front-end Pascal front-end C front-end Intel x86 back-end Motorola
More informationLexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a
More informationPESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru -100 Department of Computer Science and Engineering
TEST 1 Date : 24 02 2015 Marks : 50 Subject & Code : Compiler Design ( 10CS63) Class : VI CSE A & B Name of faculty : Mrs. Shanthala P.T/ Mrs. Swati Gambhire Time : 8:30 10:00 AM SOLUTION MANUAL 1. a.
More informationPreprocessor Directives
C++ By 6 EXAMPLE Preprocessor Directives As you might recall from Chapter 2, What Is a Program?, the C++ compiler routes your programs through a preprocessor before it compiles them. The preprocessor can
More informationCompiling and Interpreting Programming. Overview of Compilers and Interpreters
Copyright R.A. van Engelen, FSU Department of Computer Science, 2000 Overview of Compilers and Interpreters Common compiler and interpreter configurations Virtual machines Integrated programming environments
More information10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis
Lexical and Syntactic Analysis Lexical and Syntax Analysis In Text: Chapter 4 Two steps to discover the syntactic structure of a program Lexical analysis (Scanner): to read the input characters and output
More informationFinal CSE 131B Spring 2004
Login name Signature Name Student ID Final CSE 131B Spring 2004 Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 (25 points) (24 points) (32 points) (24 points) (28 points) (26 points) (22 points)
More informationChapter 9: Dealing with Errors
Chapter 9: Dealing with Errors What we will learn: How to identify errors Categorising different types of error How to fix different errors Example of errors What you need to know before: Writing simple
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from
More informationCS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)
CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going
More informationIntroduction to Lexical Analysis
Introduction to Lexical Analysis Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationLecture 12 CSE July Today we ll cover the things that you still don t know that you need to know in order to do the assignment.
Lecture 12 CSE 110 20 July 1992 Today we ll cover the things that you still don t know that you need to know in order to do the assignment. 1 The NULL Pointer For each pointer type, there is one special
More informationThis chapter is intended to take you through the basic steps of using the Visual Basic
CHAPTER 1 The Basics This chapter is intended to take you through the basic steps of using the Visual Basic Editor window and writing a simple piece of VBA code. It will show you how to use the Visual
More informationC++ Style Guide. 1.0 General. 2.0 Visual Layout. 3.0 Indentation and Whitespace
C++ Style Guide 1.0 General The purpose of the style guide is not to restrict your programming, but rather to establish a consistent format for your programs. This will help you debug and maintain your
More informationCHAPTER 2. Troubleshooting CGI Scripts
CHAPTER 2 Troubleshooting CGI Scripts OVERVIEW Web servers and their CGI environment can be set up in a variety of ways. Chapter 1 covered the basics of the installation and configuration of scripts. However,
More informationCSCI0330 Intro Computer Systems Doeppner. Lab 02 - Tools Lab. Due: Sunday, September 23, 2018 at 6:00 PM. 1 Introduction 0.
CSCI0330 Intro Computer Systems Doeppner Lab 02 - Tools Lab Due: Sunday, September 23, 2018 at 6:00 PM 1 Introduction 0 2 Assignment 0 3 gdb 1 3.1 Setting a Breakpoint 2 3.2 Setting a Watchpoint on Local
More information5 The Control Structure Diagram (CSD)
5 The Control Structure Diagram (CSD) The Control Structure Diagram (CSD) is an algorithmic level diagram intended to improve the comprehensibility of source code by clearly depicting control constructs,
More informationCOMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table
COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab
More informationChapter 2. Basics of Program Writing
Chapter 2. Basics of Program Writing Programs start as a set of instructions written by a human being. Before they can be used by the computer, they must undergo several transformations. In this chapter,
More informationCOMPILER DESIGN. For COMPUTER SCIENCE
COMPILER DESIGN For COMPUTER SCIENCE . COMPILER DESIGN SYLLABUS Lexical analysis, parsing, syntax-directed translation. Runtime environments. Intermediate code generation. ANALYSIS OF GATE PAPERS Exam
More information1. Lexical Analysis Phase
1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,
More informationCSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1
CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions
More informationRay Pereda Unicon Technical Report UTR-02. February 25, Abstract
iflex: A Lexical Analyzer Generator for Icon Ray Pereda Unicon Technical Report UTR-02 February 25, 2000 Abstract iflex is software tool for building language processors. It is based on flex, a well-known
More informationCompilers. Prerequisites
Compilers Prerequisites Data structures & algorithms Linked lists, dictionaries, trees, hash tables Formal languages & automata Regular expressions, finite automata, context-free grammars Machine organization
More informationIn this simple example, it is quite clear that there are exactly two strings that match the above grammar, namely: abc and abcc
JavaCC: LOOKAHEAD MiniTutorial 1. WHAT IS LOOKAHEAD The job of a parser is to read an input stream and determine whether or not the input stream conforms to the grammar. This determination in its most
More informationAutomatic Scanning and Parsing using LEX and YACC
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationCS354 gdb Tutorial Written by Chris Feilbach
CS354 gdb Tutorial Written by Chris Feilbach Purpose This tutorial aims to show you the basics of using gdb to debug C programs. gdb is the GNU debugger, and is provided on systems that
More informationChapter 3. Describing Syntax and Semantics ISBN
Chapter 3 Describing Syntax and Semantics ISBN 0-321-49362-1 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Copyright 2009 Addison-Wesley. All
More informationUNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger
UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division CS 164 Spring 2010 P. N. Hilfinger CS 164: Final Examination (revised) Name: Login: You have
More informationIT 374 C# and Applications/ IT695 C# Data Structures
IT 374 C# and Applications/ IT695 C# Data Structures Module 2.1: Introduction to C# App Programming Xianrong (Shawn) Zheng Spring 2017 1 Outline Introduction Creating a Simple App String Interpolation
More informationDefining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1
Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind
More informationCompiling Regular Expressions COMP360
Compiling Regular Expressions COMP360 Logic is the beginning of wisdom, not the end. Leonard Nimoy Compiler s Purpose The compiler converts the program source code into a form that can be executed by the
More informationflex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.
flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input. More often than not, though, you ll want to use flex to generate a scanner that divides
More informationTDDD55- Compilers and Interpreters Lesson 2
TDDD55- Compilers and Interpreters Lesson 2 November 11 2011 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University PURPOSE OF LESSONS The purpose
More informationAutomatic Generation of Graph Models for Model Checking
Automatic Generation of Graph Models for Model Checking E.J. Smulders University of Twente edwin.smulders@gmail.com ABSTRACT There exist many methods to prove the correctness of applications and verify
More informationEarlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl
Compilerconstructie najaar 2013 http://www.liacs.nl/home/rvvliet/coco/ Rudy van Vliet kamer 124 Snellius, tel. 071-527 5777 rvvliet(at)liacs(dot)nl college 1, dinsdag 3 september 2013 Overview 1 Why this
More informationIntroduction to Computers and C++ Programming p. 1 Computer Systems p. 2 Hardware p. 2 Software p. 7 High-Level Languages p. 8 Compilers p.
Introduction to Computers and C++ Programming p. 1 Computer Systems p. 2 Hardware p. 2 Software p. 7 High-Level Languages p. 8 Compilers p. 9 Self-Test Exercises p. 11 History Note p. 12 Programming and
More informationComputer Science Lab Exercise 1
1 of 10 Computer Science 127 - Lab Exercise 1 Introduction to Excel User-Defined Functions (pdf) During this lab you will experiment with creating Excel user-defined functions (UDFs). Background We use
More informationKU Compilerbau - Programming Assignment
716.077 KU Compilerbau - Programming Assignment Univ.-Prof. Dr. Franz Wotawa, Birgit Hofer Institute for Software Technology, Graz University of Technology April 20, 2011 Introduction During this semester
More informationType Checking and Type Equality
Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.
More informationComputers and Computation. The Modern Computer. The Operating System. The Operating System
The Modern Computer Computers and Computation What is a computer? A machine that manipulates data according to instructions. Despite their apparent complexity, at the lowest level computers perform simple
More informationImplementing Dynamic Minimal-prefix Tries
SOFTWARE PRACTICE AND EXPERIENCE, VOL. 21(10), 1027 1040 (OCTOBER 1991) Implementing Dynamic Minimal-prefix Tries JOHN A. DUNDAS III Jet Propulsion Laboratory, California Institute of Technology, Mail
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More informationA program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.
Compiler Design A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language). The name
More informationCreating a Shell or Command Interperter Program CSCI411 Lab
Creating a Shell or Command Interperter Program CSCI411 Lab Adapted from Linux Kernel Projects by Gary Nutt and Operating Systems by Tannenbaum Exercise Goal: You will learn how to write a LINUX shell
More informationChapter 4. Lexical and Syntax Analysis. Topics. Compilation. Language Implementation. Issues in Lexical and Syntax Analysis.
Topics Chapter 4 Lexical and Syntax Analysis Introduction Lexical Analysis Syntax Analysis Recursive -Descent Parsing Bottom-Up parsing 2 Language Implementation Compilation There are three possible approaches
More information5. Control Statements
5. Control Statements This section of the course will introduce you to the major control statements in C++. These control statements are used to specify the branching in an algorithm/recipe. Control statements
More informationA simple syntax-directed
Syntax-directed is a grammaroriented compiling technique Programming languages: Syntax: what its programs look like? Semantic: what its programs mean? 1 A simple syntax-directed Lexical Syntax Character
More informationUNIT - 5 EDITORS AND DEBUGGING SYSTEMS
UNIT - 5 EDITORS AND DEBUGGING SYSTEMS 5.1 Introduction An Interactive text editor has become an important part of almost any computing environment. Text editor acts as a primary interface to the computer
More informationLecture 7: Deterministic Bottom-Up Parsing
Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general
More informationAppendix A: Syntax Diagrams
A. Syntax Diagrams A-1 Appendix A: Syntax Diagrams References: Kathleen Jensen/Niklaus Wirth: PASCAL User Manual and Report, 4th Edition. Springer, 1991. Niklaus Wirth: Compilerbau (in German). Teubner,
More informationCS 541 Spring Programming Assignment 2 CSX Scanner
CS 541 Spring 2017 Programming Assignment 2 CSX Scanner Your next project step is to write a scanner module for the programming language CSX (Computer Science experimental). Use the JFlex scanner-generation
More informationLL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012
Predictive Parsers LL(k) Parsing Can we avoid backtracking? es, if for a given input symbol and given nonterminal, we can choose the alternative appropriately. his is possible if the first terminal of
More informationCS 415 Midterm Exam Spring SOLUTION
CS 415 Midterm Exam Spring 2005 - SOLUTION Name Email Address Student ID # Pledge: This exam is closed note, closed book. Questions will be graded on quality of answer. Please supply the best answer you
More informationDerivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012
Derivations vs Parses Grammar is used to derive string or construct parser Context ree Grammars A derivation is a sequence of applications of rules Starting from the start symbol S......... (sentence)
More informationEXPERIMENT NO : M/C Lenovo Think center M700 Ci3,6100,6th Gen. H81, 4GB RAM,500GB HDD
GROUP - B EXPERIMENT NO : 07 1. Title: Write a program using Lex specifications to implement lexical analysis phase of compiler to total nos of words, chars and line etc of given file. 2. Objectives :
More informationChapter 3: Lexing and Parsing
Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding
More informationChapter 2 Basic Elements of C++
C++ Programming: From Problem Analysis to Program Design, Fifth Edition 2-1 Chapter 2 Basic Elements of C++ At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class Discussion
More informationProgramming Project 1: Lexical Analyzer (Scanner)
CS 331 Compilers Fall 2017 Programming Project 1: Lexical Analyzer (Scanner) Prof. Szajda Due Thursday, September 21, 11:59:59 pm 1 Overview of the Programming Project Programming projects I IV will direct
More informationParser Tools: lex and yacc-style Parsing
Parser Tools: lex and yacc-style Parsing Version 6.11.0.6 Scott Owens January 6, 2018 This documentation assumes familiarity with lex and yacc style lexer and parser generators. 1 Contents 1 Lexers 3 1.1
More informationCOMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview
COMP 181 Compilers Lecture 2 Overview September 7, 2006 Administrative Book? Hopefully: Compilers by Aho, Lam, Sethi, Ullman Mailing list Handouts? Programming assignments For next time, write a hello,
More informationC Language, Token, Keywords, Constant, variable
C Language, Token, Keywords, Constant, variable A language written by Brian Kernighan and Dennis Ritchie. This was to be the language that UNIX was written in to become the first "portable" language. C
More informationA Simple Syntax-Directed Translator
Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called
More informationOptimizing Emulator Utilization by Russ Klein, Program Director, Mentor Graphics
Optimizing Emulator Utilization by Russ Klein, Program Director, Mentor Graphics INTRODUCTION Emulators, like Mentor Graphics Veloce, are able to run designs in RTL orders of magnitude faster than logic
More informationSyntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens
Syntactic Analysis CS45H: Programming Languages Lecture : Lexical Analysis Thomas Dillig Main Question: How to give structure to strings Analogy: Understanding an English sentence First, we separate a
More informationCSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1
CSE P 501 Compilers Implementing ASTs (in Java) Hal Perkins Autumn 2009 10/20/2009 2002-09 Hal Perkins & UW CSE H-1 Agenda Representing ASTs as Java objects Parser actions Operations on ASTs Modularity
More informationLesson 1: Writing Your First JavaScript
JavaScript 101 1-1 Lesson 1: Writing Your First JavaScript OBJECTIVES: In this lesson you will be taught how to Use the tag Insert JavaScript code in a Web page Hide your JavaScript
More informationCS 6353 Compiler Construction Project Assignments
CS 6353 Compiler Construction Project Assignments In this project, you need to implement a compiler for a language defined in this handout. The programming language you need to use is C or C++ (and the
More informationINCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM
INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM Charles S. Saxon, Eastern Michigan University, charles.saxon@emich.edu ABSTRACT Incorporating advanced programming
More informationProgramming Assignment I Due Thursday, October 7, 2010 at 11:59pm
Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm 1 Overview of the Programming Project Programming assignments I IV will direct you to design and build a compiler for Cool. Each assignment
More informationCS143 Handout 14 Summer 2011 July 6 th, LALR Parsing
CS143 Handout 14 Summer 2011 July 6 th, 2011 LALR Parsing Handout written by Maggie Johnson, revised by Julie Zelenski. Motivation Because a canonical LR(1) parser splits states based on differing lookahead
More informationChapter 4. Lexical and Syntax Analysis
Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More information3. Simple Types, Variables, and Constants
3. Simple Types, Variables, and Constants This section of the lectures will look at simple containers in which you can storing single values in the programming language C++. You might find it interesting
More informationObject-oriented Compiler Construction
1 Object-oriented Compiler Construction Extended Abstract Axel-Tobias Schreiner, Bernd Kühl University of Osnabrück, Germany {axel,bekuehl}@uos.de, http://www.inf.uos.de/talks/hc2 A compiler takes a program
More information6.001 Notes: Section 15.1
6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define
More informationPrinciples of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore
(Refer Slide Time: 00:20) Principles of Compiler Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 4 Lexical Analysis-Part-3 Welcome
More informationStarting to Program in C++ (Basics & I/O)
Copyright by Bruce A. Draper. 2017, All Rights Reserved. Starting to Program in C++ (Basics & I/O) On Tuesday of this week, we started learning C++ by example. We gave you both the Complex class code and
More informationExercises: Instructions and Advice
Instructions Exercises: Instructions and Advice The exercises in this course are primarily practical programming tasks that are designed to help the student master the intellectual content of the subjects
More informationCSCI312 Principles of Programming Languages
Copyright 2006 The McGraw-Hill Companies, Inc. CSCI312 Principles of Programming Languages! LL Parsing!! Xu Liu Derived from Keith Cooper s COMP 412 at Rice University Recap Copyright 2006 The McGraw-Hill
More informationC++ for Java Programmers
Basics all Finished! Everything we have covered so far: Lecture 5 Operators Variables Arrays Null Terminated Strings Structs Functions 1 2 45 mins of pure fun Introduction Today: Pointers Pointers Even
More informationLecture 8: Deterministic Bottom-Up Parsing
Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general
More information