Renovating an Open Source Project: RECODER

Size: px
Start display at page:

Download "Renovating an Open Source Project: RECODER"

Transcription

1 School of Mathematics and Systems Engineering Reports from MSI - Rapporter från MSI Renovating an Open Source Project: RECODER Óscar León Fernández June 2009 MSI Report Växjö University ISSN SE VÄXJÖ ISRN VXU/MSI/DA/E/ /--SE

2 Abstract Renovate a program is always a hard work and full of problems. It requires a good knowledge of the old program and research and investigation. RECODER is a tool for supporting static metaprogramming of Java program sources. This tool has been patches over and over again since the new versions of Java were coming out. Because of that the code starts to be dirty and with a lot of patches. In order to clean up the implementation the code should be renovated. In this thesis, we present the changes that were introduced to the grammar and the implementation of the new parser for RECODER using a different technology. Key-words: AST, EBNF, Java, RECODER ii

3 Contents 1. Introduction Motivation Goal Criteria Overview of the report Background BNF and EBNF Description of RECODER Parser Generators for Java JavaCC ANTLR SableCC CookCC CUP Analysis of the JavaCC Parser Specification in the current RECODER version Analysis Conclusion Description of the new implementation Analysis Conclusion Evaluation Compatibility Maintainability Conclusion & Future Work Summary Conclusion Future Work References 32 iii

4 iv

5 1 Introduction The reverse engineering has been always used for renovating programs during all the time from the beginning of the modern computer. Nowadays, always are emerging new technologies what can improve the performance or facilitate the production of a determinate task. But to change from an old technology to a new one is not so easy. Some problems arise when you want to renovate a program, for example the functionality of the program has to be the same but changing the internal structure. In terms of the final user, they should not see any change in the functionality of the program. In terms of maintainability the program has to be easy to correct and easy to see in order to do easier the future maintenance. The understanding of the program is vital to renovate a program. The majority of software development time is wasted for the maintenance of the old programs and not for the development of the new software programs. For facilitate knowledge of the old programs the engineers need reliable information about the system but this is not always possible. Sometimes the documentation of the programs is imprecise, not complete or even sometimes is undocumented. Also the architecture described in a project not always matches with the reality that is programmed in the source code. That is why that the source most reliable of information is the code although the most difficult to analyze as well. Hence we need some tools to analyze the source code and return us the enough information for understanding the existing software systems. Reverse engineering is a process to obtain the necessary information for renovating the code. So we can recover from the implementation the design specification of a system. The recovered design specification helps us to understand the program before restructuring the code. Also the information can be used as a feed back for the new requirements specification or for helping with maintenance tasks. 1.1 Motivation RECODER has a lot of parts that are deprecated or need a deep remodelling. Some of these parts are based on old technologies, bad documented or dark code (code that is a really mess or very confusing). The main problem that we are confronting here is to renovate the front-end of the free source program RECODER. To be more precise we have to renovate the parser for Java code of RECODER. The parser is one of the most important parts of RECODER. RECODER was created when Java was in the version 1.2 and the grammar specification was done for this version at the beginning. When the new versions of Java were released the grammar was patched and changed in some places in order to follow the new specifications. In special, the last versions 1.5 and 1.6 introduce a lot of changes in the specification. These changes were consequently introduced to the RECODER s grammar with a lot of patches and hacks. All this patches did finally the grammar really dark, messy and bad to understand. For this reason is needed to introduce a new grammar with the new specification in order to clarify the understanding of the grammar and improve the readability and maintainability. Also is an opportunity for changing the parser generator and introduce a 1

6 new technology. The current parser for Java source code is based in JavaCC grammar and we transform and actualize this grammar to a new one based on ANTLR. 1.2 Goal Criteria This master thesis is focused in renovate the source code front-end of the tool RECODER. The first goal is to change the actual grammar upgraded to the actual version 1.6 to a new one specification specifically for the new version in order to do it clearer. The new grammar should use ANTLR instead of JavaCC. The Abstract Syntax Tree (AST) created by the new grammar should be the same than the old grammar, changing nothing in the final result when a program is parsed. It is desirable a good functionality with the new parser based on ANTLR instead of JavaCC. These can be confronted with the test suite that is in the last released version of RECODER. It should pass all the tests if it is possible and if is not give an explanation of the error and a possible solution. So it needs to be totally compatible the new parser with the old one based on different technologies, so the final user does not have to notice any change in the functionality. The new code has to be maintainable. The source code (in this case the grammar) has to be clear, for the next developer that want to deal with the new implementation for future changes. The new grammar has to be similar to the actual specification in order to maintain the compatibility. In summary, the program should do: Replace the old grammar based in Java 1.2 and patched for a new one based on Java 1.6 directly. Change the technology from JavaCC to ANTLR. Create the same AST than the old version. Should pass all the tests in the test suite. (if not explain why is failing) 1.3 Overview of the report This report is structured as follows: In Section 2, there is background information which is necessary to understand the problem and the parser tools that we can use. In Section 3 we describe the current grammar specification and we point to some weak parts of the grammar. In Section 4 we describe our new implementation saying the differences between the original ANTLR grammar, the final version and the JavaCC grammar. In Section 5 we talk about the results that we have obtained. Finally, in Section 6, we give a conclusion with a possible future work. 2

7 2 Background In this section we talk about the notations to represent grammar. We give an overview of RECODER that is the framework that we want to renovate continuing and a brief explanation about some parser generators for Java. The most important for us are JavaCC that is used in current version of the parser in RECODER, and ANTLR that is used in the new implementation of the parser. 2.1 BNF and EBNF Backus Normal Form is a formal mathematical way to describe a language. This is described with a set of terminals (tokens), a set of non-terminals, a start symbol that belong to the set of non-terminals and the rules or productions. The Extended Backus Normal Form is as its name says an addition to the normal notation to express lists and optional symbols in a very easy way. There are three more new symbols added to the notation. The asterisk or star * that means repeat zero or more times the symbol that is before. This symbol is also called Kleene star or Kleene closure. The plus + that means repeat one or more times. The last operator added is the optional operand that is represented with an interrogation mark? that means that the element appear zero or one time. 2.2 Description of RECODER RECODER is tool for supporting static (When the manipulation of the input program is done in compile time is called static, if is done in runtime is called dynamic) metaprogramming of Java program sources. RECODER realizes parsing and analyzes Java programs and also realizes transformations in the sources saving the results in new files. A metaprogram is a program that takes other program as input data and manipulates the input somehow. Figure 2.1: How works RECODER [1] RECODER derives a metamodel of all the entities found in Java source files and class files. The model contains a detailed syntactic representation that can be unparsed to a 3

8 file again. The Figure 2.1 shows program sources parsed and analyzed. After, RECODER derives a metamodel of the program with all the entities found in the sources. Then with a metaprogram as a input the code is transformed many times and then unparsed to a new source code file. RECODER reads all the source code completely assuming that the input program compiles correctly. All the transformations are applied to the source code, the bytecode is read only in declaration level, and in other words, the byte code is read in order to know which methods and fields belong to a determinate class. 2.3 Parser Generators for Java There are a lot of parser generators for Java but there are some parsers that are more used than others. Here we show an overview of some parser generator for Java. JavaCC is probably the most used actually. ANTLR is a good parser generator that is rising with a lot of users. SableCC, CookCC and CUP are other parser generators that are used at the moment [2]. A parser generator is a tool that creates a parser receiving as an input a formal description of the language normally expressed in a BNF or EBNF grammar. The grammar is used to describe the language that wants to be parsed. Normally, when a program is parsed is created a syntax tree. This syntax tree can be reduced eliminating the symbols that do not have information. These reduced trees are called Abstract Syntax Trees and normally the parser generators provide facilities to create these trees. In between of the nodes in the AST can be added actions that allow us to put code for calculating attributes or values. The abstract syntax tree of RECODER is created inside of this actions and not using the facilities of the parser generators JavaCC JavaCC [3] (Java Compiler Compiler) is an open source parser generator for the Java programming language (BSD license). The JavaCC grammar is LL(k) and can be written in EBNF notation. JavaCC generates top-down parsers (the tree is generated from the root to the leaves) and it does not allow left recursion because the LL(k) grammar cannot decide which that has to be taken. JavaCC also provides other standard capabilities related to parser generation like tree building, actions ANTLR ANTLR [4] (ANother Tool for Language Recognition) is a parser generator that automates the construction of language recognizers. It is possible to add actions with code snippets in different programming languages. Also support the tree building generator. ANTLR generate code for different target languages although generates Java code by default. The code that is generated is human-readable and easy to fold into other applications. The parser generated is a recursive descent recognizer using LL(*) as an extensions to LL(k) that uses arbitrary lookahead to make decisions depending on the rule. The code is under the license BSD. Support multiple target languages such as Java, C#, Python, Ruby, Objective-C, C, and C++. 4

9 2.3.3 SableCC SableCC [5] is a parser generator which generates object-oriented frameworks for building compilers. You can generate AST strictly typed, and are included tree walkers. SableCC have a clean separation between generated code and the code written for the user CookCC CookCC [6] is parser generator written in Java, but the target code can vary. It uses a template to generate source codes, so it is easy to add a new target language. Also comes with a suite of test cases to assist creating and testing new target languages. A unique feature of CookCC is allowing lexer/parser to be specified using Java annotations. This feature simplifies and eases the writing of lexers and parsers for Java CUP CUP [7] (based Constructor of Useful Parsers) is a system which generates LALR parsers from simple grammar specifications. It works similar to the well known parser generator YACC [8] and offers most of his features. However, CUP is written in Java, uses specifications including embedded Java code, and produces parsers which are implemented in Java. 5

10 3 Analysis of the JavaCC Parser Specification in the current RECODER version In this section we discuss about the current parser specification talking about the specification in depth, some patches that are showed in the grammar and code commented out. In the first subsection is showed the complete analysis of the grammar and then a conclusion with the ideas that comes out from the analysis. The current grammar that generates the parser has been patched and actualized every time that the specification of Java has changed. The patches that have been added have done the grammar more messy and unreadable. The grammar is written for an old version of the parser generator JavaCC. Because of that, not only the grammar is deprecated, also some functionalities that are used are not necessary anymore. 3.1 Analysis At the beginning of the grammar we can realize that the grammar has been patched during some time over and over again as you can see in the next code [Listing 3.1]. In the comment talk about one patch adding semicolons that is showed later in this section [Listing 3.18]. /** JavaCC AST generation specification based on the original Java1.1 grammar that comes with javacc, and includes the modification of D.Williams to accept the Java 1.2 strictfp modifier. Several patches have been added to allow semicolon after member declarations. Listing 3.1: The code has been patched first adding semicolons. Inside the body of the JavaCC parser there are some variables and methods that now are not useful. These methods are related to the use of the predicate.super inside of an explicit constructor invocation and are showed in the next code [Listing 3.2]. boolean superallowed = true private boolean issuperallowed() return superallowed private void setallowsuper(boolean b) superallowed = b Listing 3.2: Listing removed not used Some codes are to maintain the downward compatibility with the old versions of Java. In the old versions of Java there are no asserts, and this word can be a normal identifier. To control this are used this variables and method in order to be aware of the keyword assert. RECODER must be capable to parse the old programs for the old versions of Java as well as for new codes that are wrote on the new version of Java. This code was added when the new versions of Java were released [Listing 3.3]. 6

11 boolean jdk1_4 = false boolean jdk1_5 = false public boolean isawareofassert() return jdk1_4 public void setawareofassert(boolean yes) jdk1_4 = yes if (yes == false) jdk1_5 = false public boolean isjava5() return jdk1_5 public void setjava5(boolean yes) jdk1_5 = yes if (yes) jdk1_4 = true Listing 3.3: Downward compatibility There are some methods related to JavaCC for manage the position of the tokens in the code. Setting the size of the tabulation JavaCC can report the correct offset in the line. These methods are used for set and get the number of white spaces that are equivalent to one tabulation. In some codes the tabulation has to be fixed to a specific number of white spaces. In some languages the number of white spaces of is eight and in other codes it is four, that s why we need to be capable to fix the size of tabulation [Listing3.4]. public void settabsize(int tabsize) jj_input_stream.settabsize(tabsize) public int gettabsize() return jj_input_stream.gettabsize(0) // whatever... Listing 3.4: Code specific for JavaCC There are some global variables in RECODER that are used inside of the specification. These variables can be changed for local variables that this does the code easier to understand When you declare an array with a number unknown of dimensions the variable tmpdimension is used in the declarations to count the number of dimensions. This variable is incremented every time than a dimension is found when you are reading the source [Listing3.5] 7

12 /** temporary valid variable that is used to return an additional argument from parser method VariableDeclaratorId, since such an id may have a dimension */ private int tmpdimension Listing 3.5: Temporal variable that can be deleted Other support code has been erased because it was not necessary now or has been adapted to the new ANTLR parser generator [Listing 3.6]. The first function copyprefixinfo copy the relative position, start position and end position of an element of the AST. The function shifttoken creates an iterator that is called current. This iterator stops just in the token before the actual. Sometimes the token before it is a special token, as is the case of the comments, in that case the token before is the special token. Having the position of the actual token and the token before it is calculated the relative position to be set later on. private void copyprefixinfo(sourceelement oldresult, SourceElement newresult) newresult.setrelativeposition(oldresult.getrelativeposition()) newresult.setstartposition(oldresult.getstartposition()) newresult.setendposition(oldresult.getendposition()) private void shifttoken() if (current!= token) if (current!= null) while (current.next!= token) current = current.next Token prev if (token.specialtoken!= null) prev = token.specialtoken else prev = current if (prev!= null) int col = token.begincolumn - 1 int lf = token.beginline - prev.endline if (lf <= 0) col -= prev.endcolumn // - 1 if (col < 0) col = 0 position.setposition(lf, col) current = token Listing 3.6: Functions unused in the new implementation 8

13 In the grammar specification first appears the tokens specification for the scanner and then appears the rules for the parser. Special mention deserve three tokens because show how it is working the downward compatibility that we discussed before [Listing 3.3]. We can see if the version is old the kind of the token is changed to IDENTIFIER. Also we can see that there is something to do when we find [Listing 3.7]. < ASSERT: "assert" > if (!myparser.jdk1_4) matchedtoken.kind = IDENTIFIER < ENUM: "enum" > if (!myparser.jdk1_5) matchedtoken.kind = IDENTIFIER < AT: "@" > if (!myparser.jdk1_5) // TODO Listing 3.7: Code for downward compatibility One program in Java can be contained in one or more compilation units. In a compilation unit the first that can appear is a package name, but it is not necessary. Then a program can appear from zero to an unlimited number of import declarations. After that you can declare some types. As we can see in compilation unit rule are two options: one with package declaration and another without. This can be skipped using the EBNF operator? 1, reducing the code substantially [Listing 3.8]. We can also see at the beginning one of the quick fixes done in the grammar. CompilationUnit CompilationUnit() : // This is a quick "fix" - TypeDeclaration and PackageDeclaration unfortunately // can both start with an unlimited number of annotations. However, usually only one file // per package contains package annotations, so this is not a performance issue. (LOOKAHEAD(PackageDeclaration()) PackageDeclaration() (ImportDeclaration())* (TypeDeclaration())* (ImportDeclaration())* (TypeDeclaration())* ) Listing 3.8: Compilation Unit in the grammar 1 In JavaCC are used the brackets [ ] for representing the EBNF operator? 9

14 The types are the next ones: classes, interfaces, enumerations and annotations [9] [Listing 3.9]. In The Listing 3.9 we also can see code that can be removed, as is written in the comment. Next we can see the type declaration with the look ahead, this is not necessary in the new parser is LL(*). It will choose the best branch consuming the number of tokens necessary for guessing. TypeDeclaration TypeDeclaration() : ( LOOKAHEAD( ( "abstract" "final" "public" "strictfp" AnnotationUse() )* "class" ) result = ClassDeclaration()... "" ) if (result!= null) // may be removed as soon as Recoder fully understands Java5 setpostfixinfo(result) return result Listing 3.9: Type Declarations with deprecated code at the end In some parts of the grammar we can see code that has been commented out because the grammar has been changed. In this code is commented out the declaration of a constant inside of a annotationtypedeclaration [Listing 3.10]. // ConstantDeclaration /*LOOKAHEAD( (AnnotationUse() "public" "static" "final")* Type() VariableDeclarator(true)) (AnnotationUse() "public" "static" "final")* Type() VariableDeclarator(true) ( "," VariableDeclarator(true))* ""*/ Listing 3.10: Commented code that has not been removed In method declarations when the method has a generic type in the current grammar we can see a hack in the grammar. This hack is not needed you can take the first token that appear in the rule in order to set the prefix. This hack is used in more places that are using generic types and can be simply deleted. It will not be showed again but it appears several times in the constructor declaration for example [Listing 3.11]. 10

15 [ "<" if (ml.size() == 0) // '<' of MethodDeclaration is first element then. Need to store the result somewhere... dummy = factory.createpublic() setprefixinfo(dummy) /* HACK */ typeparams = TypeParametersNoLE() ] Listing 3.11: Generic parameters in a method declaration In the same rule, methoddeclaration, we find old code commented out that it is not necessary anymore and can be deleted [Listing 3.12]. th.setexceptions(trl) // Throws th = factory.createthrows(trl) result.setthrown(th) Listing 3.12: More out commented code The code it is completely full of out commented code as we can see and we found a lot of patches and code from the past commented, that do the code more unreadable and dirty. Another example of out commented code it is found inside of the rule formalparameter and we can see how is used the variable tmpdimension that is incremented inside of the rule variabledeclaratorid [Listing 3.13]. id = VariableDeclaratorId() dim = tmpdimension /*if (varargspec!= null) dim++*/ Listing 3.13: Another piece of out commented code There is full rule that has been commented out probably to improve the performance with or because the changes in the grammar when the new versions of Java where released [Listing 3.14]. /* ASTList<UncollatedReferenceQualifier> NameList() : ASTList<UncollatedReferenceQualifier> result = new ASTArrayList<UncollatedReferenceQualifier>() UncollatedReferenceQualifier qn qn = Name() result.add(qn) 11

16 ( "," qn = Name() result.add(qn) )* return result */ Listing 3.14: Entire rule commented out Inside of an expression we can see this comment that talk about one expansion for performance reasons and but also says that is a weakness of the grammar that should be solved [Listing 3.15]. /* * This expansion has been written this way instead of: * Assignment() ConditionalExpression() * for performance reasons. * However, it is a weakening of the grammar for it allows the LHS of * assignments to be any conditional expression whereas it can only be * a primary expression. Consider adding a semantic predicate to work * around this. */ Listing 3.15: Comment showing a weakness of the grammar We found more strange comments in the enumconstant rule. In this code is set the position of the start position and the end position of an EnumConstructorReference, and as we can see this can be done before in the grammar [Listing 3.16]. ref = factory.createenumconstructorreference(args, cd) setprefixinfo(ref) // TODO this maybe too late?! setpostfixinfo(ref) spec = factory.createenumconstantspecification(id, ref) setprefixinfo(spec) // TODO this maybe too late?! Listing 3.16: Suspicious comments that inform us of a possible change At the end of the class declaration we also found a comment that gives us a clue that something is not well done in that place. This comment is after set the end position of a class declaration that should be a brace [Listing 3.17]. result.setmembers(mdl) setpostfixinfo(result) // coordinate of ""?! return result Listing 3.17: Another suspicious comment 12

17 In this production of the grammar we can see the patch named before, at the end of a field declaration is added a semicolon all the times and is remarked with comments showing the patches. This patch is not only in the members of a class, it is also repeated in the declaration of all the members of an interface that can be a static block a nested declaration or a method declaration [Listing 3.18] MemberDeclaration ClassBodyDeclaration() : (... (FieldDeclaration() ("")*) // patch ) Listing 3.18: Semicolon patch after a member declaration We have found more out commented code in variabledeclaratorid, another time is about setting the position [Listing 3.19]. Identifier VariableDeclaratorId() :... setpostfixinfo(result) //setprefixinfo(result) return result Listing 3.19: Commented code in a rule In the shift expressions has been commented out code, to improve the performance of the JavaCC parser with some predicates [Listing 3.20]. This code is useless because it is not needed for recognize the token. Instead of the token >> is used a new rule called RSIGNEDSHIFT and instead of >>> is used a rule called RUNSIGNEDSHIFT. Expression ShiftExpression() : AdditiveExpression() ( ( "<<" // ">>" RSIGNEDSHIFT() // ">>>" RUNSIGNEDSHIFT() ) AdditiveExpression() )* Listing 3.20: Code useless in the new implementation Some parts of the grammar can be restructured in order to avoid some unnecessary checking errors after the scan. An example of this case appear in the try catch finally statement. After a try block has to appear at least one catch block or one finally. In the current grammar this fact has to be checked in the semantic analysis [Listing 3.21]. 13

18 /* * Semantic check required here to make sure that at least one * finally/catch is present. */ TryStatement "try" BLOCK ("catch" "(" FORMALPARAMETER ")" BLOCK)* ( "finally" BLOCK)? Listing 3.21: Current try statement specification in EBNF It can be solved very easy syntactically and it is not necessary to do it semantically [Listing 3.22]. TryStatement "try" BLOCK ( ("catch" "(" FormalParameter() ")" BLOCK)+ ("finally" BLOCK)? "finally" BLOCK ) Listing 3.22 Hypothetical solution to the problem showed before We found another hack related to another before [Listing 3.11] that has been added for insert the correct position of a method declaration [Listing 3.23]. This rule separates the symbol < from the rest of a type parameter in a generic type. That is done to manage the start position of a method declaration. Depending if the number of modifiers is zero the first element can be the symbol less. // HACK for handling position of methoddeclarations correctly ASTList<TypeParameterDeclaration> TypeParametersNoLE() : ASTList<TypeParameterDeclaration> res = new ASTArrayList<TypeParameterDeclaration>() TypeParameterDeclaration tp tp = TypeParameter() res.add(tp) ("," tp = TypeParameter() res.add(tp))* ">" return res Listing 3.23 Hack for handling position of method Declarations We can also see in the primaryexpression a lot of code that should be changed and suspicious comments [Listing 3.24]. We can see in the comments that something needs to be changed. The first comment with a lot of exclamation marks, probably remarks 14

19 that we are returning the result before the end of the function. The other comments talk about the types than should be in that position. Expression PrimaryExpression() :... return result //!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!... // the prefix MUST be a type expression!!!!! // we currently only create UncollatedReferenceQualifiers... // should be a FieldReference?... Listing 3.24: Strange comments in primary expression Inside of the production primarypreffix we found a rule that has been commented out probably because the grammar was changed [Listing 3.25]. // LOOKAHEAD(NonWildcardTypeArguments() "this" Arguments()) // NonWildcardTypeArguments() "this" /* Arguments() is a mandatory suffix here!*/ // // prefix.type = prefix.this // System.err.println("Ignoring NonWildcardTypeArguments!") // // Listing 3.25: Commented rule in primarypreffix Inside of the production statementexpression we can see the proof of another expansion. This production generates more than the legal Java for this kind of statements [Listing 3.26]. /* * The last expansion of this production accepts more than the legal * Java expansions for StatementExpression. This expansion does not * use PostfixExpression for performance reasons. */ Listing 3.26: Comment that shows us an expansion Inside of the rule forstatement we can see out commented code. This code was changed when enhanced for loop (for each) was introduced in the newer versions of Java. Now in the rule has been changed the size of the lookahead in order to predict the rule that has to be taken [Listing 3.27]. 15

20 // // result = factory.createfor() // setprefixinfo(result) // //"(" Listing 3.27: Code out commented inside the for loop 3.2 Conclusion In this section we have talked about the RECODER grammar. We have discovered a lot of old out commented code that needs to be cleaned. Also we have found comments that point out possible errors in the grammar. Some rules can be simplified and improve a bit the grammar. To conclude this section, we saw why it is necessary to change the grammar and clean up the code. 16

21 4 Description of the new implementation The new version of the parser for RECODER is based on a different parser generator called ANTLR. The syntax of this parser generator is similar to JavaCC but has some details different. The grammar is written in EBNF and accepts actions in the middle. In JavaCC, when some parts of the grammar are optional is expressed with square brackets instead of the normal notation that uses?. In ANTLR is more readable with the operator? that is a standard in this notation. The original grammar where we base our implementation on is taken from the official ANTLR webpage that contains a lot of grammar for different languages and purposes. The original grammar can be downloaded for free from this link [10]. 4.1 Analysis First of all, we are going talk about the changes introduced in the original grammar from the ANTLR webpage to adapt the grammar to our use with RECODER. Other changes have been introduced in order to clean up some rules or because are not useful. The first change introduced was in order to simplify the grammar. In the original grammar an import declaration is described as below [Listing 4.1] importdeclaration : 'import' ('static' )? IDENTIFIER '.' '*' '' 'import' ('static' )? IDENTIFIER ('.' IDENTIFIER )+ ('.' '*' )? '' Listing 4.1: Original import declaration in the ANTLR grammar As we can see, we can merge the two rules changing the + for an asterisk and deleting the first rule. The result is showed below [Listing 4.2] and the rule has been reduced by five lines and is clearer to read. importdeclaration : 'import' ('static' )? IDENTIFIER ('.' IDENTIFIER )* ('.' '*' )? '' Listing 4.2: Import declaration simplified 17

22 The next change in the original grammar is deleting a rule that is unused. After checking that the rule is never used and is not useful to us, we decide to delete it. Looking at the name of the rule it was supposed to be used in the import declaration above, but probably somebody changed the grammar and forgot to erase this rule [Listing 4.3]. qualifiedimportname : IDENTIFIER ('.' IDENTIFIER )* Listing 4.3: Rule deleted because is not used When a modifier appears (for example in a class declaration), the original grammar uses this production [Listing 4.4]. But in order to the grammar similar to the JavaCC grammar the asterisk has been taken out of this production and written outside of every rule that used modifiers. So this change does not modify the grammar just is written in a different way. modifiers : ( annotation 'public' 'strictfp' ) //* -> The Kleene closure is now outside of this rule Listing 4.4: Rule modified for similarity with the JavaCC grammar We have changed the explicit constructor invocation in order to do it equal to the JavaCC grammar. The grammar it was a bit different but it does not change the final result. We can see the original production here [Listing 4.5] explicitconstructorinvocation : (nonwildcardtypearguments)? //NOTE: the position of Identifier 'super' is set to the type args position here ('this' 'super' ) arguments '' primary '.' (nonwildcardtypearguments )? 'super' arguments '' Listing 4.5: Original explicit constructor invocation 18

23 We can see that the production it expresses is the same but it just a reordering of the grammar [Listing 4.6]. explicitconstructorinvocation returns [SpecialConstructorReference result] : (nonwildcardtypearguments )? 'this' arguments '' (expr = primarynosuper '.')? (nonwildcardtypearguments)? 'super' arguments '' Listing 4.6: Explicit constructor invocation in the final grammar Normally in a block statement you can declare local variables and classes as well as you can insert normal statements (including new blocks). In this sense the original grammar is less restrictive than the JavaCC grammar because you can declare interfaces inside a block that it is not allowed [Listing 4.7]. blockstatement : localvariabledeclarationstatement classorinterfacedeclaration statement Listing 4.7: Original block statement in the grammar The grammar has been changed so the second production has a normal class declaration instead of class or interface declaration in order to keep closer to the JavaCC specification and save a semantic checking [Listing 4.8]. blockstatement returns [Statement result] : localvariabledeclarationstatement normalclassdeclaration statement Listing 4.8: Changed block statement We can see now how is solved the problem that we pointed before [Listing 3.23] and the possible solution in [Listing 3.24]. The problem is fixed in a way very similar to the way that we exposed before [Listing 4.9]. Always appear even a catch block of a finally block. trystatement : 'try' block ( catches 'finally' block catches 'finally' block ) Listing 4.9: Solution of the problem in the try statement But we have more to say about the try statement. The original grammar introduced catches as a list of catch blocks [Listing 4.10] but it is not so comfortable for the 19

24 translation for the current JavaCC grammar to the new one. So some changes were introduced [Listing 4.11]. trystatement : 'try' block ( catches 'finally' block catches 'finally' block ) catches : catchclause (catchclause )* catchclause : 'catch' '(' formalparameter ')' block Listing 4.10: Original try-catch-finally specification Instead of this the production catches was deleted and changed the name of the catch clause to catch block. Also has changed the rule formal parameter for normal parameter declaration that contains the same and because of that it can be deleted formal parameter because it is not used in other places in the grammar. The grammar result looks like this. trystatement : 'try' block ( catchblock+ 'finally' block catchblock+ 'finally' block ) catchblock : 'catch' '(' normalparameterdecl ')' block Listing 4.11: Modified Try catch block. There is one little change in the enhanced for loop statement in order to be closer to the JavaCC specification [Listing 4.12]. forstatement : // enhanced for loop 'for' '(' variablemodifiers type IDENTIFIER ':' expression ')' statement 20

25 // normal for loop 'for' '(' (forinit )? '' (expression )? '' (expressionlist )? ')' statement Listing 4.12: Little change in the for loop Instead of the variable modifiers the type and the identifier in the rule now we have the for loop initialization, the same as the normal for loop. The biggest change realized in the grammar is about the primary expressions. In the ANTLR grammar the primary expressions were difficult to translate from the JavaCC grammar to the ANTLR grammar. So it was to take the JavaCC grammar and translate this part to the ANTLR syntax. After that we have to be sure that the meaning of the grammar does not change. So it has to generate the same AST that with the grammar before. In order to understand the equivalence between both expressions, we have to say that in the original ANTLR grammar the production that contains primary is a primary expression followed for an arbitrary number of selectors. (primary selector*). In the final specification the grammar is like this: primaryprefix selector*. primary : parexpression 'this' ('.' IDENTIFIER)* (identifiersuffix)? IDENTIFIER ('.' IDENTIFIER)* (identifiersuffix)? 'super' supersuffix literal creator primitivetype ('[' ']')* '.' 'class' 'void' '.' 'class' primaryprefix : parexpression 'this' IDENTIFIER ('.' IDENTIFIER)* ( ('[' ']')*'.' 'class')? 'super' '.' IDENTIFIER literal creator primitivetype ('[' ']')* '.' 'class' 'void' '.' 'class' Listing 4.13 Comparison between primary prefixes before and after the change The code before is trying to match the differences between the original ANTLR grammar and the final result. The original ANTLR is on the left side and the final one is in the right side [Listing 4.13]. At the beginning of the rules we can see that there are similarities. The par expression is still the same however the expression that start with this have something more after. It happen the same at the end of the identifier expression and the super expression. The rest is the same than before. But as we can see we have to introduce more changes in order to leave the grammar unaltered. The old grammar difficult to follow because mixes the prefix with the suffix in a primary expression. The identifier and super suffix are inside of the primary prefix so it 21

26 is easier and clearer to put it inside of selector. Also it is better for the translation later on because it is closer to the JavaCC specification. supersuffix : arguments '.' (nonwildcardtypearguments)? IDENTIFIER (arguments)? identifiersuffix : ('[' ']')+ '.' 'class' ('[' expression ']')+ arguments '.' 'class' '.' nonwildcardtypearguments IDENTIFIER arguments '.' 'this' '.' 'super' arguments innercreator selector : selectornosuper '.' 'super' selectornosuper : '.' IDENTIFIER arguments '.' 'this' '.' creator '[' expression ']' '.' nonwildcardtypearguments IDENTIFIER selector : '.' IDENTIFIER (arguments)? '.' 'this' '.' 'super supersuffix innercreator '[' expression ']' Listing 4.13 Comparison between primary suffixes before and after the change The code of the left is the original grammar that is bit difficult to understand. As selector can be repeated all the times that we want, we can repeat the selector rule from the right side to generate the same code than the grammar from the left. It is difficult to see that both grammars generate the same language, however the grammar is equivalent. So the inner creator, identifier suffix and super suffix are moved of the grammar without any effects in the final result. We can appreciate that in the right hand side of Listing 4.13 exists a production selectornosuper. This is a change introduced to the grammar in order to control the use of the predicate. super that before was controlled by a boolean variable [Above Listing 3.2]. It is introduced a new rule selectornosuper without the postfix super and a primarynosupper that repeat this selector instead of the normal when we are inside of explicitconstructorinvocation. Continuing with the changes in the original grammar, the production classcreatorrest thas is showed below in Listing 4.14 has been replaced for his content in creator. 22

27 creator : 'new' nonwildcardtypearguments classorinterfacetype classcreatorrest 'new' classorinterfacetype classcreatorrest arraycreator classcreatorrest : arguments (classbody)? Listing 4.14 Eliminated rule classcreatorrest Finally with grammar we can add some general simplifications to the grammar. When we find something like A(A)* we can change this to A+. Sometimes it is better to leave the grammar but the same but sometimes we can save space and simplify a bit the code. In the next paragraphs of this section we discuss the differences between the current grammar used in JavaCC and the new one for ANTLR with the changes explained before in this section. Comparing the compilation unit in the new ANTLR grammar [Listing 4.15] with the compilation unit in the JavaCC grammar we can see that they are very close to each other. The unique difference between both is that the annotations are out of the package declaration and the two rules are mixed in one as we explain above of the Listing 3.8. compilationunit returns [CompilationUnit result] : ((annotations)? packagedeclaration)? (importdeclaration)* (typedeclaration)* Listing 4.15: Compilation unit in the new grammar Besides the annotations inside of the package declaration, the rules are the same, also the import declarations. In the type declarations we can see the first big difference between both grammars. They are just ordered in other way with different rules. In the JavaCC grammar we can see that the type declaration it is directly a class, an interface, an enum or an annotation. ANTLR grammar it is a bit more complex, one type can be a class declaration or an interface declaration. As we said before an enum is a special type of class so in the class declaration we have the normal class declaration and the enum declaration. In the interface declaration it happens the same, we have the normal interface declaration and the annotation type [Listing 4.16]. typedeclaration returns [TypeDeclaration result] : res = classorinterfacedeclaration result = res '' 23

28 classorinterfacedeclaration returns [TypeDeclaration result] : classdeclaration interfacedeclaration classdeclaration returns [TypeDeclaration result] : normalclassdeclaration enumdeclaration interfacedeclaration returns [TypeDeclaration result] : normalinterfacedeclaration annotationtypedeclaration Listing 4.16: Type declaration in the ANTLR grammar Type declarations Between the current annotation-type declaration and the new one there are a few things different than before. The new grammar separates the body of the annotation type in a different rule and inside the different members in a new rule as well. In this way, everything is more ordered but maybe it is less efficient talking in terms of space and time. With the enumerations it happen the same than with the annotations, the rule is almost equal to the current grammar but it has one more intermediate rule in the body and other with the list of constants. The classes have the same structure if we forget that in the JavaCC version the modifiers are outside for distinguishing the local classes from the normal classes. In the new grammar we use the same production for both, but it can be changed easily if you want to maintain the same structure as before. We can see more things like the classes extend one type and implement a type list. In the JavaCC grammar is used typed name. However, it exists another production in the old grammar that is called type, because of that is bit confusing with the name of the rules between each other. So we need how are called the rules in both grammars. In the actual grammar type can be a primitive type or a complex type created by the user: class, interface, enum or annotation. type : classorinterfacetype ('[' ']')* primitivetype ('[' ']')* createdname : classorinterfacetype primitivetype classorinterfacetype : IDENTIFIER (typearguments)? ('.' IDENTIFIER (typearguments)?)* 24

29 primitivetype : 'boolean' 'char' 'byte' 'short' 'int' 'long' 'float' 'double' Listing 4.17: Type in the new grammar In the old grammar [Listing 4.18] type can be a typed name that corresponds with class or interface type in the new one [Listing 4.17], or can be a raw type. The created name is only used when we are creating a new array [Listing 4.28]. A raw type can be a primitive type (this one is the same) or a name. However, a name is the same as a typed name but without type arguments (is optional in that production). Name could be deleted but as we can see in the comments is used also in the import declarations, so that it is not an option. If we delete name from raw type, probably we will obtain the same result because we can obtain the type from typed name and the structure would be the same as in the new grammar but with different names in the rules. Type() : TypedName()( [ ] )* RawType() TypedName() : <IDENTIFIER> [TypeArguments()] (. <IDENTIFIER> [TypeArguments()] )* RawType() : ( PrimitiveType() Name() ) ( [ ] )* PrimitiveType() : boolean... double UncollatedReferenceQualifier Name() : 25

30 /* * A lookahead of 2 is required below since Name can be followed * by a.* when used in the context of an ImportDeclaration. */ <IDENTIFIER> (. <IDENTIFIER>)* Listing 4.18 Types in the JavaCC grammar Now we are going to talk about the member in the classes and interfaces. The unique significant variation in this is that the methods and the constructor are in the same rule, not as in the JavaCC grammar [Listing 4.24]. methoddeclaration : /* For constructor */ (modifiers)* (typeparameters)? IDENTIFIER formalparameters ('throws' qualifiednamelist)? '' (explicitconstructorinvocation)? (blockstatement)* '' /* For methods */ (modifiers)* (typeparameters)? (type 'void') IDENTIFIER formalparameters ('[' ']')* ('throws' qualifiednamelist)? ( bod = block '' ) Listing 4.19: Method and constructor declaration in one rule In the new grammar there is a new rule for the methods in the interface not allowing body in the methods declarations. The old grammar allows declaring the body and now in the new grammar this is checked syntactically. In the interfaces you only cannot declare the body of the methods. The annotations also have other type of method declarations because you can fix a default value for the methods. There are a lot of small details that are different in both grammars but the expressions are exactly the same if we skip the primary expressions. Inside of the primary expressions are some differences. The first difference that comes out is between the literals, to be more exact in the floating point literals. The old grammar only has one token for this kind of numbers and differentiates if is a double or a float inside of the production [Listing 4.25] but the new grammar has two different tokens for both constants [Listing 4.26]. 26

31 Literal Literal() : <FLOATING_POINT_LITERAL> if (token.image.endswith("f") token.image.endswith("f")) result = factory.createfloatliteral(token.image) else result = factory.createdoubleliteral(token.image) setprefixinfo(result)... Listing 4.20: Floating point literals in the JavaCC grammar literal : FLOATLITERAL result = factory.createfloatliteral($floatliteral.gettext()) DOUBLELITERAL result = factory.createdoubleliteral($doubleliteral.gettext()) Listing 4.21: Floating point literals in the ANTLR grammar There is one big difference in the allocation expressions. It is totally in different order. In the old grammar [Listing 4.27] there are two rules but there is not separation between the arrays allocation and the allocation of normal classes. TypeOperator AllocationExpression() : ( "new" PrimitiveType() ArrayDimsAndInits() "new" TypedName() [NonWildcardTypeArguments()] ( Arguments() [ClassBody()] ArrayDimsAndInits() ) ) ArrayDimsAndInits() : ( ("[" Expression() "]")+ ( "[" "]")* ( "[" "]" )+ ArrayInitializer() ) Listing 4.22 Allocation expression in the old grammar 27

32 In the new grammar we can see that is separated very clear the normal allocations and the array allocations. The grammar is more ordered and easier to use for RECODER. creator : 'new' classorinterfacetype ( nonwildcardtypearguments )? arguments (classbody )? arraycreator arraycreator : 'new' createdname ('[' ']')+ arrayinitializer 'new' createdname ('[' expression]')+ ('[' ']')* Listing 4.23 Allocation expressions in the new grammar called creator 4.2 Conclusion In this section we have showed the changes that we have introduced in the original ANTLR grammar to adapt it to the RECODER grammar specification. Also we have compare the RECODER grammar with the new ANTLR grammar, talking about what has come out from the grammar and what has come in to the new implementation. 28

33 5 Evaluation In this section of the thesis we evaluate all the work done. In the first subsection we talk about the compatibility with the old version and some fails that have been encountered. In the second subsection we talk about the maintainability of the program although is difficult to measure. 5.1 Compatibility RECODER has incorporated a test suite for checking if everything is working properly and there are no fails. The suite contains 88 test cases that our implementation should pass. Unfortunately, our implementation does not pass all the tests, but on the other hand we know where the problems are and actually all are related to only one issue. RECODER keeps position of the tokens saving their start position, the end position and the relative position with the last token before the actual. This is not properly supported in new ANTLR specification yet. This position is the cause of all the errors that we find in the test suite. The first error appears in the parserprintertest. This test parses a program and then unparses it and saves the code in a string. After that it is parsed again and compared with the string saved again. The result should be the same but it is not because the position of the tokens is not good fixed. The structure of the program is maintained but it is not exactly the same because are added some white spaces and fail but the code is the same. The second test that fails is called testanalysisreport and fails because of the same reason as before. The relative positions of the tokens are bad inserted and when the code is unparsed are inserted a lot of white spaces in the middle. This method compare if the sizes of two buffers are equal and this is not the case because of the white spaces that are inserted between the tokens. The third fail appears because ModelRebuildTest extends the testanaysisreport class. It is because of that, that the same fail appear again when the test call the father and run the test that fail again for the same reason. The comments in RECODER are attached to the element in the AST closest to the comment. That means that the comments are not elements by themselves but are associated attributes to the elements in the AST. This causes some errors related to the comments. Since the relative position is bad set, the comments are set in the bad position as well. The forth fail happens with single line comments for the reasons explained before in the test testsinglelinecommentbug. In the last test that fails, it occur the same error again because of the same reason with the bad positioning of the comments. This test is called testcommentattachment the position is bad set and the comments are inserted in the bad position. 29

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Parser Generators: ANTLR History of ANTLR ANother Tool for Language Recognition Terence Parr's dissertation: Obtaining Practical Variants of LL(k) and LR(k) for k > 1 PCCTS:

More information

ECS 142 Spring Project (part 2): Lexical and Syntactic Analysis Due Date: April 22, 2011: 11:59 PM.

ECS 142 Spring Project (part 2): Lexical and Syntactic Analysis Due Date: April 22, 2011: 11:59 PM. Project (part 2): Lexical and Syntactic Analysis Due Date: April 22, 2011: 11:59 PM. 1 Overview This course requires you to write a compiler that takes a program written in a language, Java, and constructs

More information

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1 Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

6.001 Notes: Section 6.1

6.001 Notes: Section 6.1 6.001 Notes: Section 6.1 Slide 6.1.1 When we first starting talking about Scheme expressions, you may recall we said that (almost) every Scheme expression had three components, a syntax (legal ways of

More information

Grammars and Parsing, second week

Grammars and Parsing, second week Grammars and Parsing, second week Hayo Thielecke 17-18 October 2005 This is the material from the slides in a more printer-friendly layout. Contents 1 Overview 1 2 Recursive methods from grammar rules

More information

In this simple example, it is quite clear that there are exactly two strings that match the above grammar, namely: abc and abcc

In this simple example, it is quite clear that there are exactly two strings that match the above grammar, namely: abc and abcc JavaCC: LOOKAHEAD MiniTutorial 1. WHAT IS LOOKAHEAD The job of a parser is to read an input stream and determine whether or not the input stream conforms to the grammar. This determination in its most

More information

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser JavaCC Parser The Compilation Task Input character stream Lexer stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003 2 Automated? JavaCC Parser The initial

More information

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS PAUL L. BAILEY Abstract. This documents amalgamates various descriptions found on the internet, mostly from Oracle or Wikipedia. Very little of this

More information

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find CS1622 Lecture 15 Semantic Analysis CS 1622 Lecture 15 1 Semantic Analysis How to build symbol tables How to use them to find multiply-declared and undeclared variables. How to perform type checking CS

More information

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019 Comp 411 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright January 11, 2019 Top Down Parsing What is a context-free grammar (CFG)? A recursive definition of a set of strings; it is

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square) Introduction This semester, through a project split into 3 phases, we are going

More information

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore

Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Principle of Complier Design Prof. Y. N. Srikant Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 20 Intermediate code generation Part-4 Run-time environments

More information

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant Automated Tools Easier ways to create parsers The Compilation Task Input character stream Lexer Token stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003

More information

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d) CMSC 330: Organization of Programming Languages Tail Calls A tail call is a function call that is the last thing a function does before it returns let add x y = x + y let f z = add z z (* tail call *)

More information

Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal

Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal Lesson Goals Understand the basic constructs of a Java Program Understand how to use basic identifiers Understand simple Java data types

More information

Semantic Analysis. Lecture 9. February 7, 2018

Semantic Analysis. Lecture 9. February 7, 2018 Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average

More information

Compiler Construction D7011E

Compiler Construction D7011E Compiler Construction D7011E Lecture 2: Lexical analysis Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 Basics of Lexical Analysis: 2 Some definitions:

More information

COMP3131/9102: Programming Languages and Compilers

COMP3131/9102: Programming Languages and Compilers COMP3131/9102: Programming Languages and Compilers Jingling Xue School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia http://www.cse.unsw.edu.au/~cs3131

More information

Introduction to Programming Using Java (98-388)

Introduction to Programming Using Java (98-388) Introduction to Programming Using Java (98-388) Understand Java fundamentals Describe the use of main in a Java application Signature of main, why it is static; how to consume an instance of your own class;

More information

Chapter 9: Dealing with Errors

Chapter 9: Dealing with Errors Chapter 9: Dealing with Errors What we will learn: How to identify errors Categorising different types of error How to fix different errors Example of errors What you need to know before: Writing simple

More information

Full file at

Full file at Java Programming: From Problem Analysis to Program Design, 3 rd Edition 2-1 Chapter 2 Basic Elements of Java At a Glance Instructor s Manual Table of Contents Overview Objectives s Quick Quizzes Class

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

Syntax Errors; Static Semantics

Syntax Errors; Static Semantics Dealing with Syntax Errors Syntax Errors; Static Semantics Lecture 14 (from notes by R. Bodik) One purpose of the parser is to filter out errors that show up in parsing Later stages should not have to

More information

A Short Summary of Javali

A Short Summary of Javali A Short Summary of Javali October 15, 2015 1 Introduction Javali is a simple language based on ideas found in languages like C++ or Java. Its purpose is to serve as the source language for a simple compiler

More information

Grammars and Parsing. Paul Klint. Grammars and Parsing

Grammars and Parsing. Paul Klint. Grammars and Parsing Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly

More information

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F). CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

More information

Outline. 1 Introduction. 2 Context-free Grammars and Languages. 3 Top-down Deterministic Parsing. 4 Bottom-up Deterministic Parsing

Outline. 1 Introduction. 2 Context-free Grammars and Languages. 3 Top-down Deterministic Parsing. 4 Bottom-up Deterministic Parsing Parsing 1 / 90 Outline 1 Introduction 2 Context-free Grammars and Languages 3 Top-down Deterministic Parsing 4 Bottom-up Deterministic Parsing 5 Parser Generation Using JavaCC 2 / 90 Introduction Once

More information

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview COMP 181 Compilers Lecture 2 Overview September 7, 2006 Administrative Book? Hopefully: Compilers by Aho, Lam, Sethi, Ullman Mailing list Handouts? Programming assignments For next time, write a hello,

More information

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer CS664 Compiler Theory and Design LIU 1 of 16 ANTLR Christopher League* 17 February 2016 ANTLR is a parser generator. There are other similar tools, such as yacc, flex, bison, etc. We ll be using ANTLR

More information

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

Weiss Chapter 1 terminology (parenthesized numbers are page numbers) Weiss Chapter 1 terminology (parenthesized numbers are page numbers) assignment operators In Java, used to alter the value of a variable. These operators include =, +=, -=, *=, and /=. (9) autoincrement

More information

Pace University. Fundamental Concepts of CS121 1

Pace University. Fundamental Concepts of CS121 1 Pace University Fundamental Concepts of CS121 1 Dr. Lixin Tao http://csis.pace.edu/~lixin Computer Science Department Pace University October 12, 2005 This document complements my tutorial Introduction

More information

Type Checking and Type Equality

Type Checking and Type Equality Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend:

Course Overview. Introduction (Chapter 1) Compiler Frontend: Today. Compiler Backend: Course Overview Introduction (Chapter 1) Compiler Frontend: Today Lexical Analysis & Parsing (Chapter 2,3,4) Semantic Analysis (Chapter 5) Activation Records (Chapter 6) Translation to Intermediate Code

More information

Transformation of Java Card into Diet Java Card

Transformation of Java Card into Diet Java Card Semester Project Transformation of Java Card into Diet Java Card Erich Laube laubee@student.ethz.ch March 2005 Software Component Technology Group ETH Zurich Switzerland Prof. Peter Müller Supervisor:

More information

Wednesday, September 9, 15. Parsers

Wednesday, September 9, 15. Parsers Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs: What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

More information

Chapter 3. Describing Syntax and Semantics

Chapter 3. Describing Syntax and Semantics Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:

More information

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers. Part III : Parsing From Regular to Context-Free Grammars Deriving a Parser from a Context-Free Grammar Scanners and Parsers A Parser for EBNF Left-Parsable Grammars Martin Odersky, LAMP/DI 1 From Regular

More information

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

CPS122 Lecture: From Python to Java last revised January 4, Objectives: Objectives: CPS122 Lecture: From Python to Java last revised January 4, 2017 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.

More information

Decaf Language Reference Manual

Decaf Language Reference Manual Decaf Language Reference Manual C. R. Ramakrishnan Department of Computer Science SUNY at Stony Brook Stony Brook, NY 11794-4400 cram@cs.stonybrook.edu February 12, 2012 Decaf is a small object oriented

More information

1. Describe History of C++? 2. What is Dev. C++? 3. Why Use Dev. C++ instead of C++ DOS IDE?

1. Describe History of C++? 2. What is Dev. C++? 3. Why Use Dev. C++ instead of C++ DOS IDE? 1. Describe History of C++? The C++ programming language has a history going back to 1979, when Bjarne Stroustrup was doing work for his Ph.D. thesis. One of the languages Stroustrup had the opportunity

More information

Software Tools ANTLR

Software Tools ANTLR 2009 Software Tools ANTLR Part II - Lecture 5 1 The University of Auckland New Zealand COMPSCI 732 Today s Outline 2009 Introduction to ANTLR Parsing Actions Generators 2 The University of Auckland New

More information

Alternatives for semantic processing

Alternatives for semantic processing Semantic Processing Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies

More information

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

Object oriented programming. Instructor: Masoud Asghari Web page:   Ch: 3 Object oriented programming Instructor: Masoud Asghari Web page: http://www.masses.ir/lectures/oops2017sut Ch: 3 1 In this slide We follow: https://docs.oracle.com/javase/tutorial/index.html Trail: Learning

More information

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs

programming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 2017-09-04 This lecture Regular expressions Context-free grammar Attribute grammar

More information

Programming Lecture 3

Programming Lecture 3 Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements

More information

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g., C++) to low-level assembly language that can be executed by hardware int a,

More information

Building Compilers with Phoenix

Building Compilers with Phoenix Building Compilers with Phoenix Syntax-Directed Translation Structure of a Compiler Character Stream Intermediate Representation Lexical Analyzer Machine-Independent Optimizer token stream Intermediate

More information

3. Java - Language Constructs I

3. Java - Language Constructs I Educational Objectives 3. Java - Language Constructs I Names and Identifiers, Variables, Assignments, Constants, Datatypes, Operations, Evaluation of Expressions, Type Conversions You know the basic blocks

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

Get JAVA. I will just tell you what I did (on January 10, 2017). I went to:

Get JAVA. I will just tell you what I did (on January 10, 2017). I went to: Get JAVA To compile programs you need the JDK (Java Development Kit). To RUN programs you need the JRE (Java Runtime Environment). This download will get BOTH of them, so that you will be able to both

More information

// the current object. functioninvocation expression. identifier (expressionlist ) // call of an inner function

// the current object. functioninvocation expression. identifier (expressionlist ) // call of an inner function SFU CMPT 379 Compilers Spring 2015 Assignment 4 Assignment due Thursday, April 9, by 11:59pm. For this assignment, you are to expand your Bunting-3 compiler from assignment 3 to handle Bunting-4. Project

More information

Module 10A Lecture - 20 What is a function? Why use functions Example: power (base, n)

Module 10A Lecture - 20 What is a function? Why use functions Example: power (base, n) Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute of Technology, Madras Module 10A Lecture - 20 What is a function?

More information

Scalify. Java -> Scala Source Translator Target: 100% of Java 1.5 Status: 90% of Java 1.4 Ninety/Ninety rule may apply

Scalify. Java -> Scala Source Translator Target: 100% of Java 1.5 Status: 90% of Java 1.4 Ninety/Ninety rule may apply Scalify Java -> Scala Source Translator Target: 100% of Java 1.5 Status: 90% of Java 1.4 Ninety/Ninety rule may apply http://github.com/paulp/scalify (BSD-like do-what-you-want license) Motivations Mixed

More information

Types, Values and Variables (Chapter 4, JLS)

Types, Values and Variables (Chapter 4, JLS) Lecture Notes CS 141 Winter 2005 Craig A. Rich Types, Values and Variables (Chapter 4, JLS) Primitive Types Values Representation boolean {false, true} 1-bit (possibly padded to 1 byte) Numeric Types Integral

More information

The following expression causes a divide by zero error:

The following expression causes a divide by zero error: Chapter 2 - Test Questions These test questions are true-false, fill in the blank, multiple choice, and free form questions that may require code. The multiple choice questions may have more than one correct

More information

Expressions and Data Types CSC 121 Fall 2015 Howard Rosenthal

Expressions and Data Types CSC 121 Fall 2015 Howard Rosenthal Expressions and Data Types CSC 121 Fall 2015 Howard Rosenthal Lesson Goals Understand the basic constructs of a Java Program Understand how to use basic identifiers Understand simple Java data types and

More information

Semantic actions for declarations and expressions

Semantic actions for declarations and expressions Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer. The Compiler So Far CSC 4181 Compiler Construction Scanner - Lexical analysis Detects inputs with illegal tokens e.g.: main 5 (); Parser - Syntactic analysis Detects inputs with ill-formed parse trees

More information

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill Syntax Analysis Björn B. Brandenburg The University of North Carolina at Chapel Hill Based on slides and notes by S. Olivier, A. Block, N. Fisher, F. Hernandez-Campos, and D. Stotts. The Big Picture Character

More information

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar LL(k) Compiler Construction More LL parsing Abstract syntax trees Lennart Andersson Revision 2012 01 31 2012 Related names top-down the parse tree is constructed top-down recursive descent if it is implemented

More information

Semantic actions for declarations and expressions. Monday, September 28, 15

Semantic actions for declarations and expressions. Monday, September 28, 15 Semantic actions for declarations and expressions Semantic actions Semantic actions are routines called as productions (or parts of productions) are recognized Actions work together to build up intermediate

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1 CSE P 501 Compilers Parsing & Context-Free Grammars Hal Perkins Spring 2018 UW CSE P 501 Spring 2018 C-1 Administrivia Project partner signup: please find a partner and fill out the signup form by noon

More information

CS 360 Programming Languages Interpreters

CS 360 Programming Languages Interpreters CS 360 Programming Languages Interpreters Implementing PLs Most of the course is learning fundamental concepts for using and understanding PLs. Syntax vs. semantics vs. idioms. Powerful constructs like

More information

6.001 Notes: Section 8.1

6.001 Notes: Section 8.1 6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything

More information

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3 LL(k) Compiler Construction More LL parsing Abstract syntax trees Lennart Andersson Revision 2011 01 31 2010 Related names top-down the parse tree is constructed top-down recursive descent if it is implemented

More information

Java Bytecode (binary file)

Java Bytecode (binary file) Java is Compiled Unlike Python, which is an interpreted langauge, Java code is compiled. In Java, a compiler reads in a Java source file (the code that we write), and it translates that code into bytecode.

More information

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming The Big Picture Again Syntactic Analysis source code Scanner Parser Opt1 Opt2... Optn Instruction Selection Register Allocation Instruction Scheduling machine code ICS312 Machine-Level and Systems Programming

More information

Programming Languages Third Edition. Chapter 9 Control I Expressions and Statements

Programming Languages Third Edition. Chapter 9 Control I Expressions and Statements Programming Languages Third Edition Chapter 9 Control I Expressions and Statements Objectives Understand expressions Understand conditional statements and guards Understand loops and variation on WHILE

More information

The SPL Programming Language Reference Manual

The SPL Programming Language Reference Manual The SPL Programming Language Reference Manual Leonidas Fegaras University of Texas at Arlington Arlington, TX 76019 fegaras@cse.uta.edu February 27, 2018 1 Introduction The SPL language is a Small Programming

More information

JavaCC: SimpleExamples

JavaCC: SimpleExamples JavaCC: SimpleExamples This directory contains five examples to get you started using JavaCC. Each example is contained in a single grammar file and is listed below: (1) Simple1.jj, (2) Simple2.jj, (3)

More information

Parsing Combinators: Introduction & Tutorial

Parsing Combinators: Introduction & Tutorial Parsing Combinators: Introduction & Tutorial Mayer Goldberg October 21, 2017 Contents 1 Synopsis 1 2 Backus-Naur Form (BNF) 2 3 Parsing Combinators 3 4 Simple constructors 4 5 The parser stack 6 6 Recursive

More information

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

COP 3402 Systems Software Top Down Parsing (Recursive Descent) COP 3402 Systems Software Top Down Parsing (Recursive Descent) Top Down Parsing 1 Outline 1. Top down parsing and LL(k) parsing 2. Recursive descent parsing 3. Example of recursive descent parsing of arithmetic

More information

Properties of Regular Expressions and Finite Automata

Properties of Regular Expressions and Finite Automata Properties of Regular Expressions and Finite Automata Some token patterns can t be defined as regular expressions or finite automata. Consider the set of balanced brackets of the form [[[ ]]]. This set

More information

PL Revision overview

PL Revision overview PL Revision overview Course topics Parsing G = (S, P, NT, T); (E)BNF; recursive descent predictive parser (RDPP) Lexical analysis; Syntax and semantic errors; type checking Programming language structure

More information

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking Agenda COMP 181 Type checking October 21, 2009 Next week OOPSLA: Object-oriented Programming Systems Languages and Applications One of the top PL conferences Monday (Oct 26 th ) In-class midterm Review

More information

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc. Syntax Syntax Syntax defines what is grammatically valid in a programming language Set of grammatical rules E.g. in English, a sentence cannot begin with a period Must be formal and exact or there will

More information

Lexical Considerations

Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2010 Handout Decaf Language Tuesday, Feb 2 The project for the course is to write a compiler

More information

Context-free grammars (CFG s)

Context-free grammars (CFG s) Syntax Analysis/Parsing Purpose: determine if tokens have the right form for the language (right syntactic structure) stream of tokens abstract syntax tree (AST) AST: captures hierarchical structure of

More information

Frequently Asked Questions

Frequently Asked Questions Frequently Asked Questions This PowerTools FAQ answers many frequently asked questions regarding the functionality of the various parts of the PowerTools suite. The questions are organized in the following

More information

3.5 Practical Issues PRACTICAL ISSUES Error Recovery

3.5 Practical Issues PRACTICAL ISSUES Error Recovery 3.5 Practical Issues 141 3.5 PRACTICAL ISSUES Even with automatic parser generators, the compiler writer must manage several issues to produce a robust, efficient parser for a real programming language.

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

In Our Last Exciting Episode

In Our Last Exciting Episode In Our Last Exciting Episode #1 Lessons From Model Checking To find bugs, we need specifications What are some good specifications? To convert a program into a model, we need predicates/invariants and

More information

CPS122 Lecture: From Python to Java

CPS122 Lecture: From Python to Java Objectives: CPS122 Lecture: From Python to Java last revised January 7, 2013 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.

More information

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols

A language is a subset of the set of all strings over some alphabet. string: a sequence of symbols alphabet: a set of symbols The current topic:! Introduction! Object-oriented programming: Python! Functional programming: Scheme! Python GUI programming (Tkinter)! Types and values! Logic programming: Prolog! Introduction! Rules,

More information

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

COP4020 Programming Languages. Syntax Prof. Robert van Engelen COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up

More information

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

CMPSCI 187 / Spring 2015 Postfix Expression Evaluator

CMPSCI 187 / Spring 2015 Postfix Expression Evaluator CMPSCI 187 / Spring 2015 Postfix Expression Evaluator Due on Thursday, 05 March, 8:30 a.m. Marc Liberatore and John Ridgway Morrill I N375 Section 01 @ 10:00 Section 02 @ 08:30 1 CMPSCI 187 / Spring 2015

More information

6.001 Notes: Section 15.1

6.001 Notes: Section 15.1 6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define

More information

CONTENTS: Array Usage Multi-Dimensional Arrays Reference Types. COMP-202 Unit 6: Arrays

CONTENTS: Array Usage Multi-Dimensional Arrays Reference Types. COMP-202 Unit 6: Arrays CONTENTS: Array Usage Multi-Dimensional Arrays Reference Types COMP-202 Unit 6: Arrays Introduction (1) Suppose you want to write a program that asks the user to enter the numeric final grades of 350 COMP-202

More information