Semantics driven disambiguation A comparison of different approaches

Size: px
Start display at page:

Download "Semantics driven disambiguation A comparison of different approaches"

Transcription

1 Semantics driven disambiguation A comparison of different approaches Clément Vasseur <clement.vasseur@lrde.epita.fr> Technical Report n o 0416, revision 656 Context-free grammars allow the specification of ambiguous languages. Therefore, generalized parsers yield a set of parse trees, one for each possible interpretation of the input text. This parse forest must be post-processed in order to select the one parse tree that corresponds to the input, taking in consideration the semantic rules. We call this step the disambiguation process. In this report we describe three different methods for achieving this semantics driven disambiguation: term rewriting guided by algebraic specification (ASF+SDF), term rewriting using user-defined strategies (Stratego/XT) and attribute grammars. Moreover we discuss about the strengths and weaknesses of each one. Keywords disambiguation, program transformation, context-free grammars, term rewriting, attribute grammars, rewriting strategies Laboratoire de Recherche et Développement de l Epita 14-16, rue Voltaire F Le Kremlin-Bicêtre cedex France Tél Fax lrde@lrde.epita.fr

2 Copying this document Copyright c 2004 LRDE. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with the Invariant Sections being just Copying this document, no Front- Cover Texts, and no Back-Cover Texts. A copy of the license is provided in the file COPYING.DOC.

3 Contents 1 Parsing and ambiguities Context-free grammars Generalized parsing Parse forests Example Main grammar Extension Term rewriting using the Algebraic Specification Formalism (ASF) Overview The ASF+SDF Meta-environment Application SDF grammar ASF equations Language extension Discussion Strengths Weaknesses Term rewriting using the paradigm of strategies (Stratego) Overview The Stratego/XT bundle Application Concrete syntax core-disamb extension-disamb Discussion Strengths Weaknesses Attribute grammars Overview Attribute grammars in Stratego/XT Application Declaration Use Extension Discussion

4 CONTENTS Strengths Weaknesses Bibliography 22

5 Introduction Context-free grammars and generalized parsing are becoming more and more popular, since they provide the power to tackle real-world parsing challenges using a simple and natural formalism. Thus, environments for language development and program transformation, such as ASF+SDF and Stratego/XT, can be seen as solid tools to undertake the processing of any language. In order to parse an ambiguous input, generalized parsers produce a parse forest, which encodes all the possible derivations for the input text. Obviously, the semantic rules of the language must be taken in consideration to select the one parse tree that corresponds to the input, because syntactic considerations are not sufficient. This disambiguation process takes place after the parsing. In this report we describe three different methods for achieving this semantics driven disambiguation: term rewriting guided by algebraic specification (ASF+SDF), term rewriting using user-defined strategies (Stratego/XT) and attribute grammars. Moreover, we discuss about the strengths and weaknesses of each one. As we don t explain in details each of these languages (SDF, ASF, Stratego), some basic knowledge is required. We only focus on the application of these environments to the problem of semantics driven disambiguation. Pointers to reference papers will be given when necessary.

6 Chapter 1 Parsing and ambiguities In this chapter we show the effects of using context-free grammars and generalized parsing in order to turn a possibly ambiguous input text to a parse forest. 1.1 Context-free grammars Context-free grammars exhibit some properties that are beneficial to language development: They are closed under union. This means that context-free grammars can be modular. This is not the case for subsets of context-free grammars such as LALR(1), which introduce conflicts when two grammars are chained together. They don t impose restrictions on the grammar, so it does not need to be refactored in order to be used with a parser. Ambiguous languages can be specified with a context-free grammar. 1.2 Generalized parsing Generalized parsing refers to a parsing method that yields all the possible representations for an ambiguous input. A popular approach to achieve this feature is GLR (Generalized LR, Tomita (1985)). Given a context-free grammar and a possibly ambiguous input text, a GLR parser is able to generate all the trees that correspond to the input. This result is called a parse forest. In this report, we use the SGLR (Scanner-less Generalized LR, Visser (1997b)) generic parser. A parse table is generated from the SDF (Syntax Definition Formalism, Visser (1997a)) contextfree grammar. This table is given to the SGLR parser, along with the input text. Eventually, a parse forest is given as a result. 1.3 Parse forests In order to be reasonably space efficient, a parse forest is generally encoded in a single tree using ambiguity nodes. The children of such a node represent each possible derivation from this node. Moreover, maximal subtree sharing is used in order to avoid a massive duplication of nodes in memory. Those two techniques permit a compact representation of parse forests.

7 7 Parsing and ambiguities module Core imports Identifiers exports context-free syntax "type" Identifier -> Decl "enum" Identifier -> Decl Identifier -> TypeName Identifier -> EnumName TypeName EnumName -> Use (Decl Use)+ -> TranslationUnit Figure 1.1: The main grammar for our example The SGLR parser uses the ATerm format to encode the parse forest. This format sports maximal term sharing, a compact binary representation, and term annotations. 1.4 Example In this report we consider an example grammar written in SDF. It is split in two distinct parts: the core grammar and an extension. This example exhibits one of the ambiguities introduced by the standard ISO C grammar (ISO/IEC, 1999), and replicated in the standard ISO C++ grammar (ISO/IEC, 2003) Main grammar The core grammar (Fig. 1.1) lets us declare types and enums, using the type and enum keywords. Then we can use the identifiers declared this way. The problem comes from the fact that the grammar, being context-free, can t make the difference between a type and an enum when an identifier is used. This grammar snippet does not specify the syntax for the identifiers. We take advantage of the modularity of SDF in order to show only the relevant portions of the whole grammar. Therefore, the grammar just "imports Identifiers", which defines the Identifier rule as well as the low-level lexical syntax. Refer to Visser (1997a) for an explanation of the syntax of SDF, which is beyond the scope of this report Extension The extension allows us to declare classes along with types and enums, using the class keyword. The grammar in Fig. 1.2 imports the core grammar, and extends it with the few rules needed to implement this feature.

8 1.4 Example 8 module Extension imports Core exports context-free syntax "class" Identifier -> Decl Identifier -> ClassName ClassName -> Use Figure 1.2: The grammar for the extension

9 Chapter 2 Term rewriting using the Algebraic Specification Formalism (ASF) This chapter introduces the Algebraic Specification Formalism (ASF) in the context of semantics driven disambiguation. 2.1 Overview The Algebraic Specification Formalism (van den Brand et al., 2003b) supports conditional rewrite rules and traversal functions using a user-defined (concrete) syntax. It allows the concise specification of program transformation, therefore it is suitable for semantics driven disambiguation, as described in van den Brand et al. (2003a). 2.2 The ASF+SDF Meta-environment The ASF+SDF Meta-Environment is an interactive development environment for the automatic generation of interactive systems for manipulating programs, specifications, or other texts written in a formal language. The Meta-Environment consists of the following major components: ATerm An efficient term library. ToolBus A component architecture. SDF2 A modular syntax definition formalism, supported by SGLR. ASF A term rewriting language with an interpreter and a compiler. MetaStudio A user interface that activates the above components, extended with a module browser and structure editors. TIDE A generic interactive debugging environment.

10 2.3 Application 10 context-free syntax disamb(translationunit, Env) -> <TranslationUnit, Env> {traversal(trafo, accu, top-down, continue)} disamb(decl, Env) -> <Decl, Env> {traversal(trafo, accu, top-down, continue)} disamb(use, Env) -> <Use, Env> {traversal(trafo, accu, top-down, continue)} Identifier ":" Kind -> Pair "[[" Pair* "]]" -> Env "en" -> Kind "ty" -> Kind variables "Id" "Ty" "En" -> Identifier -> TypeName -> EnumName "U*"[0-9\ ]* -> {Use ","}* "P*"[0-9\ ]* -> Pair* "env" -> Env 2.3 Application Figure 2.1: SDF grammar In order to implement the disambiguation using the ASF+SDF Meta-Environment, we must extend the original SDF grammar with several rules so as to be able to write the ASF equations with a concrete syntax SDF grammar As seen in Fig. 2.1, we define a disamb traversal, which is a transformer (the tree is modified during the traversal, using rewrite rules) and an accumulator (information is gathered during the traversal). Moreover, the type of the traversal is given (top-down in this case, ASF only supports the most basic ones), and continue means that the traversal does not stop when a node is rewritten. Note that disamb needs to be declared for each node that we intend to traverse: TranslationUnit, Decl and Use. In order to keep a context of what have been declared before, we use an environment table. This environment is a list of Identifier:Kind, which needs to be filled when a declaration is found, and used when an ambiguity needs to be resolved. Two kinds are available: en for enums and ty for types.

11 11 Term rewriting using the Algebraic Specification Formalism (ASF) equations [env-ty] disamb(type Id, [[ P* ]]) = <type Id, [[ P* Id : ty ]]> [env-en] disamb(enum Id, [[ P* ]]) = <enum Id, [[ P* Id : en ]]> [disamb-ty] Id := Ty, [[ P*1 Id : ty P*2 ]] := env ========================================== disamb(amb(u*1, Ty, U*2), env) = <Ty, env> [disamb-en] Id := En, [[ P*1 Id : en P*2 ]] := env ========================================== disamb(amb(u*1, En, U*2), env) = <En, env> Figure 2.2: ASF equations ASF equations The disambiguation module consists of a list of equations (Fig 2.2). The names between brackets are the names of the equations. The first two equations are used to declare an identifier. When disamb is applied to a "type Id" node (notice the concrete syntax), with a given environment, then the Id is added to the environment, associated with the ty kind. The env-en equation is similar. The others are the ones that perform the proper disambiguation. They are conditional rewrite rules, with the rewrite rule under the bar and the conditions above. Basically, disamb-ty rewrites an ambiguous node (amb(u*1, Ty, U*2)) to one of it s children (Ty) if Id is registered as kind ty in the environment. Therefore, the application of disamb in a top-down traversal exhibits two behaviors: an accumulator on declarations, adding entries to the environment table, and a transformer on ambiguous nodes, rewriting the nodes to remove the ambiguity Language extension In order to handle the disambiguation of the extension, we add a cl kind, which represents a class. See Fig The ASF equations for the extension (Fig. 2.4) are perfectly similar to the ones already explained, using the newly introduced cl kind. 2.4 Discussion Strengths Using equations to describe term rewriting leads to a clean formalism. Simple traversals have been added to this formalism, in order to concisely specify a transformation system. The Meta- Environment allows interactive development and testing of such transformation system.

12 2.4 Discussion 12 module Extension imports Core exports context-free syntax "class" Identifier -> Decl Identifier -> ClassName ClassName -> Use "cl" -> Kind hiddens variables "Cl" -> ClassName Figure 2.3: SDF grammar for the extension equations [env-cl] disamb(class Id, [[ P* ]]) = <class Id, [[ P* Id : cl ]]> [disamb-cl] Id := Cl, [[ P*1 Id : cl P*2 ]] := env ========================================== disamb(amb(u*1, Cl, U*2), env) = <Cl, env> Figure 2.4: ASF equations for the extension

13 13 Term rewriting using the Algebraic Specification Formalism (ASF) Weaknesses Complicated traversals are not trivial in ASF. A lot of equations have to be written, since the traversal must be explicitly described. Understanding a whole disambiguation module require a detailed reading of the equations, both in the module as well as in the imported ones. Since every rewrite rule is potentially executable on any node, there can be hidden interactions between some equations, leading to cycles or unwanted effects. Those are not too hard to debug, thanks to the execution trace (provided there is not too much equations), but they definitely does not appear at first sight.

14 Chapter 3 Term rewriting using the paradigm of strategies (Stratego) This chapter deals with semantic disambiguation using the Stratego language, which implements term rewriting using the paradigm of strategies. 3.1 Overview The Stratego language focuses on term rewriting using user-defined strategies. It features conditional rewrite rules, generic traversal, scoped dynamic rules and supports term specification in concrete syntax. Refer to Visser (2004) for a description of the language. 3.2 The Stratego/XT bundle The Stratego/XT bundle (de Jonge et al., 2001) regroups several tools that can be used together in order to build dedicated program transformation tools: The Stratego compiler and it s standard library, a tool that turn SDF grammars into parsing tables, the SGLR generic parser, and GPP the Generic Pretty Printer (de Jonge, 2000). 3.3 Application In order to disambiguate our parse forest, we build a filter written in Stratego. This filter rewrites the parse forest into a parse tree, resolving the ambiguities using specified traversals and transformations. Moreover, dynamic rules are used to keep the necessary knowledge about the declared symbols.

15 15 Term rewriting using the paradigm of strategies (Stratego) module StrategoCore imports StrategoRenamed Core exports context-free syntax " d[" Decl "] " -> StrategoTerm {cons("toterm"), prefer} " u[" Use "] " -> StrategoTerm {cons("toterm"), prefer} " t[" TypeName "] " -> StrategoTerm {cons("toterm"), prefer} " e[" EnumName "] " -> StrategoTerm {cons("toterm"), prefer} variables "i" -> Identifier "t" -> TypeName "e" -> EnumName Concrete syntax Figure 3.1: Grammar definition for concrete syntax First, the main grammar must be extended and merged with the Stratego grammar, so as to allow the specification of terms using concrete syntax. This means that code snippets can be directly written in object language instead of parse tree nodes which can be very confusing. The grammar in Fig. 3.1 defines the x[... ] syntax, with x being d, u, t or e, corresponding to a Decl, Use, TypeName or EnumName code snippet. Therefore, the following term: d[ type i ] can be used instead of the equivalent AsFix term: appl(prod([lit("type"),cf(opt(layout())),cf(sort("identifier"))], cf(sort("decl")), attrs([ term(cons("typedecl")) ])), [ appl(prod([char-class([ 116 ]),char-class([ 121 ]),char-class([ 112 ]),char-class([ 101 ])], lit("type"), no-attrs()), [ 116, 121, 112, 101 ]), appl(prod([cf(layout())], cf(opt(layout())), no-attrs()), [ appl(prod([lex(layout())],cf(layout()), no-attrs()), [ appl(prod([lex(iter(char-class([ range(9, 10), 32 ])))], lex(layout()), no-attrs()), [ appl(prod([char-class([ range(9, 10), 32 ])], lex(iter(char-class([ range(9, 10), 32 ]))), no-attrs()), [ 32 ]) ]) ]) ]), i ]) Needless to say, the first version increases the readability and maintainability of the Stratego filter. Moreover, a few variables are defined, so that they can be used in the concrete syntax. Refer to Visser (2002) for more information about concrete syntax in Stratego core-disamb Stratego is modular, just like SDF. The filter in Fig. 3.2 imports lib - the standard library - and AsFix, because the concrete syntax will be expanded into ATerm nodes. This Stratego module

16 3.3 Application 16 module core-disamb imports lib AsFix strategies decl =? d[ type i ] ; rules(use:+ amb(as) -> t where! t[ i ] => t; <getfirst(? u[ t ] )> as) decl =? d[ enum i ] ; rules(use:+ amb(as) -> e where! e[ i ] => e; <getfirst(? u[ e ] )> as) core-disamb = io-wrap(alltd(decl <+ use)) Figure 3.2: The Stratego filter for disambiguating the core example consists in a set of strategies. The main strategy, core-disamb, is the combination of other strategies: io-wrap is a wrapper that handles the input and output of the term. This is the strategy that gives our binary the ability to read from standard input or from a file, write to standard output or to a file, and accept options to change the behavior of the filter. alltd basically performs a top-down traversal of the current term. decl is the strategy we use to declare the identifiers. In fact, decl will create dynamic rules which are named use. The two decl strategies are similar, and they will be combined using non-determinist choice in order to be called by the main strategy. This means that each of them will be applied until one of them succeed. The first part (? d[ type i ] ) makes use of the concrete syntax in order to match a type declaration. If it fails, the strategy bails out. Else, a dynamic rule is created, which rewrites an ambiguous node (amb(as)) to a TypeName if one of the children of the ambiguous node is a type identifier with the same name. Thus, we achieve to keep a context without using a symbol table. When a type declaration is found, it is known that future ambiguities about an identifier with the same name need to be resolved to a typename. Therefore, a dynamic rule is spawned at this moment extension-disamb Fig. 3.3 shows the Stratego code for disambiguating a parse forest generated using the extension grammar. This filter imports core-disamb, thus inheriting the disambiguation mechanism for the base grammar. The main strategy is the same as the one for core-disamb. In order to disambiguate classes, a decl strategy needs to be defined, which recognizes a class declaration, and create a corresponding use dynamic rule. This strategy is similar to the other decl strategies used previously. Therefore the extension filter is quite simple.

17 17 Term rewriting using the paradigm of strategies (Stratego) module extension-disamb imports lib AsFix core-disamb strategies decl =? d[ class i ] ; rules(use:+ amb(as) -> c where! c[ i ] => c; <getfirst(? u[ c ] )> as) extension-disamb = io-wrap(alltd(decl <+ use)) 3.4 Discussion Strengths Figure 3.3: The Stratego filter for disambiguating the extension Stratego code is very expressive. Complex traversals can be expressed very easily using the various combination operators and user-defined strategies. Additional flexibility comes with the concept of scoped dynamic rules, which allows to have a global context without explicitly carrying it everywhere Weaknesses The Stratego modules are not directly linked to the grammar it operates on. If the grammar is extended, care must be taken to make sure the Stratego code can handle it. Thus, code refactoring is sometimes necessary in order to extend the Stratego module. However, as we have seen in our example, when the code is well written, extension can be painless.

18 Chapter 4 Attribute grammars This chapter explains how we tackle the semantic disambiguation problem using a prototype attribute grammars engine made with the Stratego/XT tool chain. 4.1 Overview Attribute grammars allow the embedding of attributes in the parse tree. These attributes can be computed using the attributes of other nodes. Those attributes that are computed using values from the parent node are inherited, whereas those using values from the child nodes are synthesized. 4.2 Attribute grammars in Stratego/XT Our implementation David (2004) embeds Stratego code in the SDF grammar in order to compute the values. Attributes are arbitrary terms, which can be trees, integers, strings, or even tables. This flexibility is useful because we need to handle various types of attributes, such as symbol tables. After the parsing, an evaluator computes the attributes until a fix point is reached. 4.3 Application In order to filter ambiguous nodes using attribute grammars, we use a special ok attribute. This attribute is a boolean value which represents the validity of a node. The nodes which are not ok are removed from the tree after the attribute processing is done. Thus, we just need to adjust the ok attributes for some nodes, depending of the symbol tables that we generate, in order to strip the invalid derivations Declaration The way attributes are embedded in the SDF grammar is by the mean of the attributes annotation. Each attribute resides in a namespace. In Fig. 4.1 we use the disamb namespace. There is a single value computed in this rule, which is the out_table attribute of the root node (Decl). The computation is written in Stratego. Basically, it builds a list of couples, with

19 19 Attribute grammars "type" Identifier -> Decl {attributes(disamb: root.out_table :=![(Identifier.common:string, TypeName()) root.table] )} Figure 4.1: Attribute annotation for the type declaration Identifier -> TypeName {attributes(disamb: root.ok := <lookup> (Identifier.common:string, root.table) => TypeName() <!1 +!0 )} Figure 4.2: Attribute annotation for the type usage the first one being (Identifier.common:string, TypeName()) and the other ones taken from the table attribute of the root node. This list is a symbol table, which maps identifier names to the TypeName symbol, which means that the identifier is a type. Therefore, this rule just add the identifier s name to the symbol table inherited from root.table, mapped to a TypeName, and puts it in the root.out_table attribute. The latter will be forwarded to the parent nodes, in order to be used later on Use In order to disambiguate the parse forest, we must first identify the rules that cause the ambiguities. In our example, we know that the rules that map an Identifier to a TypeName or an EnumName are ambiguous. Therefore, we annotate these rules with an ok attribute, as seen in Fig Then, we need to decide whether the corresponding node is the good one or not. In fact, we need to know if the Identifier has been declared previously as a TypeName or not, and this information is located in the symbol table (root.table). Thus, we use the lookup strategy from the standard Stratego library, and put 1 or 0 if the result matches a TypeName or not Extension The disambiguation process for the extension is quite similar to what have been done previously. We use the ClassName term to declare classes in the symbol table. The rest is trivial. See Fig. 4.3.

20 4.4 Discussion 20 context-free syntax "class" Identifier -> Decl {attributes(disamb: root.out_table :=![(Identifier.common:string, ClassName()) root.table] )} Identifier -> ClassName {attributes(disamb: root.ok := <lookup> (Identifier.common:string, root.table) => ClassName() <!1 +!0 )} ClassName -> Use {attributes(disamb: ClassName.table :=!root.table )} 4.4 Discussion Strengths Figure 4.3: Attribute annotations for disambiguating the extension Attribute grammars are conceptually elegant, being a combination of declarativeness at the rule level and imperativeness at the attribute level. The rules are pretty much independent from each other. Attribute grammars are naturally extended, just like the SDF grammar. Since the disambiguation code is broken down at the rule level, adding new rules tends to fit well in the existing code. Using the Stratego language in order to compute the attributes brings several good features, such as the flexibility of the ATerm format, the whole Stratego standard library, and more generally the expressiveness of the language Weaknesses Separation of concerns is hard to achieve. Attaching all the attribute processing with their rules means that different processings can t be easily separated or disabled. Our implementation does provide namespaces, which is an improvement concerning this problem, but this is not a full scale solution. Attribute grammars tend to clutter the grammar. The code for the attribute computation is somewhat mixed with the rule declarations and the other annotations. When the code gets long and complicated, the grammar becomes less readable. Trivial propagation of attributes require boilerplate code that is cumbersome to maintain, and add to the clutter. Newer versions of our implementation introduce syntactic sugars in order to automatically generate these traversals.

21 Conclusion As a conclusion, we have seen how to perform semantic disambiguation on a toy example. Three different approaches have been explored: Term rewriting using the Algebraic Specification Formalism (ASF), which is purely declarative. Term rewriting using the paradigm of strategies (Stratego), which is mostly imperative (functional style). Attribute grammars, using Stratego code to compute the attributes, thus mixing declarativeness and imperativeness. Based on our previous experience with semantic disambiguation on the pretty big grammars of standard C/C++ (Anisko et al., 2003), it seems that attribute grammars is the formalism that fits best for this task. However, we can t really draw a definitive conclusion based on the toy example of this report. At the minimum, it needs to be enriched with the following properties: scopes, namespaces (named scopes), and templates. These features will show how the different paradigms can cope with real-world language constructs that cause the bigger problems. Thus, this work is to be continued.

22 Chapter 5 Bibliography Anisko, R., David, V., and Vasseur, C. (2003). Transformers: a C++ program transformation framework. Technical Report 0310, LRDE Seminar-ClementVasseur-Transformers-Report. David, V. (2004). Attribute grammars for C++ disambiguation. Technical Report 0405, LRDE. de Jonge, M. (2000). A pretty-printer for every occasion. de Jonge, M., Visser, E., and Visser, J. (2001). XT: A bundle of program transformation tools. In van den Brand, M. G. J. and Perigot, D., editors, Workshop on Language Descriptions, Tools and Applications (LDTA 01), volume 44 of Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers. ISO/IEC (1999). ISO/IEC 9899:1999 (E). Programming languages - C. ISO/IEC (2003). ISO/IEC 14882:2003 (E). Programming languages - C++. Tomita, M. (1985). Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers. van den Brand, M., Klusener, S., Moonen, L., and Vinju, J. J. (2003a). Generalized parsing and term rewriting: Semantics driven disambiguation. volume 82 of Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers. van den Brand, M. G. J., Klint, P., and Vinju, J. J. (2003b). Term rewriting with traversal functions. ACM Trans. Softw. Eng. Methodol., 12(2): Visser, E. (1997a). A family of syntax definition formalisms. Technical report. Visser, E. (1997b). Scannerless generalized-lr parsing. Technical Report P9707, Programming Research Group, University of Amsterdam. Visser, E. (2002). Meta-programming with concrete object syntax. In Batory, D., Consel, C., and Taha, W., editors, Generative Programming and Component Engineering (GPCE 02), volume 2487 of Lecture Notes in Computer Science, pages , Pittsburgh, PA, USA. Springer-Verlag. Visser, E. (2004). Strategies for Program Transformation. Draft available from

Renaud Durlin. May 16, 2007

Renaud Durlin. May 16, 2007 A comparison of different approaches EPITA Research and Development Laboratory (LRDE) http://www.lrde.epita.fr May 16, 2007 1 / 25 1 2 3 4 5 2 / 25 1 2 3 4 5 3 / 25 Goal Transformers:

More information

Semantics driven disambiguation

Semantics driven disambiguation Semantics driven disambiguation Renaud Durlin Technical Report n o 0709, June 13, 2008 revision 1464 BibTeX reference: durlin.07.seminar An elegant approach to manage ambiguous grammars consists in using

More information

Attribute grammars for C++ disambiguation

Attribute grammars for C++ disambiguation Attribute grammars for C++ disambiguation Valentin David Technical Report n o 0405, December 2004 revision 667 The development of the Transformers project has raised some design

More information

Incremental parsing C++ pre-processing

Incremental parsing C++ pre-processing Incremental parsing C++ pre-processing Olivier Gournet LRDE seminar, June 16, 2004 http://www.lrde.epita.fr/ Copying this document Copying this document Copyright c 2004

More information

Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT

Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT Karl Trygve Kalleberg 1 Department of Informatics, University of Bergen, P.O. Box 7800, N-5020 BERGEN,

More information

SERG. Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT

SERG. Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT Delft University of Technology Software Engineering Research Group Technical Report Series Spoofax: An Extensible, Interactive Development Environment for Program Transformation with Stratego/XT Karl Trygve

More information

C++ template disambiguation with TRANSFORMERS Attribute Grammars

C++ template disambiguation with TRANSFORMERS Attribute Grammars C++ template disambiguation with TRANSFORMERS Attribute Grammars Warren Seine Technical Report n o 0811, July 2008 revision 1896 English: The C++ language is context-sensitive: no context-free grammar

More information

... is a Programming Environment (PE)?... is Generic Language Technology (GLT)?

... is a Programming Environment (PE)?... is Generic Language Technology (GLT)? Introduction to Generic Language Technology Today Mark van den Brand Paul Klint Jurgen Vinju Tools for software analysis and manipulation Programming language independent (parametric) The story is from

More information

C++ Program Slicing with TRANSFORMERS

C++ Program Slicing with TRANSFORMERS C++ Program Slicing with TRANSFORMERS Florian Quèze Technical Report n o 0825, May 2008 revision 1788 TRANSFORMERS is a C++ manipulation framework built on Stratego/XT. Program Slicing is an important

More information

Grammar Engineering Support for Precedence Rule Recovery and Compatibility Checking

Grammar Engineering Support for Precedence Rule Recovery and Compatibility Checking Electronic Notes in Theoretical Computer Science 203 (2008) 85 101 www.elsevier.com/locate/entcs Grammar Engineering Support for Precedence Rule Recovery and Compatibility Checking Eric Bouwers a,1 Martin

More information

UPTR - a simple parse tree representation format

UPTR - a simple parse tree representation format UPTR - a simple parse tree representation format Jurgen Vinju Software Transformation Systems October 22, 2006 Quality of Software Transformation Systems Some Questions on Parsing Two pragmatical steps

More information

sbp: A Scannerless Boolean Parser

sbp: A Scannerless Boolean Parser LDTA 2006 Preliminary Version sbp: A Scannerless Boolean Parser Adam Megacz Computer Science UC Berkeley Abstract Scannerless generalized parsing techniques allow parsers to be derived directly from unified,

More information

Towards Automatic Partial Evaluation for the C++ Language. Robert Anisko

Towards Automatic Partial Evaluation for the C++ Language. Robert Anisko Towards Automatic Partial Evaluation for the C++ Language Robert Anisko May 27, 2002 Partial evaluation is a high-level optimization technique that, given a program text and some of its input, generates

More information

Disambiguation Filters for Scannerless Generalized LR Parsers

Disambiguation Filters for Scannerless Generalized LR Parsers Disambiguation Filters for Scannerless Generalized LR Parsers M.G.J. van den Brand 1, J. Scheerder 2, J.J. Vinju 1, and E. Visser 3 1 Centrum voor Wiskunde en Informatica (CWI), Kruislaan 413, 1098 SJ

More information

Centaur: A generic framework simplifying C++ transformation

Centaur: A generic framework simplifying C++ transformation Centaur: A generic framework simplifying C++ transformation Cedric Raud Technical Report n o 0823, July 2008 revision 1848 English: The C++ standard grammar was not thought to be easily parsable so its

More information

Infrastructure for Program Transformation Systems

Infrastructure for Program Transformation Systems Master Course Program Transformation 2004-2005 Martin Bravenboer Institute of Information & Computing Sciences Utrecht University, The Netherlands February 10, 2005 Planet Stratego/XT Stratego Language

More information

SEN. Software Engineering. Software ENgineering. A type-driven approach to concrete meta programming. J.J. Vinju

SEN. Software Engineering. Software ENgineering. A type-driven approach to concrete meta programming. J.J. Vinju C e n t r u m v o o r W i s k u n d e e n I n f o r m a t i c a SEN Software Engineering Software ENgineering A type-driven approach to concrete meta programming J.J. Vinju REPORT SEN-E0507 APRIL 2005

More information

Disambiguation Filters for Scannerless Generalized LR Parsers

Disambiguation Filters for Scannerless Generalized LR Parsers Disambiguation Filters for Scannerless Generalized LR Parsers Mark G. J. van den Brand 1,4, Jeroen Scheerder 2, Jurgen J. Vinju 1, and Eelco Visser 3 1 Centrum voor Wiskunde en Informatica (CWI) Kruislaan

More information

Scannerless Boolean Parsing

Scannerless Boolean Parsing LDTA 2006 Preliminary Version Scannerless Boolean Parsing Adam Megacz Computer Science UC Berkeley Abstract Scannerless generalized parsing techniques allow parsers to be derived directly from unified,

More information

Disambiguation Filters for Scannerless Generalized LR Parsers

Disambiguation Filters for Scannerless Generalized LR Parsers Disambiguation Filters for Scannerless Generalized LR Parsers M.G.J. van den Brand 1,4, J. Scheerder 2, J.J. Vinju 1, and E. Visser 3 1 Centrum voor Wiskunde en Informatica (CWI), Kruislaan 413, 1098 SJ

More information

Conception of a static oriented language : an overview of SCOOL

Conception of a static oriented language : an overview of SCOOL Conception of a static oriented language : an overview of SCOOL Thomas Moulard Technical Report n o 0610, June 2006 revision 963 SCOOL is a static oriented language designed to solve the problems encountered

More information

TWEAST: A Simple and Effective Technique to Implement Concrete-Syntax AST Rewriting Using Partial Parsing

TWEAST: A Simple and Effective Technique to Implement Concrete-Syntax AST Rewriting Using Partial Parsing TWEAST: A Simple and Effective Technique to Implement Concrete-Syntax AST Rewriting Using Partial Parsing Akim Demaille Roland Levillain Benoît Sigoure EPITA Research and Development Laboratory (LRDE)

More information

Semantic Analysis. Lecture 9. February 7, 2018

Semantic Analysis. Lecture 9. February 7, 2018 Semantic Analysis Lecture 9 February 7, 2018 Midterm 1 Compiler Stages 12 / 14 COOL Programming 10 / 12 Regular Languages 26 / 30 Context-free Languages 17 / 21 Parsing 20 / 23 Extra Credit 4 / 6 Average

More information

Generalized Parsing and Term Rewriting: Semantics Driven Disambiguation

Generalized Parsing and Term Rewriting: Semantics Driven Disambiguation LDTA 03 Preliminary Version Generalized Parsing and Term Rewriting: Semantics Driven Disambiguation M.G.J. van den Brand 2,4, A.S. Klusener 3,4, L. Moonen 1, J.J. Vinju 1 1 Centrum voor Wiskunde en Informatica

More information

SERG. Natural and Flexible Error Recovery for Generated Modular Language Environments

SERG. Natural and Flexible Error Recovery for Generated Modular Language Environments Delft University of Technology Software Engineering Research Group Technical Report Series Natural and Flexible Error Recovery for Generated Modular Language Environments Maartje de Jonge, Lennart C.L.

More information

Domain-Specific Languages for Composable Editor Plugins

Domain-Specific Languages for Composable Editor Plugins Domain-Specific Languages for Composable Editor Plugins LDTA 2009, York, UK Lennart Kats (me), Delft University of Technology Karl Trygve Kalleberg, University of Bergen Eelco Visser, Delft University

More information

The Syntax Definition Formalism SDF

The Syntax Definition Formalism SDF Mark van den Brand Paul Klint Jurgen Vinju 2007-10-22 17:18:09 +0200 (Mon, 22 Oct 2007) Table of Contents An Introduction to SDF... 2 Why use SDF?... 2 How to use SDF... 4 Learning more... 4 This document...

More information

Book. Signatures and grammars. Signatures and grammars. Syntaxes. The 4-layer architecture

Book. Signatures and grammars. Signatures and grammars. Syntaxes. The 4-layer architecture Book Generic Language g Technology (2IS15) Syntaxes Software Language g Engineering g by Anneke Kleppe (Addison Wesley) Prof.dr. Mark van den Brand / Faculteit Wiskunde en Informatica 13-9-2011 PAGE 1

More information

SPIN s Promela to Java Compiler, with help from Stratego

SPIN s Promela to Java Compiler, with help from Stratego SPIN s Promela to Java Compiler, with help from Stratego Master s Thesis Edwin Vielvoije SPIN s Promela to Java Compiler, with help from Stratego THESIS submitted in partial fulfillment of the requirements

More information

An Evaluation of Domain-Specific Language Technologies for Code Generation

An Evaluation of Domain-Specific Language Technologies for Code Generation An Evaluation of Domain-Specific Language Technologies for Code Generation Christian Schmitt, Sebastian Kuckuk, Harald Köstler, Frank Hannig, Jürgen Teich Hardware/Software Co-Design, System Simulation,

More information

More On Syntax Directed Translation

More On Syntax Directed Translation More On Syntax Directed Translation 1 Types of Attributes We have productions of the form: A X 1 X 2 X 3... X n with semantic rules of the form: b:= f(c 1, c 2, c 3,..., c n ) where b and the c s are attributes

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Object-oriented Compiler Construction

Object-oriented Compiler Construction 1 Object-oriented Compiler Construction Extended Abstract Axel-Tobias Schreiner, Bernd Kühl University of Osnabrück, Germany {axel,bekuehl}@uos.de, http://www.inf.uos.de/talks/hc2 A compiler takes a program

More information

CS Exam #1-100 points Spring 2011

CS Exam #1-100 points Spring 2011 CS 4700 - Exam #1-100 points Spring 2011 Fill in the blanks (1 point each) 1. syntactic sugar is a term coined for additions to the syntax of a computer language that do not affect its expressiveness but

More information

Domain-Specific Languages for Composable Editor Plugins

Domain-Specific Languages for Composable Editor Plugins Electronic Notes in Theoretical Computer Science 253 (2010) 149 163 www.elsevier.com/locate/entcs Domain-Specific Languages for Composable Editor Plugins Lennart C. L. Kats,1,2 Karl T. Kalleberg +,3 Eelco

More information

7. Introduction to Denotational Semantics. Oscar Nierstrasz

7. Introduction to Denotational Semantics. Oscar Nierstrasz 7. Introduction to Denotational Semantics Oscar Nierstrasz Roadmap > Syntax and Semantics > Semantics of Expressions > Semantics of Assignment > Other Issues References > D. A. Schmidt, Denotational Semantics,

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

Stratego/XT 0.16: Components for Transformation Systems

Stratego/XT 0.16: Components for Transformation Systems Stratego/XT 0.16: Components for Transformation Systems Martin Bravenboer Department of Information and Computing Sciences, Utrecht University martin@cs.uu.nl Karl Trygve Kalleberg Department of Informatics

More information

Compiler construction

Compiler construction Compiler construction Martin Steffen March 13, 2017 Contents 1 Abstract 1 1.1 Symbol tables. 1 1.1.1 Introduction 1 1.1.2 Symbol table design and interface.. 2 1.1.3 Implementing symbol tables 3 1.1.4

More information

Stratego/XT 0.16: Components for Transformation Systems

Stratego/XT 0.16: Components for Transformation Systems Stratego/XT 0.16: Components for Transformation Systems Martin Bravenboer Karl Trygve Kalleberg Rob Vermaas Eelco Visser Technical Report UU-CS-2005-052 Department of Information and Computing Sciences

More information

Single-pass Static Semantic Check for Efficient Translation in YAPL

Single-pass Static Semantic Check for Efficient Translation in YAPL Single-pass Static Semantic Check for Efficient Translation in YAPL Zafiris Karaiskos, Panajotis Katsaros and Constantine Lazos Department of Informatics, Aristotle University Thessaloniki, 54124, Greece

More information

Syntax-Directed Translation

Syntax-Directed Translation Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation

More information

Pretty-Printing for Software Reengineering

Pretty-Printing for Software Reengineering Pretty-Printing for Software Reengineering Merijn de Jonge CWI P.O. Box 94079, 1090 GB Amsterdam, The Netherlands Merijn.de.Jonge@cwi.nl Abstract Automatic software reengineerings change or repair existing

More information

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction Semantic Analysis David Sinclair Semantic Actions A compiler has to do more than just recognise if a sequence of characters forms a valid sentence in the language. It must

More information

SERG. Domain-Specific Languages for Composable Editor Plugins

SERG. Domain-Specific Languages for Composable Editor Plugins Delft University of Technology Software Engineering Research Group Technical Report Series Domain-Specific Languages for Composable Editor Plugins Lennart C. L. Kats, Karl T. Kalleberg, Eelco Visser Report

More information

Interactive Disambiguation of Meta Programs with Concrete Object Syntax

Interactive Disambiguation of Meta Programs with Concrete Object Syntax Interactive Disambiguation of Meta Programs with Concrete Object Syntax Lennart Kats Karl T. Kalleberg Eelco Visser (TUDelft) (KolibriFX) (TUDelft) Meta-programming Meta-programming with Template Engines

More information

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Statically vs. Dynamically typed languages

More information

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the

More information

The Meta-Environment. A Language Workbench for Source code Analysis, Visualization and Transformation. Jurgen Vinju IPA herfstdagen 2008

The Meta-Environment. A Language Workbench for Source code Analysis, Visualization and Transformation. Jurgen Vinju IPA herfstdagen 2008 The Meta-Environment A Language Workbench for Source code Analysis, Visualization and Transformation Jurgen Vinju IPA herfstdagen 2008 Overview Goal of The Meta-Environment Methods of The Meta-Environment

More information

What do Compilers Produce?

What do Compilers Produce? What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond

More information

RLSRunner: Linking Rascal with K for Program Analysis

RLSRunner: Linking Rascal with K for Program Analysis RLSRunner: Linking Rascal with K for Program Analysis Mark Hills 1,2, Paul Klint 1,2, and Jurgen J. Vinju 1,2 1 Centrum Wiskunde & Informatica, Amsterdam, The Netherlands 2 INRIA Lille Nord Europe, France

More information

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction Topics Chapter 3 Semantics Introduction Static Semantics Attribute Grammars Dynamic Semantics Operational Semantics Axiomatic Semantics Denotational Semantics 2 Introduction Introduction Language implementors

More information

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis? Anatomy of a Compiler Program (character stream) Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Semantic Analysis Parse Tree Intermediate Code Generator Intermediate Code Optimizer Code Generator

More information

A Functional Graph Library

A Functional Graph Library A Functional Graph Library Christian Doczkal Universität des Saarlandes Abstract. Algorithms on graphs are of great importance, both in teaching and in the implementation of specific problems. Martin Erwig

More information

Verifiable composition of language extensions

Verifiable composition of language extensions Verifiable composition of language extensions Ted Kaminski Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN, USA tedinski@cs.umn.edu Abstract. Domain-specific languages

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

The role of semantic analysis in a compiler

The role of semantic analysis in a compiler Semantic Analysis Outline The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors

More information

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

More information

Low-level optimization

Low-level optimization Low-level optimization Advanced Course on Compilers Spring 2015 (III-V): Lecture 6 Vesa Hirvisalo ESG/CSE/Aalto Today Introduction to code generation finding the best translation Instruction selection

More information

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer: Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal

More information

Model transformations. Overview of DSLE. Model transformations. Model transformations. The 4-layer architecture

Model transformations. Overview of DSLE. Model transformations. Model transformations. The 4-layer architecture Overview of DSLE Model driven software engineering g in general Grammars, signatures and meta-models DSL Design Code generation Models increase the level of abstraction used for both hardware and software

More information

How to make a bridge between transformation and analysis technologies?

How to make a bridge between transformation and analysis technologies? How to make a bridge between transformation and analysis technologies? J.R. Cordy and J.J. Vinju July 19, 2005 1 Introduction At the Dagstuhl seminar on Transformation Techniques in Software Engineering

More information

Project 1: Scheme Pretty-Printer

Project 1: Scheme Pretty-Printer Project 1: Scheme Pretty-Printer CSC 4101, Fall 2017 Due: 7 October 2017 For this programming assignment, you will implement a pretty-printer for a subset of Scheme in either C++ or Java. The code should

More information

Stratego: A Language for Program Transformation Based on Rewriting Strategies

Stratego: A Language for Program Transformation Based on Rewriting Strategies Stratego: A Language for Program Transformation Based on Rewriting Strategies System Description of Stratego 0.5 Eelco Visser Institute of Information and Computing Sciences, Universiteit Utrecht, P.O.

More information

Introduction to Parsing

Introduction to Parsing Introduction to Parsing The Front End Source code Scanner tokens Parser IR Errors Parser Checks the stream of words and their parts of speech (produced by the scanner) for grammatical correctness Determines

More information

CSE 12 Abstract Syntax Trees

CSE 12 Abstract Syntax Trees CSE 12 Abstract Syntax Trees Compilers and Interpreters Parse Trees and Abstract Syntax Trees (AST's) Creating and Evaluating AST's The Table ADT and Symbol Tables 16 Using Algorithms and Data Structures

More information

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

More information

Climb - A Generic and Dynamic Approach to Image Processing

Climb - A Generic and Dynamic Approach to Image Processing Climb - A Generic and Dynamic Approach to Image Processing Christopher Chedeau Technical Report n o 1001, Juin 2010 revision 2187 Climb is a generic image processing library. A case study of the erosion

More information

In Our Last Exciting Episode

In Our Last Exciting Episode In Our Last Exciting Episode #1 Lessons From Model Checking To find bugs, we need specifications What are some good specifications? To convert a program into a model, we need predicates/invariants and

More information

Programming Project II

Programming Project II Programming Project II CS 322 Compiler Construction Winter Quarter 2006 Due: Saturday, January 28, at 11:59pm START EARLY! Description In this phase, you will produce a parser for our version of Pascal.

More information

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl Compilerconstructie najaar 2013 http://www.liacs.nl/home/rvvliet/coco/ Rudy van Vliet kamer 124 Snellius, tel. 071-527 5777 rvvliet(at)liacs(dot)nl college 1, dinsdag 3 september 2013 Overview 1 Why this

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

More information

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

More information

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target. COMP 412 FALL 2017 Computing Inside The Parser Syntax-Directed Translation Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights

More information

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End Outline Semantic Analysis The role of semantic analysis in a compiler A laundry list of tasks Scope Static vs. Dynamic scoping Implementation: symbol tables Types Static analyses that detect type errors

More information

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal)

Parsing Part II. (Ambiguity, Top-down parsing, Left-recursion Removal) Parsing Part II (Ambiguity, Top-down parsing, Left-recursion Removal) Ambiguous Grammars Definitions If a grammar has more than one leftmost derivation for a single sentential form, the grammar is ambiguous

More information

Arbori Starter Manual Eugene Perkov

Arbori Starter Manual Eugene Perkov Arbori Starter Manual Eugene Perkov What is Arbori? Arbori is a query language that takes a parse tree as an input and builds a result set 1 per specifications defined in a query. What is Parse Tree? A

More information

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis Prelude COMP 8 October, 9 What is triskaidekaphobia? Fear of the number s? No aisle in airplanes, no th floor in buildings Fear of Friday the th? Paraskevidedekatriaphobia or friggatriskaidekaphobia Why

More information

Programming Assignment III

Programming Assignment III Programming Assignment III First Due Date: (Grammar) See online schedule (submission dated midnight). Second Due Date: (Complete) See online schedule (submission dated midnight). Purpose: This project

More information

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to

More information

Chapter 1. Formatting with Box and Pandora

Chapter 1. Formatting with Box and Pandora Chapter 1. Formatting with Box and Pandora Paul Klint Taeke Kooiker Jurgen Vinju 2007-10-20 22:13:15 +0200 (Sat, 20 Oct 2007) Table of Contents An introduction to Formatting... 1 Why Formatting?... 1 What

More information

A Tour of the Cool Support Code

A Tour of the Cool Support Code A Tour of the Cool Support Code 1 Introduction The Cool compiler project provides a number of basic data types to make the task of writing a Cool compiler tractable in the timespan of the course. This

More information

Concrete Syntax for Objects

Concrete Syntax for Objects Realizing Domain-Specific Language Embedding and Assimilation without Restrictions Martin Bravenboer Eelco Visser Institute of Information & Computing Sciences Utrecht University, The Netherlands June

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Grammars and Parsing. Paul Klint. Grammars and Parsing

Grammars and Parsing. Paul Klint. Grammars and Parsing Paul Klint Grammars and Languages are one of the most established areas of Natural Language Processing and Computer Science 2 N. Chomsky, Aspects of the theory of syntax, 1965 3 A Language...... is a (possibly

More information

CS 406: Syntax Directed Translation

CS 406: Syntax Directed Translation CS 406: Syntax Directed Translation Stefan D. Bruda Winter 2015 SYNTAX DIRECTED TRANSLATION Syntax-directed translation the source language translation is completely driven by the parser The parsing process

More information

CS131 Compilers: Programming Assignment 2 Due Tuesday, April 4, 2017 at 11:59pm

CS131 Compilers: Programming Assignment 2 Due Tuesday, April 4, 2017 at 11:59pm CS131 Compilers: Programming Assignment 2 Due Tuesday, April 4, 2017 at 11:59pm Fu Song 1 Policy on plagiarism These are individual homework. While you may discuss the ideas and algorithms or share the

More information

Restricting Grammars with Tree Automata

Restricting Grammars with Tree Automata Restricting Grammars with Tree Automata MICHAEL D ADAMS and MATTHEW MIGHT, University of Utah, USA Precedence and associativity declarations in systems like yacc resolve ambiguities in context-free grammars

More information

Summary: Semantic Analysis

Summary: Semantic Analysis Summary: Semantic Analysis 1 Basic Concepts When SA is performed: Semantic Analysis may be performed: In a two-pass compiler: after syntactic analysis is finished, the semantic analyser if called with

More information

The PCAT Programming Language Reference Manual

The PCAT Programming Language Reference Manual The PCAT Programming Language Reference Manual Andrew Tolmach and Jingke Li Dept. of Computer Science Portland State University September 27, 1995 (revised October 15, 2002) 1 Introduction The PCAT language

More information

CSE 401/M501 Compilers

CSE 401/M501 Compilers CSE 401/M501 Compilers ASTs, Modularity, and the Visitor Pattern Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 H-1 Agenda Today: AST operations: modularity and encapsulation Visitor pattern: basic

More information

Pioneering Compiler Design

Pioneering Compiler Design Pioneering Compiler Design NikhitaUpreti;Divya Bali&Aabha Sharma CSE,Dronacharya College of Engineering, Gurgaon, Haryana, India nikhita.upreti@gmail.comdivyabali16@gmail.com aabha6@gmail.com Abstract

More information

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017 CS 426 Fall 2017 1 Machine Problem 1 Machine Problem 1 CS 426 Compiler Construction Fall Semester 2017 Handed Out: September 6, 2017. Due: September 21, 2017, 5:00 p.m. The machine problems for this semester

More information

Database Systems. Project 2

Database Systems. Project 2 Database Systems CSCE 608 Project 2 December 6, 2017 Xichao Chen chenxichao@tamu.edu 127002358 Ruosi Lin rlin225@tamu.edu 826009602 1 Project Description 1.1 Overview Our TinySQL project is implemented

More information

Automatic Generation of Graph Models for Model Checking

Automatic Generation of Graph Models for Model Checking Automatic Generation of Graph Models for Model Checking E.J. Smulders University of Twente edwin.smulders@gmail.com ABSTRACT There exist many methods to prove the correctness of applications and verify

More information

Faster Scannerless GLR Parsing

Faster Scannerless GLR Parsing Faster Scannerless GLR Parsing Giorgios Economopoulos, Paul Klint, and Jurgen Vinju Centrum voor Wiskunde en Informatica (CWI), Kruislaan 413, 1098 SJ Amsterdam, The Netherlands Abstract. Analysis and

More information

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised: EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing Görel Hedin Revised: 2017-09-04 This lecture Regular expressions Context-free grammar Attribute grammar

More information

Extended abstract. The Pivot: A brief overview

Extended abstract. The Pivot: A brief overview Extended abstract The Pivot: A brief overview Bjarne Stroustrup and Gabriel Dos Reis bs@cs.tamu.edu, gdr@cs.tamu.edu Abstract This paper introduces the Pivot, a general framework for the analysis and transformation

More information