COMPILER DESIGN. Syntax-Directed Translation. COMP 442/6421 Compiler Design. Department of Computer Science and Software Engineering

Concordia University 1 COMPILER DESIGN Syntax-Directed Translation

Syntax-directed translation: analysis and synthesis Translation process driven by the syntactic structure of the program, as generated by the parser. 2 In syntax-directed translation, the semantic analysis and translation steps of the compilation process is divided in two parts: analysis (syntactic, semantic) synthesis (translation and optimization) The semantic analysis becomes the link between analysis and synthesis: translation (synthesis) is conditional to positive semantic analysis. The syntax-directed translation process is inducing very strong coupling between the syntax analysis phase, the semantic checking phase, and the translation phase.

Syntax-directed translation: semantic actions, semantic values, attribute migration Semantic actions are integrated in the parsing process. 3 Semantic actions are functions whose goal is to accumulate and/or process semantic information about the program as the parsing is done. Different pieces of information are gathered during the parsing of different syntactical rules. These pieces of information, named semantic values, or semantic attributes are gathered and accumulated in semantic records until they can be processed by further semantic actions. Some of the information that must be gathered spans over the application of several syntactical rules. This raises the need to migrate this information across the parse tree, a process known as attribute migration.

Syntax-directed translation: semantic actions Some semantic actions (implemented as semantic routines) do the analysis phase by performing semantic aggregation and/or checking in productions that need such aggregation and/or checks, depending on the semantic rules defined by the language specification, e.g. type checking/aggregation. 4 = type(a=b+c) A = B + C; type(a) A + type(b+c) type(b) Semantic actions assemble information in order to validate and eventually generate a meaning (i.e. translation) for the program elements generated by the productions. They are the starting point for translation(synthesis). Thus, the semantic routines are the heart of the compiler. B C type(c)

Syntax-directed translation: relationship to parse trees, nodes, and parsing methods Conceptually, semantic actions are associated with parse tree nodes: Semantic actions at the leaves of the tree typically gather semantic information, either from the token the node represents, often with the help from a lookup in the symbol table (i.e. to get the type of an identifier). Semantic actions in intermediate nodes typically aggregate, validate, or use semantic information and pass the information up in the tree (i.e. perform attribute migration). 5 As parse trees are also related to grammar rules, so are semantic actions. Semantic actions are inserted in the right-hand sides of the grammar: In recursive-descent predictive parsing, they are represented by function calls inserted in the parsing functions, attribute migration is performed by using parameters passed and returned between parsing functions. In table-driven parsers, semantic action symbols are inserted in the right-hand-sides and pushed onto the stack. Popping a semantic action symbol from the stack triggers the corresponding semantic action. Attribute migration is made by way of the stack. Some parsers may have to rely on a separate semantic stack. Alternatively, recursive-descent predictive parsers can also use a semantic stack instead of local variables and parameter passing/return values.

Syntax-directed translation and attribute grammars Semantic routines can be formalized using attribute grammars. Attribute grammars augment ordinary context-free grammars with attributes that represent semantic properties such as type, value or correctness used in semantic analysis (checking) and code generation (translation). 6 It is useful to keep checking and translation facilities distinct in the semantic routines implementation. Semantic checking is machine-independent and code generation is not, so separating them gives more flexibility to the compiler (front/back end).

Attributes An attribute is a property of a programming language construct, including data type, value, memory location/size, translated code, etc. Implementation-wise, they are also called semantic records. 7 The process of computing the value of an attribute is called binding. Static binding concerns binding that can be done at compile-time, and dynamic binding happens at run-time, e.g. for polymorphism. In our project, we are concerned solely on static compile-time binding.

Attributes migration Static attribute binding is done by gathering, propagating, and aggregating attributes while traversing the parse tree. Attributes are gathered at tree leaves, propagated across tree nodes, and aggregated at some parent nodes when additional information is available. This can be done as the program is being parsed using syntax-directed translation. 8 Synthetized attributes : attributes gathered from a child in the syntax tree Inherited attributes : attributes gathered from a sibling in the syntax tree

Example: Semantic rules and attribute migration 9 E 1 id = E 2 E = : [E 2 : v: ] [(id: ) defvar] [id.val v] [E 1.val: v] Y=3*X+Z E(v 3*vx+vz : ) E 1 E 2 E 3 E * : [E 2 : v 2 : ] [E 3 : v 3 : ] [E 1.val: v 2 v 3 ] id(v Y : ) = E(v (3*vx)+vz : ) E 1 E 2 + E 3 E + : [E 2 : v 2 : ] [E 3 : v 3 : ] [E 1.val: v 2 +v 3 ] E(v 3*vx : ) + E(v Z : ) E id E id : [(id: ) defvar] [(E.val: ) id.val] E(3: ) * E(v X : ) id(v Z : ) E const E const : [const : v: ] [(E.val: ) v] const(3: ) id(v X : ) Here, semantic information only flows upwards in the tree. All aggregated semantic information is available on the same subtree.

Example 2: attribute migration Problems arise when rules are factorized: 10 E a+b*c E TE E +TE T FT T FT F id const F 1 id(v a : ) T 1 T 1 F 2 + E 1 T 2 T 2 E 2 id(v b : ) * F 3 T 3 id(v c : )

Example 2: attribute migration Solution: migrate attributes sideways, i.e. inherited attributes 11 E a+b*c E TE E +TE T FT T FT F id const F 1 id(v a : ) T 1 T 1 F 2 + E 1 T 2 T 2 E 2 id(v b : ) * F 3 T 3 id(v c : ) Here, left operands are available on a different subtree, so they have to be migrated sideways across different subtrees in order to be aggregated.

Attribute migration: implementation in recursive-descent predictive parser 12 Parse(){ semrec Es } lookahead = NextToken() if (E(Es);Match('$')) return(true); return(false); //semantic record created //before the call //passed as a reference //to parsing functions //that will compute its value semrec potentially represents any kind of semantic record. In this example, it represents semantic records that carry type information across the parsing of an expression. It could also be tree nodes that are created, migrated and grafted/adopted in order to construct an abstract syntax tree. It could also in fact be both, as the tree is created, information, such as type or symbol tables can be gathered as the parse is done or the tree is traversed.

Attribute migration: implementation in recursive-descent predictive parser 13 E(semrec &Es){ semrec Ts,E's if (lookahead is in [0,1,(]) if (T(Ts);E'(Ts,E's);) write(e->te') Es = E's return(true) return(false) return(false) } // E' inherits Ts from T // Synthetised attribute sent up Each parsing function potentially defines its own semantic records used locally the processing of its own subtree. Ts,E's are semantic records produced/used by the T() and E () functions and returned by them to the E() function.

Attribute migration: implementation in recursive-descent predictive parser 14 E'(semrec &Ti, type &E's){ semrec Ts,E'2s if (lookahead is in [+]) if (Match('+');T(Ts);E'(Ts,E'2s)) write(e'->te') E's = semcheckop(ti,e'2s) } return(true) return(false) if (lookahead is in [$,)] write(e'->epsilon) E's = Ti return(true) return(false) // E' inherits from T // Semantic check & synthetized // attribute sent up // Synth. attr. is inhertied // from T (sibling, not child) // and sent up Some semantic actions will do some semantic checking and/or semantic aggregation, such as a tree node adopting a child node, or inferring the type of an expression from two child operands.

Attribute migration: implementation in recursive-descent predictive parser 15 T(semrec &Ts){ semrec Fs, T's if (lookahead is in [0,1,(]) if (F(Fs);T'(Fs,T's);) write(t->ft') Ts = T's return(true) return(false) return(false) } // T' inherits Fs from F // Synthetized attribute sent up

Attribute migration: implementation in recursive-descent predictive parser 16 T'(semrec &Fi, type &T's){ semrec Fs, T'2s if (lookahead is in [*]) if (Match('*');F(Fs);T'(Fs,T'2s)) write(t'->*ft') T's = semcheckop(fi,t'2s) } return(true) return(false) if (lookahead is in [+,$,)] write(t'->epsilon) T's = Fi return(true) return(false) // T' inherits from F // Semantic check and // synthetized attribute sent up // Synthetized attribute is // inhertied from F sibling // and sent up the tree

Attribute migration: implementation in recursive-descent predictive parser 17 F(semrec &Fs){ semrec Es if (lookahead is in [id]) if (Match('id')) write(f->id) Fs = gettype(id.name,table) } return(true) return(false) if (lookahead is in [(]) if (Match('(');E(Es);Match(')')) write(f->(e)) Fs = Es return(true) return(false) return(false) // Gets the attribute ``type'' // from the symbol table and // sends it up the tree as Fs // Synthetized attribute from E // sent up the tree as attribute // of F Some semantic actions at the leaves will create new information to be propagated up.

Attribute migration: implementation in recursive-descent predictive parser 18 type semcheckop(type ti,type ts){ if (ti == ts) return(ti) return(typerror) } type gettype(name, table){ if (name is in table) return (type) return(typerror) }

Attribute migration: example 19 Es E a+b*c E s Ts T 1 Ti E 1 E s T s E s Fs F 1 Fi T 1 T s + Ts T 2 Ti E 2 E s Fs T s id(v a : ) Fs F 2 Fi T 2 T s Fs T s id(v b : ) * Fs F 3 Fi T 3 T s Fs id(v c : )