The Zephyr Abstract Syntax Description Language

Similar documents
The Zephyr Abstract Syntax Description Language

PART ONE Fundamentals of Compilation

Fundamentals of Compilation

Compilation 2014 Warm-up project

CPSC 411: Introduction to Compiler Construction

CS 132 Compiler Construction

CS 352: Compilers: Principles and Practice

Compiler Construction

Topic 5: Syntax Analysis III

Topic 3: Syntax Analysis I

Modern Compiler Implementation in ML

Compiler Construction 1. Introduction. Oscar Nierstrasz

Connecting Definition and Use? Tiger Semantic Analysis. Symbol Tables. Symbol Tables (cont d)

Chapter 4. Abstract Syntax

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Abstract Syntax. Mooly Sagiv. html://

SML-SYNTAX-LANGUAGE INTERPRETER IN JAVA. Jiahao Yuan Supervisor: Dr. Vijay Gehlot

Data Abstraction. An Abstraction for Inductive Data Types. Philip W. L. Fong.

Compiler Construction 1. Introduction. Oscar Nierstrasz

LECTURE 3. Compiler Phases

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Compilers. Compiler Construction Tutorial The Front-end

More On Syntax Directed Translation

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Syntax and Grammars 1 / 21

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

JavaCC. JavaCC File Structure. JavaCC is a Lexical Analyser Generator and a Parser Generator. The structure of a JavaCC file is as follows:

CS 152: Programming Languages

CA Compiler Construction

xtc Robert Grimm Making C Safely Extensible New York University

CS 406: Syntax Directed Translation

ASTs, Objective CAML, and Ocamlyacc

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Review: Concrete syntax for Impcore

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Undergraduate Compilers in a Day

Structure of JavaCC File. JavaCC Rules. Lookahead. Example JavaCC File. JavaCC rules correspond to EBNF rules. JavaCC rules have the form:

5. Syntax-Directed Definitions & Type Analysis

(Not Quite) Minijava

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

JavaCC Rules. Structure of JavaCC File. JavaCC rules correspond to EBNF rules. JavaCC rules have the form:

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Lecture 14 Sections Mon, Mar 2, 2009

Abstract Syntax Trees

CS 432 Fall Mike Lam, Professor. Code Generation

Lexical and Syntax Analysis

Syntax-Directed Translation. Lecture 14

Compiler construction

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

Interpreters. Prof. Clarkson Fall Today s music: Step by Step by New Kids on the Block

Syntax. A. Bellaachia Page: 1

Chapter 3. Parsing #1

Meanings of syntax. Norman Ramsey Geoffrey Mainland. COMP 105 Programming Languages Tufts University. January 26, 2015

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

Project Compiler. CS031 TA Help Session November 28, 2011

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

CMPT 379 Compilers. Parse trees

Semantic actions for declarations and expressions

CSE3322 Programming Languages and Implementation

Compiler Principle and Technology. Prof. Dongming LU April 15th, 2019

Building Compilers with Phoenix

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Semantic actions for declarations and expressions. Monday, September 28, 15

Modules, Structs, Hashes, and Operational Semantics

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Time : 1 Hour Max Marks : 30

Semantic actions for declarations and expressions

Lecture 12: Conditional Expressions and Local Binding

Building Compilers with Phoenix

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

CSCE 314 Programming Languages

10/18/18. Outline. Semantic Analysis. Two types of semantic rules. Syntax vs. Semantics. Static Semantics. Static Semantics.

CS558 Programming Languages

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Thanks! Review. Course Goals. General Themes in this Course. There are many programming languages. Teaching Assistants. John Mitchell.

Table-Driven Top-Down Parsers

COS 320. Compiling Techniques

CPSC 411, Fall 2010 Midterm Examination

Design issues for objectoriented. languages. Objects-only "pure" language vs mixed. Are subclasses subtypes of the superclass?

Context Analysis. Mooly Sagiv. html://

Abstract Syntax Trees & Top-Down Parsing

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

Compiler Optimisation

4. Semantic Processing and Attributed Grammars

CSE 582 Autumn 2002 Exam 11/26/02

Project Overview. 1 Introduction. 2 A quick introduction to SOOL

CPS 506 Comparative Programming Languages. Syntax Specification

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413

Programming Languages and Compilers (CS 421)

n (0 1)*1 n a*b(a*) n ((01) (10))* n You tell me n Regular expressions (equivalently, regular 10/20/ /20/16 4

Transcription:

The Zephyr Abstract Syntax Description Language Daniel C. Wang danwang@cs.princeton.edu Princeton University with Andrew Appel, Jeff Korn, and Chris Serra Copyright 1997, Daniel C. Wang

Modular Compilers parser Front End AST semantic analysis Back End... IR opt-n tokens AST IR IR prg.c lexer translate IR opt-1 code gen. prg.o AST - Abstract Syntax Tree IR - Intermediate Representation

Multi-Language Compiler C Front End C++ Front End Shared Back End IR Java Front End

Compiler Construction Tools syntax desc. parser generator parser AST semantic analysis... IR opt-n tokens AST IR IR lexer translate IR opt-1 code gen. token desc. lexer generator MIPS desc. code gen. generator x86 desc. Sparc desc.

Compiler Glue Construction Tools glue generator Glue desc. AST IR parser AST semantic analysis... IR opt-n tokens lexer AST translate IR IR opt-1 IR code gen.

Experimenting with Optimizers Optimizer IR opt-1 IR my-opt IR opt-n IR

The Optimizer of Babel IR.h opt-1.c?? my-opt.java?? opt-n.sml IR.sml

Write to "Pickle" Files IR.h opt-1.c wr.c IR.pkl 0100101 rd.java opt-2.java wr.java IR.pkl 0100101 rd.sml opt-3.sml IR.sml

Universal Glue Made to Order IR.{c,h} IR.c wr.c rd.c IR desc. Glue Generator structure IR IR.sml wr.sml rd.sml package IR IR.java wr.java rd.java

Abstract Syntax Description Language Serves the same purpose as CORBA s IDL, SGML s DTD or ASN.1 Describes a class of tree data structures with grammar-like description Description An Example exp exp = :=(var, exp) := exp = +(exp, exp) exp = var :=(x, +(y, 1)) x + exp = int y 1

Why Not Use Existing Alternatives? CORBA s IDL biased to C types Does not describe recursive structures well. SGML too complex for machines Documents described by DTDs can be difficult to parse. HTML hard to describe with LR(1) grammar. ASN.1 is a complex language ASN.1 grammar has 300 productions, 150 nonterminals, and semantics defined with 70+ pages.

Design Goals Simple More concise and simpler than alternatives Generate natural and readable code Programmers must be able to use and read the code Produce code for C, C++, Java, and ML Deal with existing IR s such as FLINT, SUIF, and lcc s IR

ASDL By Example Straight Line Program x := 1; y := x + y; print(x,y); Type Constructor Primitives ASDL Description of SLP ASTs stm = Compound(stm, stm) Assign(identifier, exp) Print(exp_list) exp_list = ExpList(exp, exp_list) Nil exp = Id(identifier) Num(int) Op(exp, binop, exp) binop = Plus Minus Times Div

Generated Code (C like Languages) stm = Compound(stm,stm) Assign(identifier,exp) Print(exp_list) Tagged unions and structs asdlgen typedef struct _stm *stm_ty; struct _stm { enum {Compound_kind=1, Assign_kind=2, Print_kind=3} kind; union { struct { stm_ty stm1; stm_ty stm2; } Compound; struct { identifier_ty identifier1; exp_ty exp1; } Assign; struct { exp_list_ty exp_list1; } Print; } v; }; stm_ty Compound(stm_ty stm1, stm_ty stm2); stm_ty Assign(identifier_ty identifier1, exp_ty exp1); stm_ty Print(exp_list_ty exp_list1);

Generated Code (OO Languages) Use one level of inheritance abstract public class stm { protected int _k; public final int kind(){ return _k; } public static final int Compound_kind = 1; public static final int Assign_kind = 2; public static final int Print_kind = 3; } stm public final class Compound extends stm { public stm stm1; public stm stm2; public Compound(stm stm1, stm stm2){...} } public final class Assign extends stm {...} public final class Print extends stm {...} Compound Assign Print

Generated Code (ML) Use ML data types datatype stm = Compound of (stm * stm) Assign of (identifier * exp) Print of (exp_list) and...

Pickle Format Binary format Simple prefix encoding Arbitrary precision integers Designed to be concise x x := * + 1024 1 := x + * x 1024 1

Generated Picklers... void pkl_write_stm(stm_ty x, outstream_ty s) { switch(x->kind) { case Compound_kind: pkl_write_int(1, s); pkl_write_stm(x->v.compound.stm1, s); pkl_write_stm(x->v.compound.stm2, s); break; case Assign_kind: pkl_write_int(2, s);... break; case Print_kind: pkl_write_int(3, s);... break; default: pkl_die(); } }

Other Features Field names, sequences, and optional Product type and attributes Optional Sequence Field Name Product Type Type Attributes pos = (string? file, int line, int offset) stm = Compound(stm head, stm next) Assign(identifier lval, exp rval) Print(exp* args) attributes(pos p)...

C++ SUIF vs. ASDL SUIF C++ SUIF Code that Defines Structure ASDL SUIF ~2300 lines ~200 lines ASDL SUIF ~200 lines C++ ~3300 lines asdlgen Java C ~3600 lines ~3500 lines SML ~1300 lines Total ~12000 lines

ASDL SUIF Description Not Perfect Not all subtyping constraints captured 2 cases Awkward encoding of two levels of inheritance 1 case

Problems with Subtyping abstract public class block { } public class Basic extends block { } public class Funct extends block { } abstract public class table_entry { } public class VarEntry extends table_entry { } public class FunctEntry extends table_entry { public Funct value; } block = Basic( ) Funct( ) table_entry = VarEntry( ) FunctEntry(block, ) More Liberal in ASDL

Encoding Inheritance abstract public class A { ty 0 f 0 ; } public class B extends A { ty 1 f 1 ; } public class C extends B { ty 2 f 2 ; } f 0 f 1 f 0 f 1 f 2 Not Natural in ASDL a = A(ty 0 f 0, kids) kids = B(ty 1 f 1 ) C(ty 1 f 1, ty 2 f 2 ) f 1 f 0 f 1 f 2

Modula-3 vs. ASDL Picklers Modula-3 Arbitrary graphs Modula-3 specific Can pickle all Modula-3 types Uses runtime type information ASDL Just trees Less language specific Can pickle only types defined in ASDL Uses static type information

Syntax Comparison stm = Compound(stm head, stm next) Assign(stm head, stm next) Print(exp* args) ASDL CORBA s IDL interface stm stm { enum enum stm_tag { Compound_tag, Assign_tag, Print_tag}; attribute stm_tag tag; tag; } interface Compound : stm stm { attribute stm stm head; head; attribute stm stm next; next; } interface Assign : stm stm { attribute id id string; attribute exp exp exp; exp; } interface Print Print : stm stm { attribute sequence<exp> args; args; }

Syntax Comparison cont. SGML DTD <!ENTITY % stm stm "(Compound Assign Print)"> <!ELEMENT Compound - - (%stm,%stm)> <!ELEMENT Assign - - (%id,%exp)> <!ELEMENT Print - - (%exp*)> ANS.1 Stm ::= CHOICE { compound SEQUENCE {head Stm, next Stm}, assign SEQUENCE {head Stm, next Stm}, print SEQUENCE {args SEQUENCE OF Exp} }

ASDL Status asdlgen produces C, C++, Java, and SML data structures and picklers from ASDL descriptions Working on second generation browser Full public release with SML sources, binary distribution, and documentation in January Web demo and other information at htttp://www.cs.princeton.edu/zephyr/asdl

Modular descriptions Future Work Deal with name space management and separate compilation issues Separate Description from Implementation Use appropriate language specific implementations rather than concrete descriptions Encoding for FLINT, lcc s IR, and improved SUIF 2.0 encoding Support for Graphs (??)

National Compiler Infrastructure Stanford University Monica Lam et al. SUIF parallelizing compiler http://suif.stanford.edu/ Princeton University Andrew Appel Meta-thinking and Glue University of Virgina Jack Davidson Norman Ramsey Very Portable Optimizer (VPO) Back end generators (CSDL) http://www.cs.virginia.edu/zephyr/

Description vs. Implementation Description real = (int mantissa, int exp) Concrete Encoding typedef struct _real* real_ty; struct _real { int mantissa; int exp; } void pkl_write_real(...) { pkl_write_int(x->mantissa, s); pkl_write_int(x->exp, s); } Abstract Encoding typedef double real_ty; void pkl_write_real(...){?? }