Today. Elements of Programming Languages. Concrete vs. abstract syntax. L Arith. Lecture 1: Abstract syntax

Similar documents
Boolean expressions. Elements of Programming Languages. Conditionals. What use is this?

Syntax and Grammars 1 / 21

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Variables. Substitution

Context-Free Grammar (CFG)

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Supplementary Notes on Abstract Syntax

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CSCE 314 Programming Languages

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Programming Languages Third Edition

CSCI312 Principles of Programming Languages!

Programming Languages Fall 2013

CS152: Programming Languages. Lecture 2 Syntax. Dan Grossman Spring 2011

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CS 415 Midterm Exam Spring 2002

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CSE 3302 Programming Languages Lecture 2: Syntax

Chapter 3 (part 3) Describing Syntax and Semantics

CMSC 330: Organization of Programming Languages

CSE 12 Abstract Syntax Trees

Syntax. A. Bellaachia Page: 1

F453 Module 7: Programming Techniques. 7.2: Methods for defining syntax

CS558 Programming Languages

Compilers and computer architecture From strings to ASTs (2): context free grammars

CMSC 330: Organization of Programming Languages. Operational Semantics

CMSC 330: Organization of Programming Languages. Context Free Grammars

Parser Generators. Mark Boady. August 14, 2013

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

A Simple Syntax-Directed Translator

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Formal Semantics. Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

Overview. finally and resource cleanup

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Defining syntax using CFGs

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Chapter 9: Dealing with Errors

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Working of the Compilers

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Overview. Elements of Programming Languages. Evaluation order. Evaluation order

Defining Languages GMU

In Our Last Exciting Episode

CMSC 330: Organization of Programming Languages

Lecture 4: Syntax Specification

Chapter 2 - Programming Language Syntax. September 20, 2017

Stating the obvious, people and computers do not speak the same language.

CSE 401/M501 Compilers

CMSC 330: Organization of Programming Languages. Context Free Grammars

Chapter 3. Describing Syntax and Semantics

CS558 Programming Languages. Winter 2013 Lecture 1

The Substitution Model

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CPS 506 Comparative Programming Languages. Syntax Specification

At build time, not application time. Types serve as statically checkable invariants.

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Types and Static Type Checking (Introducing Micro-Haskell)

Types and Static Type Checking (Introducing Micro-Haskell)

Part 5 Program Analysis Principles and Techniques

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages

Overview. Elements of Programming Languages. Another example. Consider the humble identity function

Week 2: Syntax Specification, Grammars

Building Compilers with Phoenix

Semantics of programming languages

Introduction to Lexing and Parsing

CS 11 Ocaml track: lecture 6

The story so far. Elements of Programming Languages. Pairs in various languages. Pairs

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Formal semantics of loosely typed languages. Joep Verkoelen Vincent Driessen

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

EDA180: Compiler Construc6on Context- free grammars. Görel Hedin Revised:

Introduction to Parsing Ambiguity and Syntax Errors

CMSC 330: Organization of Programming Languages. Context Free Grammars

Programming Lecture 3

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Introduction to Parsing Ambiguity and Syntax Errors

Lecture Compiler Middle-End

A simple syntax-directed

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Overloading, Type Classes, and Algebraic Datatypes

A programming language requires two major definitions A simple one pass compiler

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

Formal Systems and their Applications

Semantic Analysis. Lecture 9. February 7, 2018

Project 1: Scheme Pretty-Printer

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

1. Abstract syntax: deep structure and binding. 2. Static semantics: typing rules. 3. Dynamic semantics: execution rules.

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

This book is licensed under a Creative Commons Attribution 3.0 License

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

22c:111 Programming Language Concepts. Fall Syntax III

Transcription:

Today Elements of Programming Languages Lecture 1: Abstract syntax James Cheney University of Edinburgh We will introduce some basic tools used throughout the course: Concrete vs. abstract syntax Abstract syntax trees Induction over expressions September 21, 2017 L Arith Concrete vs. abstract syntax We will start out with a very simple (almost trivial) programming language called L Arith to illustrate these concepts Namely, expressions with integers, + and Examples: 1 + 2 ---> 3 1 + 2 * 3 ---> 7 (1 + 2) * 3 ---> 9 Concrete syntax: the actual syntax of a programming language Specify using context-free grammars (or generalizations) Used in compiler/interpreter front-end, to decide how to interpret strings as programs Abstract syntax: the essential constructs of a programming language Specify using so-called Backus Naur Form (BNF) grammars Used in specifications and implementations to describe the abstract syntax trees of a language.

CFG vs. BNF BNF grammars BNF grammar giving abstract syntax for expressions Context-free grammar giving concrete syntax for expressions E! E PLUS F F F! F TIMES F NUM LPAREN E RPAREN Needs to handle precedence, parentheses, etc. Tokenization (+! PLUS, etc.),comments,whitespace usually handled by a separate stage Expr 3 e ::= e 1 + e 2 e 1 e 2 n 2 N This says: there are three kinds of expressions Additions e 1 + e 2, where two expressions are combined with the + operator Multiplications e 1 e 2, where two expressions are combined with the operator Numbers n 2 N Much like CFG rules, we can derive more complex expressions: e! e 1 + e 2! 3+e 2! 3+(e 3 e 4 )! BNF conventions Abstract Syntax Trees (ASTs) We will usually use BNF-style rules to define abstract syntax trees and assume that concrete syntax issues such as precedence, parentheses, whitespace, etc. are handled elsewhere. Convention: the subscripts on occurrences of e on the RHS don t a ect the meaning, just for readability Convention: we will freely use parentheses in abstract syntax notation to disambiguate e.g. (1 + 2) 3 vs. 1+(2 3) We view a BNF grammar to define a collection of abstract syntax trees, forexample: + 1 2 + 1 2 3 + 1 2 These can be represented in a program as trees, or in other ways (which we will cover in due course) 3

Languages for examples ASTs in Java We will use several languages for examples throughout the course: Java: typed, object-oriented Python: untyped, object-oriented with some functional features Haskell: typed, functional Scala: typed, combines functional and OO features Sometimes others, to discuss specific features You do not need to already know all these languages! In Java ASTs can be defined using a class hierarchy: abstract class Expr { class Num extends Expr { public int n; Num(int _n) { n = _n; ASTs in Java ASTs in Java In Java ASTs can be defined using a class hierarchy:... class Plus extends Expr { public Expr e1; public Expr e2; Plus(Expr _e1, Expr _e2) { e1 = _e1; e2 = _e2; class Times extends Expr {... // similar Traverse ASTs by adding a method to each class: abstract class Expr { abstract public int size(); class Num extends Expr {... public int size() { return 1; class Plus extends Expr {... public int size() { return e1.size(e1) + e2.size() + 1; class Times extends Expr {... // similar

ASTs in Python ASTs in Haskell Python is similar, but shorter (no types): class Expr: pass # "abstract" class Num(Expr): def init (self,n): self.n = n def size(self): return 1 class Plus(Expr): def init (self,e1,e2): self.e1 = e1 self.e2 = e2 def size(self): return self.e1.size() + self.e2.size() + 1 class Times(Expr): # similar... In Haskell, ASTs are easily defined as datatypes: data Expr = Num Integer Plus Expr Expr Times Expr Expr Likewise one can easily write functions to traverse them: size :: Expr -> Integer size (Num n) = 1 size (Plus e1 e2) = (size e1) + (size e2) + 1 size (Times e1 e2) = (size e1) + (size e2) + 1 ASTs in Scala In Scala, can define ASTs conveniently using case classes: abstract class Expr case class Num(n: Integer) extends Expr case class Plus(e1: Expr, e2: Expr) extends Expr case class Times(e1: Expr, e2: Expr) extends Expr Again one can easily write functions to traverse them using pattern matching: def size (e: Expr): Int = e match { case Num(n) => 1 case Plus(e1,e2) => size(e1) + size(e2) + 1 case Times(e1,e2) => size(e1) + size(e2) + 1 Creating ASTs Java: new Plus(new Num(2), new Num(2)) Python: Plus(Num(2),Num(2)) Haskell: Plus(Num(2),Num(2)) Scala: (the new is optional for case classes:) new Plus(new Num(2),new Num(2)) Plus(Num(2),Num(2))

Precedence, Parentheses and Parsimony The three most important reasoning techniques Infix notation and operator precedence rules are convenient for programmers (looks like familiar math) but complicate language front-end Some languages, notably LISP/Scheme/Racket, eschew infix notation. All programs are essentially so-called S-Expressions: s ::= a (a s 1 s n ) so their concrete syntax is very close to abstract syntax. For example 1 + 2 ---> (+ 1 2) 1 + 2 * 3 ---> (+ 1 (* 2 3)) (1 + 2) * 3 ---> (* (+ 1 2) 3) Induction The three most important reasoning techniques for programming languages are: (Mathematical) induction (over N) (Structural) induction (over ASTs) (Rule) induction (over derivations) We will briefly review the first and present structural induction. We will cover rule induction later. Induction over expressions Asimilarprincipleholdsforexpressions: Recall the principle of mathematical induction Mathematical induction Given a property P of natural numbers, if: P(0) holds for any n 2 N, ifp(n) holds then P(n + 1) also holds Then P(n) holdsforalln 2 N. Induction on structure of expressions Given a property P of expressions, if: P(n) holds for every number n 2 N for any expressions e 1, e 2,ifP(e 1 )andp(e 2 ) holds then P(e 1 + e 2 ) also holds for any expressions e 1, e 2,ifP(e 1 )andp(e 2 ) holds then P(e 1 e 2 ) also holds Then P(e) holdsforallexpressionse. Note that we are performing induction over abstract syntax trees, notnumbers!

Proof of expression induction principle Proof of expression induction principle Define the size of an expression in the obvious way: size(n) = 1 size(e 1 + e 2 ) = size(e 1 )+size(e 2 )+1 size(e 1 e 2 ) = size(e 1 )+size(e 2 )+1 Given P( )satisfyingtheassumptionsofexpressioninduction, we prove the property Q(n) =foralle with size(e) < n we have P(e) Since any expression e has a finite size, P(e) holdsforany expression. Proof. We prove that Q(n) holdsforalln by induction on n: The base case n =0isvacuous For n +1,thenassumeQ(n) holdsandconsideranye with size(e) < n +1. Thentherearethreecases: if e = m 2 N then P(e) holds by part 1 of expression induction principle if e = e 1 + e 2 then size(e 1 ) < size(e) apple n and similarly for size(e 2 ) < size(e) apple n. So, by induction, P(e 1 )and P(e 2 ) hold, and by part 2 of expression induction principle P(e) holds. if e = e 1 e 2, the same reasoning applies. Summary We covered: Concrete vs. Abstract syntax Abstract syntax trees Abstract syntax of L Arith in several languages Structural induction over syntax trees This might seem like a lot to absorb, but don t worry! We will revisit and reinforce these concepts throughout the course. Next time: Evaluation Asimpleinterpreter Operational semantics rules