CS301 Compiler Design and Implementation Handout # 18 Prof. Lyn Turbak February 19, 2003 Wellesley College

Similar documents
Interpre'ng and Compiling Intex SOLUTIONS

Interpre'ng and Compiling Intex

Problem Set 5 Due:??

Interpre'ng and Compiling Intex

Valex: Primitive Operators and Desugaring

Valex: Mul*ple Value Types, Condi*onals, Dynamic Type Checking and Desugaring

FOFL and FOBS: First-Order Functions

Bindex: Naming, Free Variables, and Environments

Bindex: Naming, Free Variables, and Environments SOLUTIONS

CS251 Programming Languages Handout # 29 Prof. Lyn Turbak March 7, 2007 Wellesley College

Hofl, a Higher-order Functional Language. 1 An Overview of Hofl. 2 Abstractions and Function Applications

PostFix. PostFix command semanccs (except exec) PostFix Syntax. A PostFix Interpreter in Racket. CS251 Programming Languages Fall 2017, Lyn Turbak

Valex: Mul*ple Value Types, Condi*onals, Dynamic Type Checking and Desugaring SOLUTIONS

CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College

Metaprogramming in SML: PostFix and S-expressions

Metaprogramming in SML: PostFix and S-expressions

Hofl: First-class Functions and Scoping

PostFix. PostFix command semanccs (except exec) PostFix Syntax. A PostFix Interpreter in Racket. CS251 Programming Languages Spring 2018, Lyn Turbak

Sum-of-Product Data Types

ASTs, Objective CAML, and Ocamlyacc

Sum-of-Product Data Types

List Processing in Ocaml

Problem Set 1 Due: 11:59pm Wednesday, February 7

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

An-Najah National University Faculty of Engineering Electrical Engineering Department Programmable Logic Controller. Chapter 11 Math instruction

CS251 Programming Languages Handout # 36 Prof. Lyn Turbak April 4, 2007 Wellesley College Revised April 14, Scoping in Hofl

CIS 194: Homework 3. Due Wednesday, February 11, Interpreters. Meet SImPL

6.821 Programming Languages Handout Fall MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Compvter Science

CSc 372. Comparative Programming Languages. 4 : Haskell Basics. Department of Computer Science University of Arizona

Lecture 12: Conditional Expressions and Local Binding

CSc 372 Comparative Programming Languages. 4 : Haskell Basics

Chapter Two MIPS Arithmetic

Problem Set CVO 103, Spring 2018

9/10/10. Arithmetic Operators. Today. Assigning floats to ints. Arithmetic Operators & Expressions. What do you think is the output?

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

CS152: Programming Languages. Lecture 5 Pseudo-Denotational Semantics. Dan Grossman Spring 2011

CS251 Programming Languages Spring 2016, Lyn Turbak Department of Computer Science Wellesley College

CSc 520. Principles of Programming Languages 11: Haskell Basics

61A Lecture 26. Monday, October 31

CS251 Programming Languages Handout # 47 Prof. Lyn Turbak May 22, 2005 Wellesley College. Scheme

Lecture 2: Big-Step Semantics

Func+on applica+ons (calls, invoca+ons)

Lecture 4 Abstract syntax

CS153: Compilers Lecture 15: Local Optimization

COMP520 - GoLite Type Checking Specification

Homework 3 COSE212, Fall 2018

CS115 - Module 10 - General Trees

Intermediate Code Generation

Syntax and Grammars 1 / 21

Special Section: Building Your Own Compiler

MP 3 A Lexer for MiniJava

LECTURE 3. Compiler Phases

CSI32 Object-Oriented Programming

Sir Muhammad Naveed. Arslan Ahmed Shaad ( ) Muhammad Bilal ( )

Data Abstraction. An Abstraction for Inductive Data Types. Philip W. L. Fong.

The Typed Racket Guide

Course PJL. Arithmetic Operations

3. Java - Language Constructs I

Week 2: Console I/O and Operators Arithmetic Operators. Integer Division. Arithmetic Operators. Gaddis: Chapter 3 (2.14,3.1-6,3.9-10,5.

CIS 110: Introduction to Computer Programming

Module 2 - Part 2 DATA TYPES AND EXPRESSIONS 1/15/19 CSE 1321 MODULE 2 1

On a 64-bit CPU. Size/Range vary by CPU model and Word size.

Functions & First Class Function Values

Full file at C How to Program, 6/e Multiple Choice Test Bank

Metaprogramming assignment 3

1 Lexical Considerations

In this chapter you ll learn:

Jim Lambers ENERGY 211 / CME 211 Autumn Quarter Programming Project 4

Typed Racket: Racket with Static Types

CS 6353 Compiler Construction Project Assignments

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

Computer Programming, I. Laboratory Manual. Experiment #2. Elementary Programming

Introduction. Following are the types of operators: Unary requires a single operand Binary requires two operands Ternary requires three operands

CSE 341 Section 7. Ethan Shea Autumn Adapted from slides by Nicholas Shahan, Dan Grossman, Tam Dang, and Eric Mullen

CSE 341 Section 7. Eric Mullen Spring Adapted from slides by Nicholas Shahan, Dan Grossman, and Tam Dang

MP 3 A Lexer for MiniJava

These notes are intended exclusively for the personal usage of the students of CS352 at Cal Poly Pomona. Any other usage is prohibited without

Decaf Language Reference

LAB A Translating Data to Binary

CS 6353 Compiler Construction Project Assignments

The PCAT Programming Language Reference Manual

1 Introduction. 2 Definition of the IR. 2.1 Types. CS134b Lab #3 January 29, 2001 Due February 12, 2001

CMPT 379 Compilers. Parse trees

Homework 6. What To Turn In. Reading. Problems. Handout 15 CSCI 334: Spring, 2017

Meta-Programming with Modelica. Model Transformations

Recursion and Induction: Haskell; Primitive Data Types; Writing Function Definitions

Computer Science 21b: Structure and Interpretation of Computer Programs (Spring Term, 2004)

CS111: PROGRAMMING LANGUAGE II

Topic 4 Expressions and variables

CHAPTER 3 BASIC INSTRUCTION OF C++

L28: Advanced functional programming

Intermediate Representations

2.1. Chapter 2: Parts of a C++ Program. Parts of a C++ Program. Introduction to C++ Parts of a C++ Program

! A program is a set of instructions that the. ! It must be translated. ! Variable: portion of memory that stores a value. char

A simple syntax-directed

CEN 414 Java Programming

CVO103: Programming Languages. Lecture 5 Design and Implementation of PLs (1) Expressions

CS 11 Ocaml track: lecture 2

Wellesley College CS251 Programming Languages Spring, 2000 FINAL EXAM REVIEW PROBLEM SOLUTIONS

Basic operators, Arithmetic, Relational, Bitwise, Logical, Assignment, Conditional operators. JAVA Standard Edition

Transcription:

CS30 Compiler Design and Implementation Handout # 8 Prof. Lyn Turbak February 9, 003 Wellesley College Intex: An Introduction to Program Manipulation Introduction An interpreter is a program written in one language (the target language or implementation language) that executes a program written another language (the source language). The source and target languages are typically different. For example, we might write an interpreter for Java in Scheme. We would call such an interpreter a Java interpreter, naming it after the source language, not the target language. It is possible to write an interpreter for a language in itself; such an interpreter is known as a meta-circular interpreter. For example, chapter 4 of Abelson and Sussman s Structure and Interpretation of Computer Programs presents a meta-circular interpreter for Scheme. (Of course, a meta-circular interpreter involves the boot-strapping issues we have considered earlier in our high-level discussions of interpretation and compilation.) One of the best ways to gain insight into programming language design and implementation is read, write, and modify interpreters and translators. We will spend a significant amount of time in this course studying interpreters and translators. We begin with an intepreter for an extremely simple toy language, an integer expression language we ll call Intex. To understand various programming language features, we will add them to Intex to get more complex languages. Eventually, we will build up to more realistic source languages. Abstract Syntax for Intex An Intex program specifies a function that takes any number of integer arguments and returns an integer result. Abstractly, an Intex program is a tree constructed out of the following kinds of nodes: A program node is a node with two components: () A non-negative integer numargs specifying the number of arguments to the program and () A body expression. An expression node is one of the following: A literal specifying an integer constant, known as its ; An argument reference specifying an integer index that references a program argument by position; A binary application specifying a binary ope (known as its ) and two operand expressions (known as rand and rand). An binary ope node is one of the following five opes: addition, subtraction, multiplication, division, or remainder. These nodes can be be depicted graphically and arranged into trees. For example, Fig. depicts the trees denoting three sample Intex programs: () a program that squares its single input, () a program that averages its two inputs, and (3) a program that converts its single input, a temperature measured in Fahrenheit, to a temperature measured in Celsius. Such trees are known as abstract

syntax trees (ASTs), because they specify the abstract logical structure of a program without any hint of how the program might be written down in concrete syntax. We leave discussion of the concrete syntax of Intex programs for another day. Pgm Pgm Pgm numargs body numargs body numargs body Mul randrand Arg Arg Div rand rand Lit Div rand rand Lit Add rand rand Arg Arg Mul rand rand Lit 9 Sub randrand Arg Lit 5 Squaring program Averaging program 3 Fahrenheit-to-Celsius converter Figure : Abstract syntax trees for three sample Intex programs. It is easy to express Intex ASTs using Ocaml datatypes. Fig. introduces three types to express the three different kinds of Intex AST nodes:. The pgm type has a single constructor, Pgm, with two components (numargs and body);. The exp type is a recursive type with three constructors: Lit (for integer literals), Arg (for argument references), and (for binary applications). 3. The binop type has five constructors, one for each possible ope. Using these datatypes, the three sample trees depicted in Fig. can be express in Ocaml as shown in Fig. 3. type pgm = Pgm of int * exp (* numargs, body *) and exp = Lit of int (* *) Arg of int (* index *) of binop * exp * exp (*, rand, rand *) and binop = Add Sub Mul Div Rem (* Arithmetic ops *) Figure : Ocaml datatypes for Intex abstract syntax. 3 Manipulating Intex Programs Intex programs and expressions are just trees, and can be easily manipulated as such. Here we study three different programs that manipulate Intex ASTs.

let sqr = Pgm(, (Mul, Arg, Arg )) let avg = Pgm(, (Div, (Add, Arg, Arg ), Lit )) let fc = Pgm(, (Div, (Mul, (Sub, Arg, Lit 3), Lit 5), Lit 9)) Figure 3: Sample programs expressed using Ocaml datatypes. 3. Program Size Define the size of an Intex program as the number of boxed nodes in the graphical depiction of its AST. Then the size of an Intex program can be determined as follows: let rec sizepgm (Pgm(_,body)) = + (expsize body) and sizeexp e = match e with Lit i -> Arg index -> (_,r,r) -> + (sizeexp r) + (sizeexp r) (* add one for and one for node *) For example: # open Intex;; # sizepgm sqr;; - : int = 5 # sizepgm avg;; - : int = 8 # sizepgm fc;; - : int = The tree manipulation performed by sizeexp is an instance of a general fold ope on Intex expressions that captures the essence of divide, conquer, and glue on these expressions: let rec fold litfun argfun appfun exp = match exp with Lit i -> litfun i Arg index -> argfun index (op,rand,rand) -> appfun op (fold litfun argfun appfun rand) (fold litfun argfun appfun rand) Using fold, we can re-express sizeexp as: 3

(* fold-based version *) let sizeexp e = fold (fun _ -> ) (fun _ -> ) (fun _ n n -> + n + n) e 3. Static Argument Checking We can statically (i.e., without running the program) check if all the argument indices are valid by a simple tree walk that determines the minimum and maximum argument indices: let rec argcheck (Pgm(n,body)) = let (lo,hi) = argrange body in (lo >= ) && (hi <= n) and argrange e = match e with Lit i -> (max_int, min_int) Arg index -> (index, index) (_,r,r) -> let (lo, hi) = argrange r and (lo, hi) = argrange r in (min lo lo, max hi hi) Again, argrange can be expressed in terms of fold: (* fold-based version *) let argrange e = fold (fun _ -> (max_int, min_int)) (fun index -> (index, index)) (fun _ (lo,hi) (lo,hi) -> (min lo lo, max hi hi)) e 3.3 Interpretation An interpreter for Intex is presented in Fig. 4. It determines the integer of an Intex program given a list of integer arguments. In certain situations, it is necessary to indicate an error; the EvalError exception is used for this. Even the interpreter can be expressed in terms of fold! 4

module IntexInterp = struct open Intex exception EvalError of string let rec run (Pgm(n,body)) ints = let len = List.length ints in if n = List.length ints then eval body ints raise (EvalError ("Program expected " ^ (string_of_int n) ^ " arguments but got " ^ (string_of_int len))) end (* direct version *) and eval exp args = match exp with Lit i -> i Arg index -> if (index <= 0) (index > List.length args) then raise (EvalError("Illegal arg index: " ^ (string_of_int index))) List.nth args (index - ) (op,rand,rand) -> primapply op (eval rand args) (eval rand args) and primapply binop x y = match binop with Add -> x + y Sub -> x - y Mul -> x * y Div -> if y = 0 then raise (EvalError ("Division by 0: " ^ (string_of_int x))) x / y Rem -> if y = 0 then raise (EvalError ("Remainder by 0: " ^ (string_of_int x))) x mod y Figure 4: An interpreter for Intex. 5

(* fold-based version *) let eval exp = fold (fun i -> (fun args -> i)) (fun index -> (fun args -> if (index <= 0) (index > List.length args) then raise (EvalError("Illegal arg index: " ^ (string_of_int index))) List.nth args (index - ))) (fun op fun fun -> (fun args -> primapply op (fun args) (fun args))) exp 4 S-Expression Form for Intex Programs It is easy to represent Intex programs and expressions in an s-expression format. Fig. 5 shows how to convert between the two. 6

and stringexp s = sexpexp (stringsexp s) and stringpgm s = sexppgm (stringsexp s) (************************************************************ Unparsing to S-Expressions ************************************************************) let rec pgmsexp p = match p with Pgm (n, e) -> Seq ([Sym "intex"; Int n; expsexp e]) and expsexp e = match e with Lit i -> Int i Arg i -> Seq [Sym "$"; Int i] (, rand, rand) -> Seq ([Sym (primopstring ); expsexp rand; expsexp rand]) and primopstring p = match p with Add -> "+" Sub -> "-" Mul -> "*" Div -> "/" Rem -> "%" and expstring s = sexpstring (expsexp s) and pgmstring s = sexpstring (pgmsexp s) (************************************************************ Parsing from S-Expressions ************************************************************) let rec sexppgm sexp = match sexp with Sexp.Seq [Sexp.Sym("intex"); Sexp.Int(n); body] -> Pgm(n, sexpexp body) _ -> raise (SyntaxError ("invalid Intex program: " ^ (sexpstring sexp))) and sexpexp sexp = match sexp with Int i -> Lit i Seq([Sym "$"; Int i]) -> Arg i Seq([Sym p; rand; rand]) -> (stringprimop p, sexpexp rand, sexpexp rand) _ -> raise (SyntaxError ("invalid Intex expression: " ^ (sexpstring sexp))) and stringprimop s = match s with "+" -> Add "-" -> Sub "*" -> Mul "/" -> Div "%" -> Rem _ -> raise (SyntaxError ("invalid Intex 7 primop: " ^ s))