Lexical and Syntax Analysis. Abstract Syntax

Similar documents
Lexical and Syntax Analysis

Lexical and Syntax Analysis

Lecture 19: Functions, Types and Data Structures in Haskell

Laboratory 2: Programming Basics and Variables. Lecture notes: 1. A quick review of hello_comment.c 2. Some useful information

Topic 7: Algebraic Data Types

Lectures 5-6: Introduction to C

Pointers, Dynamic Data, and Reference Types

Week 3 Lecture 2. Types Constants and Variables

CMSC 330: Organization of Programming Languages

Lectures 5-6: Introduction to C

CSC324 Principles of Programming Languages

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9

CS24 Week 2 Lecture 1

Types. C Types. Floating Point. Derived. fractional part. no fractional part. Boolean Character Integer Real Imaginary Complex

Character Strings. String-copy Example

(6-1) Basics of a Queue. Instructor - Andrew S. O Fallon CptS 122 (September 26, 2018) Washington State University

CMSC 330: Organization of Programming Languages. OCaml Data Types

Type checking of statements We change the start rule from P D ; E to P D ; S and add the following rules for statements: S id := E

Data Representation and Storage. Some definitions (in C)

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

PART ONE Fundamentals of Compilation

Structures and Pointers

FOR Loop. FOR Loop has three parts:initialization,condition,increment. Syntax. for(initialization;condition;increment){ body;

Today. o main function. o cout object. o Allocate space for data to be used in the program. o The data can be changed

C-types: basic & constructed. C basic types: int, char, float, C constructed types: pointer, array, struct

Custom Types. Outline. COMP105 Lecture 19. Today Creating our own types The type keyword The data keyword Records

The List Datatype. CSc 372. Comparative Programming Languages. 6 : Haskell Lists. Department of Computer Science University of Arizona

Prof. Carl Schultheiss MS, PE. CLASS NOTES Lecture 12

Computer Organization & Systems Exam I Example Questions

Variables in C. Variables in C. What Are Variables in C? CMSC 104, Fall 2012 John Y. Park

Parsing and Pattern Recognition

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Informatics 1 Functional Programming Lecture 9. Algebraic Data Types. Don Sannella University of Edinburgh

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Programming. Structures, enums and unions

CS 314 Principles of Programming Languages. Lecture 11

CS558 Programming Languages

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

Decaf Language Reference Manual

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements

Actually, C provides another type of variable which allows us to do just that. These are called dynamic variables.

12 CREATING NEW TYPES

Data Representation and Storage

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

Kurt Schmidt. October 30, 2018

CprE 288 Introduction to Embedded Systems Exam 1 Review. 1

LECTURE 3. Compiler Phases

Lecture 14 Sections Mon, Mar 2, 2009

Data Abstraction. An Abstraction for Inductive Data Types. Philip W. L. Fong.

BLM2031 Structured Programming. Zeyneb KURT

Programming in C - Part 2

High Performance Computing in C and C++

Homework #3: CMPT-379 Distributed on Oct 23; due on Nov 6 Anoop Sarkar

Unit IV & V Previous Papers 1 mark Answers

Data Storage. August 9, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 August 9, / 19

In Java we have the keyword null, which is the value of an uninitialized reference type

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 97 INSTRUCTIONS

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

CSE2301. Dynamic memory Allocation. malloc() Dynamic Memory Allocation and Structs

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

THE COMPILATION PROCESS EXAMPLE OF TOKENS AND ATTRIBUTES

Welcome! COMP s1. Programming Fundamentals

CS558 Programming Languages

CSCE 314 Programming Languages

CSE 230 Intermediate Programming in C and C++

Motivation was to facilitate development of systems software, especially OS development.

Variables in C. CMSC 104, Spring 2014 Christopher S. Marron. (thanks to John Park for slides) Tuesday, February 18, 14

CS201 - Introduction to Programming Glossary By

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

PRINCIPLES OF OPERATING SYSTEMS

CS558 Programming Languages

Tokens, Expressions and Control Structures

Left to right design 1

Input And Output of C++

Programming Fundamentals. With C++ Variable Declaration, Evaluation and Assignment 1

Lecture 2. Xiaoguang Wang. January 16th, 2014 STAT 598W. (STAT 598W) Lecture 2 1 / 41

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Tema 6: Dynamic memory

Chapter-8 DATA TYPES. Introduction. Variable:

COMPUTER SCIENCE TRIPOS

>B<82. 2Soft ware. C Language manual. Copyright COSMIC Software 1999, 2001 All rights reserved.

Programming in Haskell Aug-Nov 2015

Inductive Data Types

These are reserved words of the C language. For example int, float, if, else, for, while etc.

CS349/SE382 A1 C Programming Tutorial

CSC 467 Lecture 13-14: Semantic Analysis

Computer Systems Principles. C Pointers

Low-Level C Programming. Memory map Pointers Arrays Structures

More On Syntax Directed Translation

CSE P 501 Exam 8/5/04

Computer System and programming in C

C Concepts - I/O. Lecture 19 COP 3014 Fall November 29, 2017

Dynamic Memory Allocation and Command-line Arguments

CS 320: Concepts of Programming Languages

Linked List. April 2, 2007 Programming and Data Structure 1

Structures, Operators

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Introduction to Functional Programming in Haskell 1 / 56

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

COMPSCI 210 Part II Data Structure

Transcription:

Lexical and Syntax Analysis Abstract Syntax

What is Parsing? Parser String of characters Data structure Easy for humans to write Easy for programs to process A parser also checks that the input string is well-formed, and if not, rejects it.

Example 1 Parser Charlton, 49 Lineker, 48 Beckham, 17 49 Charlton 17 Beckham 48 Lineker CSV (Comma Separated Value) Array of pairs

Concrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is a set of rules that describe valid outputs from the parser. The data structure produced by a parser is commonly termed the abstract syntax tree.

Concrete and Abstract Syntax Parser String of characters Data structure Conforms to the Concrete Syntax of the language Conforms to the Abstract Syntax of the language

Concrete syntax The concrete syntax is usually defined by regular expressions and context-free grammars. Example: name = [a-za-z]+ goals = [0-9]+ squad = ( name, goals )*

Abstract syntax The abstract syntax is usually specified as a data type in the programming language being used, in our case C. Example: struct player { char* name; int goals; }; struct squad { struct player* players; int size; }; An abstract syntax tree is a value of this type.

This lecture How: to define abstract syntax to construct abstract syntax trees in the programming language C. Also revisits some important C programming techniques.

POINTERS Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

Pointers Declare a variable x of type int and initialise it to the value 10. x: int x = 10; 10 Declare a variable p of type int* (read: int pointer). int* p; Make p point to x (or assign the address of x to p). p = &x; p: p: x: 10

Pointers Print the value pointed to by p (here, the value of x). printf("%i\n", *p ); Assign 20 to the location pointed to by p. *p = 20; The value NULL means points to nothing. p = NULL; p: p: p: x: 10 x: 20

DYNAMIC ALLOCATION Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

Array allocation Declare a variable p of type int*. int* p; p: Allocate memory for an array of 4 int values and let point p to it. p = malloc(4 * sizeof(int)); p:

Array indexing Assign 10 to the location pointed to by p. *p = 10; Assign 20 to the first element of the array pointed to by p. p[0] = 20; Copy the first element of the array to the third element. p[2] = p[0]; p: p: p: 10 20 20 20

Array deallocation When finished with an array allocated by malloc, call free to release the space, otherwise your program may run out of memory. free(p); p: Space released, so it can be reused by future calls to malloc.

STRINGS String: a series of consecutive characters. [The Free Dictionary]

Strings Declare a variable s, initialised to point to the string hi. char* s = hi ; s: h i \0 Let s point to the next character. s = s + 1; s: h i \0 And let s point to the previous character again. s = s - 1; s: h i \0

Exercise 1 What is printed by the following program? int f(char* s) { int i = 0; while (s[i]!= '\0') i++; return i; } void main() { char* x = Hello ; printf( %i\n, f(x)); }

USER-DEFINED TYPES Type: the general character or structure held in common by a number of things. [The Free Dictionary]

Type definitions A typedef declaration allows a new name to be given to a type. typedef int Integer; typedef char* String; Existing type A new name Example use: String s; /* Declare a string s */ Integer i; /* and an integer i */ i = 0; s = hello ;

Enumerations An enum declaration introduces a new type whose values are members of a given set. enum colour {RED, GREEN, BLUE}; New type Possible values Example use: enum colour c; c = RED; if (c == RED) printf( Red\n ); Give it a shorter name: typedef enum colour Colour;

Structures An struct declaration introduces a new type that is a conjunction of one or more existing types. New type struct rectangle { float width; float height; }; Example use: struct rectangle r; r.width = 10; r.height = 20; A width and a height A circle: struct circle { float radius; };

Unions An union declaration introduces a new type that is a disjunction of one or more existing types. New type union shape { struct circle circ; struct rectangle rect; }; A circle or a rectangle Example use: struct shape s; s.circ.radius = 10;

Tagged unions Often a tag is used to denote the active member of a union. Another definition of shape: struct shape { enum { CIRCLE, RECTANGLE } tag; union { struct circle circ; struct rectangle rect; }; }; typedef struct shape Shape;

Tagged unions Example: s is a circle and t is a rectangle, and both are of type Shape. Shape s, t; s.tag = CIRCLE; s.circ.radius = 10; t.tag = RECTANGLE; t.rect.width = 5; t.rect.height = 15;

Tagged unions It is often convenient to define a constructor function for each member of the tagged union. Shape mkcircle(float r) { Shape s; s.tag = CIRCLE; s.circ.radius = r; return s; } Shape mkrectangle(float w, float h) { Shape s; s.tag = RECTANGLE; s.rect.width = w; s.rec.height = h; return s; }

Tagged unions Example revisited: s is a circle and t is a rectangle, and both are of type Shape. Shape s, t; s = mkcircle(10); t = mkrectangle(5, 15);

Tagged unions Example: compute the area of any given shape s. float area(shape s) { if (s.tag == CIRCLE) { float r = s.circ.radius; return (3.14 * r * r); } if (s.tag == RECTANGLE) { return (s.rect.width * s.rect.height); } }

Exercise 2 Define a function to compute the perimeter of any given shape s. float perim(shape s) {... }

Recursive structures A value of type struct t may contain a value of type struct t*. struct list { int head; struct list* tail; }; typedef struct list List; Suppose p is a value of type List* (*p).head p->head (*(*p).tail).head p->tail->head

Recursive structures Example: inserting an item onto the front of a linked list. List* insert(list* xs, int x) { List* ys = malloc(sizeof(list)); ys->head = x; ys->tail = xs; return ys; }

Exercise 3 Define a function to compute the length of a given list xs. int length(list* xs) {... }

CASE STUDY A simplifier for arithmetic expressions.

Concrete syntax Here is a concrete syntax for arithmetic expressions. v = [a-z]+ n = [0-9]+ e v n e + e e * e ( e ) Example expression: x * y + (foo * 10)

Simplification Consider the algebraic law: x. x * 1 = x This law can be used to simplify expressions by using it as a rewrite rule from left to right. Example simplification: x * (y * 1) x * y

Problem 1. Define an abstract syntax, in C, for arithmetic expressions. 2. Define constructor functions so that we can build abstract syntax trees representing expressions. 3. Implement the simplification rule as a C function over abstract syntax trees.

1. Abstract syntax typedef enum { ADD, MUL } Op; struct expr { enum { VAR, NUM, APP } tag; union { }; }; char* var; int num; struct { struct expr* e1; Op op; struct expr* e2; } app; typedef struct expr Expr; A variable or a number or a left expr and an op and a right expr

2. Constructor functions Expr* mkvar(char* v) { Expr* e = malloc(sizeof(expr)); e->tag = VAR; e->var = v; return e; } Expr* mknum(int n) { Expr* e = malloc(sizeof(expr)); e->tag = NUM; e->num = n; return e; } Expr* mkapp(expr* e1, Op op, Expr* e2) { Expr* e = malloc(sizeof(expr)); e->tag = APP; e->app.op = op; e->app.e1 = e1; e->app.e2 = e2; return e; }

2. Abstract syntax trees An abstract syntax tree represents the expression that x + (y * 2) can be constructed by the following C expression mkapp( mkvar("x"), ADD, mkapp( mkvar("y"), MUL, mknum(2)))

3. Simplification x * 1 x is implemented by void simplify(expr* e) { if (e->tag == APP && e->app.op == MUL && e->app.e2->tag == NUM && e->app.e2->num == 1) { *e = *(e->app.e1); } if (e->tag == APP) { simplify(e->app.e1); simplify(e->app.e2); } }

Homework exercises Implement a pretty printer that prints an abstract syntax tree in a concrete form. void print(expr* e) {... } Extend the simplifier to exploit the following algebraic law. x. x * 0 = 0

Motivation for Parsing In SYAC, we are interested in how to implement the following kind of function Expr* parse(char* string) {... } It takes a string conforming to the concrete syntax and returns an abstract syntax tree.