Lexical and Syntax Analysis

Similar documents
Lexical and Syntax Analysis. Abstract Syntax

Lexical and Syntax Analysis

Lecture 19: Functions, Types and Data Structures in Haskell

Prof. Carl Schultheiss MS, PE. CLASS NOTES Lecture 12

Aryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9

Lectures 5-6: Introduction to C

Topic 7: Algebraic Data Types

Laboratory 2: Programming Basics and Variables. Lecture notes: 1. A quick review of hello_comment.c 2. Some useful information

Pointers, Dynamic Data, and Reference Types

Lectures 5-6: Introduction to C

CSC324 Principles of Programming Languages

Pointers (part 1) What are pointers? EECS We have seen pointers before. scanf( %f, &inches );! 25 September 2017

Variables in C. Variables in C. What Are Variables in C? CMSC 104, Fall 2012 John Y. Park

Custom Types. Outline. COMP105 Lecture 19. Today Creating our own types The type keyword The data keyword Records

Structures and Pointers

Homework #3: CMPT-379 Distributed on Oct 23; due on Nov 6 Anoop Sarkar

Character Strings. String-copy Example

Pointers. Pointers. Pointers (cont) CS 217

Week 3 Lecture 2. Types Constants and Variables

Arrays and Pointers (part 1)

Dynamic Memory Allocation and Command-line Arguments

CSE2301. Dynamic memory Allocation. malloc() Dynamic Memory Allocation and Structs

Arrays and Pointers (part 1)

Data Representation and Storage. Some definitions (in C)

Programming. Structures, enums and unions

Lecture 2. Xiaoguang Wang. January 16th, 2014 STAT 598W. (STAT 598W) Lecture 2 1 / 41

Computer Organization & Systems Exam I Example Questions

Compiling and Running a C Program in Unix

C library = Header files + Reserved words + main method

PART ONE Fundamentals of Compilation

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

CMSC 330: Organization of Programming Languages

Variables in C. CMSC 104, Spring 2014 Christopher S. Marron. (thanks to John Park for slides) Tuesday, February 18, 14

CSE 230 Intermediate Programming in C and C++

C-types: basic & constructed. C basic types: int, char, float, C constructed types: pointer, array, struct

Motivation was to facilitate development of systems software, especially OS development.

Input And Output of C++

Programming Fundamentals. With C++ Variable Declaration, Evaluation and Assignment 1

Actually, C provides another type of variable which allows us to do just that. These are called dynamic variables.

We do not teach programming

Chapter-8 DATA TYPES. Introduction. Variable:

This is CS50. Harvard University Fall Quiz 0 Answer Key

CS349/SE382 A1 C Programming Tutorial

High Performance Computing in C and C++

BLM2031 Structured Programming. Zeyneb KURT

C LANGUAGE AND ITS DIFFERENT TYPES OF FUNCTIONS

Lecture 4: Outline. Arrays. I. Pointers II. III. Pointer arithmetic IV. Strings

Lecture 05 I/O statements Printf, Scanf Simple statements, Compound statements

12 CREATING NEW TYPES

COP 3223 Introduction to Programming with C - Study Union - Fall 2017

Principles of C and Memory Management

advanced data types (2) typedef. today advanced data types (3) enum. mon 23 sep 2002 defining your own types using typedef

IMPORTANT QUESTIONS IN C FOR THE INTERVIEW

Types. C Types. Floating Point. Derived. fractional part. no fractional part. Boolean Character Integer Real Imaginary Complex

Motivation was to facilitate development of systems software, especially OS development.

The List Datatype. CSc 372. Comparative Programming Languages. 6 : Haskell Lists. Department of Computer Science University of Arizona

Chapter 10 C Structures, Unions, Bit Manipulations

Programming in C - Part 2

Lecture 3: C Programm

PRINCIPLES OF OPERATING SYSTEMS

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

CS24 Week 2 Lecture 1

Data Representation and Storage

Pointers and Structure. Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

CS 314 Principles of Programming Languages. Lecture 11

XSEDE Scholars Program Introduction to C Programming. John Lockman III June 7 th, 2012

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

CS201 - Introduction to Programming Glossary By

Advanced Pointer Topics

Informatics 1 Functional Programming Lecture 9. Algebraic Data Types. Don Sannella University of Edinburgh

Subject: Fundamental of Computer Programming 2068

A3-R3: PROGRAMMING AND PROBLEM SOLVING THROUGH 'C' LANGUAGE

Tokens, Expressions and Control Structures

Recap. ANSI C Reserved Words C++ Multimedia Programming Lecture 2. Erwin M. Bakker Joachim Rijsdam

SYSC 2006 C Winter 2012

Type checking of statements We change the start rule from P D ; E to P D ; S and add the following rules for statements: S id := E

Data Storage. August 9, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 August 9, / 19

IV Unit Second Part STRUCTURES

Computer Science & Information Technology (CS) Rank under AIR 100. Examination Oriented Theory, Practice Set Key concepts, Analysis & Summary

Darshan Institute of Engineering & Technology for Diploma Studies Unit 5

Do not start the test until instructed to do so!

Structures, Operators

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

Review of the C Programming Language

Computer Systems Principles. C Pointers

Parsing and Pattern Recognition

FOR Loop. FOR Loop has three parts:initialization,condition,increment. Syntax. for(initialization;condition;increment){ body;

Low-Level C Programming. Memory map Pointers Arrays Structures

CMSC 330: Organization of Programming Languages. OCaml Data Types

MODULE 5: Pointers, Preprocessor Directives and Data Structures

CS558 Programming Languages

Linked List. April 2, 2007 Programming and Data Structure 1

CS558 Programming Languages

Review of the C Programming Language for Principles of Operating Systems

The New C Standard (Excerpted material)

Lecture 8: Pointer Arithmetic (review) Endianness Functions and pointers

Lecture 9. Assignment. Logical Operations. Logical Operations - Motivation 2/8/18

Procedural programming with C

Chapter 2 (Dynamic variable (i.e. pointer), Static variable)

Programming in Haskell Aug-Nov 2015

Transcription:

Lexical and Syntax Analysis (of Programming Languages) Abstract Syntax

Lexical and Syntax Analysis (of Programming Languages) Abstract Syntax

What is Parsing? Parser String of characters Data structure Easy for humans to write Easy for programs to process A parser also checks that the input string is well-formed, and if not, rejects it.

What is Parsing? Parser String of characters Data structure Easy for humans to write Easy for programs to process A parser also checks that the input string is well-formed, and if not, rejects it.

Example 1 Parser Charlton, 49 Lineker, 48 Beckham, 17 49 Charlton 17 Beckham 48 Lineker CSV (Comma Separated Value) Array of pairs

Example 1 Parser Charlton, 49 Lineker, 48 Beckham, 17 49 Charlton 17 Beckham 48 Lineker CSV (Comma Separated Value) Array of pairs

Concrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is a set of rules that describe valid outputs from the parser. The data structure produced by a parser is commonly termed the abstract syntax tree.

Concrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is a set of rules that describe valid outputs from the parser. The data structure produced by a parser is commonly termed the abstract syntax tree.

Concrete and Abstract Syntax Parser String of characters Data structure Conforms to the Concrete Syntax of the language Conforms to the Abstract Syntax of the language

Concrete and Abstract Syntax Parser String of characters Data structure Conforms to the Concrete Syntax of the language Conforms to the Abstract Syntax of the language

Abstract syntax The abstract syntax is usually specified as a data type in the programming language being used, in our case C. Example: typedef struct { char* name; int goals; } Player; typedef struct { Player* players; int size; } Squad; An abstract syntax tree is a value of this type.

Abstract syntax The abstract syntax is usually specified as a data type in the programming language being used, in our case C. Example: typedef struct { char* name; int goals; } Player; typedef struct { Player* players; int size; } Squad; An abstract syntax tree is a value of this type.

This Chapter How: to define the abstract syntax to construct abstract syntax trees in the programming language C. Also revisits some important C programming techniques. If you need a C tutorial then the following books are recommended.

This Chapter How: to define the abstract syntax to construct abstract syntax trees in the programming language C. Also revisits some important C programming techniques. If you need a C tutorial then the following books are recommended.

POINTERS Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

POINTERS Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

Pointers Declare a variable x of type int and initialise it to the value 10. x: int x = 10; 10 Declare a variable p of type int* (read: int pointer). int* p; Make p point to x (or assign the address of x to p). p = &x; p: p: x: 10

Pointers Declare a variable x of type int and initialise it to the value 10. x: int x = 10; 10 Declare a variable p of type int* (read: int pointer). int* p; Make p point to x (or assign the address of x to p). p = &x; p: p: x: 10

Pointers Print p (here, the address of x). printf("%i\n", p ); p: x: Print the value pointed to by p (here, the value of x). printf("%i\n", *p ); p: 10 x: 10 Assign 20 to the location pointed to by p. *p = 20; p: x: 20

Pointers Print p (here, the address of x). printf("%i\n", p ); p: x: Print the value pointed to by p (here, the value of x). printf("%i\n", *p ); p: 10 x: 10 Assign 20 to the location pointed to by p. *p = 20; p: x: 20

Exercise 1 What is printed by the following program? void swap(int* x, int* y) { int tmp; tmp = *x; *x = *y; *y = tmp; } void main() { int a = 1; int b = 2; swap(&a, &b); printf("a=%i, b=%i\n", a, b); }

Exercise 1 What is printed by the following program? void swap(int* x, int* y) { int tmp; tmp = *x; *x = *y; *y = tmp; } void main() { int a = 1; int b = 2; swap(&a, &b); printf("a=%i, b=%i\n", a, b); }

DYNAMIC ALLOCATION Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

DYNAMIC ALLOCATION Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

Array allocation Declare a variable p of type int*. int* p; p: Allocate memory for an array of 4 int values and let point p to it. p = malloc(4 * sizeof(int)); p:

Array allocation Declare a variable p of type int*. int* p; p: Allocate memory for an array of 4 int values and let point p to it. p = malloc(4 * sizeof(int)); p:

Array indexing Assign 10 to the location pointed to by p. *p = 10; Assign 20 to the first element of the array pointed to by p. p[0] = 20; Copy the first element of the array to the third element. p[2] = p[0]; p: p: p: 10 20 20 20

Array indexing Assign 10 to the location pointed to by p. *p = 10; Assign 20 to the first element of the array pointed to by p. p[0] = 20; Copy the first element of the array to the third element. p[2] = p[0]; p: p: p: 10 20 20 20

Array deallocation When finished with an array allocated by malloc, call free to release the space, otherwise your program may run out of memory. free(p); p: Space released, so it can be reused by future calls to malloc.

Array deallocation When finished with an array allocated by malloc, call free to release the space, otherwise your program may run out of memory. free(p); p: Space released, so it can be reused by future calls to malloc.

STRINGS String: a series of consecutive characters. [The Free Dictionary]

STRINGS String: a series of consecutive characters. [The Free Dictionary]

Strings Declare a variable s, initialised to point to the string hi. char* s = hi ; s: h i \0 Let s point to the next character. s = s + 1; s: h i \0 And let s point to the previous character again. s = s - 1; s: h i \0

Strings Declare a variable s, initialised to point to the string hi. char* s = hi ; s: h i \0 Let s point to the next character. s = s + 1; s: h i \0 And let s point to the previous character again. s = s - 1; s: h i \0

Exercise 2 What is printed by the following program? int f(char* s) { int i = 0; while (s[i]!= '\0') i++; return i; } void main() { char* x = Hello ; printf( %i\n, f(x)); }

Exercise 2 What is printed by the following program? int f(char* s) { int i = 0; while (s[i]!= '\0') i++; return i; } void main() { char* x = Hello ; printf( %i\n, f(x)); }

USER-DEFINED TYPES Type: the general character or structure held in common by a number of things. [The Free Dictionary]

USER-DEFINED TYPES Type: the general character or structure held in common by a number of things. [The Free Dictionary]

Type definitions A typedef declaration allows a new name to be given to a type. typedef int Integer; typedef char* String; Existing type A new name Example use: String s; /* Declare a string s */ Integer i; /* and an integer i */ i = 0; s = hello ;

Type definitions A typedef declaration allows a new name to be given to a type. typedef int Integer; typedef char* String; Existing type A new name Example use: String s; /* Declare a string s */ Integer i; /* and an integer i */ i = 0; s = hello ;

Enumerations An enum declaration introduces a new type whose values are members of a given set. enum colour {RED, GREEN, BLUE}; New type Possible values Example use: enum colour c; c = RED; if (c == RED) printf( Red\n ); Give it a shorter name: typedef enum colour Colour;

Enumerations An enum declaration introduces a new type whose values are members of a given set. enum colour {RED, GREEN, BLUE}; New type Possible values Example use: enum colour c; c = RED; if (c == RED) printf( Red\n ); Give it a shorter name: typedef enum colour Colour;

Structures An struct declaration introduces a new type that is a conjunction of one or more existing types. New type struct rectangle { float width; float height; }; Example use: struct rectangle r; r.width = 10; r.height = 20; A width and a height A circle: struct circle { float radius; }

Structures An struct declaration introduces a new type that is a conjunction of one or more existing types. New type struct rectangle { float width; float height; }; Example use: struct rectangle r; r.width = 10; r.height = 20; A width and a height A circle: struct circle { float radius; }

Unions An union declaration introduces a new type that is a disjunction of one or more existing types. New type union shape { struct circle circ; struct rectangle rect; }; A circle or a rectangle Example use: struct shape s; s.circ.radius = 10;

Unions An union declaration introduces a new type that is a disjunction of one or more existing types. New type union shape { struct circle circ; struct rectangle rect; }; A circle or a rectangle Example use: struct shape s; s.circ.radius = 10;

Tagged unions Often a tag is used to denote the active disjunct of a union. Another definition of shape: struct shape { enum { CIRCLE, RECTANGLE } tag; union { struct circle circ; struct rectangle rect; }; };

Tagged unions Often a tag is used to denote the active disjunct of a union. Another definition of shape: struct shape { enum { CIRCLE, RECTANGLE } tag; union { struct circle circ; struct rectangle rect; }; };

Tagged unions Example: s is a circle and t is a rectangle, and both are of type struct shape. struct shape s, t; s.tag = CIRCLE; s.circ.radius = 10; t.tag = RECTANGLE; t.rect.width = 5; t.rect.height = 15;

Tagged unions Example: s is a circle and t is a rectangle, and both are of type struct shape. struct shape s, t; s.tag = CIRCLE; s.circ.radius = 10; t.tag = RECTANGLE; t.rect.width = 5; t.rect.height = 15;

Tagged unions Example: compute the area of any given shape s. float area(struct shape s) { if (s.tag == CIRCLE) { float r = s.circ.radius; return (3.14 * r * r); } if (s.tag == RECTANGLE) { return (s.rect.width * s.rect.height); } }

Tagged unions Example: compute the area of any given shape s. float area(struct shape s) { if (s.tag == CIRCLE) { float r = s.circ.radius; return (3.14 * r * r); } if (s.tag == RECTANGLE) { return (s.rect.width * s.rect.height); } }

Recursive structures A value of type struct t may contain a value of type struct t*. struct list { int head; struct list* tail; }; typedef struct list List; Suppose x is a value of type List*. (*xs).head xs->head (*(*xs).tail).head xs->tail->head

Recursive structures A value of type struct t may contain a value of type struct t*. struct list { int head; struct list* tail; }; typedef struct list List; Suppose x is a value of type List*. (*xs).head xs->head (*(*xs).tail).head xs->tail->head

Recursive structures Example: inserting an item onto the front of a linked list. List* insert(list* xs, int x) { List* ys = malloc(sizeof(list)); ys->head = x; ys->tail = xs; return ys; }

Recursive structures Example: inserting an item onto the front of a linked list. List* insert(list* xs, int x) { List* ys = malloc(sizeof(list)); ys->head = x; ys->tail = xs; return ys; }

CASE STUDY A simplifier for arithmetic expressions.

CASE STUDY A simplifier for arithmetic expressions.

Concrete syntax Consider the following concrete syntax for arithmetic expressions, where v ranges over variable names and n over integers. e v n e + e e * e ( e ) Example expression: x * y + (z * 10)

Concrete syntax Consider the following concrete syntax for arithmetic expressions, where v ranges over variable names and n over integers. e v n e + e e * e ( e ) Example expression: x * y + (z * 10)

Simplification Consider the algebraic law: x. x * 1 = x This law can be used to simplify expressions by using it as a rewrite rule from left to right. Example simplification: x * (y * 1) x * y

Simplification Consider the algebraic law: x. x * 1 = x This law can be used to simplify expressions by using it as a rewrite rule from left to right. Example simplification: x * (y * 1) x * y

Problem 1. Define an abstract syntax, in C, for arithmetic expressions. 2. Show how to construct abstract syntax trees that represent arithmetic expressions. 3. Implement the simplification as a C function that takes and returns an abstract syntax tree.

Problem 1. Define an abstract syntax, in C, for arithmetic expressions. 2. Show how to construct abstract syntax trees that represent arithmetic expressions. 3. Implement the simplification as a C function that takes and returns an abstract syntax tree.

Abstract syntax typedef enum { ADD, MUL } Op; struct expr { enum { VAR, NUM, APP } tag; union { char* var; }; }; int num; struct { struct expr* e1; Op op; struct expr* e2; } app; typedef struct expr Expr; A variable or a number or an op and two sub-expressions

Abstract syntax typedef enum { ADD, MUL } Op; struct expr { enum { VAR, NUM, APP } tag; union { char* var; }; }; int num; struct { struct expr* e1; Op op; struct expr* e2; } app; typedef struct expr Expr; A variable or a number or an op and two sub-expressions

Constructors Expr* mkvar(char* v) { Expr* e = malloc(sizeof(expr)); e->tag = VAR; e->var = v; return e; } Expr* mknum(int n) { Expr* e = malloc(sizeof(expr)); e->tag = NUM; e->num = n; return e; } Expr* mkapp(expr* e1, Op op, Expr* e2) { Expr* e = malloc(sizeof(expr)); e->tag = APP; e->app.op = op; e->app.e1 = e1; e->app.e2 = e2; return e; }

Constructors Expr* mkvar(char* v) { Expr* e = malloc(sizeof(expr)); e->tag = VAR; e->var = v; return e; } Expr* mknum(int n) { Expr* e = malloc(sizeof(expr)); e->tag = NUM; e->num = n; return e; } Expr* mkapp(expr* e1, Op op, Expr* e2) { Expr* e = malloc(sizeof(expr)); e->tag = APP; e->app.op = op; e->app.e1 = e1; e->app.e2 = e2; return e; }

Abstract syntax trees An abstract syntax tree that represents the expression x + y * 2 can be constructed by the following C expression mkapp( mkvar("x"), ADD, mkapp( mkvar("y"), MUL, mknum(2)))

Abstract syntax trees An abstract syntax tree that represents the expression x + y * 2 can be constructed by the following C expression mkapp( mkvar("x"), ADD, mkapp( mkvar("y"), MUL, mknum(2)))

Simplification x. x * 1 = x is implemented by void simplify(expr* e) { if (e->tag == APP && e->app.op == MUL && e->app.e2->tag == NUM && e->app.e2->num == 1) { *e = *(e->app.e1); } if (e->tag == APP) { simplify(e->app.e1); simplify(e->app.e2); } }

Simplification x. x * 1 = x is implemented by void simplify(expr* e) { if (e->tag == APP && e->app.op == MUL && e->app.e2->tag == NUM && e->app.e2->num == 1) { *e = *(e->app.e1); } if (e->tag == APP) { simplify(e->app.e1); simplify(e->app.e2); } }

Homework exercises Implement a pretty printer that prints an abstract syntax tree in a concrete form. void print(expr* e) {... } Extend the simplifier to exploit the following algebraic law. x. x * 0 = 0

Homework exercises Implement a pretty printer that prints an abstract syntax tree in a concrete form. void print(expr* e) {... } Extend the simplifier to exploit the following algebraic law. x. x * 0 = 0

Motivation for LSA In LSA, we are interested in how to implement the following kind of function Expr* parse(char* string) {... } It takes a string conforming to the concrete syntax and returns an abstract syntax tree.