Decision, Computation and Language

Similar documents
COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Lexical Analysis. Prof. James L. Frankel Harvard University

Regular Expressions. Regular Expressions. Regular Languages. Specifying Languages. Regular Expressions. Kleene Star Operation

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CSE 105 THEORY OF COMPUTATION

ECS 120 Lesson 7 Regular Expressions, Pt. 1

UNIT II LEXICAL ANALYSIS

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np

Lexical Analysis (ASU Ch 3, Fig 3.1)

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Introduction to Lexical Analysis

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS

Lexical Analysis. Chapter 2

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

Implementation of Lexical Analysis

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Regular Languages and Regular Expressions

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Compiler phases. Non-tokens

Midterm I review. Reading: Chapters 1-4

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Chapter Seven: Regular Expressions

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Implementation of Lexical Analysis

Lexical Analysis. Lecture 2-4

Formal Languages and Compilers Lecture VI: Lexical Analysis

CSc 453 Lexical Analysis (Scanning)

Figure 2.1: Role of Lexical Analyzer

[Lexical Analysis] Bikash Balami

Zhizheng Zhang. Southeast University

Languages and Compilers

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

CS308 Compiler Principles Lexical Analyzer Li Jiang

CSE 105 THEORY OF COMPUTATION

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Problem: Read in characters and group them into tokens (words). Produce a program listing. Do it efficiently.

Lexical Analysis. Lecture 3-4

Lecture 4: Syntax Specification

Module 6 Lexical Phase - RE to DFA

Formal Languages and Compilers Lecture IV: Regular Languages and Finite. Finite Automata

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

DVA337 HT17 - LECTURE 4. Languages and regular expressions

COP 3402 Systems Software Syntax Analysis (Parser)

Chapter 4. Lexical analysis. Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

CSE 401/M501 Compilers

Concepts Introduced in Chapter 3. Lexical Analysis. Lexical Analysis Terms. Attributes for Tokens

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

2. Lexical Analysis! Prof. O. Nierstrasz!

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Non-deterministic Finite Automata (NFA)

Concepts. Lexical scanning Regular expressions DFAs and FSAs Lex. Lexical analysis in perspective

Dr. D.M. Akbar Hussain

Lexical Analysis. Implementation: Finite Automata

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Lecture 5: Regular Expression and Finite Automata

G52LAC Languages and Computation Lecture 6

Formal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2

Compiler Construction

Lexical Analysis. Introduction

Regular Languages. Regular Language. Regular Expression. Finite State Machine. Accepts

Building Compilers with Phoenix

Compiler Construction

Lexical Analysis/Scanning

Converting a DFA to a Regular Expression JP

CIT3130: Theory of Computation. Regular languages

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

Final Course Review. Reading: Chapters 1-9

Neha 1, Abhishek Sharma 2 1 M.Tech, 2 Assistant Professor. Department of Cse, Shri Balwant College of Engineering &Technology, Dcrust University

8 ε. Figure 1: An NFA-ǫ

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Principles of Programming Languages COMP251: Syntax and Grammars

TOPIC PAGE NO. UNIT-I FINITE AUTOMATA

CS415 Compilers. Lexical Analysis

Interpreter. Scanner. Parser. Tree Walker. read. request token. send token. send AST I/O. Console

Lexical Analyzer Scanner

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

Lexical Analyzer Scanner

CS 403: Scanning and Parsing

Front End: Lexical Analysis. The Structure of a Compiler

Compiler Design. 2. Regular Expressions & Finite State Automata (FSA) Kanat Bolazar January 21, 2010

COMP Logic for Computer Scientists. Lecture 25

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Finite automata. We have looked at using Lex to build a scanner on the basis of regular expressions.

Regular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)

List of Figures. About the Authors. Acknowledgments

Chapter 2 :: Programming Language Syntax

Learn Smart and Grow with world

Implementation of Lexical Analysis. Lecture 4

Name: Finite Automata

Principles of Programming Languages [PLP-2015] Detailed Syllabus

Transcription:

Decision, Computation and Language Regular Expressions Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218

Regular expressions M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 2

Regular expressions M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 3

Union M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 4

Concatenation M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 5

Closure (also called Kleene closure ) If we take a regular expression and add the superscript *, we get a new r.e. that represents the set of all words you can make by taking any sequence of words from the original r.e. and concatenating them together. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 6

Example ({cat})* denotes the words ε, cat, catcat, catcatcat, catcatcatcat,... Notice that using closure, you can define an infinite language! M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 7

Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 8

Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 9

Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 10

Examples M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 11

Common extensions to the notation M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 12

Lexical tokens M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 13

Regular Languages......are those languages that can be described using regular expressions. We will see that these are the same languages as those that can be accepted by finite automata (Kleene's theorem). In terms of description length some languages are better expressed as DFAs and some as reg exprs. r.e's usually give a easy-to-understand description, on the other hand it's obvious how to run DFAs on input strings. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 14

A language that is easier to describe using a r.e. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 15

Regular Expressions vs. Finite Automata Offers a declarative way to express the pattern of any string we want to accept E.g., 01*+ 10* Automata => more machine-like < input: string, output: [accept/reject] > Regular expressions => more program syntax-like Unix environments heavily use regular expressions E.g., bash shell, grep, vi & other editors, sed Perl scripting good for string processing Lexical analyzers such as Lex or Flex M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 16

Regular Expressions Syntactical expressions Regular expressions = Finite Automata (DFA, NFA, -NFA) Automata/machines Regular Languages Formal language classes M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 17

Example L = { w w is a binary string which does not contain two consecutive 0s or two consecutive 1s anywhere) E.g., w = 01010101 is in L, while w = 10010 is not in L Goal: Build a regular expression for L Four cases for w: Case A: w starts with 0 and w is even Case B: w starts with 1 and w is even Case C: w starts with 0 and w is odd Case D: w starts with 1 and w is odd Regular expression for the four cases: Case A: (01)* Case B: (10)* Case C: 0(10)* Case D: 1(01)* Since L is the union of all 4 cases: Reg Exp for L = (01)* + (10)* + 0(10)* + 1(01)* M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 18

Precedence of Operators Highest to lowest * operator (star). (concatenation) + operator Example: 01* + 1 = ( 0. ((1)*) ) + 1 M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 19

Finite Automata & Regular Expressions Proofs in the book To show that they are interchangeable, consider the following theorems: Theorem 1: For every DFA A there exists a regular expression R such that L(R)=L(A) Theorem 2: For every regular expression R there exists an -NFA E such that L(E)=L(R) Theorem 2 -NFA Reg Ex Theorem 1 NFA DFA Kleene Theorem M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 20

DFA to RE construction DFA Theorem 1 Reg Ex Informally, trace all distinct paths (traversing cycles only once) from the start state to each of the final states and enumerate all the expressions along the way Example: 1 0 0,1 q 0 0 q 1 1 q 2 (1*) 0 (0*) 1 (0 + 1)* 1* 00* 1 (0+1)* Q) What is the language? M S Khan (Univ. of Liverpool) 1*00*1(0+1)* COMP218 Decision, Computation and Language 21

RE to -NFA construction Reg Ex Theorem 2 -NFA Example: (0+1)*01(0+1)* (0+1)* 01 (0+1)* 0 0 1 0 1 1 M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 22

Algebraic Laws of Regular Expressions Commutative: E+F = F+E Associative: (E+F)+G = E+(F+G) (EF)G = E(FG) Identity: E+Φ = E E = E = E Annihilator: ΦE = EΦ = Φ M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 23

Algebraic Laws Distributive: E(F+G) = EF + EG (F+G)E = FE+GE Idempotent: E + E = E Involving Kleene closures: (E*)* = E* Φ* = * = E + =EE* M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 24

Observations, exercise M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 25

Equivalences amongst regular expressions M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 26

Equivalences amongst regular expressions M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 27

True or False? Let R and S be two regular expressions. Then: 1. ((R*)*)* = R*? 2. (R+S)* = R* + S*? 3. (RS + R)* RS = (RR*S)*? M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 28

Using regular expressions for lexical analysis Use regular expressions to specify lexical tokens A parser generator can convert r.e's to DFAs. Given a sequence of characters that may or may not be a token, it is easy to check whether it is accepted by the DFA. (Notice that it is not straightforward to check whether a string of symbols belongs to the language given by a regular expression. A compiler, given a program, will look for a prefix that corresponds to a token, when that prefix is found, add it to list of tokens, delete it from program, and repeat, on remainder of program. (A lexical error is generated if no prefix matches any token expression.) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 29

Taking stock M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 30

Taking stock M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 31