Languages and Strings. Chapter 2

Similar documents
Proof Techniques Alphabets, Strings, and Languages. Foundations of Computer Science Theory

MA/CSSE 474. Today's Agenda

Why Study the Theory of Computation?

Regular Expressions. Chapter 6

Regular Expression Module-2

DVA337 HT17 - LECTURE 4. Languages and regular expressions

Languages and Finite Automata

Glynda, the good witch of the North

Finite Automata Part Three

Notes for Comp 454 Week 2

Chapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1

Automata Theory TEST 1 Answers Max points: 156 Grade basis: 150 Median grade: 81%

Chapter Seven: Regular Expressions

CHAPTER TWO LANGUAGES. Dr Zalmiyah Zakaria

Finite Automata Part Three

Regular Expressions & Automata

Slides for Faculty Oxford University Press All rights reserved.

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Regular Languages. Regular Language. Regular Expression. Finite State Machine. Accepts

CMSC 132: Object-Oriented Programming II

Welcome! CSC445 Models Of Computation Dr. Lutz Hamel Tyler 251

CMPSCI 250: Introduction to Computation. Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014

Languages and Compilers

Formal languages and computation models

HKN CS 374 Midterm 1 Review. Tim Klem Noah Mathes Mahir Morshed

Regular Expressions. Lecture 10 Sections Robb T. Koether. Hampden-Sydney College. Wed, Sep 14, 2016

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

ITEC2620 Introduction to Data Structures

1.3 Functions and Equivalence Relations 1.4 Languages

Finite Automata Part Three

Lecture 6,

Multiple Choice Questions

CS402 Theory of Automata Solved Subjective From Midterm Papers. MIDTERM SPRING 2012 CS402 Theory of Automata

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

Automata & languages. A primer on the Theory of Computation. The imitation game (2014) Benedict Cumberbatch Alan Turing ( ) Laurent Vanbever

CS 314 Principles of Programming Languages

1. [5 points each] True or False. If the question is currently open, write O or Open.

CS402 - Theory of Automata FAQs By

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

Regular Expressions. Regular Expressions. Regular Languages. Specifying Languages. Regular Expressions. Kleene Star Operation

Automata Theory CS S-FR Final Review

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np

Turing Machine Languages

Strings, Languages, and Regular Expressions

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

MATH Iris Loeb.

JNTUWORLD. Code No: R

Announcements. CS243: Discrete Structures. Strong Induction and Recursively Defined Structures. Review. Example (review) Example (review), cont.

Rice s Theorem and Enumeration

CSCI 270: Introduction to Algorithms and Theory of Computing Fall 2017 Prof: Leonard Adleman Scribe: Joseph Bebel

Front End: Lexical Analysis. The Structure of a Compiler

CSE 105 THEORY OF COMPUTATION

Power Set of a set and Relations

Problems, Languages, Machines, Computability, Complexity

Lecture 5: The Halting Problem. Michael Beeson

Decision Properties for Context-free Languages

Introduction to Lexical Analysis

Reflection in the Chomsky Hierarchy

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems

Lexical Analysis - 1. A. Overview A.a) Role of Lexical Analyzer

UNION-FREE DECOMPOSITION OF REGULAR LANGUAGES

Introduction to Lexical Analysis

Caveat lector: This is the first edition of this lecture note. Please send bug reports and suggestions to

LECTURE NOTES THEORY OF COMPUTATION

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

Material from Recitation 1

Finite-State Transducers in Language and Speech Processing

LECTURE NOTES THEORY OF COMPUTATION

Learn Smart and Grow with world

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

Compilers. Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Lexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata

AUBER (Models of Computation, Languages and Automata) EXERCISES

UNIT I PART A PART B

TOPIC PAGE NO. UNIT-I FINITE AUTOMATA

Compiler Construction 2011, Lecture 2

Computability Summary

Week 2: Syntax Specification, Grammars

Infinity and Uncountability. Countable Countably infinite. Enumeration

Alphabets, strings and formal. An introduction to information representation

Chapter 4: Regular Expressions

3.15: Applications of Finite Automata and Regular Expressions. Representing Character Sets and Files

A Little Point Set Topology

Final Course Review. Reading: Chapters 1-9

Name: Finite Automata

ECS 20 Lecture 9 Fall Oct 2013 Phil Rogaway

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

CS402 - Theory of Automata Glossary By

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

n λxy.x n y, [inc] [add] [mul] [exp] λn.λxy.x(nxy) λmn.m[inc]0 λmn.m([add]n)0 λmn.n([mul]m)1

Regular Expressions. A Language for Specifying Languages. Thursday, September 20, 2007 Reading: Stoughton 3.1

5.2: Closure Properties of Recursive and Recursively Enumerable Languages

CS308 Compiler Principles Lexical Analyzer Li Jiang

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

Lexical Analysis. Prof. James L. Frankel Harvard University

8 ε. Figure 1: An NFA-ǫ

CHAPTER 4. COMPUTABILITY AND DECIDABILITY

Transcription:

Languages and Strings Chapter 2

Let's Look at Some Problems int alpha, beta; alpha = 3; beta = (2 + 5) / 10; (1) Lexical analysis: Scan the program and break it up into variable names, numbers, etc. (2) Parsing: Create a tree that corresponds to the sequence of operations that should be executed, e.g., / + 10 2 5 (3) Optimization: Realize that we can skip the first assignment since the value is never used and that we can precompute the arithmetic expression, since it contains only constants. (4) Termination: Decide whether the program is guaranteed to halt. (5) Interpretation: Figure out what (if anything) useful it does.

A Framework for Analyzing Problems We need a single framework in which we can analyze a very diverse set of problems. The framework we will use is Language Recognition A language is a (possibly infinite) set of finite length strings over a finite alphabet.

Strings A string is a finite sequence, possibly empty, of symbols drawn from some alphabet Σ. ε is the empty string. Σ* is the set of all possible strings over an alphabet Σ. Alphabet name Alphabet symbols Example strings The English alphabet The binary alphabet {a, b, c,, z} ε, aabbcg, aaaaa {0, 1} ε, 0, 001100 A star alphabet {,,,,, } ε,, A music alphabet {w, h, q, e, x, r, l} ε, w l h h l hqq l

Functions on Strings Counting: s is the number of symbols in s. ε = 0 1001101 = 7 # c (s) is the number of times that c occurs in s. # a (abbaaa) = 4.

More Functions on Strings Concatenation: st is the concatenation of s and t. If x = good and y = bye, then xy = goodbye. Note that xy = x + y. ε in string concatenation: x (x ε = ε x = x). Concatenation is associative: s, t, w ((st)w = s(tw)).

More Functions on Strings Replication: For each string w and each natural number i, the string w i is: Examples: w 0 = ε w i+1 = w i w a 3 = aaa (bye) 2 = byebye a 0 b 3 = bbb

More Functions on Strings Reverse: For each string w, w R is defined as: if w = 0 then w R = w = ε if w 1 then: a Σ ( u Σ* (w = ua)). So define w R = a u R.

Concatenation and Reverse of Strings Theorem: If w and x are strings, then (w x) R = x R w R. Example: (nametag) R = (tag) R (name) R = gateman

Concatenation and Reverse of Strings Proof: By induction on x : x = 0: Then x = ε, and (wx) R = (w ε) R = (w) R = ε w R = ε R w R = x R w R. n 0 ((( x = n) ((w x) R = x R w R )) (( x = n + 1) ((w x) R = x R w R ))): Consider any string x, where x = n + 1. Then x = u a for some character a and u = n. So: (w x) R = (w (u a)) R rewrite x as ua = ((w u) a) R associativity of concatenation = a (w u) R definition of reversal = a (u R w R ) induction hypothesis = (a u R ) w R associativity of concatenation = (ua) R w R definition of reversal = x R w R rewrite ua as x

Relations on Strings aaa is a substring of aaabbbaaa aaaaaa is not a substring of aaabbbaaa aaa is a proper substring of aaabbbaaa Every string is a substring of itself. ε is a substring of every string.

The Prefix Relations s is a prefix of t iff: x Σ* (t = sx). s is a proper prefix of t iff: s is a prefix of t and s t. Examples: The prefixes of abba are: ε, a, ab, abb, abba. The proper prefixes of abba are: ε, a, ab, abb. Every string is a prefix of itself. ε is a prefix of every string.

The Suffix Relations s is a suffix of t iff: x Σ* (t = xs). s is a proper suffix of t iff: s is a suffix of t and s t. Examples: The suffixes of abba are: ε, a, ba, bba, abba. The proper suffixes of abba are: ε, a, ba, bba. Every string is a suffix of itself. ε is a suffix of every string.

Defining a Language A language is a (finite or infinite) set of strings over a finite alphabet Σ. Examples: Let Σ = {a, b} Some languages over Σ:, {ε}, {a, b}, {ε, a, aa, aaa, aaaa, aaaaa} The language Σ* contains an infinite number of strings, including: ε, a, b, ab, ababaa.

Example Language Definitions L = {x {a, b}* : all a s precede all b s} ε, a, aa, aabbb, and bb are in L. aba, ba, and abc are not in L.

Example Language Definitions L = {x : y {a, b}* : x = ya} Simple English description:

More Example Language Definitions L = {} = L = {ε}

English not Well-Defined L = {w: w is a sentence in English}. Examples: Kerry hit the ball. Colorless green ideas sleep furiously. The window needs fixed. Ball the Stacy hit blue.

A Halting Problem Language L = {w: w is a C program that halts on all inputs}. Well specified Useful

Enumeration More useful: lexicographic order Shortest first Within a length, dictionary order This way, can enumerate a language The lexicographic enumeration of: {w {a, b}* : w is even} :

How Large is a Language? The smallest language over any Σ is, with cardinality 0. The largest is Σ*. How big is it?

How Large is a Language? Theorem: If Σ then Σ* is countably infinite. Proof: The elements of Σ* can be lexicographically enumerated by the following procedure: Enumerate all strings of length 0, then length 1, then length 2, and so forth. Within the strings of a given length, enumerate them in dictionary order. This enumeration is infinite since there is no longest string in Σ*. Since there exists an infinite enumeration of Σ*, it is countably infinite. n

How Large is a Language? So the smallest language has cardinality 0. The largest is countably infinite. So every language is either finite or countably infinite.

How Many Languages Are There? Theorem: If Σ then the set of languages over Σ is uncountably infinite. Proof: The set of languages defined on Σ is P(Σ*). Σ* is countably infinite. If S is a countably infinite set, P(S) is uncountably infinite. So P(Σ*) is uncountably infinite. n

Functions on Languages Set operations Union Intersection Complement Language operations Concatenation Kleene star

Concatenation of Languages If L 1 and L 2 are languages over Σ: Examples: L 1 L 2 = {w Σ* : s L 1 ( t L 2 (w = st))} L 1 = {cat, dog} L 2 = {apple, pear} L 1 L 2 ={catapple, catpear, dogapple, dogpear} L{ε} = {ε}l = L L = L =

Kleene Star Kleene operator L* = {ε} {w Σ* : k 1 ( w 1, w 2, w k L (w = w 1 w 2 w k ))} In other words, L* is the set of strings that can be formed by concatenating together zero or more strings in L Is closed under concatenation Example: L = {dog, cat, fish} L* = {ε, dog, cat, fish, dogdog, dogcat, fishcatfish, fishdogdogfishcat, } If L =, L* =? If L = {ε}, L* =? Is L* the concatenation closure of L?

Stephen Cole Kleene 1909 1994, mathematical logician One of many distinguished students (e.g., Alan Turing) of Alonzo Church at Princeton. Best known as a founder of the branch of mathematical logic known as recursion theory. Also invented regular expressions. Kleeneness is next to Godelness Cleanliness is next to Godliness A pic of pic by me Obituaries from New York Times Modern computer science, in large part, grew out of recursion theory, and his work has been tremendously influential for years. One of the reasons for the importance of recursion theory is that it gives us a way of showing that some mathematical problems can never be solved, no matter how much computing power is available

The + Operator Sometimes useful to require at least one element of L be selected L + = L L* if ε L then L + = L* - {ε} Otherwise L + = L* L + is the closure of L under concatenation. recall closure of L needs to contain L

Concatenation and Reverse of Languages Theorem: (L 1 L 2 ) R = L 2R L 1R. Proof: x ( y ((xy) R = y R x R )) Theorem 2.1 (L 1 L 2 ) R = {(xy) R : x L 1 and y L 2 } Definition of concatenation of languages = {y R x R : x L 1 and y L 2 } Lines 1 and 2 = L 2R L R 1 Definition of concatenation of languages

What About Meaning? A n B n = {a n b n : n 0}. Do these strings mean anything? But some languages are useful because their strings have meanings. English, Chinese, Java, C++, Perl Need a precise way to map each string to its meaning (semantics).

Semantic Interpretation Functions A semantic interpretation function assigns meanings to the strings of a language. Language is mostly infinite, no way to map each string Must define a function that knows the meanings of basic units and can compose those meanings according to some fixed set of rules, to build meanings for larger expressions Compositional semantic interpretation function How about English? Compositional? I gave him a mug I m going to give him a piece of my mind

Semantic Interpretation Functions When we define a formal language for a specific purpose, we design it so that there exists a compositional semantic interpretation function For example, C++ and Java One significant property of semantic interpretation functions for useful languages is that they are generally not one-to-one Chocolate, please, I d like chocolate, I ll have chocolate x++, x = x + 1