Languages and Automata

Similar documents
Computer Sciences Department

Problem Set 6 (Optional) Due: 5pm on Thursday, December 20

Decidable Problems. We examine the problems for which there is an algorithm.

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

1. [5 points each] True or False. If the question is currently open, write O or Open.

Problems, Languages, Machines, Computability, Complexity

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

Specialized Grammars. Overview. Linear Grammars. Linear Grammars and Normal Forms. Yet Another Way to Specify Regular Languages

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

Theory of Computation, Homework 3 Sample Solution

An Interactive Approach to Formal Languages and Automata with JFLAP

Final Course Review. Reading: Chapters 1-9

Theory of Programming Languages COMP360

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

Outline. Language Hierarchy

Actually talking about Turing machines this time

CT32 COMPUTER NETWORKS DEC 2015

CS21 Decidability and Tractability

Specifying Syntax COMP360

Reflection in the Chomsky Hierarchy

Universal Turing Machine Chomsky Hierarchy Decidability Reducibility Uncomputable Functions Rice s Theorem Decidability Continued

CMSC 330, Fall 2018 Midterm 2

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

Theory of Computations Spring 2016 Practice Final Exam Solutions

CSE 105 THEORY OF COMPUTATION

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Lecture 2 Finite Automata

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

Semantic Analysis. Lecture 9. February 7, 2018

The Big Picture. Chapter 3

CMSC 350: COMPILER DESIGN

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

CS415 Compilers. Lexical Analysis

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Announcements! P1 part 1 due next Tuesday P1 part 2 due next Friday

R10 SET a) Construct a DFA that accepts an identifier of a C programming language. b) Differentiate between NFA and DFA?

Skyup's Media. PART-B 2) Construct a Mealy machine which is equivalent to the Moore machine given in table.

Compiler phases. Non-tokens

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Source of Slides: Introduction to Automata Theory, Languages, and Computation By John E. Hopcroft, Rajeev Motwani and Jeffrey D.

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Decision Properties for Context-free Languages

Computation, Computers, and Programs. Administrivia. Resources. Course texts: Kozen: Introduction to Computability Hickey: Introduction to OCaml

Syntax Analysis Part I

Theory and Compiling COMP360

Recursively Enumerable Languages, Turing Machines, and Decidability

Material from Recitation 1

T.E. (Computer Engineering) (Semester I) Examination, 2013 THEORY OF COMPUTATION (2008 Course)

DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY SIRUVACHUR, PERAMBALUR DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Turing Machine Languages

PSD3A Principles of Compiler Design Unit : I-V. PSD3A- Principles of Compiler Design

Context-Free Grammars

Computation Engineering Applied Automata Theory and Logic. Ganesh Gopalakrishnan University of Utah. ^J Springer

Introduction to Lexing and Parsing

Theory of Computation

Lambda Calculus and Computation


PS3 - Comments. Describe precisely the language accepted by this nondeterministic PDA.

CS 406/534 Compiler Construction Putting It All Together

CS164: Midterm I. Fall 2003

Regular Languages (14 points) Solution: Problem 1 (6 points) Minimize the following automaton M. Show that the resulting DFA is minimal.

Midterm I review. Reading: Chapters 1-4

CSE 105 THEORY OF COMPUTATION

CMSC 330, Fall 2018 Midterm 2

CSCI312 Principles of Programming Languages!

To illustrate what is intended the following are three write ups by students. Diagonalization

Theory of Computations Spring 2016 Practice Final

CS525 Winter 2012 \ Class Assignment #2 Preparation

ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES *

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Introduction to Computers & Programming

KEY. A 1. The action of a grammar when a derivation can be found for a sentence. Y 2. program written in a High Level Language

UNIT I PART A PART B

CMPE 152 Compiler Design

CSc 453 Compilers and Systems Software

JNTUWORLD. Code No: R

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CSE 105 THEORY OF COMPUTATION

CpSc 421 Final Solutions

Complementing Non-CFLs If A and B are context free languages then: AR is a context-free language TRUE

1 Parsing (25 pts, 5 each)

Final Exam 1, CS154. April 21, 2010

Pushdown Automata. A PDA is an FA together with a stack.

Compiler Optimisation

CS/SE 153 Concepts of Compiler Design

Turning Automata Theory into a Hands-on Course

Midterm I (Solutions) CS164, Spring 2002

INTRODUCTION PRINCIPLES OF PROGRAMMING LANGUAGES. Norbert Zeh Winter Dalhousie University 1/10

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CSE 582 Autumn 2002 Exam 11/26/02

Languages and Compilers

t!1; R 0! 1; R q HALT

LECTURE 3. Compiler Phases

CMPE 152 Compiler Design

Multiple Choice Questions

15-411/ Compiler Design

Transcription:

Languages and Automata What are the Big Ideas? Tuesday, August 30, 2011 Reading: Sipser 0.1 CS235 Languages and Automata Department of Computer Science Wellesley College Why Take CS235? 1. It s required for CS majors. (bad reason) 2. It teaches some important Big Ideas. (good reason) 3. It s a lead-in to Programming Languages (CS251) and Compilers (CS301) *, in terms of both theory and practice (scanning, parsing, ML programming). 4. It s a lead-in to CS310 Theoretical Foundations in Cryptology. * Hasn t been taught for a while, but might be offered again in the future if there s sufficient interest. Was taught by Randy in Spring 11, and is planned again for Spring 13. Introduction 1-2 The Big Ideas 1. Models of Computation 2. Undecidability: Not Everything is Computable 3. Mathematical Foundations for CS 4. Cool Applications Introduction 1-3 #1 Models of Computation: Regular Languages E.g.: all binary strings consisting of all 1s or sequences of 10s (includes the empty string, written ) Regular Expression 1 * " (10) * Right-Linear Grammar S A S B" A " A 1A" B " B 10B Deterministic Finite Automaton 1 0 > 0 1 1 0 0 1 0 1 0,1 Nondeterministic Finite Automaton 1 > 1 0 Introduction 1-4

#1 Models of Computation: Context-Free Languages E.g.: all binary strings with an equal number of 0s and 1s. Note: this language is not regular Context Free Grammar S S SS S 0S1 S 1S0 Nondeterministic Pushdown Automaton 0, # 0 0, 1 #, #, # > 1, # 1 1, 0 # #1 Models of Computation: Turing Machines E.g.: all strings of the form a n b n c n. (This is not context free) a a b b c c... s finite control Q: Can we create a deterministic recognizer/parser for a CFL? A: Yes, but can only do so efficiently for a restricted subclass of CFLs. Introduction 1-5 Introduction 1-6 #1 TM State Transitions for {w#w w $ {a,b}*} q 1 # q 8 q accept #1 Models of Computation: The Chomsky Hierarchy a,b # q 2 q 4 This picture is adapted from p. 145 of Sipser. a x, L, # q reject q 6 q 7 # L, a, # a,b,x L a,b L b x, L q 3 q 5 # a,b Unrestricted grammar = TM Contextsensitive grammar = NLBA CFG = NPDA Right-linear Grammar = DFA = NFA = Reg. Exp Introduction 1-7 Introduction 1-8

#1 Models of Computation: Language Map RE = Recursively Enumerable (Turing-Recognizable/Acceptable) Languages semi-decidable+ HALT TM Dec = Recursive (Turing-Decidable) Languages decidable a n b n c n ww ACCEPT DFA EMPTY DFA ACCEPT CFG CFL = Context-Free Languages a n b n Reg = Regular Languages a*b* ACCEPT TM EQ DFA INFINITE DFA EMPTY CFG ww R (a+b)*bbb(a+b)* #1 Models of Computation: Forlan o Forlan is a calculator for models of computation -- a set of modules in the Standard ML programming language that we ll be using in class & assignments to do hands-on experiments with the models of computation we ll be learning o Created by Alley Stoughton while at Kansas State University. o See http://alleystoughton.us/forlan/ for: " Draft textbook (free online) " Textbook slides " Software = Standard ML modules, Java programs (we ll use these in CS235) " Alley s KSU course: CIS 570 Introduction 1-9 Introduction 1-10 #2 Undecidability: Not Everything is Computable Languages decided by Turing Machines Vast universe of languages that cannot be decided by Turing Machines The Halting Problem: It is impossible to write a program that always determines whether or not other programs halt. Rice s Theorem: Any nontrivial question we might ask about programs in general cannot be answered by other programs. Introduction 1-11 #2: The Landscape of Undecidability Lan EQ TM not-even-semi-decidable EQ TM EMPTY HALT TM TM ACCEPT TM RE semi-decidable+ Dec decidable HALT TM ACCEPT TM EMPTY TM co-re semi-decidable- Note: undecidable = semi-decidable+ U semi-decidable- U not-even-semidecidable Introduction 1-12

#2: With Great Power Comes Great Uncomputability #3 Mathematical Foundations of Computer Science Turing machines and equivalent models of computation (lambda calculus, Java, SML, etc.) are far more powerful than finite automata and pushdown automata. But the power is gained via features that can cause programs to loop infinitely. If we want the power, we must live with the looping. Introduction 1-13 o Explain computation by using standard mathematical structures: Sets Strings Tuples Relations Functions Languages = sets of strings Trees Graphs o Use many proof techniques, including Proof by construction Proof by algebra Proof by picture Proof by contradiction (including diagonalization) Proof by induction o Prove negative results as well as positive results: Pumping lemma for regular languages Pumping lemma for context-free languages Undecidability results Introduction 1-14 #4 Cool Applications of the Material in this Course o Using regular expressions in Java, JavaScript, Python, Emacs, etc. o Using Unix tools like grep, sed, and awk. o Writing virus detection scanners. o Designing hardware (CS240) and network protocols (CS242). o Programming language implementation: creating scanners and parsers using tools like PEG parsers and ANTLR. o ML programming (appetizer for CS251, CS301). o Knowing which problems not to solve. #4 Applications: Compiler Structure Source Program (character stream) Lexer (a.k.a. Scanner, Tokenizer) Front End (CS235) Global Analysis Information (Symbol and Attribute Tables) (used by all phases of the compiler) Tokens Parser Middle Stages (CS251/ CS301) Back End (CS301) Abstract Syntax Tree (AST) Type Checker Intermediate Representation Semantic Analysis Intermediate Representation Optimizer Intermediate Representation Code Generator Machine code or byte code Introduction 1-15 Introduction 1-16

#4 Applications: Front End Example if (num > 0 && num <= top) { // Is num in range? return c*num } else {return 0;} Lexer (ignores whitespace, comments) if ( num > 0 && num <= top ) { return c * num } else { return 0 ; } greater and relational logical intlit less-orequal conditional relational num 0 num top Parser (creates AST) times return arithmetic c num return intlit 0 Introduction 1-17 Administrivia o Syllabus, problem sets, office hours, etc. on the CS235 home page: http://cs.wellesley.edu/~cs235 o Weekly problem sets, typically due Wed. @ midnight (start early) o I ll try to grade psets in a timely fashion. The next problem set won t be due until at least one day after the previous one has been returned o Two 4-hour self-scheduled take-home midterm exams. o Will either be a final scanning/parsing project or a 2.5 hour selfscheduled final exam (I ll make a decision soon). Introduction 1-18 Textbook Sipser s Introduction to the Theory of Computation (2 nd ed.) Great book, but 1. we ll only cover first ~200 pages and 2. it s expensive: Bookstore: ~$1?? new Amazon: ~$113 new (cheaper used) I have put copies of book in CS Library (in SCI E125). Some are 1 st ed., which is OK for most purposes. Introduction 1-19 Other Resources o Other particularly useful books in the E125 CS library: Dexter Kozen s Automata and Computability L. C. Paulson s ML for the Working Programmer o Many other automata/computability books in E125 o I ll also hand out a few articles in class Andrew Appel s Modern Compiler Implementation in ML Introduction 1-20

Finding Help o My office hours (in SCI 257, not my office): Mon: 2:40-5pm Tue: 1:30-4pm Wed: 4-6pm I ll also answer questions after lecture. Sometimes I ll need to reschedule due to meetings/talks. o Amanda Daigle will hold weekly drop-in tutoring hours (TBA) o Q&A: http://piazza.com/class#fall2011/cs235 o Google group (announcements/coordination): http://groups.google.com/group/wellesley-cs235-fall11 o Form study groups & work on psets together (subject to collaboration policies on next page.) Collaboration Policies o You may talk with anyone about any nonexam problem, but must write up all proofs/programs on your own. o On every assignment, you must list all people with whom you collaborated on any problem. o You must not look at anyone else s written proofs/programs. o You must not look at any solutions from previous semesters of CS235. o You may find other information in textbooks and on the Internet, but must cite any source you use for solutions. o I consider violations of the above policies to be Honor Code violations and will report them to the Honor Code Council. Introduction 1-21 Introduction 1-22 Acknowledgments o Randy Shull is responsible for many of the assignment problems and slides (esp. ones with clip art). Questions? I love questions o Chelsea Hoover extended Forlan to handle nondeterministic pushdown automata. Introduction 1-23 Introduction 1-24