System & Network Engineering. Regular Expressions ESA 2008/2009. Mark v/d Zwaag, Eelco Schatborn 22 september 2008
|
|
- Corey Conley
- 5 years ago
- Views:
Transcription
1 1 Regular Expressions ESA 2008/2009 Mark v/d Zwaag, Eelco Schatborn 22 september 2008
2 Today: Regular1 Expressions and Grammars Formal Languages Context-free grammars; BNF, ABNF Unix Regular Expressions
3 Regular 1 Expressions A regular expression (regexp, regex, regxp) is a string (a word), that, according to certain syntax rules, describes a set of strings (a language). Regular Expressions are used in many text (unix) editors, tools and programming languages to search for patterns in a text, and for substitution of strings
4 1 History Theory of formal languages Kleenes algebra of regular sets Ken Thompson introduced the RE notation to ed Regex is used in grep, awk, emacs, vi, perl.
5 Regular Expressions 1 in formal languages theory a regular expression represents a set of strings/words (a language). Regular Expressions are made up of constants and operators Given a finite alphabet Σ, the following constants are defined: empty set represents the empty set empty string (length 0) ɛ represents the set {ɛ} literals a character A in Σ represents the set {A}
6 Regular Expressions 1 in formal languages theory The operators Assume Regular Expressions R and S concatenation (RS) represents the set {αβ α R, β S} choice (R S) represents a choice between R and S iteration (R) represents the closure of R under concatenation; R = {ɛ} R RR RRR Binding strength: Kleene star > concatenation > choice ((ab)c) can be written as abc (a (b(c) )) as a bc.
7 Regular Expressions 1 in formal languages theory Examples Let Σ = {0, 1}, then 00 represents {00} represents { }
8 Regular Expressions 1 in formal languages theory Examples Let Σ = {0, 1}, then 00 represents {00} represents { } 0 1 represents {0, 1} represents {10, 01}
9 Regular Expressions 1 in formal languages theory Examples Let Σ = {0, 1}, then 00 represents {00} represents { } 0 1 represents {0, 1} represents {10, 01} 0 represents {ɛ, 0, 00, 000,...} (01) represents {ɛ, 01, 0101, ,...}
10 Regular Expressions 1 in formal languages theory Examples Let Σ = {0, 1}, then 00 represents {00} represents { } 0 1 represents {0, 1} represents {10, 01} 0 represents {ɛ, 0, 00, 000,...} (01) represents {ɛ, 01, 0101, ,...} (0 00)1 represents {0, 00, 01, 001, 011, 0011,...} (0 00)1 =
11 Regular Expressions 1 in formal languages theory More Examples a b represents {a, ɛ, b, bb,...} (a b) = b (ab ) ab (c ɛ) represents the set of strings that begin with a single a, followed by zero or more b, and ending in an optional c.
12 Regular 1 Languages languages represented by Regular Expressions are called regular languages they correspond with so called type 3 grammars in the Chomsky hierarchy Example of a Regular grammar S as S ba A ɛ A ca with startsymbol S corresponds with regular expression
13 Regular 1 Languages languages represented by Regular Expressions are called regular languages they correspond with so called type 3 grammars in the Chomsky hierarchy Example of a Regular grammar S as S ba A ɛ A ca with startsymbol S corresponds with regular expression a bc
14 Context-Free 1 languages Regular Expressions/Languages/Grammars are less expressive than context-free grammars (CFGs) Example Context-Free Language a k b k S asb S ɛ (CFG) with start symbol S. This language cannot be expressed by a regular expression.
15 Context-Free 1 languages Regular Expressions/Languages/Grammars are less expressive than context-free grammars (CFGs) Example Context-Free Language a k b k S asb S ɛ (CFG) with start symbol S. This language cannot be expressed by a regular expression. Note For instance regular expressions in Perl are not strictly regular. Extra expressiveness can be detrimental to effectiveness Worst-case complexity or matching a string agains a Perl RE is time exponential to the length of the input
16 Backus-Naur Form System & Network Engineering Example 1 BNF Grammars a format for defining CFG s <bit> ::= 0 1 <expr> ::= <bit> (<expr> + <expr>) (<expr> * <expr>) This BNF grammar generates the strings 0, 1, (0 + 1), (1 * (1 + 1)),... Non-terminals <bit>, <expr> Terminals 0, 1, (, ), *, +, (spatie)
17 Backus-Naur Form System & Network Engineering Example 1 BNF Grammars a format for defining CFG s <bit> ::= 0 1 <expr> ::= <bit> (<expr> + <expr>) (<expr> * <expr>) This BNF grammar generates the strings 0, 1, (0 + 1), (1 * (1 + 1)),... Non-terminals <bit>, <expr> Terminals 0, 1, (, ), *, +, (spatie) Note This language is context-free, but not regular; it cannot be described using a regular expression
18 ABNF: 1 Augmented BNF RFC 2234, Augmented BNF for Syntax Specifications: ABNF. Obsolete. RFC 4234, Augmented BNF for Syntax Specifications: ABNF. Specificatie of ABNF in ABNF
19 1ABNF: Rules Rules have names like elements, rule0 and char-a-z Rulenames may be put inside brackets <elements> Case-insensitive: <rulename>, <rulename>, and <RULENAME> refer to the same rule
20 1ABNF: Rules Rules have names like elements, rule0 and char-a-z Rulenames may be put inside brackets <elements> Case-insensitive: <rulename>, <rulename>, and <RULENAME> refer to the same rule A rule is defined by a sequence name = elements ; comment crlf
21 ABNF: 1 Terminal Values Terminal values: characters A character encoding like US-ASCII may be used %b (binary 65, US-ASCII A ) %x42 (hexadecimal 66, US-ASCII B ) %d67 (decimal 67, US-ASCII C ) %d13.10 (the sequence CR,LF) Literal text: abc (the sequence a,b,c) NB: case-insensitive US-ASCII
22 1 Example Example 1 rulename = "abc" and rulename = "ABc" both generate the set { abc, Abc, abc, abc, ABc, abc, AbC, ABC}
23 1 Example Example 1 rulename = "abc" and rulename = "ABc" both generate the set { abc, Abc, abc, abc, ABc, abc, AbC, ABC} Example 2 rulename = %d97 %d98 %d99 and rulename = %d both generate the set { abc }
24 ABNF: 1 Basic operators Concatenation Alternatives Repetition (variable) Grouping Comments
25 ABNF: 1 Concatenation Rule = Rule1 Rule2 Example magic = xyzzy foo bar xyzzy = "xyzzy" foo = "foo" bar = "bar" magic "xyzzyfoobar"
26 ABNF: 1 Alternatives Rule = Rule1 / Rule2 Sometimes uses pipe ( ) instead of / Example magic = xyzzy / foo / bar magic "xyzzy", but also magic "foo", and also magic "bar"
27 ABNF: 1Variable Repetition Rule = <n> * <m> Rule1 n and m are optional decimal values Default for n is 0 and for m is Example magic = <2> * <3> xyzzy magic "xyzzyxyzzy" magic "xyzzyxyzzyxyzzy"
28 Rule = ( Rule1 ) ABNF: 1 Grouping Only used for parsing (syntax) Has no semantic counterpart Example magictoo = ( magic ) magictoo has the same productions as magic
29 Rule = ( Rule1 ) ABNF: 1 Grouping Only used for parsing (syntax) Has no semantic counterpart Example magictoo = ( magic ) magictoo has the same productions as magic Example elem (foo / bar) blat versus elem foo / bar blat Use grouping to avoid misunderstanding (elem foo) / (bar blat)
30 ABNF: 1 Comment Rule =... ; Followed by an explanation Example magic = xyzzy "," foo "," bar ; comma separated magic magic "xyzzy,foo,bar"
31 ABNF: 1 More operators Incremental alternative Value ranges Optional presence Specific repetition
32 ABNF: Incremental 1 alternative Alternatives may be added later in extra rules Rule =/ Rule1 Example magic = "xyzzy" magic =/ "foo" magic =/ "bar" Equivalently: magic = "xyzzy" / "foo" / "bar"
33 ABNF: 1 Value ranges Uses - as range indicator in terminal specifications Example DIGIT = %x30-39 ; "0" / "1" /... / "9" UPPER = %x41-5a ; "A" / "B" /... / "Z"
34 ABNF: 1 Value ranges Uses - as range indicator in terminal specifications Example DIGIT = %x30-39 ; "0" / "1" /... / "9" UPPER = %x41-5a ; "A" / "B" /... / "Z" DIGIT is a Core Rule. More core rules: ALPHA = %x41-5a / %x61-7a ; A-Z / a-z BIT = "0" / "1"
35 ABNF: 1 Optional presence Rule = [ Rule1 ] Equivalently: Rule = *<1> Rule1 Example magic = [ "xyzzy" ] magic "xyzzy", but also magic ""
36 ABNF: 1 Specific repetition Rule = <n> Rule1 Equivalently: Rule = <n> * <n> Rule1 Example magic = <3> "xyzzy" magic "xyzzyxyzzyxyzzy"
37 1Unix regexps The following syntax is more or less standard in many unix tools and programming languages Basic rules 1. Every printable character that is not a meta character is a regular expression that represents itself, for example a for the letter a 2.. represents any single character except newline
38 1Unix regexps The following syntax is more or less standard in many unix tools and programming languages Basic rules 1. Every printable character that is not a meta character is a regular expression that represents itself, for example a for the letter a 2.. represents any single character except newline 3. ^ represents the beginning of a line 4. $ represents the ending of a line
39 1Unix regexps The following syntax is more or less standard in many unix tools and programming languages Basic rules 1. Every printable character that is not a meta character is a regular expression that represents itself, for example a for the letter a 2.. represents any single character except newline 3. ^ represents the beginning of a line 4. $ represents the ending of a line 5. \ followed by a metacharacter represents the character itself Meaning \. represents a dot. 6. [E] represents a single character. The characterization of E is put in brackets.
40 1 Examples [a] represents a [abc] represents any of a, b, and c [a-z] represents a character in the range a z (ordered according to the ASCII notation)
41 1 Examples [a] represents a [abc] represents any of a, b, and c [a-z] represents a character in the range a z (ordered according to the ASCII notation) [A-Za-z0-9] represents a digit or character
42 1 Examples [a] represents a [abc] represents any of a, b, and c [a-z] represents a character in the range a z (ordered according to the ASCII notation) [A-Za-z0-9] represents a digit or character [^E] represents every character that is not represented by [E] [acq-z] represents a, c, or a character in q z.
43 1Unix regexps Inductive rules 1. if A and B are Regular Expressions then AB is a regular expression (concatenation)
44 1Unix regexps Inductive rules 1. if A and B are Regular Expressions then AB is a regular expression (concatenation) 2. A B is a regular expression (choice/unification)
45 1Unix regexps Inductive rules 1. if A and B are Regular Expressions then AB is a regular expression (concatenation) 2. A B is a regular expression (choice/unification) 3. A* is a regular expression (Kleene star, zero or more) 4. A+ is a regular expression (one or more) 5. A? is a regular expression (zero or one)
46 1Unix regexps Inductive rules 1. if A and B are Regular Expressions then AB is a regular expression (concatenation) 2. A B is a regular expression (choice/unification) 3. A* is a regular expression (Kleene star, zero or more) 4. A+ is a regular expression (one or more) 5. A? is a regular expression (zero or one) 6. (A) is a regular expression (awk, egrep, perl) 7. \(A\) is a regular expression (vi, sed, grep)
47 1Unix regexps Inductive rules 1. A{m, n} for integers m and n represents m to n concatenated instances of A 2. A{m} represents m concatenated instances of A Iteration binds stronger than concatenation which binds stronger than choice, so A BC* = A (B(C)*)
48 $ cat aap (aap) aap $ grep (aap) aap (aap) $ grep -E (aap) aap (aap) aap $ cat noot not noot nooot $ grep o\{2\} noot noot nooot $ grep -E o{3} noot nooot System & Network Engineering 1 Examples
49 More 1 Examples regex does doesn t A. A9, Aa, AA aa, AAA a.c abc, aac, a4c, a+c ABC, abcd, abbc a\.c a.c abc
50 More 1 Examples regex does doesn t A. A9, Aa, AA aa, AAA a.c abc, aac, a4c, a+c ABC, abcd, abbc a\.c a.c abc.ap aap, lap, hap [al]ap aap, lap [\^al]ap hap, kap aap, lap [al]+ap aap, lap, aaap, alap, laap, llap
51 More 1 Examples regex does doesn t A. A9, Aa, AA aa, AAA a.c abc, aac, a4c, a+c ABC, abcd, abbc a\.c a.c abc.ap aap, lap, hap [al]ap aap, lap [\^al]ap hap, kap aap, lap [al]+ap aap, lap, aaap, alap, laap, llap [^A-Z] 5, b A, Q, W [abc]* aaab, cba Iets.* Iets, Iets is beter dan niets Iets.+ Iets is beter dan niets Iets
52 More 1 Examples regex does doesn t A. A9, Aa, AA aa, AAA a.c abc, aac, a4c, a+c ABC, abcd, abbc a\.c a.c abc.ap aap, lap, hap [al]ap aap, lap [\^al]ap hap, kap aap, lap [al]+ap aap, lap, aaap, alap, laap, llap [^A-Z] 5, b A, Q, W [abc]* aaab, cba Iets.* Iets, Iets is beter dan niets Iets.+ Iets is beter dan niets Iets [ab]{4} abba, baba aba a(bc)*d ad, abc, abcbcd abcxd, bcbcd
53 Dutch 1 Examples Telephone number (land line) ^[-0-9+() ]*$ ^[0-9]{3,4}-[0-9]{6,7}$ Postal code ^[1-9]{1}[0-9]{3}[:space:]?[A-Z]{2}
54 More 1 Examples [A-Za-z0-9_-]+([.]{1}[A-Za-z0-9_-]+)*@[A-Za-z0-9-]+ ([.]{1}[A-Za-z0-9-]+)+ E.P.Schatborn@uva.nl Mobile ^06[-]?[0-9]{8}$
55 One1 More Example $ cat sedscr s/\([a-z][a-z]*\){\([^}]*\)}/\1<\2>/
56 One1 More Example $ cat sedscr s/\([a-z][a-z]*\){\([^}]*\)}/\1<\2>/ $ cat b a{ab} A{d} abc{aba} ABC(a) ABC{a}ab
57 One1 More Example $ cat sedscr s/\([a-z][a-z]*\){\([^}]*\)}/\1<\2>/ $ cat b a{ab} A{d} abc{aba} ABC(a) ABC{a}ab $ sed -f sedscr b a{ab} A<d> abc{aba} ABC(a) ABC<a>ab
58 1 Assignment 1 create a unix regular expression to show all lines that contain URI s in a html file. (hrefs... ) use grep -E use wget to retrieve a page look at pages with many URI s, like 2 create a regexp to read all telephone numbers correctly. (correct length etc... ) 3 create a regular expression using grep that will remove all lines with only comments from any bash script file
We have the following problem
We have the following problem need to check a lot of files (the 70 or so files comprising the source for this book, actually) to confirm that each file contained `SetSize exactly as oben (or as rarely)
More informationLecture 18 Regular Expressions
Lecture 18 Regular Expressions In this lecture Background Text processing languages Pattern searches with grep Formal Languages and regular expressions Finite State Machines Regular Expression Grammer
More informationCS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018
CS 301 Lecture 05 Applications of Regular Languages Stephen Checkoway January 31, 2018 1 / 17 Characterizing regular languages The following four statements about the language A are equivalent The language
More informationNon-deterministic Finite Automata (NFA)
Non-deterministic Finite Automata (NFA) CAN have transitions on the same input to different states Can include a ε or λ transition (i.e. move to new state without reading input) Often easier to design
More informationPrinciples of Programming Languages COMP251: Syntax and Grammars
Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationCS 314 Principles of Programming Languages. Lecture 3
CS 314 Principles of Programming Languages Lecture 3 Zheng Zhang Department of Computer Science Rutgers University Wednesday 14 th September, 2016 Zheng Zhang 1 CS@Rutgers University Class Information
More informationLexical Analysis (ASU Ch 3, Fig 3.1)
Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program
More informationCPS 506 Comparative Programming Languages. Syntax Specification
CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens
More informationECS 120 Lesson 7 Regular Expressions, Pt. 1
ECS 120 Lesson 7 Regular Expressions, Pt. 1 Oliver Kreylos Friday, April 13th, 2001 1 Outline Thus far, we have been discussing one way to specify a (regular) language: Giving a machine that reads a word
More informationJNTUWORLD. Code No: R
Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five
More informationCSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal
CSE 311 Lecture 21: Context-Free Grammars Emina Torlak and Kevin Zatloukal 1 Topics Regular expressions A brief review of Lecture 20. Context-free grammars Syntax, semantics, and examples. 2 Regular expressions
More informationPerl Regular Expressions. Perl Patterns. Character Class Shortcuts. Examples of Perl Patterns
Perl Regular Expressions Unlike most programming languages, Perl has builtin support for matching strings using regular expressions called patterns, which are similar to the regular expressions used in
More informationannouncements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings
CSE 311: Foundations of Computing Fall 2013 Lecture 19: Regular expressions & context-free grammars announcements Reading assignments 7 th Edition, pp. 878-880 and pp. 851-855 6 th Edition, pp. 817-819
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationRegular Expressions. Chapter 6
Regular Expressions Chapter 6 Regular Languages Generates Regular Language Regular Expression Recognizes or Accepts Finite State Machine Stephen Cole Kleene 1909 1994, mathematical logician One of many
More informationDVA337 HT17 - LECTURE 4. Languages and regular expressions
DVA337 HT17 - LECTURE 4 Languages and regular expressions 1 SO FAR 2 TODAY Formal definition of languages in terms of strings Operations on strings and languages Definition of regular expressions Meaning
More informationLecture 4: Syntax Specification
The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic
More informationRegular Expressions 1
Regular Expressions 1 Basic Regular Expression Examples Extended Regular Expressions Extended Regular Expression Examples 2 phone number 3 digits, dash, 4 digits [[:digit:]][[:digit:]][[:digit:]]-[[:digit:]][[:digit:]][[:digit:]][[:digit:]]
More informationRegular Expressions. Todd Kelley CST8207 Todd Kelley 1
Regular Expressions Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 POSIX character classes Some Regular Expression gotchas Regular Expression Resources Assignment 3 on Regular Expressions
More informationPart 5 Program Analysis Principles and Techniques
1 Part 5 Program Analysis Principles and Techniques Front end 2 source code scanner tokens parser il errors Responsibilities: Recognize legal programs Report errors Produce il Preliminary storage map Shape
More informationLecture 8: Context Free Grammars
Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall
More informationRegular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications
Agenda for Today Regular Expressions CSE 413, Autumn 2005 Programming Languages Basic concepts of formal grammars Regular expressions Lexical specification of programming languages Using finite automata
More informationAugmented BNF for Syntax Specifications: ABNF
Network Working Group Request for Comments: 4234 Obsoletes: 2234 Category: Standards Track D. Crocker, Editor Brandenburg InternetWorking P. Overell THUS plc. October 2005 Augmented BNF for Syntax Specifications:
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure
More informationRegular Expressions. Steve Renals (based on original notes by Ewan Klein) ICL 12 October Outline Overview of REs REs in Python
Regular Expressions Steve Renals s.renals@ed.ac.uk (based on original notes by Ewan Klein) ICL 12 October 2005 Introduction Formal Background to REs Extensions of Basic REs Overview Goals: a basic idea
More informationPattern Matching. An Introduction to File Globs and Regular Expressions
Pattern Matching An Introduction to File Globs and Regular Expressions Copyright 2006 2009 Stewart Weiss The danger that lies ahead Much to your disadvantage, there are two different forms of patterns
More informationPattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College
Pattern Matching An Introduction to File Globs and Regular Expressions Adapted from Practical Unix and Programming Hunter College Copyright 2006 2009 Stewart Weiss The danger that lies ahead Much to your
More informationCategory: Standards Track January Augmented BNF for Syntax Specifications: ABNF
Network Working Group D. Crocker, Ed. Request for Comments: 5234 Brandenburg InternetWorking STD: 68 P. Overell Obsoletes: 4234 THUS plc. Category: Standards Track January 2008 Status of This Memo Augmented
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 4
CS321 Languages and Compiler Design I Winter 2012 Lecture 4 1 LEXICAL ANALYSIS Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical
More informationArchitecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End
Architecture of Compilers, Interpreters : Organization of Programming Languages ource Analyzer Optimizer Code Generator Context Free Grammars Intermediate Representation Front End Back End Compiler / Interpreter
More informationChapter Seven: Regular Expressions
Chapter Seven: Regular Expressions Regular Expressions We have seen that DFAs and NFAs have equal definitional power. It turns out that regular expressions also have exactly that same definitional power:
More informationCS 314 Principles of Programming Languages
CS 314 Principles of Programming Languages Lecture 2: Syntax Analysis Zheng (Eddy) Zhang Rutgers University January 22, 2018 Announcement First recitation starts this Wednesday Homework 1 will be release
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCSCE 314 Programming Languages
CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how
More informationBNF, EBNF Regular Expressions. Programming Languages,
BNF, EBNF Regular Expressions Programming Languages, 234319 1 Reminder - (E)BNF A notation for describing the grammar of a language The notation consists of: Terminals: the actual legal strings, written
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationIntroduction to Regular Expressions
Introduction to Regular Expressions Basil L. Contovounesios blc@netsoc.tcd.ie For the Dublin University Internet Society [Netsoc] Tuesday 3 rd November, 2015 Finite Automata Imagine a mysterious black
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationCHAPTER TWO LANGUAGES. Dr Zalmiyah Zakaria
CHAPTER TWO LANGUAGES By Dr Zalmiyah Zakaria Languages Contents: 1. Strings and Languages 2. Finite Specification of Languages 3. Regular Sets and Expressions Sept2011 Theory of Computer Science 2 Strings
More informationIntroduction to Syntax Analysis. The Second Phase of Front-End
Compiler Design IIIT Kalyani, WB 1 Introduction to Syntax Analysis The Second Phase of Front-End Compiler Design IIIT Kalyani, WB 2 Syntax Analysis The syntactic or the structural correctness of a program
More informationRegex, Sed, Awk. Arindam Fadikar. December 12, 2017
Regex, Sed, Awk Arindam Fadikar December 12, 2017 Why Regex Lots of text data. twitter data (social network data) government records web scrapping many more... Regex Regular Expressions or regex or regexp
More informationCMPSCI 250: Introduction to Computation. Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014
CMPSCI 250: Introduction to Computation Lecture #28: Regular Expressions and Languages David Mix Barrington 2 April 2014 Regular Expressions and Languages Regular Expressions The Formal Inductive Definition
More informationToday s Lecture. The Unix Shell. Unix Architecture (simplified) Lecture 3: Unix Shell, Pattern Matching, Regular Expressions
Lecture 3: Unix Shell, Pattern Matching, Regular Expressions Today s Lecture Review Lab 0 s info on the shell Discuss pattern matching Discuss regular expressions Kenneth M. Anderson Software Methods and
More informationRegular Languages. MACM 300 Formal Languages and Automata. Formal Languages: Recap. Regular Languages
Regular Languages MACM 3 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop The set of regular languages: each element is a regular language Each regular language is an example of a
More informationContext-Free Languages and Parse Trees
Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing
More informationCOP 3402 Systems Software Syntax Analysis (Parser)
COP 3402 Systems Software Syntax Analysis (Parser) Syntax Analysis 1 Outline 1. Definition of Parsing 2. Context Free Grammars 3. Ambiguous/Unambiguous Grammars Syntax Analysis 2 Lexical and Syntax Analysis
More informationNotes for Comp 454 Week 2
Notes for Comp 454 Week 2 This week we look at the material in chapters 3 and 4. Homework on Chapters 2, 3 and 4 is assigned (see end of notes). Answers to the homework problems are due by September 10th.
More informationCSCI 2132 Software Development. Lecture 7: Wildcards and Regular Expressions
CSCI 2132 Software Development Lecture 7: Wildcards and Regular Expressions Instructor: Vlado Keselj Faculty of Computer Science Dalhousie University 20-Sep-2017 (7) CSCI 2132 1 Previous Lecture Pipes
More informationFormal Languages and Grammars. Chapter 2: Sections 2.1 and 2.2
Formal Languages and Grammars Chapter 2: Sections 2.1 and 2.2 Formal Languages Basis for the design and implementation of programming languages Alphabet: finite set Σ of symbols String: finite sequence
More informationQUESTION BANK. Formal Languages and Automata Theory(10CS56)
QUESTION BANK Formal Languages and Automata Theory(10CS56) Chapter 1 1. Define the following terms & explain with examples. i) Grammar ii) Language 2. Mention the difference between DFA, NFA and εnfa.
More informationCompiler Design. 2. Regular Expressions & Finite State Automata (FSA) Kanat Bolazar January 21, 2010
Compiler Design. Regular Expressions & Finite State Automata (FSA) Kanat Bolazar January 1, 010 Contents In these slides we will see 1.Introduction, Concepts and Notations.Regular Expressions, Regular
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationLexical Analysis. Sukree Sinthupinyo July Chulalongkorn University
Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University 14 July 2012 Outline Introduction 1 Introduction 2 3 4 Transition Diagrams Learning Objectives Understand definition of
More informationHomework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.
Homework Context Free Languages Homework #2 returned Homework #3 due today Homework #4 Pg 133 -- Exercise 1 (use structural induction) Pg 133 -- Exercise 3 Pg 134 -- Exercise 8b,c,d Pg 135 -- Exercise
More informationDecision Properties for Context-free Languages
Previously: Decision Properties for Context-free Languages CMPU 240 Language Theory and Computation Fall 2018 Context-free languages Pumping Lemma for CFLs Closure properties for CFLs Today: Assignment
More informationCSE 3302 Programming Languages Lecture 2: Syntax
CSE 3302 Programming Languages Lecture 2: Syntax (based on slides by Chengkai Li) Leonidas Fegaras University of Texas at Arlington CSE 3302 L2 Spring 2011 1 How do we define a PL? Specifying a PL: Syntax:
More informationConfiguring the RADIUS Listener LEG
CHAPTER 16 Revised: July 28, 2009, Introduction This module describes the configuration procedure for the RADIUS Listener LEG. The RADIUS Listener LEG is configured using the SM configuration file p3sm.cfg,
More informationLexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata
Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string
More informationCS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 2: Lexical Analysis 23 Jan 08
CS412/413 Introduction to Compilers Tim Teitelbaum Lecture 2: Lexical Analysis 23 Jan 08 Outline Review compiler structure What is lexical analysis? Writing a lexer Specifying tokens: regular expressions
More informationChapter 2 :: Programming Language Syntax
Chapter 2 :: Programming Language Syntax Michael L. Scott kkman@sangji.ac.kr, 2015 1 Regular Expressions A regular expression is one of the following: A character The empty string, denoted by Two regular
More informationCS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)
Programming languages must be precise Remember instructions This is unlike natural languages CS 315 Programming Languages Syntax Precision is required for syntax think of this as the format of the language
More informationRegular Expressions. with a brief intro to FSM Systems Skills in C and Unix
Regular Expressions with a brief intro to FSM 15-123 Systems Skills in C and Unix Case for regular expressions Many web applications require pattern matching look for tag for links Token search
More informationCSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1
CSEP 501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE B-1 Agenda Basic concepts of formal grammars (review) Regular expressions
More informationChapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1
Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules
More informationTheory and Compiling COMP360
Theory and Compiling COMP360 It has been said that man is a rational animal. All my life I have been searching for evidence which could support this. Bertrand Russell Reading Read sections 2.1 3.2 in the
More informationprogramming languages need to be precise a regular expression is one of the following: tokens are the building blocks of programs
Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott Introduction programming languages need to be precise natural languages less so both form (syntax) and meaning
More informationIntroduction to Syntax Analysis
Compiler Design 1 Introduction to Syntax Analysis Compiler Design 2 Syntax Analysis The syntactic or the structural correctness of a program is checked during the syntax analysis phase of compilation.
More informationPart III. Shell Config. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26,
Part III Shell Config Compact Course @ Max-Planck, February 16-26, 2015 33 Special Directories. current directory.. parent directory ~ own home directory ~user home directory of user ~- previous directory
More informationLexical and Syntax Analysis
COS 301 Programming Languages Lexical and Syntax Analysis Sebesta, Ch. 4 Syntax analysis Programming languages compiled, interpreted, or hybrid All have to do syntax analysis For a compiled language parse
More informationWeek 2: Syntax Specification, Grammars
CS320 Principles of Programming Languages Week 2: Syntax Specification, Grammars Jingke Li Portland State University Fall 2017 PSU CS320 Fall 17 Week 2: Syntax Specification, Grammars 1/ 62 Words and Sentences
More informationRegular Expressions. Regular Expression Syntax in Python. Achtung!
1 Regular Expressions Lab Objective: Cleaning and formatting data are fundamental problems in data science. Regular expressions are an important tool for working with text carefully and eciently, and are
More information2. λ is a regular expression and denotes the set {λ} 4. If r and s are regular expressions denoting the languages R and S, respectively
Regular expressions: a regular expression is built up out of simpler regular expressions using a set of defining rules. Regular expressions allows us to define tokens of programming languages such as identifiers.
More informationChapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)
Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical
More informationWhile Statement Examples. While Statement (35.15) Until Statement (35.15) Until Statement Example
While Statement (35.15) General form. The commands in the loop are performed while the condition is true. while condition one-or-more-commands While Statement Examples # process commands until a stop is
More informationCompiler Construction
Compiler Construction Exercises 1 Review of some Topics in Formal Languages 1. (a) Prove that two words x, y commute (i.e., satisfy xy = yx) if and only if there exists a word w such that x = w m, y =
More informationTheoretical Part. Chapter one:- - What are the Phases of compiler? Answer:
Theoretical Part Chapter one:- - What are the Phases of compiler? Six phases Scanner Parser Semantic Analyzer Source code optimizer Code generator Target Code Optimizer Three auxiliary components Literal
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up
More informationLec-5-HW-1, TM basics
Lec-5-HW-1, TM basics (Problem 0)-------------------- Design a Turing Machine (TM), T_sub, that does unary decrement by one. Assume a legal, initial tape consists of a contiguous set of cells, each containing
More information2. Lexical Analysis! Prof. O. Nierstrasz!
2. Lexical Analysis! Prof. O. Nierstrasz! Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.! http://www.cs.ucla.edu/~palsberg/! http://www.cs.purdue.edu/homes/hosking/!
More informationCompiler phases. Non-tokens
Compiler phases Compiler Construction Scanning Lexical Analysis source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2011 01 21 parse tree
More informationMore Scripting and Regular Expressions. Todd Kelley CST8207 Todd Kelley 1
More Scripting and Regular Expressions Todd Kelley kelleyt@algonquincollege.com CST8207 Todd Kelley 1 Regular Expression Summary Regular Expression Examples Shell Scripting 2 Do not confuse filename globbing
More informationDecision, Computation and Language
Decision, Computation and Language Regular Expressions Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218 Regular expressions M S Khan (Univ. of Liverpool)
More informationContext-Free Grammars
Context-Free Grammars Describing Languages We've seen two models for the regular languages: Automata accept precisely the strings in the language. Regular expressions describe precisely the strings in
More informationBioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)
Bioinformatics Programming EE, NCKU Tien-Hao Chang (Darby Chang) 1 Regular Expression 2 http://rp1.monday.vip.tw1.yahoo.net/res/gdsale/st_pic/0469/st-469571-1.jpg 3 Text patterns and matches A regular
More informationCSE 401 Midterm Exam 11/5/10
Name There are 5 questions worth a total of 100 points. Please budget your time so you get to all of the questions. Keep your answers brief and to the point. The exam is closed books, closed notes, closed
More informationCSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions 1 Agenda Overview of language recognizers Basic concepts of formal grammars Scanner Theory
More informationParsing Combinators: Introduction & Tutorial
Parsing Combinators: Introduction & Tutorial Mayer Goldberg October 21, 2017 Contents 1 Synopsis 1 2 Backus-Naur Form (BNF) 2 3 Parsing Combinators 3 4 Simple constructors 4 5 The parser stack 6 6 Recursive
More informationOutline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata
Outline 1 2 Regular Expresssions Lexical Analysis 3 Finite State Automata 4 Non-deterministic (NFA) Versus Deterministic Finite State Automata (DFA) 5 Regular Expresssions to NFA 6 NFA to DFA 7 8 JavaCC:
More informationCSE 401/M501 Compilers
CSE 401/M501 Compilers Languages, Automata, Regular Expressions & Scanners Hal Perkins Spring 2018 UW CSE 401/M501 Spring 2018 B-1 Administrivia No sections this week Read: textbook ch. 1 and sec. 2.1-2.4
More informationProgramming Lecture 3
Programming Lecture 3 Expressions (Chapter 3) Primitive types Aside: Context Free Grammars Constants, variables Identifiers Variable declarations Arithmetic expressions Operator precedence Assignment statements
More information2010: Compilers REVIEW: REGULAR EXPRESSIONS HOW TO USE REGULAR EXPRESSIONS
2010: Compilers Lexical Analysis: Finite State Automata Dr. Licia Capra UCL/CS REVIEW: REGULAR EXPRESSIONS a Character in A Empty string R S Alternation (either R or S) RS Concatenation (R followed by
More informationLexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast! Compiler Passes Analysis of input program (front-end) character stream
More informationUnleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011
Unleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011 Last time Compiling software and the three-step procedure (./configure && make && make install). Dependency hell and
More informationChapter Seven: Regular Expressions. Formal Language, chapter 7, slide 1
Chapter Seven: Regular Expressions Formal Language, chapter 7, slide The first time a young student sees the mathematical constant π, it looks like just one more school artifact: one more arbitrary symbol
More information