Computer Systems and Architecture

Similar documents
Computer Systems and Architecture

Essentials for Scientific Computing: Stream editing with sed and awk

Understanding Regular Expressions, Special Characters, and Patterns

Regular Expressions. Regular expressions match input within a line Regular expressions are very different than shell meta-characters.

Regex, Sed, Awk. Arindam Fadikar. December 12, 2017

CSE 390a Lecture 7. Regular expressions, egrep, and sed

CSE 303 Lecture 7. Regular expressions, egrep, and sed. read Linux Pocket Guide pp , 73-74, 81

Lecture 18 Regular Expressions

User Commands sed ( 1 )

ITST Searching, Extracting & Archiving Data

Regular Expressions. Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland

Certification. String Processing with Regular Expressions

UNIX / LINUX - REGULAR EXPRESSIONS WITH SED

Topic 4: Grep, Find & Sed

Regular Expressions. Regular Expression Syntax in Python. Achtung!

Motivation (Scenarios) Topic 4: Grep, Find & Sed. Displaying File Names. grep

Pattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College

Pattern Matching. An Introduction to File Globs and Regular Expressions

UNIX II:grep, awk, sed. October 30, 2017

Regular Expressions 1

Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

Grep and Shell Programming

Perl Regular Expressions. Perl Patterns. Character Class Shortcuts. Examples of Perl Patterns

IT441. Regular Expressions. Handling Text: DRAFT. Network Services Administration

5/8/2012. Exploring Utilities Chapter 5

CS Unix Tools & Scripting Lecture 7 Working with Stream

Practical 02. Bash & shell scripting

Unleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011

Bash Script. CIRC Summer School 2015 Baowei Liu

Systems Programming/ C and UNIX

IB047. Unix Text Tools. Pavel Rychlý Mar 3.

Advanced Handle Definition

CS Unix Tools & Scripting

sed Stream Editor Checks for address match, one line at a time, and performs instruction if address matched

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

CS 2112 Lab: Regular Expressions

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...]

Getting to grips with Unix and the Linux family

Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs

applied regex implementing REs using finite state automata using REs to find patterns Informatics 1 School of Informatics, University of Edinburgh 1

Regular expressions: Text editing and Advanced manipulation. HORT Lecture 4 Instructor: Kranthi Varala

FILTERS USING REGULAR EXPRESSIONS grep and sed

Mineração de Dados Aplicada

Structure of Programming Languages Lecture 3

Regular Expressions. using REs to find patterns. implementing REs using finite state automata. Sunday, 4 December 11

CS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018

CS Advanced Unix Tools & Scripting

Learning Ruby. Regular Expressions. Get at practice page by logging on to csilm.usu.edu and selecting. PROGRAMMING LANGUAGES Regular Expressions

Pathologically Eclectic Rubbish Lister

Regular Expressions for Linguists: A Life Skill

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010

Programming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual

Regular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

CSE 374 Midterm Exam Sample Solution 2/6/12

This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.

Regular Expressions Explained

CSCI 2132 Software Development. Lecture 8: Introduction to C

# use temporary files to store partial results # remember to delete these temporary files when you do not need them anymore

Digital Humanities. Tutorial Regular Expressions. March 10, 2014

CSC207 Week 9. Larry Zhang

CSE 374 Programming Concepts & Tools. Laura Campbell (thanks to Hal Perkins) Winter 2014 Lecture 6 sed, command-line tools wrapup

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

Advanced training. Linux components Command shell. LiLux a.s.b.l.

CSCI 4152/6509 Natural Language Processing Lecture 6: Regular Expressions; Text Processing in Perl

Basic Shell Scripting Practice. HPC User Services LSU HPC & LON March 2018

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Regular Expressions for Technical Writers (tutorial)

STATS Data analysis using Python. Lecture 0: Introduction and Administrivia

Lecture 2. Regular Expression Parsing Awk

Regular Expressions!!

Server-side Web Development (I3302) Semester: 1 Academic Year: 2017/2018 Credits: 4 (50 hours) Dr Antoun Yaacoub

Regular Expressions for Information Processing in ABAP. Ralph Benzinger SAP AG

The answers to this test are in the Answer Key on the last page(s).

More regular expressions, synchronizing data, comparing files

CSE II-Sem)

A shell can be used in one of two ways:

Regular Expressions Primer

Learning Language. Reference Manual. George Liao (gkl2104) Joseanibal Colon Ramos (jc2373) Stephen Robinson (sar2120) Huabiao Xu(hx2104)

Describing Languages with Regular Expressions

Regular Expressions. Upsorn Praphamontripong. CS 1111 Introduction to Programming Spring [Ref:

Overview. Unix/Regex Lab. 1. Setup & Unix review. 2. Count words in a text. 3. Sort a list of words in various ways. 4.

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal

Regular Expressions for Technical Writers

WRITE COMMANDS USING sed or grep (1 mark each) 2014 oct./nov march/april Commands using grep or egrep. (1 mark each)

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Introduction to regular expressions

CS214-AdvancedUNIX. Lecture 2 Basic commands and regular expressions. Ymir Vigfusson. CS214 p.1

CS434 Compiler Construction. Lecture 3 Spring 2005 Department of Computer Science University of Alabama Joel Jones

More Examples. Lex/Flex/JLex

Dr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017

CSE 374 Midterm Exam 11/2/15. Name Id #

QUESTION BANK ON UNIX & SHELL PROGRAMMING-502 (CORE PAPER-2)

Using Lex or Flex. Prof. James L. Frankel Harvard University

CST Lab #5. Student Name: Student Number: Lab section:

Indian Institute of Technology Kharagpur. PERL Part III. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.

CS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky

Vi & Shell Scripting

Wildcards and Regular Expressions

Shell Programming Overview

Transcription:

Computer Systems and Architecture Stephen Pauwels Regular Expressions Academic Year 2018-2019

Outline What is a Regular Expression? Tools Anchors, Character sets and Modifiers Advanced Regular Expressions

Regular Expressions A regular expression is a pattern that describes a set of strings Search and manipulate text based on patterns Flexible and powerful Three parts Anchors: specify the position of the pattern in relation to a line of text Character sets: match one or more characters in a single position Modifiers: specify how many times the previous character set is repeated

Anchors ^ (beginning of line) and $ (end of line) Examples: ^A A at the beginning of a line A$ A at the end of a line A^ A^ anywhere on a line $A $A anywhere on a line ^^ ^ at the beginning of a line $$ $ at the end of a line

Character Sets Simplest character set: abc matches the character sequence abc. represents any single character Ranges: Between [ and ]: one of these characters/patterns [^ and ]: NOT one of these characters/patterns Use - between characters to denote a range between these characters Want to use literal characters with a special meaning? Escape with backslash \ \. matches a. \* matches an asterix {,}, (,),<,> don t have a special meaning

Character sets [A-Z] Any capital letter [A-Za-z] Any letter [] The characters [] [0] The character 0 [0-9] Any number [^0-9] Any character other than a number [-0-9] Any number or a - [0-9-] Any number or a - [^-0-9] Any character except a number or a - []0-9] Any number or a ] [0-9]] Any number followed by a ]

Modifiers Combining character sets: ^T[a-z][aeiou] Matches a line that starts with T, followed by a letter and a vowel Use modifiers to repeat character sets [0-9]* Matches zero or more numbers [0-9] [0-9]* Matches one or more numbers [0-9]\{5\} Matches five numbers [0-9]\{5,8\} matches five to eight numbers [0-9]\{5,\} matches five or more numbers Match only words: use \< and \> Surrounding characters are anything but a letter, number, underscore, new line or end of line \<[tt]he\> matches any line with the word the or The

Backreferences Reuse patterns: remember what you found earlier Mark pattern with \( and \) Refer to previously marked patterns with \1, \2, \3, Examples \([a-z]\)\1 \<\([a-z]\)[a-z]*\1\> \([a-z]\)\([a-z]\)[a-z]\2\1 Matches two identical letters Matches every word that starts and ends with the same letter Matches every 5-letter palindrome

Tools Grep Print lines matching a pattern Sed Read and modify the input stream as specified by a pattern Awk More advanced string handling

Grep grep class /usr/share/dict/words Print all words that contain the string class grep ^class /usr/share/dict/words Print all words that begin with the string class grep class$ /usr/share/dict/words Print all words that end with the string class grep ^c..ss$ /usr/share/dict/words Print all 5-letter words that begin with c and end with ss grep ^c.*ss^ /usr/share/dict/words Print all words that begin with c and end with ss

Sed sed s/from/to/g Replace all occurrences of regex from to to Substitute command: s: Substitute /../../: Delimiter from: Regular expression to: Replacement string g: Flags Usage: cat oldfile.txt sed s/from/to/ sed s/from/to/ < oldfile.txt sed s/from/to/ < oldfile.txt > newfile.txt

Sed Other delimiters sed s:/usr/local/bin:/home/bin: sed s /usr/local/bin /home/bin Use & as the matched string sed s/[a-z]*/(&)/ places parenthesis around a string Using \1, \2 to keep part of the pattern

Sed Options sed e: combine options sed -e s/a/a/ -e s/b/b/ sed f: read commands from script file sed -n: silent mode

Sed Flags What to do when there is more than one occurrence of pattern on a single line? /../../: Only the first occurrence is replaced /../../g: Global replacement /../../3: Replace the third occurrence /../../2g: Replace but the first occurrence /../../p: Print modified lines sed n s/pattern/&/p duplicates the function of grep /../../w filename: Write all modified lines to filename

Extended Regular Expressions Used by egrep and awk? matches 0 or 1 instances of the character set before + matches 1 or more instances of the character set before \{, \}, \(, \), \<, \> no longer have special meaning ^(Ruben Pieter) matches every line that starts with Ruben or Pieter

Exercises Blackboard Course webpage http://msdl.cs.mcgill.ca/people/hv/teaching/computersystemsarchitecture/#cs2