Computer Systems and Architecture

Similar documents
Computer Systems and Architecture

Essentials for Scientific Computing: Stream editing with sed and awk

Understanding Regular Expressions, Special Characters, and Patterns

Regex, Sed, Awk. Arindam Fadikar. December 12, 2017

CSE 390a Lecture 7. Regular expressions, egrep, and sed

Regular Expressions. Regular expressions match input within a line Regular expressions are very different than shell meta-characters.

CSE 303 Lecture 7. Regular expressions, egrep, and sed. read Linux Pocket Guide pp , 73-74, 81

Lecture 18 Regular Expressions

User Commands sed ( 1 )

Regular Expressions. Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland

ITST Searching, Extracting & Archiving Data

Regular Expressions. Regular Expression Syntax in Python. Achtung!

Certification. String Processing with Regular Expressions

UNIX / LINUX - REGULAR EXPRESSIONS WITH SED

Topic 4: Grep, Find & Sed

IB047. Unix Text Tools. Pavel Rychlý Mar 3.

Pattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College

Motivation (Scenarios) Topic 4: Grep, Find & Sed. Displaying File Names. grep

Pattern Matching. An Introduction to File Globs and Regular Expressions

UNIX II:grep, awk, sed. October 30, 2017

Regular Expressions 1

CS 2112 Lab: Regular Expressions

Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

Grep and Shell Programming

Perl Regular Expressions. Perl Patterns. Character Class Shortcuts. Examples of Perl Patterns

5/8/2012. Exploring Utilities Chapter 5

IT441. Regular Expressions. Handling Text: DRAFT. Network Services Administration

CS Unix Tools & Scripting Lecture 7 Working with Stream

Getting to grips with Unix and the Linux family

Unleashing the Shell Hands-On UNIX System Administration DeCal Week 6 28 February 2011

Bash Script. CIRC Summer School 2015 Baowei Liu

Systems Programming/ C and UNIX

sed Stream Editor Checks for address match, one line at a time, and performs instruction if address matched

Advanced Handle Definition

CS Unix Tools & Scripting

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...]

Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs

Regular expressions: Text editing and Advanced manipulation. HORT Lecture 4 Instructor: Kranthi Varala

Mineração de Dados Aplicada

CS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018

FILTERS USING REGULAR EXPRESSIONS grep and sed

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Regular Expressions. using REs to find patterns. implementing REs using finite state automata. Sunday, 4 December 11

Practical 02. Bash & shell scripting

Learning Ruby. Regular Expressions. Get at practice page by logging on to csilm.usu.edu and selecting. PROGRAMMING LANGUAGES Regular Expressions

CS Advanced Unix Tools & Scripting

CSCI 4152/6509 Natural Language Processing Lecture 6: Regular Expressions; Text Processing in Perl

Pathologically Eclectic Rubbish Lister

Regular Expressions for Linguists: A Life Skill

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010

Programming Languages & Translators. XML Document Manipulation Language (XDML) Language Reference Manual

Regular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)

The answers to this test are in the Answer Key on the last page(s).

Describing Languages with Regular Expressions

applied regex implementing REs using finite state automata using REs to find patterns Informatics 1 School of Informatics, University of Edinburgh 1

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

CSE 374 Midterm Exam Sample Solution 2/6/12

This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.

Regular Expressions Explained

CSCI 2132 Software Development. Lecture 8: Introduction to C

# use temporary files to store partial results # remember to delete these temporary files when you do not need them anymore

Digital Humanities. Tutorial Regular Expressions. March 10, 2014

CSE 374 Programming Concepts & Tools. Laura Campbell (thanks to Hal Perkins) Winter 2014 Lecture 6 sed, command-line tools wrapup

Structure of Programming Languages Lecture 3

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

Advanced training. Linux components Command shell. LiLux a.s.b.l.

Basic Shell Scripting Practice. HPC User Services LSU HPC & LON March 2018

Regular Expressions for Technical Writers (tutorial)

Lecture 2. Regular Expression Parsing Awk

Regular Expressions Primer

Regular Expressions!!

Pieter van den Hombergh. April 13, 2018

An Introduction to Regular Expressions in Python

More regular expressions, synchronizing data, comparing files

A shell can be used in one of two ways:

CSE II-Sem)

Regular Expressions. Upsorn Praphamontripong. CS 1111 Introduction to Programming Spring [Ref:

Overview. Unix/Regex Lab. 1. Setup & Unix review. 2. Count words in a text. 3. Sort a list of words in various ways. 4.

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal

Regular Expressions for Technical Writers

WRITE COMMANDS USING sed or grep (1 mark each) 2014 oct./nov march/april Commands using grep or egrep. (1 mark each)

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

CS214-AdvancedUNIX. Lecture 2 Basic commands and regular expressions. Ymir Vigfusson. CS214 p.1

CS434 Compiler Construction. Lecture 3 Spring 2005 Department of Computer Science University of Alabama Joel Jones

STATS Data analysis using Python. Lecture 0: Introduction and Administrivia

CSC207 Week 9. Larry Zhang

More Examples. Lex/Flex/JLex

Dr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017

CSE 374 Midterm Exam 11/2/15. Name Id #

QUESTION BANK ON UNIX & SHELL PROGRAMMING-502 (CORE PAPER-2)

CST Lab #5. Student Name: Student Number: Lab section:

Using Lex or Flex. Prof. James L. Frankel Harvard University

Server-side Web Development (I3302) Semester: 1 Academic Year: 2017/2018 Credits: 4 (50 hours) Dr Antoun Yaacoub

Regular Expressions for Information Processing in ABAP. Ralph Benzinger SAP AG

CS 124/LINGUIST 180 From Languages to Information. Unix for Poets Dan Jurafsky

Indian Institute of Technology Kharagpur. PERL Part III. Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. I.I.T.

Vi & Shell Scripting

Shell Programming Overview

CSE 374 Midterm Exam Sample Solution 2/11/13. Question 1. (8 points) What does each of the following bash commands do?

Transcription:

Computer Systems and Architecture Regular Expressions Bart Meyers University of Antwerp August 29, 2012

Outline What? Tools Anchors, character sets and modifiers Advanced Regular expressions Exercises

Regular Expressions A regular expression is a pattern that describes a set of strings Search and manipulate text based on patterns Flexible and powerful Three parts: Anchors: specify the position of the pattern in relation to a line of text Character sets: match one or more characters in a single position Modifiers: specify how many times the previous character set is repeated

Tools Grep (grep) Print lines matching a pattern Sed (sed) Read and modify the input stream as specified by a pattern. Awk (awk) More advanced string handling

Grep grep class /usr/share/dict/words Print all words that contain the string class grep ^class /usr/share/dict/words Print all words that begin with the string class grep class$ /usr/share/dict/words Print all words that end with the string class grep ^c..ss$ /usr/share/dict/words Print all 5-letter words that begin with c and end with ss grep ^c.*ss$ /usr/share/dict/words Print all words that begin with c and end with ss

Sed sed s/from/to/g Replace all occurences of regex from by to Substitute command: s : Substitute command /../../ : Delimiter from : Regular expression to : Replacement string g : Flags Usage: cat oldfile.txt sed s/from/to/ sed s/from/to/ < oldfile.txt sed s/from/to/ < oldfile.txt > newfile.txt

Sed Other delimiters sed s:/usr/local/bin:/home/bin: sed s /usr/local/bin /home/bin Using & as the matched string sed s/[a-z]*/(&)/ places parenthesis around a string Using \1, \2... to keep part of the pattern

Sed Options sed -e : combine commands sed -e s/a/a/ -e s/b/b/ sed -f : read commands from script file sed -n : silent mode

Sed Flags What to do when there is more than one occurence of a pattern on a single line? /../../ : Only the first occurrence of from is replaced /../../g : Global replacement /../../3 : Replace the third occurrence /../../2g : Replace all but the first occurrence /../../p : Print the modified lines sed -n s/pattern/&/p duplicates the function of grep /../../w filename : Write all modified lines to filename

Anchors ^ (beginning of line) and $ (end of line) Examples: ^A A at the beginning of a line A$ A at the end of a line A^ Aˆ anywhere on a line $A $A anywhere on a line ^^ ˆ at the beginning of a line $$ $ at the end of a line

Character sets Simplest character set: abc matches the character sequence abc. represents any single character Ranges Between [ and ]: one of these characters/patterns [^ and ]: NOT one of these characters/patterns Use - between characters to denote a range between these characters Want to use literal characters with a special meaning? Escape with backslash \ \. matches a. \* matches an asterix {, }, (, ), <, > don t have a special meaning

Character sets [A-Z] Any capital letter [A-Za-z] Any letter [] The characters [] [0] The character 0 [0-9] Any number [^0-9] Any character other than a number [-0-9] Any number or a - [0-9-] Any number or a - [^-0-9] Any character except a number or a - []0-9] Any number or a ] [0-9]] Any number followed by a ]

Modifiers Combining character sets: ^T[a-z][aeiou] Matches a line that starts with T, followed by a letter and a vowel Use modifiers to repeat character sets [0-9]* matches zero or more numbers [0-9][0-9]* matches one or more numbers [0-9]\{5\} matches five numbers [0-9]\{5,8\} matches five to eight numbers [0-9]\{5,\} matches five or more numbers Match only words: use \< and \> Surrounding characters are anything but a letter, number, underscore, new line or end of line \<[tt]he\> matches any line with the word the or The.

Backreferences Reuse patterns: remember what you found earlier Mark pattern with \( and \) Refer to previously market patterns with \1, \2, \3... Examples \([a-z]\)\1 matches two identical letters. \<\([a-z]\)[a-z]*\1\> matches every word that starts and ends with the same letter. \([a-z]\)\([a-z]\)[a-z]\2\1 matches every 5-letter palindrome.

Extended regular expressions Used by egrep and awk? matches 0 or 1 instances of the character set before + matches 1 or more instances of the character set before \{, \}, \(, \), \<, \> no longer have a special meaning ^(Ruben Pieter) matches every line that starts with Ruben or Pieter

Exercises http://msdl.cs.mcgill.ca/people/hv/teaching/ ComputerSystemsArchitecture/#CS2