STREAM EDITOR - REGULAR EXPRESSIONS

Similar documents
Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

STREAM EDITOR - BASIC COMMANDS

STREAM EDITOR - QUICK GUIDE STREAM EDITOR - OVERVIEW

Regular Expressions. Michael Wrzaczek Dept of Biosciences, Plant Biology Viikki Plant Science Centre (ViPS) University of Helsinki, Finland

TCL - STRINGS. Boolean value can be represented as 1, yes or true for true and 0, no, or false for false.

Dr. Sarah Abraham University of Texas at Austin Computer Science Department. Regular Expressions. Elements of Graphics CS324e Spring 2017

AWK - PRETTY PRINTING

Regex, Sed, Awk. Arindam Fadikar. December 12, 2017

Server-side Web Development (I3302) Semester: 1 Academic Year: 2017/2018 Credits: 4 (50 hours) Dr Antoun Yaacoub

UNIX / LINUX - REGULAR EXPRESSIONS WITH SED

Password Management Guidelines for Cisco UCS Passwords

Regular Expressions. Regular expressions are a powerful search-and-replace technique that is widely used in other environments (such as Unix and Perl)

Essentials for Scientific Computing: Stream editing with sed and awk

=~ determines to which variable the regex is applied. In its absence, $_ is used.

Pattern Matching. An Introduction to File Globs and Regular Expressions

Pattern Matching. An Introduction to File Globs and Regular Expressions. Adapted from Practical Unix and Programming Hunter College

psed [-an] script [file...] psed [-an] [-e script] [-f script-file] [file...]

If you are a software developer, system administrator, or a GNU/Linux loving person, then this tutorial is for you.

Configuring the RADIUS Listener LEG

Getting to grips with Unix and the Linux family

successes without magic London,

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010

Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

Module 8 Pipes, Redirection and REGEX

RUBY VARIABLES, CONSTANTS AND LITERALS

PYTHON MOCK TEST PYTHON MOCK TEST III

Understanding Regular Expressions, Special Characters, and Patterns

Regular Expressions. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 9

This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.

DECLARATIONS. Character Set, Keywords, Identifiers, Constants, Variables. Designed by Parul Khurana, LIECA.

- c list The list specifies character positions.

Configuring the RADIUS Listener Login Event Generator

SPEECH RECOGNITION COMMON COMMANDS

CS Unix Tools & Scripting

PowerGREP. Manual. Version October 2005

User Commands sed ( 1 )

Regular Expressions. Regular Expression Syntax in Python. Achtung!

Cisco Common Classification Policy Language

ISO/IEC JTC1/SC22/WG20 N

Regular Expression Reference

Chapter 2. Lexical Elements & Operators

Part III. Shell Config. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26,

Regular Expressions Explained

Regular Expressions. Upsorn Praphamontripong. CS 1111 Introduction to Programming Spring [Ref:

FSASIM: A Simulator for Finite-State Automata

Using Lex or Flex. Prof. James L. Frankel Harvard University

Regular Expressions Overview Suppose you needed to find a specific IPv4 address in a bunch of files? This is easy to do; you just specify the IP

Introduction to Regular Expressions Version 1.3. Tom Sgouros

Regular Expressions. Regular expressions match input within a line Regular expressions are very different than shell meta-characters.

C How to Program, 6/e by Pearson Education, Inc. All Rights Reserved.

正则表达式 Frank from

perlrebackslash - Perl Regular Expression Backslash Sequences and Escapes

sed Stream Editor Checks for address match, one line at a time, and performs instruction if address matched

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

set in Options). Returns the cursor to its position prior to the Correct command.

Bashed One Too Many Times. Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009

FILTERS USING REGULAR EXPRESSIONS grep and sed

IT441. Regular Expressions. Handling Text: DRAFT. Network Services Administration

Filtering Service

Version November 2017

Regular Expressions. using REs to find patterns. implementing REs using finite state automata. Sunday, 4 December 11

Characters Lesson Outline

JavaScript Functions, Objects and Array

PYTHON- AN INNOVATION

CST Lab #5. Student Name: Student Number: Lab section:

ITST Searching, Extracting & Archiving Data

Advanced Handle Definition

Information technology. Specification method for cultural conventions ISO/IEC JTC1/SC22/WG20 N690. Reference number of working document:

The C++ Language. Arizona State University 1

ITC213: STRUCTURED PROGRAMMING. Bhaskar Shrestha National College of Computer Studies Tribhuvan University

CS160A EXERCISES-FILTERS2 Boyd

Describing Languages with Regular Expressions

Perl Programming. Bioinformatics Perl Programming

1 CS580W-01 Quiz 1 Solution

Bash Reference Manual Reference Documentation for Bash Edition 2.5b, for Bash Version 2.05b. July 2002

Variables and Values

CS 301. Lecture 05 Applications of Regular Languages. Stephen Checkoway. January 31, 2018

MCIS/UA. String Literals. String Literals. Here Documents The <<< operator (also known as heredoc) can be used to construct multi-line strings.

Reference number of working document: Reference number of document: ISO/IEC FCD

Using the Command-Line Interface

1. What type of error produces incorrect results but does not prevent the program from running? a. syntax b. logic c. grammatical d.

TECHNICAL ISO/IEC REPORT TR 14652

UNIX II:grep, awk, sed. October 30, 2017

WSR Commands. WSR Commands: Mouse Grid: What can I say?: Will show a list of applicable commands

Regular Expressions. Perl PCRE POSIX.NET Python Java

The top level documentation about Perl regular expressions is found in perlre.

1. Character/String Data, Expressions & Intrinsic Functions. Numeric Representation of Non-numeric Values. (CHARACTER Data Type), Part 1

Typescript on LLVM Language Reference Manual

CSCI 2132 Software Development. Lecture 7: Wildcards and Regular Expressions

ASCII Code - The extended ASCII table

String Manipulation. Module 6

DOCUMENT SECURITY IN WORD 2010

6 Character Classification and Utilities Module (chars.hhf)

Regex Guide. Complete Revolution In programming For Text Detection

Language Reference Manual

Regular Expressions Primer

Lecture Outline. COMP-421 Compiler Design. What is Lex? Lex Specification. ! Lexical Analyzer Lex. ! Lex Examples. Presented by Dr Ioanna Dionysiou

Object-Oriented Software Engineering CS288

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:

C++ character set Letters:- A-Z, a-z Digits:- 0 to 9 Special Symbols:- space + - / ( ) [ ] =! = < >, $ # ; :? & White Spaces:- Blank Space, Horizontal

Transcription:

STREAM EDITOR - REGULAR EXPRESSIONS http://www.tutorialspoint.com/sed/sed_regular_expressions.htm Copyright tutorialspoint.com It is the regular expressions that make SED powerful and efficient. A number of complex tasks can be solved with regular expressions. Any command-line expert knows the power of regular expressions. Like many other GNU/Linux utilities, SED too supports regular expressions, which are often referred to as as regex. This chapter describes regular expressions in detail. The chapter is divided into three sections: Standard regular expressions, POSIX classes of regular expressions, and Meta characters. Standard Regular Expressions Start of line ^ In regular expressions terminology, the caret ^ symbol matches the start of a line. The following example prints all the lines that start with the pattern "The". [jerry]$ sed -n '/^The/ p' books.txt The Two Towers, J. R. R. Tolkien The Alchemist, Paulo Coelho The Fellowship of the Ring, J. R. R. Tolkien The Pilgrimage, Paulo Coelho End of Line $ End of line is represented by the dollar$ symbol. The following example prints the lines that end with "Coelho". [jerry]$ sed -n '/Coelho$/ p' books.txt The Alchemist, Paulo Coelho The Pilgrimage, Paulo Coelho Single Character. The Dot. matches any single character except the end of line character. The following example prints all three letter words that end with the character "t". [jerry]$ echo -e "cat\nbat\nrat\nmat\nbatting\nrats\nmats" sed -n '/^..t$/p' cat bat rat mat Match Character Set [] In regular expression terminology, a character set is represented by square brackets []. It is used to match only one out of several characters. The following example matches the patterns "Call" and "Tall" but not "Ball".

[jerry]$ echo -e "Call\nTall\nBall" sed -n '/[CT]all/ p' Call Tall Exclusive Set [ ] In exclusive set, the caret negates the set of characters in the square brackets. The following example prints only "Ball". [jerry]$ echo -e "Call\nTall\nBall" sed -n '/[^CT]all/ p' Ball Character Range [ ] When a character range is provided, the regular expression matches any character within the range specified in square brackets. The following example matches "Call" and "Tall" but not "Ball". [jerry]$ echo -e "Call\nTall\nBall" sed -n '/[C-Z]all/ p' Call Tall Now let us modify the range to "A-P" and observe the result. [jerry]$ echo -e "Call\nTall\nBall" sed -n '/[A-P]all/ p' Call Ball Zero on One Occurrence \? In SED, the question mark \? matches zero or one occurrence of the preceding character. The following example matches "Behaviour" as well as "Behavior". Here, we made "u" as an optional character by using "\?". [jerry]$ echo -e "Behaviour\nBehavior" sed -n '/Behaviou\?r/ p' Behaviour Behavior One or More Occurrence \+ In SED, the plus symbol\+ matches one or more occurrences of the preceding character. The following example matches one or more occurrences of "2". [jerry]$ echo -e "111\n22\n123\n234\n456\n222" sed -n '/2\+/ p'

22 123 234 222 Zero or More Occurrence Asterisks matches the zero or more occurrence of the preceding character. The following example matches "ca", "cat", "catt", and so on. [jerry]$ echo -e "ca\ncat" sed -n '/cat*/ p' ca cat Exactly N Occurrences {n} {n} matches exactly "n" occurrences of the preceding character. The following example prints only three digit numbers. But before that, you need to create the following file which contains only numbers. [jerry]$ cat numbers.txt 1 10 100 1000 10000 100000 1000000 10000000 100000000 1000000000 Let us write the SED expression. [jerry]$ sed -n '/^[0-9]\{3\}$/ p' numbers.txt 100 Note that the pair of curly braces is escaped by the "\" character. At least n Occurrences {n,} {n,} matches at least "n" occurrences of the preceding character. The following example prints all the numbers greater than or equal to five digits. [jerry]$ sed -n '/^[0-9]\{5,\}$/ p' numbers.txt 10000 100000

1000000 10000000 100000000 1000000000 M to N Occurrence {m, n} {m, n} matches at least "m" and at most "n" occurrences of the preceding character. The following example prints all the numbers having at least five digits but not more than eight digits. [jerry]$ sed -n '/^[0-9]\{5,8\}$/ p' numbers.txt 10000 100000 1000000 10000000 Pipe In SED, the pipe character behaves like logical OR operation. It matches items from either side of the pipe. The following example either matches "str1" or "str3". [jerry]$ echo -e "str1\nstr2\nstr3\nstr4" sed -n '/str\(1\ 3\)/ p' str1 str3 Note that the pair of the parenthesis and pipe is escaped by the "\" character. Escaping Characters There are certain special characters. For example, newline is represented by "\n", carriage return is represented by "\r", and so on. To use these characters into regular ASCII context, we have to escape them using the backward slash(\) character. This chapter illustrates escaping of special characters. Escaping "\" The following example matches the pattern "\". [jerry]$ echo 'str1\str2' sed -n '/\\/ p' str1\str2 Escaping "\n" The following example matches the new line character. [jerry]$ echo 'str1\nstr2' sed -n '/\\n/ p' str1\nstr2

Escaping "\r" The following example matches the carriage return. [jerry]$ echo 'str1\rstr2' sed -n '/\\r/ p' str1\rstr2 Escaping "\dnnn" This matches a character whose decimal ASCII value is "nnn". The following example matches only the character "a". [jerry]$ echo -e "a\nb\nc" sed -n '/\d97/ p' a Escaping "\onnn" This matches a character whose octal ASCII value is "nnn". The following example matches only the character "b". [jerry]$ echo -e "a\nb\nc" sed -n '/\o142/ p' b This matches a character whose hexadecimal ASCII value is "nnn". The following example matches only the character "c". [jerry]$ echo -e "a\nb\nc" sed -n '/\x63/ p' c POSIX Classes of Regular Expressions There are certain reserved words which have special meaning. These reserved words are referred to as POSIX classes of regular expression. This section describes the POSIX classes supported by SED. [:alnum:] It implies alphabetical and numeric characters. The following example matches only "One" and "123", but does not match the tab character. [jerry]$ echo -e "One\n123\n\t" sed -n '/[[:alnum:]]/ p' One 123

[:alpha:] It implies alphabetical characters only. The following example matches only the word "One". [jerry]$ echo -e "One\n123\n\t" sed -n '/[[:alpha:]]/ p' One [:blank:] It implies blank character which can be either space or tab. The following example matches only the tab character. [jerry]$ echo -e "One\n123\n\t" sed -n '/[[:space:]]/ p' cat -vte ^I$ Note that the command "cat -vte" is used to show tab characters I. [:digit:] It implies decimal numbers only. The following example matches only digit "123". [jerry]$ echo -e "abc\n123\n\t" sed -n '/[[:digit:]]/ p' 123 [:lower:] It implies lowercase letters only. The following example matches only "one". [jerry]$ echo -e "one\ntwo\n\t" sed -n '/[[:lower:]]/ p' one [:upper:] It implies uppercase letters only. The following example matches only "TWO". [jerry]$ echo -e "one\ntwo\n\t" sed -n '/[[:upper:]]/ p' TWO [:punct:] It implies punctuation marks which include non-space or alphanumeric characters [jerry]$ echo -e "One,Two\nThree\nFour" sed -n '/[[:punct:]]/ p'

One,Two [:space:] It implies whitespace characters. The following example illustrates this. [jerry]$ echo -e "One\n123\f\t" sed -n '/[[:space:]]/ p' cat -vte 123^L^I$ Metacharacters Like traditional regular expressions, SED also supports metacharacters. These are Perl style regular expressions. Note that metacharacter support is GNU SED specific and may not work with other variants of SED. Let us discuss metacharacters in detail. Word Boundary \b In regular expression terminology, "\b" matches the word boundary. For example, "\bthe\b" matches "the" but not "these", "there", "they", "then", and so on. The following example illustrates this. [jerry]$ echo -e "these\nthe\nthey\nthen" sed -n '/\bthe\b/ p' the Non-Word Boundary \B In regular expression terminology, "\B" matches non-word boundary. For example, "the\b" matches "these" and "they" but not "the". The following example illustrates this. [jerry]$ echo -e "these\nthe\nthey" sed -n '/the\b/ p' these they Single Whitespace \s In SED, "\s" implies single whitespace character. The following example matches "Line\t1" but does not match "Line1". [jerry]$ echo -e "Line\t1\nLine2" sed -n '/Line\s/ p' Line 1 Single Non-Whitespace In SED, "\S" implies single whitespace character. The following example matches "Line2" but does not match "Line\t1".

[jerry]$ echo -e "Line\t1\nLine2" sed -n '/Line\S/ p' Line2 Single Word Character \w In SED, "\w" implies single word character, i.e., alphabetical characters, digits, and underscore _. The following example illustrates this. [jerry]$ echo -e "One\n123\n1_2\n&;#" sed -n '/\w/ p' One 123 1_2 Single Non-Word Character \W In SED, "\W" implies single non-word character which is exactly opposite to "\w". The following example illustrates this. [jerry]$ echo -e "One\n123\n1_2\n&;#" sed -n '/\W/ p' &;# Beginning of Pattern Space \` In SED, "\`" implies the beginning of the pattern space. The following example matches only the word "One". [jerry]$ echo -e "One\nTwo One" sed -n '/\`One/ p' One Loading [MathJax]/jax/output/HTML-CSS/jax.js