Certification String Processing with Regular Expressions
UNIT 4 String Processing with Regular Expressions
UNIT 4: Objectives? Learn how the regular expression pattern matching system works? Explore the use of regular expressions in string processing tools 2
UNIT 4: Agenda? Regular expressions? grep? sed? less? Extended regular expressions and awk 3
Pattern Matching with Regular Expressions? Regular expressions are a pattern matching engine? Used by many tools, including: grep, sed, less, vi, awk? Values:? Power over ease of use? Greed!? Two types: Basic and Extended 4
Wildcard Characters? Wildcard characters stand for another single character:. any single character [abc] any single character in the set [a-c] any single character in the range [^abc] any single character not in the set [^a-c] any single character not in the range 5
Modifiers? Modifiers determine the number of the previous character * zero or more of the previous char \+ one or more of the previous char \? zero or one of the previous char \{i\} exactly i of the previous character \{i,\} i or more of the previous char \{i,j\} i to j of the previous character 6
Anchors? Anchors match the beginning or end of a line or word ^ line begins with $ line ends with \< word begins with \> word ends with 7
regex Combinations? Regular expressions are most useful in combination with each other.* zero or more of any character [a-z]* zero or more letters \<cat\> the word 'cat' ab..ef ab and ef separated by two chars.\{32\} 32 of any character \* a literal asterisk 8
Regular Expressions - Examples What do the following match?. Sm.th 2. Sm[iy]th 3. www\.redhat\.com 4. ^#! 5. \<the 6. ^[a-z0-9 ]\{28\}$ 7. ^^Yipes!$$ 9
Quote your regex's!? On the command line, quote regular expressions? File name generation characters must remain unquoted? Do not use quotes in regular expressions within commands 0
grep? general regular expression processor? Prints lines of files where a pattern is matched $ grep john /etc/passwd john:x:500:500:john Doe:/home/john:/bin/bash? Also used as filter in pipelines ls grep.c? Uses regular expressions grep '[0-9][A-Z]\{3\}[0-9]\{3\}' cars
sed? stream editor? Reads a file or stream of data; writes out the data, performing search and replace instruction(s)? Uses regular expressions in search string (but not replace string) 2
Using sed? Quote search and replace instructions!? sed addresses sed 's/dog/cat/g' pets sed ',50s/dog/cat/g' pets sed '/digby/,/duncan/s/dog/cat/g' pets? Multiple sed instructions sed -e 's/dog/cat/' -e 's/hi/lo/' pets sed -f myedits pets 3
less and slocate? Searches in less use regular expressions /h[aeiou]t? Searches in slocate can use regular expressions slocate -r 'tig.*png' 4
Regular Expressions in vi and vim? Regular expressions operate in lesslike search operations Example: /RKZ[68][0-9]3? And in sed-like search and replace commands Example: :,$s/\<[cc]at\>/& and dog/g 5
Extended Regular Expressions? An extension of the regular expression set? Tools that use extended regex's: egrep grep -E (same as egrep) awk 6
Extended regex Syntax? Most basic regular expressions are supported? Basic regular expressions requiring a preceding backslash no longer require backslash a{0,2} counter: 0,, or 2 letter a s? Exception: word anchors ( \< and \> ) still require backslashes 7
awk? Programing language for editing text? Searches a file for lines matching a pattern or patterns? Performs specified actions on matching lines? Search patterns are extended regular expressions 8
Using awk? awk programs are data-driven? awk rules contain a pattern and an action in curly braces pattern { action }? The action is taken on any line matching the pattern awk '/bash/ { print }' /etc/passwd awk '/[2-5]+/ { print }' /etc/inittab 9
End of Unit 4? Questions and answers? Summary Basic string processing Simple regular expressions 2 0