Module 8 Pipes, Redirection and REGEX

Similar documents
Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs

Week Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file

ITST Searching, Extracting & Archiving Data

CS 307: UNIX PROGRAMMING ENVIRONMENT FIND COMMAND

5/20/2007. Touring Essential Programs

Basics. I think that the later is better.

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal

CST Algonquin College 2

Linux Text Utilities 101 for S/390 Wizards SHARE Session 9220/5522

Bashed One Too Many Times. Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009

find starting-directory -name filename -user username

Lab 2: Linux/Unix shell

Review of Fundamentals

The input can also be taken from a file and similarly the output can be redirected to another file.

DATA 301 Introduction to Data Analytics Command Line. Dr. Ramon Lawrence University of British Columbia Okanagan

Why learn the Command Line? The command line is the text interface to the computer. DATA 301 Introduction to Data Analytics Command Line

18-Sep CSCI 2132 Software Development Lecture 6: Links and Inodes. Faculty of Computer Science, Dalhousie University. Lecture 6 p.

Part III. Shell Config. Tobias Neckel: Scripting with Bash and Python Compact Max-Planck, February 16-26,

RH033 Red Hat Linux Essentials

Common File System Commands

A Brief Introduction to the Linux Shell for Data Science

5/8/2012. Exploring Utilities Chapter 5

UNIX files searching, and other interrogation techniques

CSE 390a Lecture 2. Exploring Shell Commands, Streams, and Redirection

ADVANCED LINUX SYSTEM ADMINISTRATION

CSCI 2132 Software Development. Lecture 7: Wildcards and Regular Expressions

Introduction Variables Helper commands Control Flow Constructs Basic Plumbing. Bash Scripting. Alessandro Barenghi

CSC209H Lecture 1. Dan Zingaro. January 7, 2015

Command-line interpreters

Reading and manipulating files

2) clear :- It clears the terminal screen. Syntax :- clear

sottotitolo A.A. 2016/17 Federico Reghenzani, Alessandro Barenghi

Chapter 1 - Introduction. September 8, 2016

Lecture 5. Essential skills for bioinformatics: Unix/Linux

CS 25200: Systems Programming. Lecture 11: *nix Commands and Shell Internals

System Administration

Introduction to UNIX Part II

CS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010

Advanced training. Linux components Command shell. LiLux a.s.b.l.

More Scripting and Regular Expressions. Todd Kelley CST8207 Todd Kelley 1

CS160A EXERCISES-FILTERS2 Boyd

CS 307: UNIX PROGRAMMING ENVIRONMENT KATAS FOR EXAM 2

CST Lab #5. Student Name: Student Number: Lab section:

bash, part 3 Chris GauthierDickey

Paolo Santinelli Sistemi e Reti. Regular expressions. Regular expressions aim to facilitate the solution of text manipulation problems

Shells and Shell Programming

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

Week 5 Lesson 5 02/28/18

Basic Linux (Bash) Commands

UNIX, GNU/Linux and simple tools for data manipulation

Introduction to UNIX. Introduction. Processes. ps command. The File System. Directory Structure. UNIX is an operating system (OS).

Introduction to UNIX. CSE 2031 Fall November 5, 2012

Unix Internal Assessment-2 solution. Ans:There are two ways of starting a job in the background with the shell s & operator and the nohup command.

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1

CSE 390a Lecture 2. Exploring Shell Commands, Streams, Redirection, and Processes

Files

CS395T: Introduction to Scientific and Technical Computing

Finding files and directories (advanced), standard streams, piping

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Lec 1 add-on: Linux Intro

An Illustrated Guide to Shell Magic: Standard I/O & Redirection

Assume that username is cse. The user s home directory will be /home/cse. You may remember what the relative pathname for users home directory is: ~

Introduction: What is Unix?

Linux Command Line Primer. By: Scott Marshall

- c list The list specifies character positions.

Unix as a Platform Exercises + Solutions. Course Code: OS 01 UNXPLAT

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

STATS Data Analysis using Python. Lecture 15: Advanced Command Line

EECS2301. Lab 1 Winter 2016

ECE 364 Software Engineering Tools Lab. Lecture 2 Bash II

Unix as a Platform Exercises. Course Code: OS-01-UNXPLAT

CSC UNIX System, Spring 2015

Systems Programming/ C and UNIX

A shell can be used in one of two ways:

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1

22-Sep CSCI 2132 Software Development Lecture 8: Shells, Processes, and Job Control. Faculty of Computer Science, Dalhousie University

Shells and Shell Programming

Useful Unix Commands Cheat Sheet

CS246 Spring14 Programming Paradigm Files, Pipes and Redirection

5/8/2012. Creating and Changing Directories Chapter 7

File Commands. Objectives

Files and Directories

PESIT Bangalore South Campus

Utilities. September 8, 2015

Basic Linux Command Line Interface Guide

This is Lab Worksheet 3 - not an Assignment

Crash Course in Unix. For more info check out the Unix man pages -orhttp:// -or- Unix in a Nutshell (an O Reilly book).

Essentials for Scientific Computing: Bash Shell Scripting Day 3

Linux II and III. Douglas Scofield. Crea-ng directories and files 18/01/14. Evolu5onary Biology Centre, Uppsala University

Basic Unix Command. It is used to see the manual of the various command. It helps in selecting the correct options

Essentials for Scientific Computing: Stream editing with sed and awk

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Introduction to the Shell

UNIX System Programming Lecture 3: BASH Programming

Introduction to Unix

Unix Introduction to UNIX

Chapter 9. Shell and Kernel

Linux & Shell Programming 2014

Unix Shell. Advanced Shell Tricks

Cisco IOS Shell. Finding Feature Information. Prerequisites for Cisco IOS.sh. Last Updated: December 14, 2012

Transcription:

Module 8 Pipes, Redirection and REGEX

Exam Objective 3.2 Searching and Extracting Data from Files Objective Summary Piping and redirection Partial POSIX

Command Line and Redirection

Command Line Pipes The pipe character ( ) can be used between two commands to send the output of the first as input to the second: ls /etc head The output of ls /etc is sent to head as input.

Command Line Pipelines Multiple commands can be combined to form pipelines. The order in which commands are added to the pipeline can affect the output:

I/O Redirection Three Input/Output (I/O) streams associated with every command: Standard Input (STDIN) is normally provided by the user via the keyboard. Standard Output (STDOUT) is the output produced by the command when operating correctly. STDOUT normally appears in the same window as where command executed. Standard Error (STERR) is the is the output produced by the command when an error has occurred. STDOUT normally appears in the same window as where command executed.

I/O Redirection Symbols Summary of redirection possible with the bash shell: < /path/to/file (Redirect STDIN from file) > /path/to/file (Redirect STDOUT overwriting file) >> /path/to/file (Redirect STDOUT appending file) 2> /path/to/file (Redirect STDERR overwriting file) 2>> /path/to/file (Redirect STDERR appending file) &> /path/to/file (Redirect STDERR and STDOUT overwriting file) &>> /path/to/file (Redirect STDERR and STDOUT appending file)

The null device The null device is represented by the /dev/null file. (Otherwise known as the Bit Bucket ) This file is very useful in redirection of input and output. This file serves two purposes: any output redirected to /dev/null is discarded. /dev/null can be used for input to provide a stream of null values.

STDIN, STDOUT, and STDERR

STDIN or 0 Standard Input (STDIN) normally is provided by the keyboard but can be redirected with the < symbol. STDIN can be read by programs to get data for them to process. To signal a program that you wish to stop providing data by the keyboard via STDIN, type CTRL-D. The tr command reads its data from STDIN. It translates from one set of characters to another. If you were the user typing the data to be translated by the tr command, you would type CTRL-D when finished.

STDIN from keyboard In the following example, the tr command translates from lowercase to uppercase after the user typed the command and pressed Enter. Then, "alpha" was typed and Enter pressed. Finally, the user typed CTRL-D.

Redirecting STDIN from file The tr command translates from lowercase to uppercase with STDIN being redirected from the /etc/hosts file:

STDOUT or 1 Standard Out (STDOUT) is the output from the command when operating correctly. It is usually displayed in the same window where the command is executed. The echo command is used to print messages to STDOUT. It can be used to demonstrate how STDOUT can be redirected, as shown on the following slide.

Redirecting STDOUT In the example below, the echo Linux 1 command is executed and the output appears on STDOUT. Then, the echo Linux 1 > a.txt command redirects the output to the file a.txt. Finally, the command cat a.txt sends the file contents to STDOUT, so the output is shown.

Appending STDOUT redirection Using a single arrow > for STDOUT redirection will clobber, or overwrite, the file specified. Using the double arrow >> for STDOUT redirection will either create a new file or append an existing one:

STDERR or 2 Standard Error (STDERR) is the output of a command after an error has occurred. It is normally sent to the console/terminal where the command is executed. ls /fake is a command that will cause an error to be output to STDERR because the /fake file does not exist.

Redirecting STDERR ls /fake 2> /tmp/err.msg is a command that would cause an error to be sent to STDERR which is then redirected to the /tmp/err.msg file. The cat /tmp/err.msg command sends the contents of the file to STDOUT to display the file:

Disposing of STDERR ls /fake 2> /dev/null is a command that would cause STDERR to be redirected to the /dev/null file, in effect disposing of the error message. Notice cat /dev/null displays no visible output.

Working with STDERR and STDOUT find is a command that searches the filesystem. It sends output to STDOUT when it successfully locates a file that matches your criteria. It sends output to STDERR when it fails to access a directory. The find command will be used to demonstrate redirecting both STDOUT and STDERR on the following slides. More detail about the find command

STDERR and STDOUT Example The following example demonstrates the find command searching recursively the /etc/pki directory for any files matching "*.crt". Two lines of STDERR and two lines of STDOUT messages appear:

Isolating STDERR In the next example, the STDOUT output is redirected to the /dev/null file, so the STDERR output alone is sent to the terminal window:

Isolating STDOUT In the next example, the STDERR output is now redirected to /dev/null file, so the STDOUT output alone is sent to the terminal window:

Redirecting Multiple Streams Separately In the next example, the STDERR output is sent to the crt.err file and the STDOUT output is sent to the crt.txt file:

Redirecting Multiple Streams Combined In this example, both STDOUT and STERR are redirected into the same file, crt.all:

find Command

Searching with find command The filesystem has hundreds of directories with thousands of files making finding files challenging. The find command is a powerful tool to be able to search for files in different ways including: name size date ownership

Syntax of find command The find command has the following syntax: find [start_dir] [search_op] [criteria] [result] If the starting directory (start_dir) is not specified, then the current directory is assumed. The search option (search_op) is how the search will be done. For example, use the -name option to search by name.

Syntax of find command (cont'd) The search criteria (criteria) is the data to be used with the search option. So, if the search option was -name, then the search criteria would be the name of the file to find. The result option (result) defaults to -print, which will output the names of the files that are found. Other result options can perform actions on the files that are found.

Searching by file name Consider the following command: find /etc/pki -name "*.crt" Begins searching recursively from the /etc/pki directory Output any file names that match "*.crt" (anything that ends in ".crt ).

Displaying file detail The option -ls will create output similar to the ls -l command. (show both) The columns output are: inode, blocks used, permissions, link count, user owner, group owner, size, date/time, and file name.

Searching by file size The -size option can be used by find to search by its size. Large units can be specified as K, M, G, etc. Using +1M means more than one megabyte. Using -1M means less than one megabyte.

Useful options for find command Option Example Meaning -maxdepth -maxdepth 1 Only search specified directory and its immediate subdirectories -group -group payroll Find any files owned by the payroll group -iname -iname hosts Case insensitive search by filename -mmin -mmin -10 Find any files modified in the last ten minutes or less -type -type f Find only regular files -user -user bob Find any files owned by the user bob

less Command

Viewing files with less command The less command is a pager command designed to display only one page of data at a time. The more command is another pager command that has less features than the less command. Both commands allow the user to move back and forth with movement commands to view one page at a time.

The help screen in less Once in the less program, pressing the "h" key will display the help screen:

less movement commands As seen in the help screen, the less command has many movement commands. The most common commands are: Movement Window forward Window backward Line forward Exit Help Key Spacebar b Enter q h

less searching commands Type / to search from cursor to end of file. Type? to search from cursor to beginning of file. Type pattern to search and press Enter. If more than one match found, press n to go to next match or N to go to previous match.

head or tail

Filtering with head The head command displays the first ten lines of a file by default. The -n option allows for the number of lines to be displayed to be specified.

head with negative lines Normally the head command displays the number of lines specified from the top of the file. Using -n with a negative value, indicates how many lines from the bottom to not show. This example shows all lines from /etc/passwd except the last thirty-two.

Filtering with tail The tail command displays the last ten lines of a file by default. The -n option allows for the number of lines to be displayed to be specified:

tail with positive lines If the -n option specifies the number of lines to be displayed with a plus ( + ) prefix, then the tail command interprets that to mean to display from that line number to the end of the file:

Following with tail The tail command is able to monitor changes a file and print them out as they occur by using -f option. System administrators frequently follow log files in order to troubleshoot system problems. The user must terminate the tail command when following with the f option by using CTRL-C.

sort Command

Sorting files or input The sort command will rearrange its output lines according to one or more fields you specify for sorting. Fields are separated by whitespace, although with the t option, you can specify the delimiter. The default sort is in ascending order, but you can use the -r option to reverse the sorting of a field. The default sort is a dictionary sort, but you can use the -n option to make it a

Example of sort In the following example, the /etc/passwd file is sorted using a : character as a delimiter, by the fourth field numerically and then the third field numerically in reverse:

File Statistics

File statistics with wc command The wc command outputs up to three statistics for each file it is given as an argument. By default, wc displays the lines, words and bytes contained in each file. If provided more than one file, then it also calculates the totals of all files. To view individual statistics, specify -l for lines, -w for words or -c for bytes.

Example of wc command To analyze the number of lines, words and bytes in the /etc/passwd and /etc/passwd- files, the following wc command could be executed:

Using wc with pipes The wc command is often used with pipes so that the output of a command can be analyzed. Using wc -l as the final command in the pipe will count how many lines of output was produced. For example, to determine how many files and directories are in the /etc directory, you could execute: ls /etc wc -l

cut Command

Filtering with cut command If you want to extract columns of text, then the cut command provides two simple techniques: By delimiter, where whitespace is the default. The -d option can let you specify other delimiters and -f is used to indicate which fields to extract. By character position, using the -c option with the range of the column to extract.

Example of cut command The /etc/passwd file is delimited by colon with these fields: account:password:uid:gid:gecos:directory:shell To extract the first, and fifth through seventh fields: cut d: -f1,5-7 /etc/passwd

grep Command

Filtering with grep command The grep command can be used to filter standard input or the contents of a file for lines matching a specified pattern. If you want to see where a pattern, or perhaps a word, appears within a file, then the grep command is useful for that purpose.

Common grep options Option Purpose --color Color the matches found -v Reverse (negate) matches -c Count matches -n Number matching lines -l List matching files -i Match case insensitive -w Match pattern as a word

Basic Regular Expressions

Basic Regular Expressions Basic Regular Expressions (BRE) are able to be used with the grep command without requiring an option to use them (unlike Extended Regular Expression show later). The simplest regular expressions are just alphabetic or numeric characters that match themselves. The backslash \ can be used to escape the meaning of regular expression metacharacters, including the backslash itself.

BRE: the. example The. (period) character matches exactly one character. The example below shows the grep command matching the 'a' followed by two characters. The results show it matched 'abc'.

BRE: the [ ] example The [ ] (brackets) characters are used to match exactly one character. The characters can be listed or given as a range. If the first character listed is ^ (caret), then it means not the characters bracketed.

BRE: the * example The * (asterisk) character will match zero or more of the previous character. Matching "a*" is not very useful because it might match zero a's (matches every line). Matching "abcd*" would be more useful, since you would need an "abc" followed by zero or more d's.

BRE: the ^ example The ^ (caret) character, when appearing at the beginning of the pattern, means that pattern must appear at the beginning of the line. The ^ not at the beginning of a pattern matches itself.

BRE: the $ example The $ (dollar sign), when appearing at the end of the pattern, means that pattern must appear at the end of the line. The $ not at the beginning of a pattern matches itself.

BRE: Combining ^ and $ Combining both the ^ and $ characters allows for two special matches: '^$' is a blank line match. '^pattern$ matches if the whole line contains only the "pattern.

Extended Regular Expressions

Extended Regular Expressions The use of Extended Regular Expressions (ERE) requires the -E option when using the grep command. Extended Regular Expressions can be combined with Basic Regular Expressions. The following are ERE characters:?, +, and

ERE: the + example The + (plus) character will match one or more of the previous character. Matching "a+" is useful because it can match one or more a's, ensuring only lines that have at least one a are matched.

ERE: the? example The? (question mark) character will optionally match one of the previous character. The? character is useful for matching characters that only occasionally appear in a word. The following example illustrates this:

ERE: the example The (vertical bar) character will act like an or operator between two regular expressions. This alternation operator is useful to be able to match multiple patterns:

The xargs command The xargs command helps complex piped command sets execute more efficiently It attempts to build the longest command line possible with as many arguments as possible It tries to prevent executing the command each time for every argument