Finding files and directories (advanced), standard streams, piping

Similar documents
Removing files and directories, finding files and directories, controlling programs

More regular expressions, synchronizing data, comparing files

CS 307: UNIX PROGRAMMING ENVIRONMENT FIND COMMAND

More text file manipulation: sorting, cutting, pasting, joining, subsetting,

Links, basic file manipulation, environmental variables, executing programs out of $PATH

Module 8 Pipes, Redirection and REGEX

Week Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file

CSC209H Lecture 1. Dan Zingaro. January 7, 2015

Review of Fundamentals

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

ITST Searching, Extracting & Archiving Data

CSC UNIX System, Spring 2015

Shells and Shell Programming

CSCI 2132 Software Development. Lecture 4: Files and Directories

Lecture 3 Tonight we dine in shell. Hands-On Unix System Administration DeCal

File Commands. Objectives

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

Table of contents. Our goal. Notes. Notes. Notes. Summer June 29, Our goal is to see how we can use Unix as a tool for developing programs

The input can also be taken from a file and similarly the output can be redirected to another file.

Processes. Shell Commands. a Command Line Interface accepts typed (textual) inputs and provides textual outputs. Synonyms:

Basic UNIX commands. HORT Lab 2 Instructor: Kranthi Varala

6 Redirection. Standard Input, Output, And Error. 6 Redirection

Shells and Shell Programming

Introduction to Linux

Command-line interpreters

Bashed One Too Many Times. Features of the Bash Shell St. Louis Unix Users Group Jeff Muse, Jan 14, 2009

Introduc)on to Linux Session 2 Files/Filesystems/Data. Pete Ruprecht Research Compu)ng Group University of Colorado Boulder

Shell Scripting. Todd Kelley CST8207 Todd Kelley 1

Tutorial 8: Practice Exam Questions

Basic Linux (Bash) Commands

CS 307: UNIX PROGRAMMING ENVIRONMENT KATAS FOR EXAM 2

Systems Programming/ C and UNIX

The Unix Shell. Pipes and Filters

A shell can be used in one of two ways:

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1

CS246 Spring14 Programming Paradigm Files, Pipes and Redirection

ECE 364 Software Engineering Tools Lab. Lecture 2 Bash II

User Commands find ( 1 )

Recap From Last Time:

BGGN 213 Working with UNIX Barry Grant

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Introduction in Unix. Linus Torvalds Ken Thompson & Dennis Ritchie

Week 2 Lecture 3. Unix

Review of Fundamentals. Todd Kelley CST8207 Todd Kelley 1

System Administration

Files and Directories

Useful Unix Commands Cheat Sheet

Introduction To. Barry Grant

Shells. A shell is a command line interpreter that is the interface between the user and the OS. The shell:

Essential Linux Shell Commands

UNIX Essentials Featuring Solaris 10 Op System

Introduction to UNIX. Introduction. Processes. ps command. The File System. Directory Structure. UNIX is an operating system (OS).

Introduction to UNIX. CSE 2031 Fall November 5, 2012

EECS 470 Lab 5. Linux Shell Scripting. Friday, 1 st February, 2018

IT 341: Introduction to System Administration Using sed

IT441. Network Services Administration. Perl: File Handles

Introduction to Linux Basics Part II. Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala

Week 5 Lesson 5 02/28/18

CSE 390a Lecture 1. introduction to Linux/Unix environment

CMPS 12A Introduction to Programming Lab Assignment 7

Introduction: What is Unix?

CSE 391 Lecture 1. introduction to Linux/Unix environment

Bourne Shell Reference

Unix as a Platform Exercises + Solutions. Course Code: OS 01 UNXPLAT

Introduction to UNIX. SURF Research Boot Camp April Jeroen Engelberts Consultant Supercomputing

: the User (owner) for this file (your cruzid, when you do it) Position: directory flag. read Group.

Introduction to the Shell

STATS Data Analysis using Python. Lecture 15: Advanced Command Line

DATA 301 Introduction to Data Analytics Command Line. Dr. Ramon Lawrence University of British Columbia Okanagan

Why learn the Command Line? The command line is the text interface to the computer. DATA 301 Introduction to Data Analytics Command Line

The Unix Shell & Shell Scripts

A Brief Introduction to the Linux Shell for Data Science

Exploring the system, investigating hardware & system resources

CS 25200: Systems Programming. Lecture 11: *nix Commands and Shell Internals

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12)

Scripting Languages Course 1. Diana Trandabăț

Mills HPC Tutorial Series. Linux Basics I

Essential Unix and Linux! Perl for Bioinformatics, ! F. Pineda

Creating a Shell or Command Interperter Program CSCI411 Lab

UNIX files searching, and other interrogation techniques

Lec 1 add-on: Linux Intro

Linux shell scripting Getting started *

CSC209. Software Tools and Systems Programming.

CSCI 2132 Software Development. Lecture 5: File Permissions

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #47. File Handling

Linux Refresher (1) 310/ Fourth Workshop on Distributed Laboratory Instrumentation Systems (30 October - 24 November 2006)

CS Unix Tools. Lecture 2 Fall Hussam Abu-Libdeh based on slides by David Slater. September 10, 2010

Introduction to UNIX command-line

Principles of Bioinformatics. BIO540/STA569/CSI660 Fall 2010

Affine Transformations Computer Graphics Scott D. Anderson

find Command as Admin Security Tool

UNIX Kernel. UNIX History

Basics. I think that the later is better.

UNIX Tutorial One

Tutorial 5: XML. Informatics 1 Data & Analysis. Week 7, Semester 2,

Assignment clarifications

CSE 391 Lecture 1. introduction to Linux/Unix environment

Linux Command Line Primer. By: Scott Marshall

5/20/2007. Touring Essential Programs

Transcription:

Finding files and directories (advanced), standard streams, piping Laboratory of Genomics & Bioinformatics in Parasitology Department of Parasitology, ICB, USP

Finding files or directories When you have lots of fles (potentially thousands!) in you system, fnding that one fle that you hno you have, but cantt remember here, can be a daunting tash There are t o diferent commands for fnding fles in Linux systems: locate find The find command is probably al ays present, hile locate might or might not be installed (although it is common) As The Linux Command Line booh says: locate Find Files The Easy Way find Find Files The Hard Way

locate locate fnd fles exclusively by name The locate program performs a rapid database search of path names, and then outputs every name that matches a given query Let s say e ant to fnd every fle that contains.zip in the name (including directories in its path); the command for locate ould be: locate.zip locate, as implied above, uses a pre-made database to looh up names The problem ith that is that a newly created fle ill not be found before the database is updated. Want proof, list the contents lihe this: ls ~jmalves

locate As you can see, there is a fle (testtest.zip) ith.zip in the name that as not found by locate That is because the database to locate is only updated at certain intervals typically once a day Thus, any fles created/deleted/renamed/etc. before the next update to the database ill not be seen by locate That, of course, can be a disadvantage The advantage of using the database is the speed of the loohup This was last week! How about now?

find The expression that find uses to select fles consists of one or more primaries, each of hich is a separate command line argument find evaluates the expression each time it processes a fle An expression can contain any of the follo ing types of primaries: Options Tests Actions For example: Operators find ~ -maxdepth 3 -name '*.pdf' -and -perm 777 -delete

find find ~ -maxdepth 3 -name '*.pdf' -and -perm 777 -delete command (find) option (-maxdepth 3) tests (-name '*.pdf' and -perm 777) action (-delete) path where the search should start (user s home: ~) logical operator (-and)

find find ~ -maxdepth 3 -name '*.pdf' -and -perm 777 -delete This search ill: Looh for fles and directories in the user s home directory (and its subdirectories, but ) Go do n at most three subdirectory levels (e.g., it ill search directories ~/dir1/subdir1/ and ~dir1/subdir1/subsub2/, but not ~/dir1/subdir2/subsub1/subsubsubx/) Looh for fles hose names end in.pdf AND hose permissions are 777 Finally, find ill delete fles that satisfy those conditions

find some tests -amin, -cmin, -mmin -anewer, -cnewer, -mnewer -atime, -ctime, -mtime -empty -executable -group -name, -iname -inum -newer -nogroup, -nouser -path -perm -readable -regex -samefile -size -type -user -writable etc. etc. etc.

find actions -delete -exec, execdir -fls, -ls -print, -fprint -print0, fprint0 -printf, fprintf -ok, -okdir -prune -quit

find operators \( \) -not,! -a, -and -o, -or

find Let s try! First, log into the remote server (200.144.244.172) Use find to perform the follo ing searches: find /data/genomas -name '*contigs*' find /data/genomas -iname '*contigs*' find /data/genomas -name 'Try*' find /data/genomas -iname 'Try*' -type f find /data/genomas -iname 'Try*' -type f -exec ls -l {} \; find /data/genomas -iname 'Try*' -type f -exec ls -l {} + Characters lihe ; and ( and ) have special meaning to the shell, so they must be escaped ith a preceding \ (bachslash): \; \( \)

find We can group diferent tests in a search command To do that, e use the logical operators mentioned earlier (-and, -or, -not) If no operator is given, -and is applied by default in most cases Grouping is performed ith parantheses, hich have to be escaped (since they have special meaning for the shell) by using a backslash Examples: find /data/ -name '*contigs*' -and -type d find /data/ -name '*contigs*' -type d (same as previous one) find /data/ \( -name 'Try*' -or -name 'Lc*' \) -and -type f find /data/ -iname 'Try*' -not -type d https://www.gnu.org/software/findutils/manual/html_mono/find.html

Quiz time! Go to the Moodle site and choose Quiz 17 (beware time limits!)

Now you do it! Go to the Moodle site, Practical Exercise 17 Follo the instructions to ans er the questions in the exercise (and beware time limits!) Remember: in the PE, you should do things in practice before ans ering the question!

Standard streams Standard streams (for data fo ) are automatically connected input and output communication channels bet een a computer program and its environment, available hen the program begins execution They are considered a special hind of virtual text fles One of the most important concepts in the use of the command line! There are three such streams: Standard input (stdin) Standard output (stdout) Standard error (stderr)

Standard streams In most operating systems before UNIX, programs had to be explicitly connected (by the programmer) to the appropriate input and output devices One of the advances introduced by UNIX ere abstract devices, hich eliminated the need for the program to hno (or going into) here data as coming from UNIX also implemented automatic connection of each running program to the standard data streams ( hich tie the program to actual physical devices) Standard input (stdin): here data comes from to enter the program Standard output (stdout): here data goes hen it gets out of the program Standard error (stderr): here program errors (or arnings or diagnostic messages) go to hen issued

Standard streams Some programs do not require standard input, e.g., ls, pwd Some others do not require standard output, e.g., mkdir, cd Standard input is represented by number 0 Standard output is represented by number 1 Standard error is represented by number 2

By default: Standard streams Standard input (stdin): Standard output (stdout): heyboard screen Standard error (stderr): #0 stdin screen Text terminal Keyboard Process #1 stdout #2 stderr Display

Standard streams Let's try! Log into the remote server, in case you ere not there already If you ere in class previously, you must have a program called average hose fle has been placed in your $HOME/bin directory If you dontt have that fle, copy it from ~dummy/bin/ to your home directory The average program accepts data from the standard input and sends its output to standard output Start the program (remember: if itts not in a directory from your $PATH, you must give the relative or absolute path to be able to run it! Also mahe sure your copy of the program has execute permissions for your user) average (or./average or bin/average etc.) Notice that the program started, and it is no aiting for data!

Data is coming from the standard input, hich is the keyboard, by default So, type a number, and then ENTER Keep doing that until you are done To signal the end of fle (after all, STDIN is a virtual fle ), press Ctrl+d by itself, at the start of a ne line Since the average program aits for the hole input before performing any calculation, no output appears until the end of input Other programs could behave diferently; for example: head This program return the top 10 (by default) lines of a fle. Try it with STDIN! Enter lines until the program exits

Standard streams Just by existing, standard streams are already very useful But the capability of redirection mahes them even more versatile That ay, data can come from (or go to) diferent places than just the heyboard (or the screen) Redirection can be done between a program and input and output fles or between diferent programs This is the main enabler of the modularity displayed by UNIX, specially at the command line! Remember the frst lecture? The Unix philosophy: combining small programs that each do only one thing (but do it ell), instead of having large programs that do a lot of things (but not as ell done): the po er of a system comes more from the relationships among programs than from the programs themselves Brian W. Kernighan e Rob Pihe, 1984

Quiz time! Go to the Moodle site and choose Quiz 18 (beware time limits!)

Redirecting to (and from) files Typing a lot of data for the program ould not be practical Reading a large amount of output on the screen ouldn t either actually, it ould be impossible in many cases To redirect the streams, e use redirection operators The redirection operators are: > >> < << <<< To determine hich stream you are redirecting, you can prepend its number to the operator; e.g., 2> ( ill redirect STDERR)

Redirecting to (and from) files The redirection operators are: > : redirect STDOUT to fle named on the right 2> : redirect STDERR to fle named on the right >> : redirect, appending STDOUT to fle named on the right 2>> : redirect, appending STDERR to fle named on the right < : redirect STDIN from fle named on the right << : redirect STDIN as a here-document <<< : redirect STDIN as a here-string The operators that redirect to STDOUT and STDERR create the fle if it does not exist Careful! The single version of the operator (e.g., 2>) ill al ays over rite the fle named on the right (if it exists)!

Redirecting to (and from) files Notice that it is not necessary to use the fle handle numbers for the STDIN and STDOUT streams If nothing is given, these are the default choices for the < and > redirection operators It is possible to merge STDOUT and STDERR and send them to the same fle by using the construct: command > file 2>&1 command &> file Both versions do the same thing: send the t o streams to the same fle (overwriting the fle!) To add to the fle ithout over riting, use the >> file 2>&1 and &>> versions

Redirecting to (and from) files Let s try! In the remote server, run the follo ing: ls -l /usr/local/bin/ /usr/blah If you typed correctly, you should see one error and the listings of a directory No run: ls -l /usr/local/bin/ /usr/blah > ls_f1 2> ls_f2 Where did all the stuf go? List the contents of your directory and see you no have t o ne fles ls -ltr (That is: list ith long format, sorting by time, ith ne est fles last)

No, try: Redirecting to (and from) files ls -l /usr/local/bin/ > ls_f3 Use the more command to see hat is inside of the fle ls_f3 that you just created No run: ls -l /usr/local/lib/ > ls_f3 Looh again at the contents of fle ls_f3 Where did all the data from the frst run go? We actually anted to append the results from the second run to the fle! ls -l /usr/local/bin > ls_f3 ls -l /usr/local/lib >> ls_f3

Here-documents Here-documents are multi-line string literals That is, they are a ay of passing multiple lines of text to standard input The << operator specifes that a here-document is about to start Here-docs are of the follo ing general format command << MARK MARK Everything bet een the t o instances of the ord MARK (or hatever you choose) ill be redirected to the STDIN of the command to the left of << Variables can be expanded inside the bloch of text, or not (depending on hether e use quotes around the delimiter ord)

Here-documents Example: wc << EOF A quick brown fox jumps over the lazy dog $PATH EOF Since the delimiting identifer (in this case, EOF) appearing by itself on a line marhs the end of the here-doc, it is a good idea to choose something that is not a real word The output of the command above ill be something lihe: 1 10 300 No, put quotations marks (single or double, it does not matter) around the frst EOF and see hat happens No more expansion!

Here-strings Here-strings are a shortened version of a here-doc Here-strings are limited to one line (containing one or more ords) The <<< operator specifes that a here-string follo s Here-strings have a very simple format command <<< STRING Variables can be expanded inside the string, or not (depending on what kind of quotes, single or double, e use around the string) For example: wc <<< "A quick brown fox jumps over the lazy dog $PATH" Run the command lihe that, ith double quotes, and then ith single quotes. Diferent output? Why?

Quiz time! Go to the Moodle site and choose Quiz 19 (beware time limits!)

Now you do it! Go to the Moodle site, Practical Exercise 18 Follo the instructions to ans er the questions in the exercise (and beware time limits!) Remember: in the PE, you should do things in practice before ans ering the question!

Piping Another pioneering UNIX concept, the pipeline is a sequence of processes chained together by their standard streams The standard output of the frst process goes directly into the standard input of the second process, then the STDOUT of the second goes into the STDIN of the third, and so on and so forth

Piping The output of one process......becomes the input to another

Piping Another pioneering UNIX concept, the pipeline is a sequence of processes chained together by their standard streams The standard output of the frst process goes directly into the standard input of the second process, then the STDOUT of the second goes into the STDIN of the third, and so on and so forth The operator for the pipe is the vertical bar: command1 command2 command3 The STDERR does not get in the pipe, by default To have STDERR go along ith STDOUT in the pipe, use the & construct: command1 & command2 & command3 No space bet een and & there! This construct is not used much though

Piping The pipeline is the crucial feature enabling the UNIX philosophy The chaining allo ed by standard streams and pipe redirection is hat leads to the combination of small, generic, single purpose command line tools into very specifc, sophisticated commands

Piping The t o diferent hinds of redirection can be used in the same chained command, of course For example: ls -l /usr/bin wc -l > out_file This command ill: List all fles from /usr/bin (left side of the pipe) Count ho many fles there are (right side of the pipe) Save the results in fle out_file (redirection of STDOUT on the right) Right no, e haven t seen enough data-munching commands to be able to explore the full po er of piping That s for after the midterm exam!

Now you do it! Go to the Moodle site, Practical Exercise 19 Follo the instructions to ans er the questions in the exercise (and beware time limits!) Remember: in the PE, you should do things in practice before ans ering the question!

Recap The find program is very po erful and can fnd fles based on a large number of criteria, and also includes logical operators for greater fexibility Standard streams (STDIN, STDOUT, and STDERR) are an essential feature of UNIX, and mahe it very easy to redirect data fo s bet een programs and fles or programs and other programs The main redirection operators are <, >, 2>, >>, and 2>> Redirection of standard streams bet een programs, called piping, allo s us to concatenate diferent programs to create more specifc ones The pipe character,, redirects STDOUT from the command to its left to STDIN of the command to its right (STDERR by default goes to the screen) Standard streams and piping are responsible for most of the UNIX philosophy