The Linux Command Line & Shell Scripting [web] [email] portal.biohpc.swmed.edu biohpc-help@utsouthwestern.edu 1 Updated for 2017-11-18
Study Resources : A Free Book 500+ pages * Some of the materials covered in today s training is from this book 2
3 Study Resources : Tutorial Website
Study Resources : Cheat Sheets On the portal Training -> Slides & Handouts 4
Study Resources: Man pages Get a command s help page: man <command> Press q or Ctrl-C to exit the man page 5
Study Resources : Follow Along You can follow along using: 1. the Nucleus Web terminal on the BioHPC portal: https://portal.biohpc.swmed.edu/terminal/ssh/ 2. Putty or SSH from your PC 3. Terminal from your MacBook ssh <username>@nucleus.biohpc.swmed.edu 6
The Shell and Command Line Interfaces The interaction between user and the operating system is provide by a shell Graphic User Interfaces & Command Line Interfaces By far the most common shell on Linux is bash (Bourne Again SHell) It has a lot of built-in commands user name arguments machine name current directory options/switches name of the command 7
Linux Basics: The File System Files on a Linux system are arranged in a hierarchical directory structure. / apps bin tmp home2 project work {uid} {department} shared shared {department} shared {PI_lab} {uid} shared shared {uid or project_name} 8
Linux Command Line: Files and Directories Relative path v.s. Absolute path (full path) / home2 ydu (I am here) Training Relative path path related to the present working directory (pwd) If my current directory is /home2/ydu, relative path is: Training pwd will return : /home2/ydu If change directory to home2, relative path becomes ydu/training pwd will return: /home2 Absolute path specify the location of a file or directory from the root directory (/) /home2/ydu/training 9
Linux Command Line: Files and Directories Editors vi / vim Cryptic commands! Cheat sheet on the portal. Quick tutorial: http://www.washington.edu/computing/unix/vi.html Emacs An extensible, customizable text editor Quick tutorial: http://www.gnu.org/software/emacs/tour/ nano Easier to use. Quick tutorial: http://mintaka.sdsu.edu/reu/nano.html Any text editor from your PC or Mac Mount your directories as network drives https://portal.biohpc.swmed.edu/content/guides/biohpc-cloud-storage/ 10
Demo Example 1: Manage Files/Folders Example 2: Wildcards Example 3: Disk Usage Example 4: Create Shared Folder/Files Example 5: Text File Manipulation with Linux Tools (view/find patterns/sort/replace) as important as knowing how to use Linux itself popular and convenient in BioData processing and Analysis 11
Example 1: Manage Files/Folders Command What does it do? pwd Print Working Directory where am I? ls List files in current directory ls -lah List files with detail (l = long format, a = all files including hidden, h = human readable file sizes) cd Training Change directory to today s training folder mkdir example1 touch example1/newfile.txt ls lah example1 mkdir new_example cp example1/newfile.txt new_example/ cp r example1 newer_example Make a directory called example1 Create an empty new file inside example1 folder List files inside the example1 directory Make the directory new_example Copy newfile.txt from example1 into new_example Copy example1 recursively (directory and everything inside) to a directory called newer_example) 12
Example 1: Manage Files/Folders Command mv new_example old_example rmdir old_example cd old_example ls -alh cd.. rmdir old_example rm r newer_example What does it do? Move/rename new_example directory to old_example Try to remove old_example Change directory into old_example List files with detail Change directory to parent folder Delete an empty folder Recursively delete all files and folders inside newer_example 13
Example 2: Wildcards * Match any number of characters ls * ls notes* ls *.txt ls *2015* Any file Any file beginning with notes Any file ending in.txt Any file with 2015 somewhere in its name? Match a single character ls data_00?.txt Matches data_001, data002, data_00a etc. [] Match a set of characters (bracket expression) ls data_00[0123456789].txt ls data_00[0-9].txt Matches data_001 data_009, not data_00a 14
Example 3: Disk Usage quota: display disk usage and limits (How much space do I have) 15
Example 3: Disk Usage du: estimate file space usage (How much space am I taking up) Total folder size: Detailed usage for each sub-folders: 16
Example 3: Disk Usage Archive files and folders will reduce both space usage and number of files We plan to limit the number of files for each user in the future. mysandybox mysandybox.tar.gz Space usage : 0.07T 0.02T Number of Files : 306 13 17
Example 3: Disk Usage Command gunzip mydata.txt.gz tar xvf archive.tar cd.. du sh example3 cd example3 tar cvzf new.tar.gz file1 folder1 What does it do? Extract a.gz gzipped file Extract (x) everything from a tar archive file (f). Show verbose output (v). Change directory into parent directory (CLI) Estimate space usage of folder example3 Change directory into parent directory example3 Create (c) a compressed tar.gz archive file (f) adding files and folders to it. gzip mydata.txt Compress a file using gzip becomes a.gz file cd.. du sh example3 tar zxvf example3/archive.tar.gz Change directory into parent directory (CLI) Estimate space usage of folder example3 Extract a compressed gzipped tar archive in one step 18
ls -l Permissions drwxr-xr-x 4 dtrudgian biohpc_admin 58 Feb 16 15:13 all_training drwxr-xr-x 7 dtrudgian biohpc_admin 140 Feb 12 10:36 Apps drwxr-xr-x 2 dtrudgian biohpc_admin 26 Feb 16 15:12 cli_training drwxr-xr-x 8 dtrudgian biohpc_admin 4.0K Feb 16 14:25 Cluster_Installs drwxr-xr-x 3 dtrudgian biohpc_admin 4.0K Feb 16 11:49 Desktop drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:10 Documents drwxr-xr-x 9 dtrudgian biohpc_admin 135 Feb 16 14:32 Downloads -rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt drwxr-xr-x 10 dtrudgian biohpc_admin 4.0K Feb 9 12:45 Git drwxr-xr-x 17 dtrudgian biohpc_admin 4.0K Feb 16 15:17 owncloud drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:18 Pictures drwxr-xr-x 5 dtrudgian biohpc_admin 102 Feb 4 11:19 portal_jobs Permissions Owner Group 19
20 File Permissions
Octal Permissions r = 4 w = 2 x = 1 Add up the permissions you need for each class, e.g. rx = 5 rw = 6 rwx = 7 -rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt 6 4 4 Owner can read+write Group can read Others can read 21
Permissions chmod u/g/a +/- r/w/x filename Class: u = user (owner) g = group a = all + Add permission - Remove permission r w x read write execute chmod g+rw test.txt chmod a+x script.sh chmod g-x script.sh Add read/write permissions for the group Add execute permission for everyone Remove execute permission for the group chmod 700 script.sh -rwx------ chmod 640 script.sh rw-r----- 22
Example 4: Create Shared Folder Command cd /project/biohpcadmin/shared/ ls -al mkdir shared_data cd shared_data cp /project/biohpcadmin/ydu/ideas.txt. chmod g+w ideas.txt cd.. chmod g+w shared_data What does it do? Change directory to your lab s shared folder List files with detail to check permission Create shared folder Change directory to newly created shared_data folder Copy file to shared folder, you group members can read and copy it Grant write permission to allow group members edit the ideas.txt Change directory to parent folder Grant write permission to allow group members create new files inside shared_data folder You can only apply chmod command to the files/folders owned by you. Send email to biohpc-help@utsouthwestern.edu if you need help to set up a shared folder 23
Example 5: Text File Manipulation Command cat story.txt less story.txt head n 15 story.txt tail n 5 story.txt head n 15 story.txt tail n 5 grep "elephant" story.txt grep i "mouse" story.txt grep c "at" story.txt grep ^The story.txt grep water\.$ story.txt What does it do? Show the content of story.txt in the terminal Show the content of story.txt in a pager that allows scrolling & searching Show the first 15 lines of story.txt Show the last 5 lines of story.txt Shows lines 11-15 of story.txt. We take the first 15 lines from story.txt using head, and extract the bottom of that selection by piping it through tail. Find all lines in story.txt that contain elephant case-sensitive. Find all lines in story.txt that contain mouse non-case sensitive. Count the number of lines in story.txt that contain at. Find lines beginning with The Find lines ending with water. Note that. is a special character so we have to escape it. grep searches for patterns using regular expressions - http://www.robelle.com/smugbook/regexpr.html 24
Example 5: Text File Manipulation Command sort fruits.txt sort -r fruits.txt sort nr numbers.txt sort fruits.txt > sorted_fruits.txt sort fruits.txt >> sorted_fruits.txt sort fruits.txt 2 > error.txt What does it do? Sort the lines in a file alphabetically Sort the lines in a file in reverse alphabetical order Sort the lines in a file in reverse numerical order Send the output of the sort command into sorted_fruits.txt overwriting existing content. Send the output of the sort command into sorted_fruits.txt, appending to the end of the file. Send the error output of the sort command into error.txt sort fruits.txt 2 > /dev/null sort fruits.txt 2>&1 > output.txt sed "s/dog/cat/g" story.txt Discard the error output of the command which nonsense Combine the error output into the standard output, and direct into the file output.txt Change all instances of cat to dog, printing the result s(substitute)/old/new/g(lobal) sed can do a lot - http://www.grymoire.com/unix/sed.html awk can do even more - http://www.grymoire.com/unix/awk.html 25
Bash Shell Scripting In addition to the interactive mode, bash shell script is a programing language in itself one command at a time run an entire script of commands A little knowledge can make difficult things easy, and time-consuming things quick. 26
Bash Shell Scripting : Variables script_varables.sh A variable holds information to be used later Set them using: Get their value using: name=value $name #!/bin/bash MY_NAME=dave echo Hello $MY_NAME Specify we are using bash shell Set the variable MY_NAME Run echo using the value in MY_NAME 27
Bash Shell Scripting : How to run How to run a bash script Method 1: convert scripts into an executable file chmod +x script_variables.sh./scripts_variable.sh Method 2: use bash or sh bash script_variables.sh -orsh script_variables.sh 28
Bash Shell Scripting : Input/Output script_io.sh We can assign the output of programs/commands into variables: #!/bin/bash NOW=$(date +%Y-%m-%d) DATA_DIR="data_$NOW" mkdir $DATA_DIR Put the date (YY-MM-DD) into NOW Create a directory with a name incorporating the date module load matlab matlab nodisplay -nosplash < hello.m > $DATA_DIR/output.txt Run matlab with input hello.m, and send the output into our data directory 29
Bash Shell Scripting : If-else Statement script_if.sh #!/bin/bash echo "This scripts checks for the dummy file." echo "Checking..." file exists test is -f if [ -f "dummy" ]; then echo "dummy exists." else echo "Could not find it! " fi Read about the available tests here http://tldp.org/ldp/bash-beginners-guide/html/sect_07_01.html 30
Bash Shell Scripting : Loops script_loops.sh This script lists all of the files in the current directory using a for loop: #!/bin/bash Set variable i to each value in the output of the ls command for i in $( ls ); do echo "item: $i" Echo the current value of $i done Read about the other loops here http://tldp.org/howto/bash-prog-intro-howto-7.html 31
Bash Shell Scripting : Pass Command Line Arguments script_arguments.sh #!/bin/bash echo "Script Name: $0 " echo "Total Number of Argument s: $#" echo "1st Argument: $1" echo "2nd Argument: $2 " echo "All Arguments are: $*" $* Store all command line arguments $# Store count of command line arguments $0 Store name of script itself $1 Store first command line argument $2 Store second command line argument $3 Store third command line argument 32
Bash Shell Scripting : Reorganize data to fit the software requirement Images generated by Total Internal Refraction Fluorescence Microscope Input data structure required by DeBias TEST_INPUT TEST_OUTPUT 1_10 1_11 1_12 folder_reorganize.sh 1 2 3 img_000000000_tirf_488_000.tif img_000000000_tirf_561_000.tif 488.tif 561.tif 33
Bash Shell Scripting : Reorganize data to fit the software requirement #!/bin/bash INPUT_FOLDER=$1 # pass input folder name as 1 st argument from terminal OUTPUT_FOLDER=$2 # pass input folder name as 2 nd argument from terminal i=1 # initialize iterator as 1 for SUBFOLDER in $(ls $INPUT_FOLDER);do # loop through all folders echo $SUBFOLDER # display current sub folder name IMG488_ORG=$INPUT_FOLDER/$SUBFOLDER/img_000000000_TIRF_488_000.tif # define 488 image name IMG561_ORG=$INPUT_FOLDER/$SUBFOLDER/img_000000000_TIRF_561_000.tif # define 561 image name IMG_OUTPUT=${OUTPUT_FOLDER}/$i # display current sub folder name let "i=$i+1" # increment iterator mkdir $IMG_OUTPUT # create output folder for images if [ -f $IMG488_ORG ];then # if 486 image file exists, cp $IMG488_ORG $IMG_OUTPUT/488.tif # copy image file to output folder and rename else # if 486 image file not exists, echo "$IMG488_ORG do not exist" # display error message fi if [ -f $IMG561_ORG ];then cp $IMG561_ORG $IMG_OUTPUT/561.tif else echo "$IMG561_ORG do not exist fi done 34