A Genomics View of Unix

Similar documents
A Short Introduction to Unix. Adapted from:

Introduction to Linux

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12)

Unix tutorial. Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based.

The Directory Structure

History. Terminology. Opening a Terminal. Introduction to the Unix command line GNOME

Unix Tutorial Haverford Astronomy 2014/2015

Lecture 1. A. Sahu and S. V. Rao. Indian Institute of Technology Guwahati

Scripting Languages Course 1. Diana Trandabăț

Mills HPC Tutorial Series. Linux Basics I

Physics REU Unix Tutorial

Week 2 Lecture 3. Unix

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Getting Started with UNIX

Getting Started With UNIX Lab Exercises

Tutorial 1: Unix Basics

Introduction to Linux (Part I) BUPT/QMUL 2018/03/14

Introduction to Unix and Linux. Workshop 1: Directories and Files

Linux File System and Basic Commands

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory

Using the Zoo Workstations

Outline. Structure of a UNIX command

Introduction. File System. Note. Achtung!

Operating Systems and Using Linux. Topics What is an Operating System? Linux Overview Frequently Used Linux Commands

This lab exercise is to be submitted at the end of the lab session! passwd [That is the command to change your current password to a new one]

Examples: Directory pathname: File pathname: /home/username/ics124/assignments/ /home/username/ops224/assignments/assn1.txt

Introduction to Unix - Lab Exercise 0

Parts of this tutorial has been adapted from M. Stonebank s UNIX Tutorial for Beginners (

UNIT 9 Introduction to Linux and Ubuntu

Linux Training. for New Users of Cluster. Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala

Oxford University Computing Services. Getting Started with Unix

Introduction to the UNIX command line

INSE Lab 1 Introduction to UNIX Fall 2017

CENG 334 Computer Networks. Laboratory I Linux Tutorial

Chapter-3. Introduction to Unix: Fundamental Commands

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Unix File System. Learning command-line navigation of the file system is essential for efficient system usage

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

Command Line Interface The basics

CS Fundamentals of Programming II Fall Very Basic UNIX

Lab Working with Linux Command Line

DATA 301 Introduction to Data Analytics Command Line. Dr. Ramon Lawrence University of British Columbia Okanagan

Why learn the Command Line? The command line is the text interface to the computer. DATA 301 Introduction to Data Analytics Command Line

Introduction to Linux Workshop 1

THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering

No lectures on Monday!!

Helsinki 19 Jan Practical course in genome bioinformatics DAY 0

Introduction to Linux

Week Overview. Unix file system File types and file naming Basic file system commands: pwd,cd,ls,mkdir,rmdir,mv,cp,rm man pages

Basic Survival UNIX.

Common UNIX Commands. Unix. User Interfaces. Unix Commands Winter COMP 1270 Computer Usage II 9-1. Using UNIX. Unix has a command line interface

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

Introduction to Linux Environment. Yun-Wen Chen

EKT332 COMPUTER NETWORK

Part I. Introduction to Linux

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

Helpful Tips for Labs. CS140, Spring 2015

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

Introduction to Unix: Fundamental Commands

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Introduction: What is Unix?

Linux Command Line Primer. By: Scott Marshall

An Introduction to Using the Command Line Interface (CLI) to Work with Files and Directories

LAB #7 Linux Tutorial

Introduction to UNIX command-line

Unix File System. Class Meeting 2. * Notes adapted by Joy Mukherjee from previous work by other members of the CS faculty at Virginia Tech

Section 2: Developer tools and you. Alex Mariakakis (staff-wide)

This is Lab Worksheet 3 - not an Assignment

UNLV Computer Science Department CS 135 Lab Manual

Table Of Contents. 1. Zoo Information a. Logging in b. Transferring files 2. Unix Basics 3. Homework Commands

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

AN INTRODUCTION TO UNIX

Short Read Sequencing Analysis Workshop

Perl and R Scripting for Biologists

Introduction to Linux. Fundamentals of Computer Science

Unix L555. Dept. of Linguistics, Indiana University Fall Unix. Unix. Directories. Files. Useful Commands. Permissions. tar.

Practical Session 0 Introduction to Linux

Shell Programming Overview

Introduction to the Linux Command Line

Appendix B WORKSHOP. SYS-ED/ Computer Education Techniques, Inc.

Using Linux as a Virtual Machine

Introduction to Linux for BlueBEAR. January

Unix Filesystem. January 26 th, 2004 Class Meeting 2

CS CS Tutorial 2 2 Winter 2018

CS246 Spring14 Programming Paradigm Notes on Linux

User Guide Version 2.0

Lab 2: Linux/Unix shell

First of all, these notes will cover only a small subset of the available commands and utilities, and will cover most of those in a shallow fashion.

CMSC 104 Lecture 2 by S Lupoli adapted by C Grasso

Introduction to Linux Part 1. Anita Orendt and Wim Cardoen Center for High Performance Computing 24 May 2017

Read the relevant material in Sobell! If you want to follow along with the examples that follow, and you do, open a Linux terminal.

Operating System Interaction via bash

Introduction to Linux Basics

Linux/Cygwin Practice Computer Architecture

Chap2: Operating-System Structures

Session 1: Accessing MUGrid and Command Line Basics

genome[phd14]:/home/people/phd14/alignment >

First of all, these notes will cover only a small subset of the available commands and utilities, and will cover most of those in a shallow fashion.

This section reviews UNIX file, directory, and system commands, and is organized as follows:

Transcription:

A Genomics View of Unix Genomics Requires Powerful Computers Genomic data sets are large and complex, the requirements for computing power are also quite large Computers that can provide this power generally use the Unix operating system - so you must learn Unix The mac OS X is based on Unix most of the Unix information is true for your mac simply open Terminal inside X11 and you can control your powerbook with unix commands Unix Advantages It is very popular, so it is easy to find information and get help Lots of books plenty of helpful websites USENET discussions (google groups) and e-mail lists most Comp. Sci. students know Unix Unix can run on virtually any computer (IBM, Sun, Compaq, Macintosh,etc) Unix is free or nearly free Linux/open source software movement Suse, FreeBSD, Red Hat, LinuxPPC, etc. Stable and Efficient Unix is very stable - computers running Unix almost never crash Unix is very efficient it gets maximum number crunching power out of your processor (and multiple processors) it can smoothly manage very large amounts of data Most bioinformatics software is created for Unix - its easy for the programmers 1

Unix has some Drawbacks Unix computers are controlled by a command line interface NOT user-friendly difficult to learn, even more difficult to truly master Hackers love Unix there are lots of security holes most computers on the Internet run Unix, so hackers can apply the same tricks to many different computers There are many different versions of Unix with subtle (or not so subtle) differences General Unix Tips UNIX is case sensitive!! myfile.txt and MyFile.txt do not mean the same thing You communicate with a Unix computer through a command program known as a shell. The shell interprets the commands that you type on the keyboard Advanced users can write shell scripts to automate program execution, important for large (genomic size) datasets Tips..Control Characters You type Control characters by holding down the ctrl key while also pressing the specified character. While you are typing a command: ctrl-u erases the whole command line Control commands that work (almost) any time ctrl-s suspends (halts) output scrolling up on your terminal screen ctrl-q resumes the display of output on your screen ctrl-c will abort any program ctrl-z will pause any program Getting Help in Unix Unix is not a user-friendly computer system. While not actively user-hostile, it is perfectly happy to sit there and taunt you with a blank screen and a blinking > cursor. There is a rudimentary Help system which consists of a set of "manual pages for every Unix command. The man pages tell you which options a particular command can take, and how each option modifies the behavior of the command. Type man and the name of a command to read the manual page for that command. 2

Apple s OS X man page for ls More Help (?) LS(1) System General Commands Manual LS(1) NAME ls - list directory contents SYNOPSIS ls [-ACFLRSTWacdfgiklnoqrstux1] [file...] DESCRIPTION For each operand that names a file of a type other than directory, ls displays its name as well as any requested, associated information. For each operand that names a file of type directory, ls displays the names of files contained within that directory, as well as any requested, associated information. If no operands are given, the contents of the current directory are displayed. If more than one operand is given, non-directory operands are displayed first; directory and non-directory operands are sorted separately and in lexicographical order. The following options are available: -A List all entries except for `.' and `..'. Always set for the So what if you don t know what command you need? You can search the manual pages for a given keyword in the header: man -k E.g. to get help on commands that deal with passwords you would type: man -k password Apropos works on some unix systems as well, try man apropos on other unix systems for help Get yourself a good "Intro to Unix" book Unix Help on the Web Here is a list of a few online Unix tutorials: Unix for Beginners http://www.ee.surrey.ac.uk/teaching/unix/ Getting Started in Unix http://www2.ucsc.edu/cats/sc/help/unix/start.shtml Unix Guru Universe http://www.ugu.com/sui/ugu/show?help.beginners Getting Started With The Unix Operating System http://www.leeds.ac.uk/iss/documentation/beg/beg8/beg8.html You will need to be able to: Type and understand basic commands Manipulate files (copy, delete, examine etc) Run Bioinformatic programs (consed, etc) Understand files and directories 3

Unix Commands Most commands have this basic structure: command -a -b -c input.file Program modifiers input Program: the name of the program you want to run (blast, consed, etc.) Modifiers (optional, usually in any order): a CASE SENSITIVE letter which controls exactly how the program will run Some will be plain -l some will have values -l guest Input (optional): the file(s) you wish the program to act upon. Unix Commands (cont) Many Unix commands are short and cryptic like vi or rm. Computer geeks like it that way. The more you use it the more you will like it Every command has a host of modifiers which are generally single letters preceded by a hyphen: ls -l or mv -R Capital letters have different functions than small letters, often completely unrelated. A command also generally requires one or more arguments, meaning some file on which it will act: cat -n mygene.seq Lets change your Password 1. Start X11 click on icon at bottom 2. You should see a window xterm show up where you can type commands 3. Type: passwd This command has no arguments 4. enter current password (dna50) 5. enter new password 2X 6. DO NOT FORGET! Unix Filenames Unix is case sensitive UNIX filenames should contain only letters, numbers, and the _ (underscore),. (dot), and - (dash) characters. Unix does not allow two files to exist in the same directory with the same name. Whenever a situation occurs where a file is about to be created or copied into a directory where another file has that exact same name, the new file will overwrite (and delete) the older file. Unix will generally alert you when this is about to happen, but it is easy to ignore the warning. 4

Filename Extensions By convention most UNIX filenames start with a lower case letter and end with a dot followed by one, two, or three letters (or more): myfile.txt However, this is just a common convention and is not required. It is also possible to have additional dots in the filename. The part of the name following the dot is called the extension. The extension is used to designate the file type This convention is used to help you remember what the file is. Unix does not require extensions. Some Common Extensions By convention files that end in :.txt are text files.html are HTML files for the Web.zip or.gz are compressed files.fasta are DNA or peptide sequences in fasta format.seq or.pep are DNA or protein sequences Files can have more than one extension. Compressed text files would end.txt.gz Unix directories Unix systems can have many thousands of files Directories are a means of organizing your files on a Unix computer. They are equivalent to folders in Windows and Macintosh computers Start with the root directory: / There are files and directories inside root There are files and other directories inside those There are files and other directories inside those.. Unix directories (cont) When logged into a computer there is always one directory called the current working directory. The computer will look for files in the current working dir, unless told specifically where to look. 5

Unix directories (cont) A list of all the directories to a file is called the path. The path can either start at root / /Users/chris/Documents/4342files/2004/lecture1.ppt Or the current working directory Documents/4342files/2004/lecture1.ppt Your Home Directory When you login to any Unix server, you always start in your Home directory. On Mac my home is /Users/chris Create sub-directories to store specific projects or groups of information, just as you would place folders in a filing cabinet. Do not accumulate thousands of files with cryptic names in your Home directory File & Directory Commands This is a minimal list of Unix commands that you must know for file management: ls (list files) cd (change directory) mkdir (make directory) rmdir (remove directory) cp (copy file) pwd (present working directory) mv (move file) rm (remove) more (view by page) cat (dump file on screen) All of these commands can be modified with many options. Learn to use Unix man pages for more information. Navigation pwd (present working directory) shows the name of the current working directory: > pwd /Users/chris ls (list) gives you a list of the files in the current directory: Desktop Library Public Documents Movies Sites Use the ls -l (long) option to get more information ~/ >ls -l total 0 drwx------ 10 chris staff 340 Mar 9 11:11 Desktop drwx------ 13 chris staff 442 Mar 9 11:11 Documents drwx------ 28 chris staff 952 Feb 21 10:42 Library drwx------ 3 chris staff 102 Dec 20 21:16 Movies 6

Sub-directories cd (change directory) set a new current working dir >cd Documents > pwd /Users/chris/Documents mkdir (make directory) creates a new sub-directory inside of the current directory Leah.cvs splitgti.pl > mkdir subdir Leah.cvs splitgti.pl subdir rmdir (remove directory) deletes a sub-directory, but the sub-directory must be empty > rmdir subdir Leah.cvs splitgti.pl Shortcuts There are some important shortcuts in Unix for specifying directories. (dot) means "the current directory".. means "the parent directory" - the directory one level above the current directory, so cd.. will move you up one level ~ (tilde) means your Home directory, so cd ~ will move you back to your Home. Just typing a plain cd will also bring you back to your home directory Commands for Files Files are used to store information, for example, DNA sequence data or the results of some analysis. You will mostly deal with text files cat dumps the entire contents of a file onto the screen. For a long file this can be annoying, but it can also be helpful if you want to copy and paste from your screen More Use the command more to view the contents of a file one screen at a time: > more t27054_cel.pep!!aa_sequence 1.0 P1;T27054 - protein Y49E10.20 - Caenorhabditis elegans Length: 534 May 30, 2000 13:49 Type: P Check: 1278.. 1 MLKKAPCLFG SAIILGLLLA AAGVLLLIGI PIDRIVNRQV IDQDFLGYTR 51 DENGTEVPNA MTKSWLKPLY AMQLNIWMFN VTNVDGILKR HEKPNLHEIG 101 PFVFDEVQEK VYHRFADNDT RVFYKNQKLY HFNKNASCPT CHLDMKVTIP t27054_cel.pep (87%) Hit the spacebar to page down through the file b moves back up a page At the bottom of the screen, more shows how much of the file has been displayed /pattern searches ahead for pattern 7

Copy & Move cp lets you copy a file from any directory to any other directory, or create a copy of a file with a new name in one directory cp file.ext newname.ext cp file.ext subdir/newname.ext cp /Users/chris/file.ext./subdir/newfilename.ext mv allows you to move files to other directories, but it is also used to rename files. Filename and directory syntax for mv is exactly the same as for the cp command. mv filename.ext subdir/newfilename.ext NOTE: When you use mv to move a file into another directory, the current file is deleted. Delete Use the command rm (remove) to delete files There is no way to undo this command!!! some Unix setup s will ask you to confirm others will not. Be aware of this when deleting af151074.gb_pr5 test.seq > rm test.seq af151074.gb_pr5 Wildcards You may often want to work on more than one file at a time. You can use wildcard symbols to do this. You can substitute the * as a wildcard symbol for any number of characters in any filename. If you type just * after a command, it stands for all files in the current directory: cat * will dump all files to the screen You can mix the * with other characters to form a search pattern: a*.txt would be a list all files that start with a and end in.txt The? wildcard stands for any single character: draft?.doc would match draft1.doc, draft2.doc, draftb.doc, etc. Two Tricks for everyone Command History Filename completion 8

Command history The computer remembers you last 100 or so commands. Use the up arrow to scroll back through these commands. You can use right and left arrow to move around inside the command to edit it Very nice for long commands (e.g. blast searches) Command history The computer remembers you last 100 or so commands. Use the up arrow to scroll back through these commands. You can use right and left arrow to move around inside the command to edit it Very nice for long commands (e.g. blast searches) Files name completion Since the computer will only look in the current working dir for files when you want to type in a file name all you have to type is enough characters at the beginning to uniquely define the file When you hit <TAB> the shell will finish typeing the name for you af151074.gb_pr5 test.seq apt.txt > rm af <hit tab> > rm af151074.gb_pr5 Have long names but keep the differences at the beginning to facilitate using this feature 9