Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th

Similar documents
Introduction to UNIX command-line

Introduction to UNIX command-line II

Introduction to Unix The Windows User perspective. Wes Frisby Kyle Horne Todd Johansen

Introduction: What is Unix?

Using Linux as a Virtual Machine

Computer Systems and Architecture

Carnegie Mellon. Linux Boot Camp. Jack, Matthew, Nishad, Stanley 6 Sep 2016

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

Where can UNIX be used? Getting to the terminal. Where are you? most important/useful commands & examples. Real Unix computers

Perl and R Scripting for Biologists

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

CSCI 2132 Software Development. Lecture 4: Files and Directories

Computer Systems and Architecture

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Connecting to ICS Server, Shell, Vim CS238P Operating Systems fall 18

Bioinformatics. Computational Methods I: Genomic Resources and Unix. George Bell WIBR Biocomputing Group

Files and Directories

Unix Workshop Aug 2014

Introduction to Linux for BlueBEAR. January

Introduction to UNIX I: Command Line 1 / 21

Research. We make it happen. Unix Basics. User Support Group help-line: personal:

Introduction to Unix. University of Massachusetts Medical School. October, 2014

Introduction to Linux

A Brief Introduction to the Linux Shell for Data Science

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011

Unix Tutorial Haverford Astronomy 2014/2015

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

Arkansas High Performance Computing Center at the University of Arkansas

Chapter-3. Introduction to Unix: Fundamental Commands

Introduction to remote command line Linux. Research Computing Team University of Birmingham

Unix L555. Dept. of Linguistics, Indiana University Fall Unix. Unix. Directories. Files. Useful Commands. Permissions. tar.

Short Read Sequencing Analysis Workshop

Introduction to the Linux Command Line

Linux command line basics II: downloading data and controlling files. Yanbin Yin

Working With Unix. Scott A. Handley* September 15, *Adapted from UNIX introduction material created by Dr. Julian Catchen

Unix Tools / Command Line

Read the relevant material in Sobell! If you want to follow along with the examples that follow, and you do, open a Linux terminal.

Linux Bootcamp Fall 2015

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

acmteam/unix.pdf How to manage your account (user ID, password, shell); How to compile C, C++, and Java programs;

CS CS Tutorial 2 2 Winter 2018

Lab Working with Linux Command Line

Session 1: Accessing MUGrid and Command Line Basics

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Recap From Last Time:

No Food or Drink in this room. Logon to Windows machine

Unix tutorial. Thanks to Michael Wood-Vasey (UPitt) and Beth Willman (Haverford) for providing Unix tutorials on which this is based.

BGGN 213 Working with UNIX Barry Grant

First of all, these notes will cover only a small subset of the available commands and utilities, and will cover most of those in a shallow fashion.

Introduction to UNIX. Introduction. Processes. ps command. The File System. Directory Structure. UNIX is an operating system (OS).

Introduction to UNIX. CSE 2031 Fall November 5, 2012

The Linux Command Line & Shell Scripting

INTRODUCTION TO BIOINFORMATICS

Introduction to Linux Part 1. Anita Orendt and Wim Cardoen Center for High Performance Computing 24 May 2017

User Guide Version 2.0

Linux at the Command Line Don Johnson of BU IS&T

Linux Command Line Primer. By: Scott Marshall

Introduction. File System. Note. Achtung!

Introduction to Unix: Fundamental Commands

Unix Basics. Benjamin S. Skrainka University College London. July 17, 2010

CS197U: A Hands on Introduction to Unix

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

Introduction in Unix. Linus Torvalds Ken Thompson & Dennis Ritchie

Computational Skills Primer. Lecture 2

Introduction To. Barry Grant

Linux for Beginners Part 1

Practical Session 0 Introduction to Linux

Introduction to Linux Workshop 1

EE516: Embedded Software Project 1. Setting Up Environment for Projects

Linux/Cygwin Practice Computer Architecture

Introduction to Linux

CENG 334 Computer Networks. Laboratory I Linux Tutorial

CS 460 Linux Tutorial

Introduction to Linux Environment. Yun-Wen Chen

Introduction to Linux Organizing Files

AMS 200: Working on Linux/Unix Machines

Introduction to Linux. Roman Cheplyaka

VERY SHORT INTRODUCTION TO UNIX

CHAPTER 1 UNIX FOR NONPROGRAMMERS

The Unix Shell. Pipes and Filters

Introduction To Linux. Rob Thomas - ACRC

BIOINFORMATICS POST-DIPLOMA PROGRAM SUBJECT OUTLINE Subject Title: OPERATING SYSTEMS AND PROJECT MANAGEMENT Subject Code: BIF713 Subject Description:

CS Fundamentals of Programming II Fall Very Basic UNIX

Helsinki 19 Jan Practical course in genome bioinformatics DAY 0

Unix basics exercise MBV-INFX410

Working with Basic Linux. Daniel Balagué

Introduction to UNIX. SURF Research Boot Camp April Jeroen Engelberts Consultant Supercomputing

Lab 1 Introduction to UNIX and C

Week 2 Lecture 3. Unix

Principles of Bioinformatics. BIO540/STA569/CSI660 Fall 2010

Lecture # 2 Introduction to UNIX (Part 2)

Introduction p. 1 Who Should Read This Book? p. 1 What You Need to Know Before Reading This Book p. 2 How This Book Is Organized p.

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

Getting Started with UNIX

commandname flags arguments

Recitation #1 Boot Camp. August 30th, 2016

Transcription:

Unix Essentials BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th 2016 http://barc.wi.mit.edu/hot_topics/ 1

Outline Unix overview Logging in to tak Directory structure Accessing Whitehead drives/finding your lab share Basic commands Exercises BaRC and Whitehead resources LSF 2

What is unix? and Why unix? Unix is a family of operating systems that are related to the original UNIX operating system It is powerful, so large datasets can be analyzed Many repetitive analyses or tasks can be easily automated Some computer programs only run on the unix operation system TAK (our unix server): lots of software and databases already installed or downloaded 3

Where can unix be used? Mac computers Come with unix Windows computers need Cygwin: \\grindhouse\software\cygwin\wibr-cygwin-installer-1010.exe Dedicated unix server tak, the Whitehead Scientific Linux 4

Logging in to tak Requesting a tak account http://iona.wi.mit.edu/bio/software/unix/bioinfoaccount.php Windows PuTTY or Cygwin Xming: setup X-windows for graphical display Macs Access through Terminal 5

Connecting to tak for Windows 3) Click Open 1) Open putty 2.1) Write the address of the Host 2.2) Click on SSH->X11-> Enable X11 forwarding Note: When you write the password you won t see any characters being typed. Command Prompt user@tak ~$ 6

Log in to tak for Mac ssh Y username@tak.wi.mit.edu username@tak.wi.mit.edu s 7

Unix Directory Structure root / home usr bin nfs lab... jdoe BaRC_Public BaRC_training solexa_public solexa_lodish page /home/jdoe /lab/page See http://www.thegeekstuff.com/2010/09/linux-file-system-structure/ for more detailed info. nfs and lab are directories specific to Whitehead 8

Accessing Shared Resources at Whitehead Unix /nfs/barc_public /nfs/barc_training /lab/solexa_public Windows (access using Start Menu Search) \\wi-files1\barc_public \\wi-files1\barc_training \\wi-htdata\solexa_public Macs (access using Go Connect to Server ) cifs://wi-files1/barc_public cifs://wi-files1/barc_training cifs://wi-htdata/solexa_public Where s my lab s share? http://it.wi.mit.edu/systems/file-storage/lab-share-paths 9

Get ready for the exercises Use this link to copy paste the commands for the exercises http://jura.wi.mit.edu/bio/education/hot_topics/unix_essentials _2016/UnixEssentials_HandsOn.txt These folders contain today s slides and the material for the exercises \\wi-files1\barc_public\hot_topics\unixessentials_2016 cifs://wi-files1/barc_public/hot_topics/unixessentials_2016 10

Unix Tips Use to reuse previous commands Ctrl-c: stop a process that is running Tab-completion: Complete commands/file names Unix is case-sensitive 11

Basic commands listing and organizing files/folders list the contents of a directory: ls [only show names] ls l [long listing: show other information too] ls h [human readable] make a directory: mkdir dirname change directory: cd dirname The directories. and.. The current directory (.) cd. The parent directory (..) cd.. (go to the parent directory) ls.. (list to the parent directory) print working directory: pwd Go to your home directory: cd or cd ~ Go to the directory you where last: cd - 12

Paths Absolute path example /nfs/barc_training/username/mywork /nfs/barc_training/username/myfig cd /nfs/barc_training/username/mywork cd /nfs/barc_training/username/myfig Relative path example../dirname if I am in /nfs/barc_training/username/mywork I can do: cd../username/myfig 13

Unix Directory Structure root / home usr bin nfs lab... jdoe BaRC_Public BaRC_training solexa_public solexa_lodish username page mywork myfig cd /nfs/barc_training/username/myfig cd../username/myfig 14

Permissions drwxrwxr-x Type: directory (d) symbolic link(l) User Group Others r read w write x execute Use chmod to change permissions user(u), group(g), others(o), all(a) chmod u+x foo.pl (user can execute) chmod g-w foo.pl (group can t write) thiruvil@tak /nfs/barc_public$ ls -l myfile.txt -rw-r--r-- 1 thiruvil barc 0 2012-10-10 13:32 myfile.txt thiruvil@tak /nfs/barc_public$ chmod g+w myfile.txt thiruvil@tak /nfs/barc_public$ ll myfile.txt -rw-rw-r-- 1 thiruvil barc 0 2012-10-10 13:32 myfile.txt 15

Basic commands copying, moving files, getting help copy: cp move: mv remove: rm remove directory: rmdir get help on a command: man {command} Do Exercises 1, 2 and 3 16

Displaying the contents of a file on the screen cat filename: Dump a file to the screen more filename: Progressively dump a file to the screen: ENTER = one line down SPACEBAR = page down q=quit less filename: like more but with extended capabilities head filename: Show the first few lines of a file head -n filename: Show the first n lines of a file tail filename: Show the last few lines of a file tail -n filename: Show the last n lines of a file clear: clear screen 17

Editing a File Command-line editors pico nano emacs (emacs nw) vi Graphical editors (Windows users need an X-windows emulator) Note: may not be part of standard installation nedit gedit xemacs Put an & at the end of command line to run it in the background when using a graphical editor so that you can continue to use the terminal window eg. gedit myfile.txt& 18

Parsing a File: cut Word count: wc Select columns of interest cut f 9,12-15 mygenevalues.txt > col_9.12to15.txt Options: -f output only these fields -d field delimiter Count number of lines/words/characters in file wc myfile wc -w (count words only) wc -l (count lines only) 19

Output Redirection and Piping Write output of a command to file Write to output file sort myfile.txt > myfile_sorted.txt Over-write output file (if it exists) sort myfile.txt > myfile_sorted.txt Append to output file sort myfile.txt >> myfile_sorted.txt Piping : use output of one command as input for another command sort myfile.txt more Do Exercises 4, 5, 6 and 7 20

Sorting and removing redundancy sort and uniq Sort on column(s) sort -k 3,3 mygeneexpression.txt more Options: -n, --numeric-sort compare according to string numerical value -g, --general-numeric-sort compare according to general numerical value -r reverse -k pos1,pos2 start sorting at pos1, end it at pos2 Get only unique entries uniq mysortedgenes.txt > myuniqgenes.txt Options: -c count entries -d duplicate counts make sure that the file is sorted before running uniq Do Exercises 8 and 9 21

Searching the contents of a file grep (global regular expression print) Find words, or patterns, occurring in lines of a file grep TMEM genelist.txt TMEM131 TMEM9B TMEM14C TMEM66 TMEM49 Options: -v select non-matching lines -i ignore case -n print line number Example: get TMEM but exclude TMEM14C grep TMEM genelist.txt grep -v "TMEM14C" more Do Exercise 10 22

Getting Files Getting files or directories Files wget http://www.broadinstitute.org/igv/projects/downloads/igv_2.1.17.zip Directories from (outside) servers scp -r jdoe@copper.broadinstitute.org:/broad/lab/works. scp r origin destination scp -r username@severtocopyfrom:/pathtofoldertocopyfrom. 23

.gz file (Un)Compressing Files Compress: gzip file.txt (file.txt.gz will be created) Uncompress: gunzip file.txt.gz (file.txt will be created).tar.gz file Compress: tar czvf myfiles.tar.gz myfiles Uncompress: tar xzvf myfiles.tar.gz Options -c create an archive -x extract an archive -f FILE name of archive -v be verbose, list all files being archived/extracted -z create/extract archive with gzip/gunzip View compressed files using: zmore,zgrep 24

BaRC Resources jura.wi.mit.edu 25

BaRC SOP http://barcwiki.wi.mit.edu/wiki/sops 26

Unix Commands and BaRC Scripts UNIX R Perl 27

Running Scripts on Unix Perl bed2gff.pl or./bed2gff.pl or path/bed2gff.pl R run_rma_customcdf.r Python myscript.py or./myscript.py Matlab matlab -nodesktop -nosplash myscript.m Java Archive (JAR) java -Xmx1000m -jar /usr/local/share/igvtools/igv.jar 28

Running Programs/Tools on Unix bedtools bedtools intersect -a mygenes1.bed b mygenes2.bed Other utilities: http://code.google.com/p/bedtools/wiki/usage samtools samtools view myfile.bam Other utilities: http://samtools.sourceforge.net/samtools.shtml Fastx toolkit fastx_quality_stats -i myseq.fastq -o fastxstats_myseq FastQC fastqc myseq.fastq BLAST blastp task blastp -db myprotdb.fa q myprot.fa out out.txt 29

Commonly Used Data Locations at Whitehead Location /nfs/genomes /nfs/seq/data /nfs/barc_datasets Description Genome data: gff, gtf, fasta, bowtie indexed files, blat indexed file, etc. for several organisms Sequence data, including blast databases, for several organisms Large (array/ngs) datasets: HBI, HBM 2.0 30

LSF cluster jobs https://tak.wi.mit.edu 31

LSF Commands bsub to submit jobs bsub wc l reads.fq bsub sort foo.txt > sorted.txt Options: -e error_file -o standard_out_file -m machine (send the job to that machine) -n number (use that many processors in the cluster) bsub -n 4 -R "span[hosts=1] (use 4 processors all in the same machine in the cluster) bjobs to view your jobs bjobs bkill to kill a job bkill 237878 32

BaRC: Unix Info Further Reading http://iona.wi.mit.edu/bio/education/unix_intro.php LSF Cluster (incl. examples) http://iona.wi.mit.edu/bio/bioinfo/docs/lsf_help.php Whitehead IT Scientific Computing Tutorials http://it.wi.mit.edu/ 33

Upcoming Hot Topics Unix advanced (Oct 31st) Introduction to R (Nov) Introduction to R graphics (Nov) 34

Survey https://www.surveymonkey.com/r/tp5lbgj 35