INTRODUCTION TO BIOINFORMATICS

Similar documents
Introduction: What is Unix?

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Perl and R Scripting for Biologists

A Brief Introduction to the Linux Shell for Data Science

Introduction to the Linux Command Line

Chapter-3. Introduction to Unix: Fundamental Commands

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

commandname flags arguments

User Guide Version 2.0

Lab Working with Linux Command Line

CSCI 2132 Software Development. Lecture 4: Files and Directories

Introduction to Linux. Fundamentals of Computer Science

Introduction to Linux for BlueBEAR. January

Session 1: Accessing MUGrid and Command Line Basics

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

Unix Introduction to UNIX

Physics REU Unix Tutorial

Introduction to Linux Workshop 1

History. Terminology. Opening a Terminal. Introduction to the Unix command line GNOME

Mills HPC Tutorial Series. Linux Basics I

Introduction to the UNIX command line

Outline. Structure of a UNIX command

EECS Software Tools. Lab 2 Tutorial: Introduction to UNIX/Linux. Tilemachos Pechlivanoglou

Introduction of Linux

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

CS Fundamentals of Programming II Fall Very Basic UNIX

A Brief Introduction to Unix

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

Unix File System. Class Meeting 2. * Notes adapted by Joy Mukherjee from previous work by other members of the CS faculty at Virginia Tech

Computer Systems and Architecture

THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering

Getting Started with UNIX

CS CS Tutorial 2 2 Winter 2018

UNIX. The Very 10 Short Howto for beginners. Soon-Hyung Yook. March 27, Soon-Hyung Yook UNIX March 27, / 29

Computer Systems and Architecture

Introduction to Unix The Windows User perspective. Wes Frisby Kyle Horne Todd Johansen

Shell Programming Overview

Unix Filesystem. January 26 th, 2004 Class Meeting 2

Unix Workshop Aug 2014

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

Introduction to Linux. Woo-Yeong Jeong Computer Systems Laboratory Sungkyunkwan University

CENG 334 Computer Networks. Laboratory I Linux Tutorial

CSCE 212H, Spring 2008, Matthews Lab Assignment 1: Representation of Integers Assigned: January 17 Due: January 22

Essential Linux Shell Commands

Working with Basic Linux. Daniel Balagué

Operating Systems. Copyleft 2005, Binnur Kurt

Introduction to Unix: Fundamental Commands

Operating Systems 3. Operating Systems. Content. What is an Operating System? What is an Operating System? Resource Abstraction and Sharing

Introduction to Linux Part 1. Anita Orendt and Wim Cardoen Center for High Performance Computing 24 May 2017

Linux Command Line Primer. By: Scott Marshall

Introduction to Linux

CS197U: A Hands on Introduction to Unix

Crash Course in Unix. For more info check out the Unix man pages -orhttp:// -or- Unix in a Nutshell (an O Reilly book).

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

Unix Basics. Benjamin S. Skrainka University College London. July 17, 2010

EECS2301. Lab 1 Winter 2016

Unix L555. Dept. of Linguistics, Indiana University Fall Unix. Unix. Directories. Files. Useful Commands. Permissions. tar.

Overview LEARN. History of Linux Linux Architecture Linux File System Linux Access Linux Commands File Permission Editors Conclusion and Questions

Introduction to Linux

CHAPTER 1 UNIX FOR NONPROGRAMMERS

Virtual Machine. Linux flavor : Debian. Everything (except slides) preinstalled for you.

Brief Linux Presentation. July 10th, 2006 Elan Borenstein

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

Introduction to Linux

h/w m/c Kernel shell Application s/w user

Parallel Programming Pre-Assignment. Setting up the Software Environment

Practical Session 0 Introduction to Linux

CS4350 Unix Programming. Outline

Introduction to Linux

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011

The Unix Shell & Shell Scripts

Examples: Directory pathname: File pathname: /home/username/ics124/assignments/ /home/username/ops224/assignments/assn1.txt

UNIX File Hierarchy: Structure and Commands

Table Of Contents. 1. Zoo Information a. Logging in b. Transferring files 2. Unix Basics 3. Homework Commands

Introduction to Linux

Files

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th

Introduction to Linux

Scripting Languages Course 1. Diana Trandabăț

CISC 220 fall 2011, set 1: Linux basics

CHE3935. Lecture 1. Introduction to Linux

Connecting to ICS Server, Shell, Vim CS238P Operating Systems fall 18

CSE Linux VM. For Microsoft Windows. Based on opensuse Leap 42.2

CS246 Spring14 Programming Paradigm Notes on Linux

INSE Lab 1 Introduction to UNIX Fall 2017

Unix as a Platform Exercises + Solutions. Course Code: OS 01 UNXPLAT

Linux/Cygwin Practice Computer Architecture

Lec 1 add-on: Linux Intro

Unix and C Program Development SEEM

CS 2400 Laboratory Assignment #1: Exercises in Compilation and the UNIX Programming Environment (100 pts.)

Linux Training. for New Users of Cluster. Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala

Computer Architecture Lab 1 (Starting with Linux)

Exploring UNIX: Session 3

CSE 390a Lecture 3. Multi-user systems; remote login; editors; users/groups; permissions

This is Lab Worksheet 3 - not an Assignment

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

Introduction to UNIX command-line II

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12)

Introduction to Linux

Introduc)on to Linux Session 2 Files/Filesystems/Data. Pete Ruprecht Research Compu)ng Group University of Colorado Boulder

Transcription:

Introducing the LINUX Operating System BecA-ILRI INTRODUCTION TO BIOINFORMATICS Mark Wamalwa BecA- ILRI Hub, Nairobi, Kenya h"p://hub.africabiosciences.org/ h"p://www.ilri.org/ m.wamalwa@cgiar.org 1

What is UNIX? A family of operating systems Multitasking IRIX Multiuser SOLARIS Runs more than one program at the same time. AIX Many different people can use A the busy system system at the can same be running time. LINUX several It is designed hundred to be or linked even to Digital thousands other computers UNIX of programs and to at allow the same people time. to work over a network. HP-UX The network IS the computer.... Networked 2

What is LINUX? Linus Torvalds n A freely available clone of the UNIX operating system for personal computers n Linux and Unix Time Sharing OPS: allow multiple users to use the system simultaneously Unix: developed in 1969 at Bell-Labs Linux is similar to Unix in some aspects 3

What does UNIX do? unix> help users Press ENTER to continue: UNIX Kernel X Xprog The Computer X Console The Pointy, User Controls Disk X Shell Interaction Window clicky storage programs access (or command program. System to the line) hardware. Run Allows Graphical Any Many Memory from number different the the interface user of shell users, to interact (point, can directly click, use typically Use Prevents any one drag, number with accessing programs drop the of etc.) computer actively programs the by at a interfering typing and system Network time methods commands. from enabled with adapter remote to access each other. system machines Provides The Can shell use from many interprets different any easy programs number way these ways for of at Modem programmers and once remote instructs machines the to talk kernel at the to the same electronics. accordingly. time. Is Screen a separate program Controls Very Easier powerful to data use than storage but the can and shell be Keyboard protection. intimidating but less powerful 4

Logging in Log in from anywhere. You must Log Have have in graphical from a username anywhere output (login sent you have anywhere id) to use you a unix/linux permission have system permission Every This user identifies is a member you of to one the system or more so groups it can of users. manage your work properly. This helps the system manage different types of user properly. 5

Logging in Connect to the linux machine using: Putty WinSCP - open source SFTP (SSH File Transfer Protocol) SCP (Secure CoPy) client for Windows using SSH (Secure SHell). Connecting to http://hpc.ilri.cgiar.org Connected. Welcome Xterm to Genotyping by Sequencing (GBS) workshop Login: Telnet Secure Shell username Kermit Other terminal emulators Password: The system will be unavailable unix linux is doesn t case sensitive. show during p/w username on Ramadhan. the screen is not as You have new mail. the you same type your as Username password. or USERNAME username@hpc~> You may get some messages here from the system administrator. 6

Accessing HPC from Windows systems n n Two stage process: Connecting to the system via secure shell (ssh) login Getting a graphical connection that supports X-Windows ssh connection: Need third party software. Local suggestion use putty n Process is slightly more awkward than ideal because local putty is configured for the Sun UNIX environment. n Better download putty.exe from http://www.chiark.greenend.org.uk/~sgtatham/putty/ Just runs from your desktop n Alternative cygwin - a Linux-like environment for Windows www.cygwin.com

Using Local PuTTY - 1 Better choice This is necessary for all PuTTY installs.

Using Local PuTTY - 2 linux

Using PuTTY-3

PuTTY Terminal Screen

The shell or command line Several 1. The Prompt. different shells but they behave more or less the same username@hpc/home~> interactive your username The prompt can be the customised machine your to you look present how location you wish are logged in to 12

The shell or command line 2. Commands username@hpc~> ls -ald ls -ald *.txt *.txt The shell breaks the command up into individual words The first word is a command The subsequent boundary between words form words a list a of space. arguments to For the the command shell to treat a phrase that includes arguments spaces as a beginning single word, with put - are it in options quotes: 'my word' or "my word". * is a special character. It means any group of Options control how the program runs. characters (including none). The shell finds all the '-a -l -d' is equivalent to '-ald' filenames that match anything.txt and adds them to the list of arguments 13

More Special Characters *? " ' Any word single group delineation character. of characters including none. & > < `` $ \ ; Cause Pipe. Redirect the the a process commands to run input. the background Pass output, eg. from the eg. output a file to a instead file of the of the command keyboard. Backticks String Backslash. Semicolon or Dollar (not on the '). left as the input Take Treat Change Seperate to the the output commands next meaning word of the as typed of on a the the in right. command variable next together. character. and as write an argument out its value Some special characters can lose their special meaning if they are inside quotes. 14

Organisation "Everything is a file" An ordinary file contains data. A directory contains other files. A link is a file that is a shortcut to another file. There data are could many be an other image, types a document, of file. a set of This instructions is a folder (a on program) windows. or A any directory fixed information. can contain Files can other have directories more than (sub-directories.) one name, and be in different directories at the same time 15

Organisation of the file system / bin usr home etc The top of the file system is the directory '/', Several commonly subdirectories known as the under root the directory root directory username Any example file in the users file home system can directory be uniquely with identified a subdirectory by and describing several the files path to it from the root directory. Another subdirectory. prot letter project seq4 seq3 seq2 seq1 /home/username/prot 16

Organisation of the file system bin usr home etc / Any process is located somewhere in the filesystem The command 'pwd' will tell you where. username@hpc ~> pwd pwd print /home/username working dir prot username letter seq4 project seq3 seq2 seq1 '~' is a linux shortcut for 'your home directory' 17

Looking at the file system bin usr home etc 'ls' lists the files in a username directory or directories prot letter project Without There are an many argument, options ls to lists ls that all the allow files you that to don't select start and control with. in the the information current directory it presents. seq4 seq3 seq2 seq1 username@hpc~> ~> ls project prot project: letter project seq1 seq2 seq3 seq4 / 18

Moving around the file system bin usr home etc / You can move to a different directory with the command 'cd directory ' prot username letter project 'directory' is the directory seq4 to seq3 which seq2 you seq1 want to move. The name can be written as the username@hpc full path ~/project> ~> cd (from /home/username/project root) cd or.. as the relative path username@hpc (from ~/project> ~> your pwd current directory) pwd '..' means the parent directory. /home/username/project repeat using the relative path '.' means the current directory... 19

Changing the file system bin usr home etc / You can create a new subdirectory in the current directory with the command ' mkdir directory ' username prot letter project model seq4 seq3 seq2 seq1 username@hpc ~> username@hpc ~> mkdir model 20

Changing the file system bin usr home etc You can delete an empty username subdirectory with the command ' rmdir directory' prot letter project model You can delete a file You with can the delete a subdirectory and command ' rm file its contents ' with the command seq4 seq3 seq2 seq1 ' rm -rf directory ' username@hpc ~> rmdir model username@hpc ~> rm prot username@hpc~> rm -rf directory / 21

More about files: filenames Filenames can contain any normal text character including spaces and special characters. Filenames can be almost any length. It is best to stick to a-z, A-Z, If a filename contains _, -, and numbers. It is best a to special keep them character short or a space you may need as it saves to put typing. quotes around the whole path. Special characters in filenames can cause problems with some programs. 22

More about files: reading files You can print the contents of one or more files to the screen with the command: 'cat file1 file2...' You can view the contents of one or more files a cat prints the whole file at once, so a file page at a time on the screen with the command: longer than just a few lines will run off ' more the file1 top of your file2 screen....' You can print the first few lines of a file with the command: more will let you search through a file, go 'head file1 backwards file2 and forwards...' and has many other functions. The last few lines can be viewed with 'tail' 23

More about files: editing files You can change the content of text files and create new files with a text editor. Text editors edit text. They do not try to format the text like word processors. A novice friendly basic text editor used as standard on many systems. Start with the A powerful editing environment which can be command 'pico filename' programmed. It has many modes for auto layout A powerful of program editor which code. Start can be with somewhat the command confusing for 'emacs newcomers. filename' It is designed for rapid editing of text files and programming. Start with the command 'vi filename' PICO EMACS VI Others: kedit,gedit,kwrite etc.. 24

More about files: copying files You can copy a file with the command 'cp oldfilename newfilename' username@hpc ~> ls letter project username@hpc ~> cp letter draft If newfilename is a directory, then the file will be copied to 'newfilename/oldfilename' username@hpc ~> ls draft letter project username@hpc ~> mv oldfilename newfilename Warning: If a file called newfilename already exists The command then 'mv it will oldfilename be overwritten. newfilename' can be used to rename a file 25

More about files: permissions Every file is protected. Permissions determine who can read, write, or execute a given file. Owner Group World The user who owns the file Other users in the same group All the as other users who in the owns the system. file. Files can have read (-r), write (-w) or execute (-x) permission for each of the three types of user. 26

More about files: permissions You can view the permissions for a file by listing it in long format with the command 'ls -l filename' username@hpc ~> ls -l letter -rwxr--r-- 1 username users 6048 Aug 17 16:07 letter The letter l The The date Permissions file The the type: Permissions files was size for The for the last user for the owner modified everyone owners who The owns files group else name group the file - - ordinary file d - directory l - link (shortcut) 27

More about files: permissions You can change the permissions for a file with the command 'chmod change filename' change ls -l letter is the modification you want chmod to o-r make letter to the files permissions ls -l letter username@hpc ~> -rwxr--r-- 1 username users 6048 Aug 17 16:07 letter username@hpc ~> username@hpc ~> -rwxr----- 1 For Permissions How username whom you are you being changing users are changed: changing 6048 permissions: permissions: Aug 17 16:07 letter username@hpc o r - - ~> other read remove permission these permissions g w + - group write add these permissions u x = - user execute set permissions (run) permission to this a - all 28

Introduction to Awk Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.

Awk n Works well on record-type data n Reads input file(s) a line at a time n Parses each line into fields n Performs user-defined tests against each line, performs actions on matches

Other Common Uses n Input validation Every record have same # of fields? Do values make sense (negative time, hourly wage > $100, etc.)? n Filtering out certain fields n Searches Who got a zero on lab 3? Who got the highest grade? n Many others (it's late)

Invocation n Can write little one-liners on the command line (very handy): print the 3 rd field of every line: $ awk '{ print $3 }' input.txt n Execute an awk script file: $ awk f script.awk input.txt n Or, use this sha-bang as the first line, and give your script execute permissions: #!/bin/awk -f

Form of an AWK program n AWK programs are entries of the form: pattern { action } pattern some test, looking for a pattern (regular expressions) or C-like conditions n if null, actions are applies to every line action a statement or set of statements n if not provided, the default action is to print the entire line, much like grep

Awk Features n Patterns can be regular expressions or C like conditions. n Each line of the input is matched against the patterns, one after the next. If a match occurs the corresponding action is performed. n Input lines are parsed and split into fields, which are accessed by $1,,$NF, where NF is a variable set to the number of fields. The variable $0 contains the entire line, and by default lines are split by white space (blanks, tabs)

Variables n Not declared, nor typed n No character type Only strings and floats (support for ints) n $n refers to the nth field (where n is some integer value) # prints each field on the line for( i=1; i<=nf; ++i ) print $i

Some Built-in Variables n FS the input field separator n OFS the output field separator n NF # of fields; changes w/each record n NR the # of records read (so far). So, the current record #. n $0 the entire input line

Getting help You can get help on a command by using the command ' man command' If you do not know This what will bring a command up the is manual called, page use the option '-k' and to show get it a list to you of commands screen by screen that may be relevant 'man -k word' Try using the options This will '-h', find '-help', all manual or pages '--help' if you containing can't find word the man in the page. short description of the command. 37

Exercise: Filter SNPS Go to http://hpc.ilri.cgiar.org/beca/gbs/ and run these commands in your home directory a) mkdir snp_data b) cd snp_data c) wget http://hpc.ilri.cgiar.org/beca/gbs/africa55k_10pops.bim d) wget http://hpc.ilri.cgiar.org/beca/gbs/emp.data e) ls -alh f) grep '^23\ ^25\ ^26 Africa55K_10Pops.bim > AfricaAll_Pops_non_autosomal.rsids g) awk '{if ($1 > 22) print $2}' Africa55K_10Pops.bim > Africa55K_10Pops.xchrsnps 38

Example Print those employees who actually worked $ awk '$3>0 {print $1, $2*$3}' emp.data Kathy 40 Mark 100 Mary 121 Susie 76.5 $ cat emp.data Beth 4.00 0 Dan 3.75 0 Kathy 4.00 10 Mark 5.00 20 Mary 5.50 22 Susie 4.25 18

Acknowledgement n SANBI (David Martin) n BSK Adapted from SANBI & Bioinformatics Society of Kenya/BSK 40

Useful literature 'Learning the UNIX operating system', O'Reilly press. Questions? 'UNIX Quickguide hpc 41