COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 3: Pan- and Core- genome analysis, Pan-genome tree

Size: px
Start display at page:

Download "COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 3: Pan- and Core- genome analysis, Pan-genome tree"

Transcription

1 COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP Exercise 3: Pan- and Core- genome analysis, Pan-genome tree 1. Pan- and Core- genome plot construction Pan- and core-genome plots are graphs that display to what extent gene familes are conserved within a set of genomes. Conservation is evaluated by first BLASTing the proteomes of the genomes against each other. This is done in a certain order, in that for every proteome, it performs a BLAST search against all previous proteomes. The result is a set of numbers specific for that time point that represents the proteome in the order of the input list, showing: Number of new genes Number of new families Size of core genome Size of pan genome Two genes are considered to belong to the same gene family if the two are more than 50% identical over more than 50% of the length of the longest of the two genes. You will use the script coregenome-2.7c.pl. This script takes a list of proteomes, uses blast search for the gene families and derives the number of core and pan proteins for each proteome. The output list will then be redirected into an R-script, which plots all the core/pan values as a function of the proteome number. Just like the BLAST matrix script you tried the other day, this script will cache all the BLAST results. If you would like to change the order of the analysed proteomes in the plot, all BLAST searches must be carried out once again. In the event you change the order of the input proteins, all BLAST searches must be carried out again. However, all the blast results should be stored in the memory after you ran them once and then it won t take much time to produce a plot with new order of proteomes. Similarly like two days before log in to CBS computers: Open ssh shell client. Press Quick connect. Host name: organism.cbs.dtu.dk User name: studxx stud137 stud138 stud139 stud140 stud141 stud143 stud144 stud145 stud146 stud147 stud148 Escherichia Salmonella Cyanobacteria Bifidobacterium Lactic Acid Bacteria Bacteroidetes Chlorobi Clostridium Bacillus Mycobacterium Archaea

2 stud149 stud150 stud151 stud152 Brucella Pseudomonas Rhodobacter and Rhizobium Staphylococcus Password: V2Gubeso Press New terminal Window icon to open working Terminal window. Connect to another server called life: ssh X Y life The password is all the time the same Change directory to the one containing all the.orf.fsa files: cd PROTEINS Make a list of genomes for analysis foreach i ( *.fsa ) echo $i:r:r../proteins/$i >>../BLAST/PanCorePlot.cf Change directory to the one where you will be storing the analysis results in. Let say it will be the same BLAST directory you created in last exercise. Find the configuration file called PanCorePlot.cf file (less PanCorePlot.cf). You should see two tab separated columns, one contains name of the strain and the other proteome file. You will change the names to more nice ones. cd../blast/ perl -pi -e 's/ /\t/g' PanCorePlot.cf perl -pi -e 's/\_prodigal\t/\t/g' PanCorePlot.cf less PanCorePlot.cf Did anything change? Final step is to run pan-/core-genome plot construction. We once again will try to run it on the server, but not on queuing system. Command looks like this. Remember the command is too big, so it didn t fit in one line. ~carsten/scripts/coregenome/coregenome-2.7c.pl -keep PanCorePlot_v27_memory PanCorePlot.cf > PanCorePlot_v27.ps -keep function saves all proteome blast all-against-all memory to a directory, which can be used afterwards for Microarray probes design. When job is done, similarly as for blastmatrix open it with gs, gv or ghostview ghostview PanCorePlot_v27.ps The names on the figure might not fit and is difficult to read. In this case you can run plot once again using another, older version of the script coregenome-2.3.pl. The command for this looks like this:

3 ~carsten/scripts/coregenome/coregenome-2.3.pl PanCorePlot.cf > PanCorePlot_v23.ps It shouldn t take a lot of time, since the blasting information will be taken from the memory. Question: Look at the plot. Can you tell how many gene families, approximately, your genomes have in common? How many gene families are there in total for your genomes? 2. Pan-genome tree Pan-genome tree is the tree based on the genes present/absent in pan-genome of all the genomes. There are several ways of doing it. Return to your home directory: cd Copy pan-genome tree making package and extract files from it: cp ~stud137/pantree_organism.tar. tar xvf pantree_organism.tar Enter the directory in which you should find useful scripts and files: cd pantree_organism/organism/blast Copy all the earlier predicted proteomes files here: cp home/people/studxx/proteins/*.fsa. There you will use small useful script which is helpful to make a tab separated configuration file. Configuration file should contain 3 columns: GID File Name, where GID (Genome ID) is used for computational purposes, File is an.fsa file containing proteomes for each of the genomes and Name is a genome, which will be illustrated on the resulting tree. perl ~oksana/perl/pantree_mapping.pl Look into the resulting configuration file to be sure it contains the earlier described columns and if all the genomes are present there. less organism_mapping.txt PLEASE DON T DO NEXT STEPS, JUST READ WHAT YOU WOULD BE DOING IF YOU WOULD HAVE MUCH MORE TIME. You would start running all-against-all blast search to achieve the information of differences and similarities between the proteomes. For this purpose R-scripts were written. Start R installed on the CBS machines:

4 R-2.9 script_fastaprep.r will renames the proteomes files to the one containing GID next to each of the genome. And script_blastall.r will prepare submission to queueing system files. source("script_fastaprep.r") source("script_blastall.r") q() mkdir database mv *blastdb* database/ The following foreach loop prepares submission to the queueing system files for each of the proteome. This runs actuall all-against-all blast. foreach i ( *.fsa_xmsub ) perl ~oksana/perl/pantree_qsub.pl $i > $i:r.qsub foreach i (*.qsub) msub -d /home/people/studxx/pantree_organism/organism/blast/ - lncpus=2,mem=10gb,walltime=5:00:00 /home/people/studxx/pantree_organism/organism/blast/$i INSTEAD YOU WILL COPY BLASTING FILES FROM THE PREVIUOSLY MADE DATABASE AND CONTINUE THE TREE MAKING PLAN. Believe me you want to do it like this, in order not to sp time in the dark and cold campus, next to ponds full of strange creatures. cp /home/people/studxx/organism/pantree/organism/blast/res/*.txt /home/people/studxx/pantree_organism/organism/blast/res Check the number of the blast results you copied. It should be equal to squared number of analysed genomes. If it not, ask Oksana for help. lt res/*.txt wc Now you are ready to run pan matrix formation. The matrix will contain information about the gene presence or absence in the genome. If the gene is present the matrix cell will state 1, and 0 in other case. Once again several R- scripts are prepared for this purpose. R-2.9 source( script_panmatrix.r ) You will the pan-matrix calculation process in the window. In this case you will have to wait until it is done and you can once again see > sign in the beginning of the command line. When it is done, run another R-script, which builds a tree

5 from the pan-matrix. Make sure X-ming (for Windows) and X11 (for Mac) iare installed and running at the moment on your computer. First the script will visualize the tree in newly opened X11 window and then bootstrapping will be done. source( script_pantree.r ) The tree is considered to be done when the red bootstrap numbers appeared on the tree. To save the tree do following: dev.copy(device = postscript, file = "pangenomeplot.ps") dev.off() Quir R program: q() Questions: Compare pan-genome tree to the 16S rrna tree you have done before. Which tree illustrates what? Are they different? If yes, why?

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas First of all connect once again to the CBS system: Open ssh shell client. Press Quick

More information

Proteome Comparison: A fine-grained tool for comparative genomics

Proteome Comparison: A fine-grained tool for comparative genomics Proteome Comparison: A fine-grained tool for comparative genomics In addition to the Protein Family Sorter that allows researchers to examine up to the protein families from up to 500 genomes at a time,

More information

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version... Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing

More information

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used: Linux Tutorial How to read the examples When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used: $ application file.txt

More information

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011) UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................

More information

Links, basic file manipulation, environmental variables, executing programs out of $PATH

Links, basic file manipulation, environmental variables, executing programs out of $PATH Links, basic file manipulation, environmental variables, executing programs out of $PATH Laboratory of Genomics & Bioinformatics in Parasitology Department of Parasitology, ICB, USP The $PATH PATH (which

More information

Module 1. - System set-up and data-set construction. Center for Biological Sequence Analysis. Tammi Vesth, PhD student

Module 1. - System set-up and data-set construction. Center for Biological Sequence Analysis. Tammi Vesth, PhD student Module 1 - System set-up and data-set construction Tammi Vesth, PhD student E-mail address: tammi@cbs.dtu.dk Building/Room: 208/061 Center for Biological Sequence Analysis Department of Systems Biology,

More information

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines Introduction to UNIX Logging in Basic system architecture Getting help Intro to shell (tcsh) Basic UNIX File Maintenance Intro to emacs I/O Redirection Shell scripts Logging in most systems have graphical

More information

Practical Linux examples: Exercises

Practical Linux examples: Exercises Practical Linux examples: Exercises 1. Login (ssh) to the machine that you are assigned for this workshop (assigned machines: https://cbsu.tc.cornell.edu/ww/machines.aspx?i=87 ). Prepare working directory,

More information

When you first log in, you will be placed in your home directory. To see what this directory is named, type:

When you first log in, you will be placed in your home directory. To see what this directory is named, type: Chem 7520 Unix Crash Course Throughout this page, the command prompt will be signified by > at the beginning of a line (you do not type this symbol, just everything after it). Navigation When you first

More information

Unix basics exercise MBV-INFX410

Unix basics exercise MBV-INFX410 Unix basics exercise MBV-INFX410 In order to start this exercise, you need to be logged in on a UNIX computer with a terminal window open on your computer. It is best if you are logged in on freebee.abel.uio.no.

More information

The Shell. EOAS Software Carpentry Workshop. September 20th, 2016

The Shell. EOAS Software Carpentry Workshop. September 20th, 2016 The Shell EOAS Software Carpentry Workshop September 20th, 2016 Getting Started You need to download some files to follow this lesson. These files are found on the shell lesson website (see etherpad) 1.

More information

Introduction to Unix - Lab Exercise 0

Introduction to Unix - Lab Exercise 0 Introduction to Unix - Lab Exercise 0 Along with this document you should also receive a printout entitled First Year Survival Guide which is a (very) basic introduction to Unix and your life in the CSE

More information

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:

More information

GPI, Exercise #1. Part 1

GPI, Exercise #1. Part 1 GPI, Exercise #1 In this exercise you will gain some experience with GPI data and the basic reduction steps. Start by reading the three papers related to GPI s commissioning, first- light and observations

More information

An Introduction to Cluster Computing Using Newton

An Introduction to Cluster Computing Using Newton An Introduction to Cluster Computing Using Newton Jason Harris and Dylan Storey March 25th, 2014 Jason Harris and Dylan Storey Introduction to Cluster Computing March 25th, 2014 1 / 26 Workshop design.

More information

Lab 4: Bash Scripting

Lab 4: Bash Scripting Lab 4: Bash Scripting February 20, 2018 Introduction This lab will give you some experience writing bash scripts. You will need to sign in to https://git-classes. mst.edu and git clone the repository for

More information

Phylogeny Yun Gyeong, Lee ( )

Phylogeny Yun Gyeong, Lee ( ) SpiltsTree Instruction Phylogeny Yun Gyeong, Lee ( ylee307@mail.gatech.edu ) 1. Go to cygwin-x (if you don t have cygwin-x, you can either download it or use X-11 with brand new Mac in 306.) 2. Log in

More information

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University Unix/Linux Basics 1 Some basics to remember Everything is case sensitive Eg., you can have two different files of the same name but different case in the same folder Console-driven (same as terminal )

More information

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides) STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2 (Mouse over to the left to see thumbnails of all of the slides) ALLINEA DDT Allinea DDT is a powerful, easy-to-use graphical debugger capable of debugging a

More information

Practical Unix exercise MBV INFX410

Practical Unix exercise MBV INFX410 Practical Unix exercise MBV INFX410 We will in this exercise work with a practical task that, it turns out, can easily be solved by using basic Unix. Let us pretend that an engineer in your group has spent

More information

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction

More information

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK

Cloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction

More information

Intro to Linux. this will open up a new terminal window for you is super convenient on the computers in the lab

Intro to Linux. this will open up a new terminal window for you is super convenient on the computers in the lab Basic Terminal Intro to Linux ssh short for s ecure sh ell usage: ssh [host]@[computer].[otheripstuff] for lab computers: ssh [CSID]@[comp].cs.utexas.edu can get a list of active computers from the UTCS

More information

RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL

RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL While you can probably write a reasonable program that carries out molecular dynamics (MD) simulations, it s sometimes more efficient

More information

Anthill User Group Meeting, 2015

Anthill User Group Meeting, 2015 Agenda Anthill User Group Meeting, 2015 1. Introduction to the machines and the networks 2. Accessing the machines 3. Command line introduction 4. Setting up your environment to see the queues 5. The different

More information

RDAV Tutorial: Hands-on with VisIt on Nautilus If you want to work hands-on, you will need to install VisIt and

RDAV Tutorial: Hands-on with VisIt on Nautilus  If you want to work hands-on, you will need to install VisIt and RDAV Tutorial: Hands-on with VisIt on Nautilus http://rdav.nics.tennessee.edu/ If you want to work hands-on, you will need to install VisIt and register a password token. The data that we are using today

More information

Introduction to Linux for BlueBEAR. January

Introduction to Linux for BlueBEAR. January Introduction to Linux for BlueBEAR January 2019 http://intranet.birmingham.ac.uk/bear Overview Understanding of the BlueBEAR workflow Logging in to BlueBEAR Introduction to basic Linux commands Basic file

More information

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12)

Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12) Using LINUX a BCMB/CHEM 8190 Tutorial Updated (1/17/12) Objective: Learn some basic aspects of the UNIX operating system and how to use it. What is UNIX? UNIX is the operating system used by most computers

More information

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny. Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny stefano.gaiarsa@unimi.it Linux and the command line PART 1 Survival kit for the bash environment Purpose of the

More information

Introduction to UNIX Command Line

Introduction to UNIX Command Line Introduction to UNIX Command Line Files and directories Some useful commands (echo, cat, grep, find, diff, tar) Redirection Pipes Variables Background processes Remote connections (e.g. ssh, curl) Scripts

More information

Introduction to Linux/Unix. Xiaoge Wang, ICER Jan. 14, 2016

Introduction to Linux/Unix. Xiaoge Wang, ICER Jan. 14, 2016 Introduction to Linux/Unix Xiaoge Wang, ICER wangx147@msu.edu Jan. 14, 2016 How does this class work We are going to cover some basics with hands on examples. Exercises are denoted by the following icon:

More information

Basic Unix and Matlab Logging in from another Unix machine, e.g. ECS lab Dells

Basic Unix and Matlab Logging in from another Unix machine, e.g. ECS lab Dells Basic Unix and Matlab 101 1 Logging in from another Unix machine, e.g. ECS lab Dells The computer we will be using for our assignments is called malkhut.engr.umbc.edu which is a Unix/Linux machine that

More information

OrthoMCL v1.4. Recall: Web Service: Datadoc v.1 1/29/ Algorithm Description (SCIENCE)

OrthoMCL v1.4. Recall: Web Service: Datadoc v.1 1/29/ Algorithm Description (SCIENCE) OrthoMCL v1.4 Datadoc v.1 1/29/2007 1. Algorithm Description (SCIENCE) Summary: OrthoMCL is a method that calculates the closest relative to a gene within another species set. For example, protein kinase

More information

Finding Selection in All the Right Places TA Notes and Key Lab 9

Finding Selection in All the Right Places TA Notes and Key Lab 9 Objectives: Finding Selection in All the Right Places TA Notes and Key Lab 9 1. Use published genome data to look for evidence of selection in individual genes. 2. Understand the need for DNA sequence

More information

STA 303 / 1002 Using SAS on CQUEST

STA 303 / 1002 Using SAS on CQUEST STA 303 / 1002 Using SAS on CQUEST A review of the nuts and bolts A.L. Gibbs January 2012 Some Basics of CQUEST If you don t already have a CQUEST account, go to www.cquest.utoronto.ca and request one.

More information

Advanced Linux Commands & Shell Scripting

Advanced Linux Commands & Shell Scripting Advanced Linux Commands & Shell Scripting Advanced Genomics & Bioinformatics Workshop James Oguya Nairobi, Kenya August, 2016 Man pages Most Linux commands are shipped with their reference manuals To view

More information

Distributed Memory Programming With MPI Computer Lab Exercises

Distributed Memory Programming With MPI Computer Lab Exercises Distributed Memory Programming With MPI Computer Lab Exercises Advanced Computational Science II John Burkardt Department of Scientific Computing Florida State University http://people.sc.fsu.edu/ jburkardt/classes/acs2

More information

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th Unix Essentials BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th 2016 http://barc.wi.mit.edu/hot_topics/ 1 Outline Unix overview Logging in to tak Directory structure

More information

Exercise 1. RNA-seq alignment and quantification. Part 1. Prepare the working directory. Part 2. Examine qualities of the RNA-seq data files

Exercise 1. RNA-seq alignment and quantification. Part 1. Prepare the working directory. Part 2. Examine qualities of the RNA-seq data files Exercise 1. RNA-seq alignment and quantification Part 1. Prepare the working directory. 1. Connect to your assigned computer. If you do not know how, follow the instruction at http://cbsu.tc.cornell.edu/lab/doc/remote_access.pdf

More information

Introduction to HPC Resources and Linux

Introduction to HPC Resources and Linux Introduction to HPC Resources and Linux Burak Himmetoglu Enterprise Technology Services & Center for Scientific Computing e-mail: bhimmetoglu@ucsb.edu Paul Weakliem California Nanosystems Institute & Center

More information

I. Creating a group with a private genome- Global Search

I. Creating a group with a private genome- Global Search Protein Family Analysis: Creating genome groups I. Creating a group with a private genomes- Global Search II. Creating a genome group with private genomes- Taxon landing page I. Creating a group with a

More information

Computer Systems and Architecture

Computer Systems and Architecture Computer Systems and Architecture Introduction to UNIX Stephen Pauwels University of Antwerp October 2, 2015 Outline What is Unix? Getting started Streams Exercises UNIX Operating system Servers, desktops,

More information

CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater)

CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater) CS 3410 Intro to Unix, shell commands, etc... (slides from Hussam Abu-Libdeh and David Slater) 28 January 2013 Jason Yosinski Original slides available under Creative Commons Attribution-ShareAlike 3.0

More information

A Guide to Condor. Joe Antognini. October 25, Condor is on Our Network What is an Our Network?

A Guide to Condor. Joe Antognini. October 25, Condor is on Our Network What is an Our Network? A Guide to Condor Joe Antognini October 25, 2013 1 Condor is on Our Network What is an Our Network? The computers in the OSU astronomy department are all networked together. In fact, they re networked

More information

Part I. Introduction to Linux

Part I. Introduction to Linux Part I Introduction to Linux 7 Chapter 1 Linux operating system Goal-of-the-Day Familiarisation with basic Linux commands and creation of data plots. 1.1 What is Linux? All astronomical data processing

More information

Using Linux as a Virtual Machine

Using Linux as a Virtual Machine Intro to UNIX Using Linux as a Virtual Machine We will use the VMware Player to run a Virtual Machine which is a way of having more than one Operating System (OS) running at once. Your Virtual OS (Linux)

More information

Recap From Last Time:

Recap From Last Time: Recap From Last Time: BGGN 213 Working with UNIX Barry Grant http://thegrantlab.org/bggn213 Motivation: Why we use UNIX for bioinformatics. Modularity, Programmability, Infrastructure, Reliability and

More information

Lab 4: Shell Scripting

Lab 4: Shell Scripting Lab 4: Shell Scripting Nathan Jarus June 12, 2017 Introduction This lab will give you some experience writing shell scripts. You will need to sign in to https://git.mst.edu and git clone the repository

More information

BGGN 213 Working with UNIX Barry Grant

BGGN 213 Working with UNIX Barry Grant BGGN 213 Working with UNIX Barry Grant http://thegrantlab.org/bggn213 Recap From Last Time: Motivation: Why we use UNIX for bioinformatics. Modularity, Programmability, Infrastructure, Reliability and

More information

AN INTRODUCTION TO UNIX

AN INTRODUCTION TO UNIX AN INTRODUCTION TO UNIX Paul Johnson School of Mathematics September 18, 2011 OUTLINE 1 INTRODUTION Unix Common Tasks 2 THE UNIX FILESYSTEM Moving around Copying, deleting File Permissions 3 SUMMARY OUTLINE

More information

Data Walkthrough: Background

Data Walkthrough: Background Data Walkthrough: Background File Types FASTA Files FASTA files are text-based representations of genetic information. They can contain nucleotide or amino acid sequences. For this activity, students will

More information

ENCM 339 Fall 2017: Editing and Running Programs in the Lab

ENCM 339 Fall 2017: Editing and Running Programs in the Lab page 1 of 8 ENCM 339 Fall 2017: Editing and Running Programs in the Lab Steve Norman Department of Electrical & Computer Engineering University of Calgary September 2017 Introduction This document is a

More information

Please include the following sentence in any works using center resources.

Please include the following sentence in any works using center resources. The TCU High-Performance Computing Center The TCU HPCC currently maintains a cluster environment hpcl1.chm.tcu.edu. Work on a second cluster environment is underway. This document details using hpcl1.

More information

Introduction to remote command line Linux. Research Computing Team University of Birmingham

Introduction to remote command line Linux. Research Computing Team University of Birmingham Introduction to remote command line Linux Research Computing Team University of Birmingham Linux/UNIX/BSD/OSX/what? v All different v UNIX is the oldest, mostly now commercial only in large environments

More information

Using CSC Environment Efficiently,

Using CSC Environment Efficiently, Using CSC Environment Efficiently, 13.2.2017 1 Exercises a) Log in to Taito either with your training or CSC user account, either from a terminal (with X11 forwarding) or using NX client b) Go to working

More information

Bioinformatics. Computational Methods I: Genomic Resources and Unix. George Bell WIBR Biocomputing Group

Bioinformatics. Computational Methods I: Genomic Resources and Unix. George Bell WIBR Biocomputing Group Bioinformatics Computational Methods I: Genomic Resources and Unix George Bell WIBR Biocomputing Group Human genome databases Human Genome Sequencing Consortium Major annotators: NCBI Ensembl (EMBL-EBI

More information

CLC Server. End User USER MANUAL

CLC Server. End User USER MANUAL CLC Server End User USER MANUAL Manual for CLC Server 10.0.1 Windows, macos and Linux March 8, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus C Denmark

More information

ECE 331: Electronics Principles I Fall 2014

ECE 331: Electronics Principles I Fall 2014 ECE 331: Electronics Principles I Fall 2014 Lab #0: Introduction to Computer Modeling and Laboratory Measurements Report due at your registered lab period on the week of Sept. 8-12 Week 1 Accessing Linux

More information

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2

Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades Q2 Linux Operating System Environment Computadors Grau en Ciència i Enginyeria de Dades 2017-2018 Q2 Facultat d Informàtica de Barcelona This first lab session is focused on getting experience in working

More information

Annotating a Genome in PATRIC

Annotating a Genome in PATRIC Annotating a Genome in PATRIC The following step-by-step workflow is intended to help you learn how to navigate the new PATRIC workspace environment in order to annotate and browse your genome on the PATRIC

More information

MetaPhyler Usage Manual

MetaPhyler Usage Manual MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2

More information

Shell Scripting. Jeremy Sanders. October 2011

Shell Scripting. Jeremy Sanders. October 2011 Shell Scripting Jeremy Sanders October 2011 1 Introduction If you use your computer for repetitive tasks you will find scripting invaluable (one of the advantages of a command-line interface). Basically

More information

AMPHORA2 User Manual. An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu

AMPHORA2 User Manual. An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu AMPHORA2 User Manual An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu AMPHORA2 is free software: you may redistribute it and/or modify its

More information

Tutorial: chloroplast genomes

Tutorial: chloroplast genomes Tutorial: chloroplast genomes Stacia Wyman Department of Computer Sciences Williams College Williamstown, MA 01267 March 10, 2005 ASSUMPTIONS: You are using Internet Explorer under OS X on the Mac. You

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

No Food or Drink in this room. Logon to Windows machine

No Food or Drink in this room. Logon to Windows machine While you are waiting No Food or Drink in this room Logon to Windows machine Username/password on right-hand monitor Not the username/password I gave you earlier We will walk through connecting to the

More information

Module 1 Artemis. Introduction. Aims IF YOU DON T UNDERSTAND, PLEASE ASK! -1-

Module 1 Artemis. Introduction. Aims IF YOU DON T UNDERSTAND, PLEASE ASK! -1- Module 1 Artemis Introduction Artemis is a DNA viewer and annotation tool, free to download and use, written by Kim Rutherford from the Sanger Institute (Rutherford et al., 2000). The program allows the

More information

CENG 334 Computer Networks. Laboratory I Linux Tutorial

CENG 334 Computer Networks. Laboratory I Linux Tutorial CENG 334 Computer Networks Laboratory I Linux Tutorial Contents 1. Logging In and Starting Session 2. Using Commands 1. Basic Commands 2. Working With Files and Directories 3. Permission Bits 3. Introduction

More information

Arkansas High Performance Computing Center at the University of Arkansas

Arkansas High Performance Computing Center at the University of Arkansas Arkansas High Performance Computing Center at the University of Arkansas AHPCC Workshop Series Introduction to Linux for HPC Why Linux? Compatible with many architectures OS of choice for large scale computing

More information

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Contents User access, logging in Linux/Unix

More information

DAVE LIDDAMENT INTRODUCTION TO BASH

DAVE LIDDAMENT INTRODUCTION TO BASH DAVE LIDDAMENT INTRODUCTION TO BASH @daveliddament FORMAT Short lectures Practical exercises (help each other) Write scripts LEARNING OBJECTIVES What is Bash When should you use Bash Basic concepts of

More information

New User Tutorial. OSU High Performance Computing Center

New User Tutorial. OSU High Performance Computing Center New User Tutorial OSU High Performance Computing Center TABLE OF CONTENTS Logging In... 3-5 Windows... 3-4 Linux... 4 Mac... 4-5 Changing Password... 5 Using Linux Commands... 6 File Systems... 7 File

More information

User Guide Version 2.0

User Guide Version 2.0 User Guide Version 2.0 Page 2 of 8 Summary Contents 1 INTRODUCTION... 3 2 SECURESHELL (SSH)... 4 2.1 ENABLING SSH... 4 2.2 DISABLING SSH... 4 2.2.1 Change Password... 4 2.2.2 Secure Shell Connection Information...

More information

Introduction to Unix and Linux. Workshop 1: Directories and Files

Introduction to Unix and Linux. Workshop 1: Directories and Files Introduction to Unix and Linux Workshop 1: Directories and Files Genomics Core Lab TEXAS A&M UNIVERSITY CORPUS CHRISTI Anvesh Paidipala, Evan Krell, Kelly Pennoyer, Chris Bird Genomics Core Lab Informatics

More information

Keep Track of Your Passwords Easily

Keep Track of Your Passwords Easily Keep Track of Your Passwords Easily K 100 / 1 The Useful Free Program that Means You ll Never Forget a Password Again These days, everything you do seems to involve a username, a password or a reference

More information

Project 0: Linux & Virtual Machine Dabbling

Project 0: Linux & Virtual Machine Dabbling Project 0: Linux & Virtual Machine Dabbling CS-3013 Operating Systems Hugh C. Lauer (Slides include materials from Slides include materials from Modern Operating Systems, 3 rd ed., by Andrew Tanenbaum

More information

Where Did My Files Go? How to find your files using Windows 10

Where Did My Files Go? How to find your files using Windows 10 Where Did My Files Go? How to find your files using Windows 10 Have you just upgraded to Windows 10? Are you finding it difficult to find your files? Are you asking yourself Where did My Computer or My

More information

Linux Bootcamp Fall 2015

Linux Bootcamp Fall 2015 Linux Bootcamp Fall 2015 UWB CSS Based on: http://swcarpentry.github.io/shell-novice "Software Carpentry" and the Software Carpentry logo are registered trademarks of NumFOCUS. What this bootcamp is: A

More information

2 Algorithm. Algorithms for CD-HIT were described in three papers published in Bioinformatics.

2 Algorithm. Algorithms for CD-HIT were described in three papers published in Bioinformatics. CD-HIT User s Guide Last updated: 2012-04-25 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1 Contents 2 1

More information

Thank you for choosing ASAP s Remote QuickBooks hosting! The following will guide you through the set up of your new or updated user profile.

Thank you for choosing ASAP s Remote QuickBooks hosting! The following will guide you through the set up of your new or updated user profile. Thank you for choosing ASAP s Remote QuickBooks hosting! The following will guide you through the set up of your new or updated user profile. Please select from the following options to begin setup: Windows

More information

replace my_user_id in the commands with your actual user ID

replace my_user_id in the commands with your actual user ID Exercise 1. Alignment with TOPHAT Part 1. Prepare the working directory. 1. Find out the name of the computer that has been reserved for you (https://cbsu.tc.cornell.edu/ww/machines.aspx?i=57 ). Everyone

More information

Lab #3 Automating Installation & Introduction to Make Due in Lab, September 15, 2004

Lab #3 Automating Installation & Introduction to Make Due in Lab, September 15, 2004 Lab #3 Automating Installation & Introduction to Make Due in Lab, September 15, 2004 Name: Lab Time: Grade: /10 Error Checking In this lab you will be writing a shell script to automate the installation

More information

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory 1 Logging In When you sit down at a terminal and jiggle the mouse to turn off the screen saver, you will be confronted with a window

More information

Whole genome assembly comparison of duplication originally described in Bailey et al

Whole genome assembly comparison of duplication originally described in Bailey et al WGAC Whole genome assembly comparison of duplication originally described in Bailey et al. 2001. Inputs species name path to FASTA sequence(s) to be processed either a directory of chromosomal FASTA files

More information

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame 1 When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from

More information

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource.

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource. Siemens PLM Software HEEDS MDO 2018.04 Setting up a Windows-to- Linux Compute Resource www.redcedartech.com. Contents Introduction 1 On Remote Machine B 2 Installing the SSH Server 2 Configuring the SSH

More information

Lab 06: Support Measures Bootstrap, Jackknife, and Bremer

Lab 06: Support Measures Bootstrap, Jackknife, and Bremer Integrative Biology 200, Spring 2014 Principles of Phylogenetics: Systematics University of California, Berkeley Updated by Traci L. Grzymala Lab 06: Support Measures Bootstrap, Jackknife, and Bremer So

More information

Computer Systems and Architecture

Computer Systems and Architecture Computer Systems and Architecture Stephen Pauwels Computer Systems Academic Year 2018-2019 Overview of the Semester UNIX Introductie Regular Expressions Scripting Data Representation Integers, Fixed point,

More information

Introduction: What is Unix?

Introduction: What is Unix? Introduction Introduction: What is Unix? An operating system Developed at AT&T Bell Labs in the 1960 s Command Line Interpreter GUIs (Window systems) are now available Introduction: Unix vs. Linux Unix

More information

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois Unix/Linux Primer Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois August 25, 2017 This primer is designed to introduce basic UNIX/Linux concepts and commands. No

More information

Meteorology 5344, Fall 2017 Computational Fluid Dynamics Dr. M. Xue. Computer Problem #l: Optimization Exercises

Meteorology 5344, Fall 2017 Computational Fluid Dynamics Dr. M. Xue. Computer Problem #l: Optimization Exercises Meteorology 5344, Fall 2017 Computational Fluid Dynamics Dr. M. Xue Computer Problem #l: Optimization Exercises Due Thursday, September 19 Updated in evening of Sept 6 th. Exercise 1. This exercise is

More information

MacVector for Mac OS X. The online updater for this release is MB in size

MacVector for Mac OS X. The online updater for this release is MB in size MacVector 17.0.3 for Mac OS X The online updater for this release is 143.5 MB in size You must be running MacVector 15.5.4 or later for this updater to work! System Requirements MacVector 17.0 is supported

More information

QUESTION 1 Tools used: pdf-parser, pdfid, pdftk, oledump

QUESTION 1 Tools used: pdf-parser, pdfid, pdftk, oledump TOPIC QUESTION Visit the following link and analyze the pdf file along with the doc file without executing them. You will need to find relevant tools and commands for analyzing a doc file https://blog.didierstevens.com/2015/08/28/test-file-pdf-with-embedded-doc-dropping-eicar/

More information

Linux command line basics II: downloading data and controlling files. Yanbin Yin

Linux command line basics II: downloading data and controlling files. Yanbin Yin Linux command line basics II: downloading data and controlling files Yanbin Yin 1 Things you should know about programming Learning programming has to go through the hands-on practice, a lot of practice

More information

UMass High Performance Computing Center

UMass High Performance Computing Center .. UMass High Performance Computing Center University of Massachusetts Medical School October, 2015 2 / 39. Challenges of Genomic Data It is getting easier and cheaper to produce bigger genomic data every

More information

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011

Unix/Linux Operating System. Introduction to Computational Statistics STAT 598G, Fall 2011 Unix/Linux Operating System Introduction to Computational Statistics STAT 598G, Fall 2011 Sergey Kirshner Department of Statistics, Purdue University September 7, 2011 Sergey Kirshner (Purdue University)

More information

Introduction to UNIX command-line II

Introduction to UNIX command-line II Introduction to UNIX command-line II Boyce Thompson Institute 2017 Prashant Hosmani Class Content Terminal file system navigation Wildcards, shortcuts and special characters File permissions Compression

More information

BLAST. Jon-Michael Deldin. Dept. of Computer Science University of Montana Mon

BLAST. Jon-Michael Deldin. Dept. of Computer Science University of Montana Mon BLAST Jon-Michael Deldin Dept. of Computer Science University of Montana jon-michael.deldin@mso.umt.edu 2011-09-19 Mon Jon-Michael Deldin (UM) BLAST 2011-09-19 Mon 1 / 23 Outline 1 Goals 2 Setting up your

More information

Oregon State University School of Electrical Engineering and Computer Science. CS 261 Recitation 1. Spring 2011

Oregon State University School of Electrical Engineering and Computer Science. CS 261 Recitation 1. Spring 2011 Oregon State University School of Electrical Engineering and Computer Science CS 261 Recitation 1 Spring 2011 Outline Using Secure Shell Clients GCC Some Examples Intro to C * * Windows File transfer client:

More information