Anthill User Group Meeting, 2015
|
|
- Angel Ramsey
- 5 years ago
- Views:
Transcription
1 Agenda Anthill User Group Meeting, Introduction to the machines and the networks 2. Accessing the machines 3. Command line introduction 4. Setting up your environment to see the queues 5. The different queues on the system 6. Queues and jobs 7. Policies to ensure equitable availability of computes to all 8. Submitting simple jobs to the queue 9. Submitting array jobs to the queue 10. Local data vs. scratch data 11. Blast, lastal, rapsearch, diamond and similar programs 12. Assembly 13. prinseq 14. focus 15. Moving data on and off the cluster. 16. Backing up data Introduction to the machines and the networks There are only a few machines that you need to worry about: anthill.sdsu.edu, rambox, edwardsdata.sdsu.edu and possibly rohan.sdsu.edu. I'll explain how everything fits together and when to use each. For most things, you are just going to want to use anthill.sdsu.edu for all your work. Accessing the machines SSH Mac OSX / Linux use Terminal Windows use Putty: Accessing from on campus: ssh username@anthill.sdsu.edu or fill in username and anthill.sdsu.edu 1
2 Accessing from off campus, use edwards-data.sdsu.edu or rohan.sdsu.edu: ssh -P 7010 Note: you have to use port 7010 to access edwards-data from off campus. For the more advanced people don't worry about this if it is too confusing screen to run virtual terminals so that you can start a command and go away, come back later and it will be running. Use screen -DR to reconnect to an existing screen. The main screen commands you need to know are: ctrl-a ctrl-c ctrl-a ctrl-n ctrl-a ctrl-p create a new screen move to the next screen move to the previous screen Basic LINUX Commands All our servers run linux (almost all run CentOS version 6), and so you have to get used to moving around in LINUX. We have a couple of cheat sheets to share with you. Here are a couple of resources that it is worth working through: Learn UNIX in 10 minutes: UNIX Tutorials for beginners: Linux Commands: We will work through some of these commands in the workshop:
3 A simple text editor on almost every system is nano Setting up your environment to see the queues We have queues to work on the cluster, and you need to set it up so you can see the queue. The main command to see what is running on the queue is qstat or for a more detailed view you can use qstat -f The different queues on the system Anthill has three queues that you can use: default The default queue has 35 machines, each with 16 processors and 128 GB RAM. Each processor runs an independent job, so you can run 560 jobs simultaneously on these machines. This queue is in eternally friendly mode, and all jobs are run on a first-in first-out basis. important The important queue has 4 machines with 16 processors and 128 GB RAM. This queue is for single jobs only. Do not run array jobs on this queue or they will be terminated! The queue is for testing and running individual programs. smallmem This queue has 9 machines each with 8 processors (72 computes total), Each machine has 14 GB RAM, except node1 that has 24 GB RAM. People often forget about this queue, so sometimes it is worth checking! 3
4 Policies to ensure equitable availability of computes to all Be mellow. Be nice to others. Listen to your mother. Submitting simple jobs to the queue You need a simple shell script (which can be very easy) and then you need to use this command to submit the job: qsub -cwd./scriptname.sh You can also add some other commands. Common commands include: -q use a different queue (e.g. -q important or -q smallmem) -o where to put the output file -e where to put the error file Examples: Submit a job to the cluster using the default queue: qsub -cwd./scriptname.sh Submit a job to the cluster using the important queue: qsub -cwd -q important./scriptname.sh Submitting array jobs to the queue To submit an array job we add the flag -t to our qsub command -t 1-100:1 submit an array job. This will submit jobs from in increments of 1 With an array job, scriptname.sh gets passed a special variable called $SGE_TASK_ID that is the number of the job it is running. I will provide you with some template code to process every file in a directory. Submit an array job to the cluster, and redirect output and error files to a directory: mkdir sge qsub -cwd -e sge -o sge -t 1:540:1./scriptname.sh 4
5 Naming files SGE submission scripts: Please don't use run.sh or job.sh Limit file names to 10 meaningful characters Include some notion of the command (rapsearch, blast, etc) Potentially start everything with s_ Fasta files Use.fna for nucleotide Use.faa for protein Output files Append.blastp,.blastn,.rapsearch2,.lastal etc so that you remember what you did Remembering what you did Make a file called how-to.txt and copy and paste your commands into it. Annotate the file so you remember what the commands are doing. If you forget to do that, you can make one like this history > how-to.txt and then edit the how-to.txt file: nano how-to.txt NFS data vs. scratch data There are two ways of housing your data. Your home directory is on a different file server machine, and so anything in there will need to be transported to the machine where the computations are done, and then the results will need to be moved back again. As an alternative, each of our computers has a local hard drive with space in /scratch. This is common space, that anyone can use, and thus has the problem that it fills up. Opinions vary on the importance of using your home directory versus /scratch. I keep everything on the file server and compute off of that. Kate and Rob S. move their data to the /scratch space and compute off of that. Certainly if you have something large that you routinely compute against (e.g. the nr database) it would be good to house it on /scratch. Also, if you have a lot of jobs running that require network IO, occassionaly it is better to use /scratch to avoid network issues. However, if you don't 5
6 know what you are doing, don't worry about it. The time saving will be very small (I think insignificant in the overall scheme of things), so don't waste your time! Blast, lastal, rapsearch, diamond and similar programs Its time to move away from BLAST, there are some really good alternatives that we have installed on the cluster. For DNA-DNA (instead of blastn) comparisons we (currently) recommend: For protein-protein (instead of blastp) comparisons we (currently) recommend: LASTAL For protein-dna (instead of blastx) comparisons we (currently) recommend: rapsearch2 We will work through examples of all of these. BLAST Even though it is time to move away from it, I suppose people still want to do it. You will need to format your database using the makeblastdb command (unless you use one that is already formatted). The commands that you want to run are: /usr/local/blast+/bin/blastn, /usr/local/blast+/bin/blastp, /usr/local/blast+/bin/blastx, etc. We have a trivial blast solution for you, it is a script that takes your fasta file, splits it up into a series of smaller files, and then runs your specified blast program against that file. split_blast_queries_blastplus If you run the command without any options you get this help output: 6
7 /home3/redwards/bioinformatics/cluster/split_lastal_queries.pl <options> -f file to split -n number to break into -d destination directory (default = ".") -p matrix: BL62 (for protein/protein or BL80 for DNA/protein) -db lastal database -ex lastal executable location (default is /usr/local/last/bin/) -N job name (default is s_lastal) -rev reverse the order of files that are submitted to the queue. (i.e. so you can run twice and start from the end backwards!) -v verbose Other things will be used as lastal options. Unless -db is provided we will just split and stop Basically the main options you need are -f for the fasta file, -n for the number to break it into, -d for a directory to put the results, -p for the blast program to run, -db for the database to compare to. If you want to add options to blast+ you can add them at the end of the command. For example: split_blast_queries_blastplus -f Nudibranch_S7_L001_R1_001.fasta -n 200 -d nudi -p blastx -db /home/db/blast/nr/nr -evalue 1e-5 This creates a directory called nudi and outputs files into there, including the blast output files. Since this is blastx, the blast output files all end with blastx. (That will change if you use blastn or blastp, of course.) Once the blast is complete you can concatenate all the blast output files using the cat command: cat nudi/*.blastx > nudibranch.blastx Now I have a single file called nudibranch.blastx that has all the blast output. LASTAL To use LASTAL you first need to format the database. If you have a protein database you need to specify the the infile is protein with "-p". Then specify the new database name and the location for the fasta formatted input file. lastdb -p nr nr.faa 7
8 This will create several files, and also possibly break up the database into ~20GB blocks. To run LASTAL you need the location of the database, followed by the QUERYFILE, followed by which score matrix to use (I originally was using BL62, but the developer recommended BL80 for short sequences), followed by the output format (0 for tabular). lastal nr QUERYFILE.faa -p BL80 -f 0 A sample SGE script looks like: #!/bin/bash lastal /usr/data/kate/nr/nr QUERYFILE.faa -p BL80 -f 0 We also have a trivial lastal command that is based on the blast command above: split_lastal_queries which has a similar help profile: /home3/redwards/bin/split_lastal_queries <options> -f file to split -n number to break into -d destination directory (default = ".") -p matrix: BL62 (for protein/protein or BL80 for DNA/protein) -db lastal database -ex lastal executable location (default is /usr/local/last/bin/) -N job name (default is s_lastal) -rev reverse the order of files that are submitted to the queue. (i.e. so you can run twice and start from the end backwards!) -v verbose Other things will be used as lastal options. Unless -db is provided we will just split and stop 8
9 You use the command in the same way. although the options are slightly different (-p for the pairwise matrix): split_lastal_queries -f Nudibranch_S7_L001_R1_001.fasta -n 200 -d nudi -p BL80 -db /home/db/lastal/nr-lastal/nr Again, this will result in a directory of output files, and you can concatenate them as before. RAPSearch2, Reduced Alphabet based Protein similarity Search RAPSearch is about 100 times faster than BLAST and in single thread mode requires up to 2G memory. Its a great replacement for blastx, but is slightly less sensitive than blastx (especially in the fast mode) so you may miss some rare matches. There are two steps to running rapsearch2: Formating the database prerapsearch -d Fasta_File -n DatabaseName Running RAPSearch2 rapsearch -a 1 -q FASTA/FASTQ -d DatabaseName -o OUTPUT -v NumberDbSequences -z NumberThreads -e evalue -b 0 -s f -a 1 set the program to its fast mode -a 0 runs the program in its sensitive mode -b 0 sets the program to not write any sequence alignments -s f sets the program to use E value in the same format as BLAST Sequence assembly We currently recommend (and use) the St. Petersburg assembler, SPAdes: This runs fine on the cluster for most sequences, including metagenomes. If you run out of memory we can run it on rambox contact Rob and I'll work with you on that. to run this on the cluster, I put this in my script file and then submit it to the queue: /home3/redwards/bin/spades/spades linux/bin/spades.py -o spades.assembly --careful --dataset files.yaml 9
10 and files.yaml contains: [ ] { } orientation: "fr", type: "paired-end", right reads: [ "/home3/redwards/johnkirby/rob/1291/fastq/sample4-1291wt_s4_l001_r1_001.fastq", /home3/redwards/johnkirby/rob/1291/fastq/1_1291wt_4_cttgta_l001_r1_001.fastq", "/home3/redwards/johnkirby/rob/1291/fastq/2_1291_0177_4_cgatgt_l001_r1_001.fastq", ], left reads: [ "/home3/redwards/johnkirby/rob/1291/fastq/sample4-1291wt_s4_l001_r2_001.fastq", "/home3/redwards/johnkirby/rob/1291/fastq/1_1291wt_4_cttgta_l001_r2_001.fastq", "/home3/redwards/johnkirby/rob/1291/fastq/2_1291_0177_4_cgatgt_l001_r2_001.fastq", ] 10
11 Prinseq You can run prinseq-lite.pl on the cluster. For example, to generate the report, you need to put this in your file: perl prinseq-lite.pl -verbose -fastq test.fq -graph_data test.gd -out_good null -out_bad null You can then upload the test.gd file to the website to see the report. focus Geni will show you! Moving data on and off the cluster Use scp from the command line. For Windows try WinSCP ( SSH Secure ( For Mac try CyberDuck: ( Rbrowser ( Backing Up Data Your data is NOT backed up. It is your responsibility to back it up to an external hard drive or another source. DO NOT RELY ON US TO PRESERVE YOUR DATA!!! Parting thoughts! Be mellow, be nice to others, everyone uses the resources. 11
A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing
A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:
More informationSequence Alignment: BLAST
E S S E N T I A L S O F N E X T G E N E R A T I O N S E Q U E N C I N G W O R K S H O P 2015 U N I V E R S I T Y O F K E N T U C K Y A G T C Class 6 Sequence Alignment: BLAST Be able to install and use
More informationHow to Run NCBI BLAST on zcluster at GACRC
How to Run NCBI BLAST on zcluster at GACRC BLAST: Basic Local Alignment Search Tool Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 OVERVIEW What is BLAST?
More informationAn Introduction to Cluster Computing Using Newton
An Introduction to Cluster Computing Using Newton Jason Harris and Dylan Storey March 25th, 2014 Jason Harris and Dylan Storey Introduction to Cluster Computing March 25th, 2014 1 / 26 Workshop design.
More informationName Department/Research Area Have you used the Linux command line?
Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services
More informationNew User Tutorial. OSU High Performance Computing Center
New User Tutorial OSU High Performance Computing Center TABLE OF CONTENTS Logging In... 3-5 Windows... 3-4 Linux... 4 Mac... 4-5 Changing Password... 5 Using Linux Commands... 6 File Systems... 7 File
More informationChIP-seq Analysis Practical
ChIP-seq Analysis Practical Vladimir Teif (vteif@essex.ac.uk) An updated version of this document will be available at http://generegulation.info/index.php/teaching In this practical we will learn how
More informationUoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)
UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................
More informationLinux Introduction to Linux
Linux Introduction to Linux Most computational biologists use either Apple Macs or Linux machines. There are a couple of reasons for this: * Much of the software is free * Many of the tools require a command
More informationIntroduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU
Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.
More informationHigh Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin
High Performance Computing (HPC) Club Training Session Xinsheng (Shawn) Qin Outline HPC Club The Hyak Supercomputer Logging in to Hyak Basic Linux Commands Transferring Files Between Your PC and Hyak Submitting
More informationUsing ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim
Using ISMLL Cluster Tutorial Lec 5 1 Agenda Hardware Useful command Submitting job 2 Computing Cluster http://www.admin-magazine.com/hpc/articles/building-an-hpc-cluster Any problem or query regarding
More informationCS CS Tutorial 2 2 Winter 2018
CS CS 230 - Tutorial 2 2 Winter 2018 Sections 1. Unix Basics and connecting to CS environment 2. MIPS Introduction & CS230 Interface 3. Connecting Remotely If you haven t set up a CS environment password,
More informationIntro to Linux. this will open up a new terminal window for you is super convenient on the computers in the lab
Basic Terminal Intro to Linux ssh short for s ecure sh ell usage: ssh [host]@[computer].[otheripstuff] for lab computers: ssh [CSID]@[comp].cs.utexas.edu can get a list of active computers from the UTCS
More informationContents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...
Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing
More informationPARALLEL COMPUTING IN R USING WESTGRID CLUSTERS STATGEN GROUP MEETING 10/30/2017
PARALLEL COMPUTING IN R USING WESTGRID CLUSTERS STATGEN GROUP MEETING 10/30/2017 PARALLEL COMPUTING Dataset 1 Processor Dataset 2 Dataset 3 Dataset 4 R script Processor Processor Processor WHAT IS ADVANCED
More informationWhole genome assembly comparison of duplication originally described in Bailey et al
WGAC Whole genome assembly comparison of duplication originally described in Bailey et al. 2001. Inputs species name path to FASTA sequence(s) to be processed either a directory of chromosomal FASTA files
More informationJoint High Performance Computing Exchange (JHPCE) Cluster Orientation.
Joint High Performance Computing Exchange (JHPCE) Cluster Orientation http://www.jhpce.jhu.edu/ Schedule - Introductions who are we, who are you? - Terminology - Logging in and account setup - Basics of
More informationQuick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing
Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Contents User access, logging in Linux/Unix
More informationUnix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th
Unix Essentials BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th 2016 http://barc.wi.mit.edu/hot_topics/ 1 Outline Unix overview Logging in to tak Directory structure
More informationOregon State University School of Electrical Engineering and Computer Science. CS 261 Recitation 1. Spring 2011
Oregon State University School of Electrical Engineering and Computer Science CS 261 Recitation 1 Spring 2011 Outline Using Secure Shell Clients GCC Some Examples Intro to C * * Windows File transfer client:
More informationsftp - secure file transfer program - how to transfer files to and from nrs-labs
last modified: 2017-01-20 p. 1 CS 111 - useful details: ssh, sftp, and ~st10/111submit You write Racket BSL code in the Definitions window in DrRacket, and save that Definitions window's contents to a
More informationLAB #5 Intro to Linux and Python on ENGR
LAB #5 Intro to Linux and Python on ENGR 1. Pre-Lab: In this lab, we are going to download some useful tools needed throughout your CS career. First, you need to download a secure shell (ssh) client for
More informationRUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL
RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL While you can probably write a reasonable program that carries out molecular dynamics (MD) simulations, it s sometimes more efficient
More informationUsing Sapelo2 Cluster at the GACRC
Using Sapelo2 Cluster at the GACRC New User Training Workshop Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Sapelo2 Cluster Diagram
More informationQuick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing
Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Linux/Unix basic commands Basic command structure:
More informationBy Ludovic Duvaux (27 November 2013)
Array of jobs using SGE - an example using stampy, a mapping software. Running java applications on the cluster - merge sam files using the Picard tools By Ludovic Duvaux (27 November 2013) The idea ==========
More informationFor Dr Landau s PHYS8602 course
For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your
More informationCS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 2: SEP. 8TH INSTRUCTOR: JIAYIN WANG
CS/IT 114 Introduction to Java, Part 1 FALL 2016 CLASS 2: SEP. 8TH INSTRUCTOR: JIAYIN WANG 1 Notice Class Website http://www.cs.umb.edu/~jane/cs114/ Reading Assignment Chapter 1: Introduction to Java Programming
More informationIntroduction to Unix The Windows User perspective. Wes Frisby Kyle Horne Todd Johansen
Introduction to Unix The Windows User perspective Wes Frisby Kyle Horne Todd Johansen What is Unix? Portable, multi-tasking, and multi-user operating system Software development environment Hardware independent
More informationRead mapping with BWA and BOWTIE
Read mapping with BWA and BOWTIE Before We Start In order to save a lot of typing, and to allow us some flexibility in designing these courses, we will establish a UNIX shell variable BASE to point to
More informationLab 1 Introduction to UNIX and C
Name: Lab 1 Introduction to UNIX and C This first lab is meant to be an introduction to computer environments we will be using this term. You must have a Pitt username to complete this lab. NOTE: Text
More informationA Brief Introduction to The Center for Advanced Computing
A Brief Introduction to The Center for Advanced Computing May 1, 2006 Hardware 324 Opteron nodes, over 700 cores 105 Athlon nodes, 210 cores 64 Apple nodes, 128 cores Gigabit networking, Myrinet networking,
More informationA Brief Introduction to The Center for Advanced Computing
A Brief Introduction to The Center for Advanced Computing February 8, 2007 Hardware 376 Opteron nodes, over 890 cores Gigabit networking, Myrinet networking, Infiniband networking soon Hardware: nyx nyx
More informationPractical Linux examples: Exercises
Practical Linux examples: Exercises 1. Login (ssh) to the machine that you are assigned for this workshop (assigned machines: https://cbsu.tc.cornell.edu/ww/machines.aspx?i=87 ). Prepare working directory,
More informationPart I. UNIX Workshop Series: Quick-Start
Part I UNIX Workshop Series: Quick-Start Objectives Overview Connecting with ssh Command Window Anatomy Command Structure Command Examples Getting Help Files and Directories Wildcards, Redirection and
More informationProgramming introduction part I:
Programming introduction part I: Perl, Unix/Linux and using the BlueHive cluster Bio472- Spring 2014 Amanda Larracuente Text editor Syntax coloring Recognize several languages Line numbers Free! Mac/Windows
More informationMetaPhyler Usage Manual
MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2
More informationCS 261 Recitation 1 Compiling C on UNIX
Oregon State University School of Electrical Engineering and Computer Science CS 261 Recitation 1 Compiling C on UNIX Winter 2017 Outline Secure Shell Basic UNIX commands Editing text The GNU Compiler
More informationITCS 4145/5145 Assignment 2
ITCS 4145/5145 Assignment 2 Compiling and running MPI programs Author: B. Wilkinson and Clayton S. Ferner. Modification date: September 10, 2012 In this assignment, the workpool computations done in Assignment
More informationA Brief Introduction to The Center for Advanced Computing
A Brief Introduction to The Center for Advanced Computing November 10, 2009 Outline 1 Resources Hardware Software 2 Mechanics: Access Transferring files and data to and from the clusters Logging into the
More informationHORIZONTAL GENE TRANSFER DETECTION
HORIZONTAL GENE TRANSFER DETECTION Sequenzanalyse und Genomik (Modul 10-202-2207) Alejandro Nabor Lozada-Chávez Before start, the user must create a new folder or directory (WORKING DIRECTORY) for all
More informationMigrating from Zcluster to Sapelo
GACRC User Quick Guide: Migrating from Zcluster to Sapelo The GACRC Staff Version 1.0 8/4/17 1 Discussion Points I. Request Sapelo User Account II. III. IV. Systems Transfer Files Configure Software Environment
More informationCarnegie Mellon. Linux Boot Camp. Jack, Matthew, Nishad, Stanley 6 Sep 2016
Linux Boot Camp Jack, Matthew, Nishad, Stanley 6 Sep 2016 1 Connecting SSH Windows users: MobaXterm, PuTTY, SSH Tectia Mac & Linux users: Terminal (Just type ssh) andrewid@shark.ics.cs.cmu.edu 2 Let s
More informationCpSc 1111 Lab 1 Introduction to Unix Systems, Editors, and C
CpSc 1111 Lab 1 Introduction to Unix Systems, Editors, and C Welcome! Welcome to your CpSc 111 lab! For each lab this semester, you will be provided a document like this to guide you. This material, as
More informationThese will serve as a basic guideline for read prep. This assumes you have demultiplexed Illumina data.
These will serve as a basic guideline for read prep. This assumes you have demultiplexed Illumina data. We have a few different choices for running jobs on DT2 we will explore both here. We need to alter
More informationOBTAINING AN ACCOUNT:
HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to
More informationIntroduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines
Introduction to UNIX Logging in Basic system architecture Getting help Intro to shell (tcsh) Basic UNIX File Maintenance Intro to emacs I/O Redirection Shell scripts Logging in most systems have graphical
More informationShort Read Sequencing Analysis Workshop
Short Read Sequencing Analysis Workshop Day 2 Learning the Linux Compute Environment In-class Slides Matt Hynes-Grace Manager of IT Operations, BioFrontiers Institute Review of Day 2 Videos Video 1 Introduction
More informationNBIC TechTrack PBS Tutorial
NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial
More informationMinnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.
Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac
More informationIntel Manycore Testing Lab (MTL) - Linux Getting Started Guide
Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation
More informationWhen we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame
1 When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from
More informationNBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen
NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS
More informationUsing the computational resources at the GACRC
An introduction to zcluster Georgia Advanced Computing Resource Center (GACRC) University of Georgia Dr. Landau s PHYS4601/6601 course - Spring 2017 What is GACRC? Georgia Advanced Computing Resource Center
More informationParallel Computing with Matlab and R
Parallel Computing with Matlab and R scsc@duke.edu https://wiki.duke.edu/display/scsc Tom Milledge tm103@duke.edu Overview Running Matlab and R interactively and in batch mode Introduction to Parallel
More informationSeminar III: R/Bioconductor
Leonardo Collado Torres lcollado@lcg.unam.mx Bachelor in Genomic Sciences www.lcg.unam.mx/~lcollado/ August - December, 2009 1 / 25 Class outline Working with HTS data: a simulated case study Intro R for
More informationSequence Alignment. GBIO0002 Archana Bhardwaj University of Liege
Sequence Alignment GBIO0002 Archana Bhardwaj University of Liege 1 What is Sequence Alignment? A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.
More informationCOMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas
COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas First of all connect once again to the CBS system: Open ssh shell client. Press Quick
More informationNotes for installing a local blast+ instance of NCBI BLAST F. J. Pineda 09/25/2017
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Notes for installing a local blast+ instance of NCBI BLAST F. J. Pineda 09/25/2017
More informationIntroduction to HPC Resources and Linux
Introduction to HPC Resources and Linux Burak Himmetoglu Enterprise Technology Services & Center for Scientific Computing e-mail: bhimmetoglu@ucsb.edu Paul Weakliem California Nanosystems Institute & Center
More information2 Algorithm. Algorithms for CD-HIT were described in three papers published in Bioinformatics.
CD-HIT User s Guide Last updated: 2012-04-25 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu liwz@sdsc.edu 1 Contents 2 1
More informationIntroduction: What is Unix?
Introduction Introduction: What is Unix? An operating system Developed at AT&T Bell Labs in the 1960 s Command Line Interpreter GUIs (Window systems) are now available Introduction: Unix vs. Linux Unix
More information2018/08/16 14:47 1/36 CD-HIT User's Guide
2018/08/16 14:47 1/36 CD-HIT User's Guide CD-HIT User's Guide This page is moving to new CD-HIT wiki page at Github.com Last updated: 2017/06/20 07:38 http://cd-hit.org Program developed by Weizhong Li's
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What
More informationTips from the experts: How to waste a lot of time on this assignment
Com S 227 Spring 2018 Assignment 1 100 points Due Date: Friday, September 14, 11:59 pm (midnight) Late deadline (25% penalty): Monday, September 17, 11:59 pm General information This assignment is to be
More informationImage Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System
Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line
More informationProtected Environment at CHPC. Sean Igo Center for High Performance Computing September 11, 2014
Protected Environment at CHPC Sean Igo Center for High Performance Computing Sean.Igo@utah.edu September 11, 2014 Purpose of Presentation Overview of CHPC environment / access Actually this is most of
More informationCS 460 Linux Tutorial
CS 460 Linux Tutorial http://ryanstutorials.net/linuxtutorial/cheatsheet.php # Change directory to your home directory. # Remember, ~ means your home directory cd ~ # Check to see your current working
More informationWhy You Should Consider Grid Computing
Why You Should Consider Grid Computing Kenny Daily BIT Presentation 8 January 2007 Outline Motivational Story Electric Fish Grid Computing Overview N1 Sun Grid Engine Software Use of UCI's cluster My Research
More informationIntroduction in Unix. Linus Torvalds Ken Thompson & Dennis Ritchie
Introduction in Unix Linus Torvalds Ken Thompson & Dennis Ritchie My name: John Donners John.Donners@surfsara.nl Consultant at SURFsara And Cedric Nugteren Cedric.Nugteren@surfsara.nl Consultant at SURFsara
More informationSupercomputing environment TMA4280 Introduction to Supercomputing
Supercomputing environment TMA4280 Introduction to Supercomputing NTNU, IMF February 21. 2018 1 Supercomputing environment Supercomputers use UNIX-type operating systems. Predominantly Linux. Using a shell
More informationACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009
ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 What is ACEnet? Shared resource......for research computing... physics, chemistry, oceanography, biology, math, engineering,
More informationIntroduction to the Linux Command Line
Introduction to the Linux Command Line May, 2015 How to Connect (securely) ssh sftp scp Basic Unix or Linux Commands Files & directories Environment variables Not necessarily in this order.? Getting Connected
More informationBLAST. Jon-Michael Deldin. Dept. of Computer Science University of Montana Mon
BLAST Jon-Michael Deldin Dept. of Computer Science University of Montana jon-michael.deldin@mso.umt.edu 2011-09-19 Mon Jon-Michael Deldin (UM) BLAST 2011-09-19 Mon 1 / 23 Outline 1 Goals 2 Setting up your
More informationMERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced
MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California
More informationIntroduction to Discovery.
Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging
More informationIntroduction to Discovery.
Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging
More informationEssential Skills for Bioinformatics: Unix/Linux
Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose
More informationdiamond Requirements Time Torque/PBS Examples Diamond with single query (simple)
diamond Diamond is a sequence database searching program with the same function as BlastX, but 1000X faster. A whole transcriptome search of the NCBI nr database, for instance, may take weeks using BlastX,
More informationIntroduction to UNIX. SURF Research Boot Camp April Jeroen Engelberts Consultant Supercomputing
Introduction to UNIX SURF Research Boot Camp April 2018 Jeroen Engelberts jeroen.engelberts@surfsara.nl Consultant Supercomputing Outline Introduction to UNIX What is UNIX? (Short) history of UNIX Cartesius
More informationIntroduction to UNIX
PURDUE UNIVERSITY Introduction to UNIX Manual Michael Gribskov 8/21/2016 1 Contents Connecting to servers... 4 PUTTY... 4 SSH... 5 File Transfer... 5 scp secure copy... 5 sftp
More informationIntroduction to Scripting using bash
Introduction to Scripting using bash Scripting versus Programming (from COMP10120) You may be wondering what the difference is between a script and a program, or between the idea of scripting languages
More informationGenomic Files. University of Massachusetts Medical School. October, 2015
.. Genomic Files University of Massachusetts Medical School October, 2015 2 / 55. A Typical Deep-Sequencing Workflow Samples Fastq Files Fastq Files Sam / Bam Files Various files Deep Sequencing Further
More informationCommand-Line Data Analysis INX_S17, Day 10,
Command-Line Data Analysis INX_S17, Day 10, 2017-05-01 Assignment 4 (quiz). sort, head, tail Learning Outcome(s): Use `sort` to build filtering pipelines for bioinformatics data Matthew Peterson, OSU CGRB,
More informationHigh Performance Computing (HPC) Using zcluster at GACRC
High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?
More informationCSC209. Software Tools and Systems Programming. https://mcs.utm.utoronto.ca/~209
CSC209 Software Tools and Systems Programming https://mcs.utm.utoronto.ca/~209 What is this Course About? Software Tools Using them Building them Systems Programming Quirks of C The file system System
More informationIntroduction to Linux Environment. Yun-Wen Chen
Introduction to Linux Environment Yun-Wen Chen 1 The Text (Command) Mode in Linux Environment 2 The Main Operating Systems We May Meet 1. Windows 2. Mac 3. Linux (Unix) 3 Windows Command Mode and DOS Type
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What
More informationIntroduction to HPC Using zcluster at GACRC On-Class GENE 4220
Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 OVERVIEW GACRC
More informationAssessing Transcriptome Assembly
Assessing Transcriptome Assembly Matt Johnson July 9, 2015 1 Introduction Now that you have assembled a transcriptome, you are probably wondering about the sequence content. Are the sequences from the
More informationWorking with GIT. Florido Paganelli Lund University MNXB Florido Paganelli MNXB Working with git 1/47
Working with GIT MNXB01 2017 Florido Paganelli Lund University florido.paganelli@hep.lu.se Florido Paganelli MNXB01-2017 - Working with git 1/47 Required Software Git - a free and open source distributed
More informationBioinformatics Facility at the Biotechnology/Bioservices Center
Bioinformatics Facility at the Biotechnology/Bioservices Center Co-Heads : J.P. Gogarten, Paul Lewis Facility Scientist : Pascal Lapierre Hardware/Software Manager: Jeff Lary Mandate of the Facility: To
More informationVERY SHORT INTRODUCTION TO UNIX
VERY SHORT INTRODUCTION TO UNIX Tore Samuelsson, Nov 2009. An operating system (OS) is an interface between hardware and user which is responsible for the management and coordination of activities and
More informationRuby on Rails Welcome. Using the exercise files
Ruby on Rails Welcome Welcome to Ruby on Rails Essential Training. In this course, we're going to learn the popular open source web development framework. We will walk through each part of the framework,
More informationUsing the Yale HPC Clusters
Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: hpc@yale.edu Read documentation at: http://research.computing.yale.edu/hpc-support
More informationUnix basics exercise MBV-INFX410
Unix basics exercise MBV-INFX410 In order to start this exercise, you need to be logged in on a UNIX computer with a terminal window open on your computer. It is best if you are logged in on freebee.abel.uio.no.
More informationIntroduction to HPC Using zcluster at GACRC
Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is
More informationIntroduction to Linux for BlueBEAR. January
Introduction to Linux for BlueBEAR January 2019 http://intranet.birmingham.ac.uk/bear Overview Understanding of the BlueBEAR workflow Logging in to BlueBEAR Introduction to basic Linux commands Basic file
More informationUsing UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program
Using UNIX. UNIX is mainly a command line interface. This means that you write the commands you want executed. In the beginning that will seem inferior to windows point-and-click, but in the long run the
More information