Genome Assembly. 2 Sept. Groups. Wiki. Job files Read cleaning Other cleaning Genome Assembly
|
|
- Anis Carson
- 5 years ago
- Views:
Transcription
1 2 Sept Groups Group 5 was down to 3 people so I merged it into the other groups Group 1 is now 6 people anyone want to change? The initial drafter is not the official leader use any management structure you like Wiki Use the wiki as your group notebook Share your job files Need to see your results Job files Read cleaning Other cleaning Genome Assembly
2 RCAC Job files Why? When you log into an RCAC server you are using a special server designed for multiple users. This is called a frontend node ( or sometimes a head node). There are (I think) three front end nodes often they are very busy. Frontend node: edit files, send mail, backup data, compile programs No computing The other nodes are called compute nodes. They are allocated and run by a system called PBS/Torque. The preferred way to use PBS is by submitting a job file using the command qsub When you run a job with qsub, all of the normal output (STDOUT) and error output (STDERR) is sent to files called jobname.o<jobnumber> and jobname.e<jobnumber>, respectively. For example check_clip_man.o check_clip_man.e
3 RCAC Job files Example: Running seqyclean (a module) #!/bin/sh -l #PBS -N seqyclean_monpu1 #PBS -q scholar #PBS -l nodes=1:ppn=16 #PBS -l walltime=168:00:00 module load seqyclean cd $PBS_O_WORKDIR pwd Shebang tell unix this is a shell file. It could be a Perl file Jobname (seen in qstat) Queue use scholar unless otherwise instructed Number of nodes and CPUs (ppn) to reserve. Usually ppn will be 1 or 16 on scholar Maximum CPU time the job will run. The scholar queue is limited to 168 hours cat seqyclean.job date +"%d %B %Y %H:%M:%S" echo " " seqyclean -t 16 \ -1../../data/Monpu1.genome.rawReads.r1.fq \ -2../../data/Monpu1.genome.rawReads.r2.fq \ -v adapter.fa \ -qual \ -minimum_read_length 30 \ -o Monpu1.genome.rawReads.seqyclean.stats \ > seqyclean.log Optional: pwd $PBS_O_WORKDIR is a predefined symbol that means the directory from which you submitted the job with qsub Pwd print the directory after the cd useful for debugging cat <filename> copies the command file to the output The date echo command writes the date into the output The backslash, \, is a line continuation character in unix. It makes it easier to write and understand very long command lines The greater-than symbol, >, redirects output in unix, i.e., everything written to STDOUT is sent to the file seqyclean.log echo " " date +"%d %B %Y %H:%M:%S"
4 RCAC Job files #PBS -N seqyclean_monpu1 #PBS -q scholar #PBS -l nodes=1:ppn=16 #PBS -l walltime=168:00:00 PBS commands can also be entered on the command line when you run qsub qsub N seqyclean_monpu1 -q scholar -l nodes=1:ppn=16 -l walltime=168:00:00 I like the PBS commands in the job file so I have a record Its easy to make a mistake Can save a lot of work by copying from old jobs
5 RCAC job files Job files # (header stuff removed for example) ~/src/btrim/btrim64 \ -3 \ -p adapter2.fa \ -t../monpu1.genome.rawreads.fastq \ -o Monpu1.trimmed \ -s Monpu1.btrim.summary \ >btrim.log This file is in /home/mgribsko/src. In unix, ~ is a symbol for your home directory. ~<username>, for instance ~mgribsko is a symbol for the named user s home directory echo " " date +"%d %B %Y %H:%M:%S" # Btrim64: -q -p <pattern file> -t <fastq file> -o <trim file> [-u 5'-error -v 3'-error -l minlen -b <5'-cut> -e <3'-cut> \ # -w <window> -a <average> -f <5'-trim> -I] # # Required for pattern trimming: # -p <pattern file> each line contains one pair of 5'- and 3'-adaptors; ignored if -q in effect # -t <sequence file> fastq file to be trimmed # -o <output file> fastq file of trimmed sequences # # Required for quality trimming (-q in effect): # -t <sequence file> fastq file to be trimmed # -o <output file> fastq file of trimmed sequences # # Optional: # -q toggle to quality trimming [default=adaptor trimming] # -3 3'-adaptor trimming only [default=off] # -P pass if no adaptor is found [default=off] # -Q do a quality trimming even if adaptor is found [default=off] # -s <summary file> detailed trimming info for each sequence # -u <5'-error> maximum number of errors in 5'-adaptor [default=3] # -v <3'-error> maximum number of errors in 3'-adaptor [default=4] # -l <minimal length> minimal insert size [default=25] # -b <5'-range> the length of sequence to look for 5'-adaptor at the beginning of the sequence [default=1.3 x adaptor length] # I often copy the help for the command into the job file as a comment. Comments begin with #. This makes it much easier to change the command later. Notice that the PBS commands are comments as far as unix is concerned
6 RCAC Job files Time for an Example Job files Grep tricks
7 Adapter trimming Over the summer I tried many methods AdapterRemoval AlienTrimmer Btrim Cutadapt Fastx_clip Fastqmcf Flexbar Reaper Scythe Seqprep Seqyclean Skewer Trimmomatic
8 Adapter trimming Quick and Dirty test: use grep to check for the first 14 bases of the universal and index adapters, and their reverse complement Why 14? Long enough that you don t expect to see (many) matches by chance. Why quick and dirty? Only exact matches will be found Quality not considered Matches may be cut off by end of read This test will UNDERESTIMATE the number of adapters.
9 Adapter trimming index Forward Index Reverse Universal Forward Universal Reverse Total Adapters reads remain adapters remain Monpu1.genome.rawReads.r1.fq Monpu1.genome.rawReads.r2.fq Monpu1.genome.rawReads.both.fq % % Monpu1.genome.filteredReads.fastq % 34.11% adapterremoval % 3.16% alientrimmer % 5.46% cutadapt % 65.96% fastqmcf % 17.39% flexbar % 1.72% reaper % 2.66% scythe % 4.21% seqprep % 3.26% skewer % 2.14% seqyclean all % 0.11% trimmomatic paired.r1.fq trimmomatic unpaired.r1.fq trimmomatic paired.r2.fq trimmomatic unpaired.r2.fq trimmomatic all % 20.45%
10 Adapter trimming Group 1- trimmomatic
11 Adapter trimming
12 Adapter trimming
13 Adapter trimming
14 Adapter trimming
15 Adapter trimming
16 Adapter trimming
17 Other Cleaning Mitochondrial Phi-X174 Match to reads using Bowtie2 (or any other mapper) use local-very-sensitive (matches with small gaps)
18 De Bruijn Graphs (from Homolog.us Bioinformatics)
19 De Bruijn Graph
20 De Bruijn Graph Repeats
21 De Bruijn Graph reads
22 Velvet One of the first De Bruijn assemblers Pruning tips a chain of nodes disconnected on one end caused by sequencing errors OR coverage gaps errors tend to be short (rule trim if < 2 kmer ) errors tend to have low multiplicity at junction bubbles paths that leave and return caused by sequence variation (SNPs) length/multiplicity rule shorter, higher multiplicity paths are preferred Erroneous connections duplicate sequences + errors errors will have low coverage, so will areas with low coverage
RCAC. Job files Example: Running seqyclean (a module)
RCAC Job files Why? When you log into an RCAC server you are using a special server designed for multiple users. This is called a frontend node ( or sometimes a head node). There are (I think) three front
More informationGenomics AGRY Michael Gribskov Hock 331
Genomics AGRY 60000 Michael Gribskov gribskov@purdue.edu Hock 331 Computing Essentials Resources In this course we will assemble and annotate both genomic and transcriptomic sequence assemblies We will
More informationIntroduction to UNIX
PURDUE UNIVERSITY Introduction to UNIX Manual Michael Gribskov 8/21/2016 1 Contents Connecting to servers... 4 PUTTY... 4 SSH... 5 File Transfer... 5 scp secure copy... 5 sftp
More informationUsing ITaP clusters for large scale statistical analysis with R. Doug Crabill Purdue University
Using ITaP clusters for large scale statistical analysis with R Doug Crabill Purdue University Topics Running multiple R jobs on departmental Linux servers serially, and in parallel Cluster concepts and
More informationOpenPBS Users Manual
How to Write a PBS Batch Script OpenPBS Users Manual PBS scripts are rather simple. An MPI example for user your-user-name: Example: MPI Code PBS -N a_name_for_my_parallel_job PBS -l nodes=7,walltime=1:00:00
More informationParameter searches and the batch system
Parameter searches and the batch system Scientific Computing Group css@rrzn.uni-hannover.de Parameter searches and the batch system Scientific Computing Group 1st of October 2012 1 Contents 1 Parameter
More informationLab #2 Physics 91SI Spring 2013
Lab #2 Physics 91SI Spring 2013 Objective: Some more experience with advanced UNIX concepts, such as redirecting and piping. You will also explore the usefulness of Mercurial version control and how to
More informationUser Guide of High Performance Computing Cluster in School of Physics
User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software
More informationQuick Guide for the Torque Cluster Manager
Quick Guide for the Torque Cluster Manager Introduction: One of the main purposes of the Aries Cluster is to accommodate especially long-running programs. Users who run long jobs (which take hours or days
More informationQuick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing
Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Linux/Unix basic commands Basic command structure:
More informationCloud Computing Research Cloud: NeCTAR Commercial Cloud: Amazon AWS, Microsoft Azure, etc. Seed money for exploration of new cloud technologies
High Performance Computing (HPC) As a service: NCI Raijin Katana local HPC cluster Cloud Computing Research Cloud: NeCTAR Commercial Cloud: Amazon AWS, Microsoft Azure, etc. Seed money for exploration
More informationNBIC TechTrack PBS Tutorial
NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial
More informationNBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen
NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS
More informationBatch Systems. Running calculations on HPC resources
Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between
More informationQuick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing
Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Contents User access, logging in Linux/Unix
More informationThese will serve as a basic guideline for read prep. This assumes you have demultiplexed Illumina data.
These will serve as a basic guideline for read prep. This assumes you have demultiplexed Illumina data. We have a few different choices for running jobs on DT2 we will explore both here. We need to alter
More informationProgramming introduction part I:
Programming introduction part I: Perl, Unix/Linux and using the BlueHive cluster Bio472- Spring 2014 Amanda Larracuente Text editor Syntax coloring Recognize several languages Line numbers Free! Mac/Windows
More informationUF Research Computing: Overview and Running STATA
UF : Overview and Running STATA www.rc.ufl.edu Mission Improve opportunities for research and scholarship Improve competitiveness in securing external funding Matt Gitzendanner magitz@ufl.edu Provide high-performance
More informationShell Scripting. With Applications to HPC. Edmund Sumbar Copyright 2007 University of Alberta. All rights reserved
AICT High Performance Computing Workshop With Applications to HPC Edmund Sumbar research.support@ualberta.ca Copyright 2007 University of Alberta. All rights reserved High performance computing environment
More informationAn Introduction to Cluster Computing Using Newton
An Introduction to Cluster Computing Using Newton Jason Harris and Dylan Storey March 25th, 2014 Jason Harris and Dylan Storey Introduction to Cluster Computing March 25th, 2014 1 / 26 Workshop design.
More informationIntroduction to Linux and Cluster Computing Environments for Bioinformatics
Introduction to Linux and Cluster Computing Environments for Bioinformatics Doug Crabill Senior Academic IT Specialist Department of Statistics Purdue University dgc@purdue.edu What you will learn Linux
More informationAnswers to Federal Reserve Questions. Training for University of Richmond
Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node
More informationGetting started with the CEES Grid
Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account
More informationGPU Cluster Usage Tutorial
GPU Cluster Usage Tutorial How to make caffe and enjoy tensorflow on Torque 2016 11 12 Yunfeng Wang 1 PBS and Torque PBS: Portable Batch System, computer software that performs job scheduling versions
More informationLinux Command Line Interface. December 27, 2017
Linux Command Line Interface December 27, 2017 Foreword It is supposed to be a refresher (?!) If you are familiar with UNIX/Linux/MacOS X CLI, this is going to be boring... I will not talk about editors
More informationThe DTU HPC system. and how to use TopOpt in PETSc on a HPC system, visualize and 3D print results.
The DTU HPC system and how to use TopOpt in PETSc on a HPC system, visualize and 3D print results. Niels Aage Department of Mechanical Engineering Technical University of Denmark Email: naage@mek.dtu.dk
More informationIntroduction to HPC Resources and Linux
Introduction to HPC Resources and Linux Burak Himmetoglu Enterprise Technology Services & Center for Scientific Computing e-mail: bhimmetoglu@ucsb.edu Paul Weakliem California Nanosystems Institute & Center
More information22-Sep CSCI 2132 Software Development Lecture 8: Shells, Processes, and Job Control. Faculty of Computer Science, Dalhousie University
Lecture 8 p.1 Faculty of Computer Science, Dalhousie University CSCI 2132 Software Development Lecture 8: Shells, Processes, and Job Control 22-Sep-2017 Location: Goldberg CS 127 Time: 14:35 15:25 Instructor:
More informationQuality Control of Illumina Data at the Command Line
Quality Control of Illumina Data at the Command Line Quick UNIX Introduction: UNIX is an operating system like OSX or Windows. The interface between you and the UNIX OS is called the shell. There are a
More informationBatch Systems. Running your jobs on an HPC machine
Batch Systems Running your jobs on an HPC machine Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationHow to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions
How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules
More informationData Preprocessing. Next Generation Sequencing analysis DTU Bioinformatics Next Generation Sequencing Analysis
Data Preprocessing Next Generation Sequencing analysis DTU Bioinformatics Generalized NGS analysis Data size Application Assembly: Compare Raw Pre- specific: Question Alignment / samples / Answer? reads
More informationUnderstanding and Pre-processing Raw Illumina Data
Understanding and Pre-processing Raw Illumina Data Matt Johnson October 4, 2013 1 Understanding FASTQ files After an Illumina sequencing run, the data is stored in very large text files in a standard format
More informationCS Unix Tools. Fall 2010 Lecture 5. Hussam Abu-Libdeh based on slides by David Slater. September 17, 2010
Fall 2010 Lecture 5 Hussam Abu-Libdeh based on slides by David Slater September 17, 2010 Reasons to use Unix Reason #42 to use Unix: Wizardry Mastery of Unix makes you a wizard need proof? here is the
More informationRunning Jobs, Submission Scripts, Modules
9/17/15 Running Jobs, Submission Scripts, Modules 16,384 cores total of about 21,000 cores today Infiniband interconnect >3PB fast, high-availability, storage GPGPUs Large memory nodes (512GB to 1TB of
More informationLogging in to the CRAY
Logging in to the CRAY 1. Open Terminal Cray Hostname: cray2.colostate.edu Cray IP address: 129.82.103.183 On a Mac 2. type ssh username@cray2.colostate.edu where username is your account name 3. enter
More informationA Hands-On Tutorial: RNA Sequencing Using High-Performance Computing
A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:
More informationVariation among genomes
Variation among genomes Comparing genomes The reference genome http://www.ncbi.nlm.nih.gov/nuccore/26556996 Arabidopsis thaliana, a model plant Col-0 variety is from Landsberg, Germany Ler is a mutant
More informationSharpen Exercise: Using HPC resources and running parallel applications
Sharpen Exercise: Using HPC resources and running parallel applications Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into
More informationThe cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group
The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing
More informationNew High Performance Computing Cluster For Large Scale Multi-omics Data Analysis. 28 February 2018 (Wed) 2:30pm 3:30pm Seminar Room 1A, G/F
New High Performance Computing Cluster For Large Scale Multi-omics Data Analysis 28 February 2018 (Wed) 2:30pm 3:30pm Seminar Room 1A, G/F The Team (Bioinformatics & Information Technology) Eunice Kelvin
More informationBatch Systems & Parallel Application Launchers Running your jobs on an HPC machine
Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike
More informationCloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK
Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction
More informationJob Management on LONI and LSU HPC clusters
Job Management on LONI and LSU HPC clusters Le Yan HPC Consultant User Services @ LONI Outline Overview Batch queuing system Job queues on LONI clusters Basic commands The Cluster Environment Multiple
More informationCloud Computing and Unix: An Introduction. Dr. Sophie Shaw University of Aberdeen, UK
Cloud Computing and Unix: An Introduction Dr. Sophie Shaw University of Aberdeen, UK s.shaw@abdn.ac.uk Aberdeen London Exeter What We re Going To Do Why Unix? Cloud Computing Connecting to AWS Introduction
More informationA Brief Introduction to the Linux Shell for Data Science
A Brief Introduction to the Linux Shell for Data Science Aris Anagnostopoulos 1 Introduction Here we will see a brief introduction of the Linux command line or shell as it is called. Linux is a Unix-like
More informationInstalling and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.
Installing and running COMSOL 4.3a on a Linux cluster 2012 COMSOL. All rights reserved. Introduction This quick guide explains how to install and operate COMSOL Multiphysics 4.3a on a Linux cluster. It
More informationSimple examples how to run MPI program via PBS on Taurus HPC
Simple examples how to run MPI program via PBS on Taurus HPC MPI setup There's a number of MPI implementations install on the cluster. You can list them all issuing the following command: module avail/load/list/unload
More informationUoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)
UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................
More informationWeek Overview. Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file
ULI101 Week 05 Week Overview Simple filter commands: head, tail, cut, sort, tr, wc grep utility stdin, stdout, stderr Redirection and piping /dev/null file head and tail commands These commands display
More informationOur new HPC-Cluster An overview
Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization
More informationRead mapping with BWA and BOWTIE
Read mapping with BWA and BOWTIE Before We Start In order to save a lot of typing, and to allow us some flexibility in designing these courses, we will establish a UNIX shell variable BASE to point to
More informationQueue systems. and how to use Torque/Maui. Piero Calucci. Scuola Internazionale Superiore di Studi Avanzati Trieste
Queue systems and how to use Torque/Maui Piero Calucci Scuola Internazionale Superiore di Studi Avanzati Trieste March 9th 2007 Advanced School in High Performance Computing Tools for e-science Outline
More information5/20/2007. Touring Essential Programs
Touring Essential Programs Employing fundamental utilities. Managing input and output. Using special characters in the command-line. Managing user environment. Surveying elements of a functioning system.
More informationData Preprocessing : Next Generation Sequencing analysis CBS - DTU Next Generation Sequencing Analysis
Data Preprocessing 27626: Next Generation Sequencing analysis CBS - DTU Generalized NGS analysis Data size Application Assembly: Compare Raw Pre- specific: Question Alignment / samples / Answer? reads
More informationBioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.
Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny stefano.gaiarsa@unimi.it Linux and the command line PART 1 Survival kit for the bash environment Purpose of the
More informationWhole genome assembly comparison of duplication originally described in Bailey et al
WGAC Whole genome assembly comparison of duplication originally described in Bailey et al. 2001. Inputs species name path to FASTA sequence(s) to be processed either a directory of chromosomal FASTA files
More informationBasic UNIX commands. HORT Lab 2 Instructor: Kranthi Varala
Basic UNIX commands HORT 59000 Lab 2 Instructor: Kranthi Varala Client/Server architecture User1 User2 User3 Server (UNIX/ Web/ Database etc..) User4 High Performance Compute (HPC) cluster User1 Compute
More informationsee also:
ESSENTIALS OF NEXT GENERATION SEQUENCING WORKSHOP 2014 UNIVERSITY OF KENTUCKY AGTC Class 3 Genome Assembly Newbler 2.9 Most assembly programs are run in a similar manner to one another. We will use the
More informationComputing with the Moore Cluster
Computing with the Moore Cluster Edward Walter An overview of data management and job processing in the Moore compute cluster. Overview Getting access to the cluster Data management Submitting jobs (MPI
More informationPBS Pro and Ansys Examples
PBS Pro and Ansys Examples Introduction This document contains a number of different types of examples of using Ansys on the HPC, listed below. 1. Single-node Ansys Job 2. Single-node CFX Job 3. Single-node
More informationImage Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System
Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line
More informationUsing Sapelo2 Cluster at the GACRC
Using Sapelo2 Cluster at the GACRC New User Training Workshop Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Sapelo2 Cluster Diagram
More informationBatch system usage arm euthen F azo he Z J. B T
Batch system usage 10.11.2010 General stuff Computing wikipage: http://dvinfo.ifh.de Central email address for questions & requests: uco-zn@desy.de Data storage: AFS ( /afs/ifh.de/group/amanda/scratch/
More informationUnix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University
Unix/Linux Basics 1 Some basics to remember Everything is case sensitive Eg., you can have two different files of the same name but different case in the same folder Console-driven (same as terminal )
More informationHigh Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin
High Performance Computing (HPC) Club Training Session Xinsheng (Shawn) Qin Outline HPC Club The Hyak Supercomputer Logging in to Hyak Basic Linux Commands Transferring Files Between Your PC and Hyak Submitting
More informationBasics. I think that the later is better.
Basics Before we take up shell scripting, let s review some of the basic features and syntax of the shell, specifically the major shells in the sh lineage. Command Editing If you like vi, put your shell
More informationSharpen Exercise: Using HPC resources and running parallel applications
Sharpen Exercise: Using HPC resources and running parallel applications Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into ARCHER frontend nodes and run commands.... 3 3.2 Download and extract
More informationand how to use TORQUE & Maui Piero Calucci
Queue and how to use & Maui Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 We Are Trying to Solve 2 Using the Manager
More informationHigh Performance Beowulf Cluster Environment User Manual
High Performance Beowulf Cluster Environment User Manual Version 3.1c 2 This guide is intended for cluster users who want a quick introduction to the Compusys Beowulf Cluster Environment. It explains how
More informationTrimming and quality control ( )
Trimming and quality control (2015-06-03) Alexander Jueterbock, Martin Jakt PhD course: High throughput sequencing of non-model organisms Contents 1 Overview of sequence lengths 2 2 Quality control 3 3
More informationSupercomputing environment TMA4280 Introduction to Supercomputing
Supercomputing environment TMA4280 Introduction to Supercomputing NTNU, IMF February 21. 2018 1 Supercomputing environment Supercomputers use UNIX-type operating systems. Predominantly Linux. Using a shell
More informationIntroduction to HPC at MSU
Introduction to HPC at MSU CYBERINFRASTRUCTURE DAYS 2014 Oct/23/2014 Yongjun Choi choiyj@msu.edu Research Specialist, Institute for Cyber- Enabled Research Agenda Introduction to HPCC Introduction to icer
More informationAdvanced Scripting Using PBS Environment Variables
Advanced Scripting Using PBS Environment Variables Your job submission script has a number of environment variables that can be used to help you write some more advanced scripts. These variables can make
More informationUnix basics exercise MBV-INFX410
Unix basics exercise MBV-INFX410 In order to start this exercise, you need to be logged in on a UNIX computer with a terminal window open on your computer. It is best if you are logged in on freebee.abel.uio.no.
More informationExercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters.
Exercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters. Before you login to the Blue Waters system, make sure you have the following information
More informationBy Ludovic Duvaux (27 November 2013)
Array of jobs using SGE - an example using stampy, a mapping software. Running java applications on the cluster - merge sam files using the Picard tools By Ludovic Duvaux (27 November 2013) The idea ==========
More informationIntroduction: What is Unix?
Introduction Introduction: What is Unix? An operating system Developed at AT&T Bell Labs in the 1960 s Command Line Interpreter GUIs (Window systems) are now available Introduction: Unix vs. Linux Unix
More informationGenomics. Nolan C. Kane
Genomics Nolan C. Kane Nolan.Kane@Colorado.edu Course info http://nkane.weebly.com/genomics.html Emails let me know if you are not getting them! Email me at nolan.kane@colorado.edu Office hours by appointment
More informationApplying Cortex to Phase Genomes data - the recipe. Zamin Iqbal
Applying Cortex to Phase 3 1000Genomes data - the recipe Zamin Iqbal (zam@well.ox.ac.uk) 21 June 2013 - version 1 Contents 1 Overview 1 2 People 1 3 What has changed since version 0 of this document? 1
More informationCSE 15L Winter Midterm :) Review
CSE 15L Winter 2015 Midterm :) Review Makefiles Makefiles - The Overview Questions you should be able to answer What is the point of a Makefile Why don t we just compile it again? Why don t we just use
More informationPractical: a sample code
Practical: a sample code Alistair Hart Cray Exascale Research Initiative Europe 1 Aims The aim of this practical is to examine, compile and run a simple, pre-prepared OpenACC code The aims of this are:
More informationAdvanced Linux Commands & Shell Scripting
Advanced Linux Commands & Shell Scripting Advanced Genomics & Bioinformatics Workshop James Oguya Nairobi, Kenya August, 2016 Man pages Most Linux commands are shipped with their reference manuals To view
More informationSGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH
SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH reiner@sgi.com Module Objectives After completion of this module you should be able to Submit batch jobs Create job chains Monitor your
More informationIntroduction to GALILEO
Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it
More informationCS Unix Tools & Scripting Lecture 7 Working with Stream
CS2043 - Unix Tools & Scripting Lecture 7 Working with Streams Spring 2015 1 February 4, 2015 1 based on slides by Hussam Abu-Libdeh, Bruno Abrahao and David Slater over the years Announcements Course
More informationUBDA Platform User Gudie. 16 July P a g e 1
16 July 2018 P a g e 1 Revision History Version Date Prepared By Summary of Changes 1.0 Jul 16, 2018 Initial release P a g e 2 Table of Contents 1. Introduction... 4 2. Perform the test... 5 3 Job submission...
More informationAssembly of the Ariolimax dolicophallus genome with Discovar de novo. Chris Eisenhart, Robert Calef, Natasha Dudek, Gepoliano Chaves
Assembly of the Ariolimax dolicophallus genome with Discovar de novo Chris Eisenhart, Robert Calef, Natasha Dudek, Gepoliano Chaves Overview -Introduction -Pair correction and filling -Assembly theory
More informationUsing ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim
Using ISMLL Cluster Tutorial Lec 5 1 Agenda Hardware Useful command Submitting job 2 Computing Cluster http://www.admin-magazine.com/hpc/articles/building-an-hpc-cluster Any problem or query regarding
More informationMartinos Center Compute Cluster
Why-N-How: Intro to Launchpad 8 September 2016 Lee Tirrell Laboratory for Computational Neuroimaging Adapted from slides by Jon Kaiser 1. Intro 2. Using launchpad 3. Summary 4. Appendix: Miscellaneous
More informationNew User Tutorial. OSU High Performance Computing Center
New User Tutorial OSU High Performance Computing Center TABLE OF CONTENTS Logging In... 3-5 Windows... 3-4 Linux... 4 Mac... 4-5 Changing Password... 5 Using Linux Commands... 6 File Systems... 7 File
More informationNGS Data Analysis. Roberto Preste
NGS Data Analysis Roberto Preste 1 Useful info http://bit.ly/2r1y2dr Contacts: roberto.preste@gmail.com Slides: http://bit.ly/ngs-data 2 NGS data analysis Overview 3 NGS Data Analysis: the basic idea http://bit.ly/2r1y2dr
More informationPipelines! CTB 6/15/13
Pipelines! CTB 6/15/13 A pipeline view of the world Sequence E. coli 2x110 Remove adapters Discard/trim low quality Assemble Genome! Each computa@onal step is one or more commands Sequence E. coli 2x110
More informationWorking on the NewRiver Cluster
Working on the NewRiver Cluster CMDA3634: Computer Science Foundations for Computational Modeling and Data Analytics 22 February 2018 NewRiver is a computing cluster provided by Virginia Tech s Advanced
More informationAbout this course 1 Recommended chapters... 1 A note about solutions... 2
Contents About this course 1 Recommended chapters.............................................. 1 A note about solutions............................................... 2 Exercises 2 Your first script (recommended).........................................
More informationKnights Landing production environment on MARCONI
Knights Landing production environment on MARCONI Alessandro Marani - a.marani@cineca.it March 20th, 2017 Agenda In this presentation, we will discuss - How we interact with KNL environment on MARCONI
More informationQuality assessment of NGS data
Quality assessment of NGS data Ines de Santiago July 27, 2015 Contents 1 Introduction 1 2 Checking read quality with FASTQC 1 3 Preprocessing with FASTX-Toolkit 2 3.1 Preprocessing with FASTX-Toolkit:
More informationITST Searching, Extracting & Archiving Data
ITST 1136 - Searching, Extracting & Archiving Data Name: Step 1 Sign into a Pi UN = pi PW = raspberry Step 2 - Grep - One of the most useful and versatile commands in a Linux terminal environment is the
More informationGenomic Files. University of Massachusetts Medical School. October, 2015
.. Genomic Files University of Massachusetts Medical School October, 2015 2 / 55. A Typical Deep-Sequencing Workflow Samples Fastq Files Fastq Files Sam / Bam Files Various files Deep Sequencing Further
More informationPBS Pro Documentation
Introduction Most jobs will require greater resources than are available on individual nodes. All jobs must be scheduled via the batch job system. The batch job system in use is PBS Pro. Jobs are submitted
More information