New High Performance Computing Cluster For Large Scale Multi-omics Data Analysis. 28 February 2018 (Wed) 2:30pm 3:30pm Seminar Room 1A, G/F

Size: px
Start display at page:

Download "New High Performance Computing Cluster For Large Scale Multi-omics Data Analysis. 28 February 2018 (Wed) 2:30pm 3:30pm Seminar Room 1A, G/F"

Transcription

1 New High Performance Computing Cluster For Large Scale Multi-omics Data Analysis 28 February 2018 (Wed) 2:30pm 3:30pm Seminar Room 1A, G/F

2 The Team (Bioinformatics & Information Technology) Eunice Kelvin Juilian Kevin Nick 2

3 HPCF High Performance Computing Facility Existing HPC (2011) Master / Login Node: hpcf.cgs.hku.hk Hostname: statgenpro 15 Compute Nodes / 376 cores / 1.5TB RAM Operating System: Ubuntu Job Scheduler: Torque / MAUI Tool: None New HPC (2018) Master / Login Node: hpcf2.cgs.hku.hk Hostname: omics 13 (8+5) Compute Nodes / 640 cores / 3.6TB RAM Operating System: CentOS 7.3 Job Scheduler: PBS Pro Tool: Environment Modules 3

4 What s New Environment Modules a tool to help you manage your shell environment, by allowing groups of related environment-variable settings to be made or removed dynamically What does it mean to you? No need to change path or type in full path to access a software and/or a version worry about dependencies of a software Simply load and unload the desired version of a software Minor change of script required but job submission is basically the same Software to be used by all users will be installed by IT using Modules You can continue to install your own software in your /home directory 4

5 What s New Updated Queue parameters Queue name Max. no. of processors per job Max. memory (GB) per job Max. no. of job running per user Max. no. of job queuing per user Max. Walltime (hr) small 2 10 (8) 18 (16) 40 (30) 6 small_ext 2 10 (8) 6 (4) 12 (10) 60 (48) medium 12 (10) 50 (40) 12 (10) 25 (22) 24 (18) medium_ext 12 (10) 50 (40) 6 (4) 8 60 (48) large (90) 3 (2) 4 84 (72) test default NA NA NA NA NA 5

6 What s New Some commonly used genome references / databases have been added Genomes References Database Software Bundle Databases Homo_sapiens mirbase ANNOVAR Mus_musculus ncrna CellRanger Gallus_gallus SILVA_DB GATK_bundle Equus_caballus mutect oncotator igenomes More coming... 6

7 Performance benchmark (HPC 2011 vs HPC 2018) Speed Up GFLOPS x Passmark Performance Test x 1 FLOPS: floating point operations per second 1 GFLOPS:

8 Quick demo Basic use of module module avail module list module load cutadapt module help cutadapt module unload cutadapt module purge Sample script modification Add module load commands/remove full path Log location 8

9 -bash-4.2$ module avail /software/modules/modulefiles ANNOVAR/2017Jul16 HTSeq/0.9.1 (D) STAR/2.5.2a bwa/ BEDTools/ MACS/ STAR/2.5.3a (D) bwa/ (D) BEDTools/ MACS/ (D) TrimGalore/0.4.1 cutadapt/1.8.1 BEDTools/ (D) MUMmer/3.22 TrimGalore/0.4.5 (D) cutadapt/1.15 (D) BioPerl/1.7.2 MUMmer/3.23 (D) Trimmomatic/0.33 idba/1.1.3 Canu/1.5 NCBI-blast/ Trimmomatic/0.36 (D) java/7.0_25 Canu/1.6 (D) NCBI-blast/ (D) VerifyBamID/1.1.2 java/7.0_80 CellRanger/2.0.1 Oncotator/ VerifyBamID/1.1.3 (D) java/8.0_161 (D) CellRanger/2.1.0 (D) PEAR/ bamutil/ java/9.0.4 DESeq2/ PEAR/ (D) bamutil/ (D) miniconda2/ DESeq2/ (D) Perl/ bamtools/2.3.0 mutect/1.1.4 EBSeq/1.9.3 Picard/2.0.1 bamtools/2.5.1 (D) mutect/1.1.5 (D) EBSeq/1.18 (D) Picard/ (D) bcl2fastq/2.19 python2/ FASTX-toolkit/ QIIME/1.9.1 bcl2fastq/2.20 (D) python3/3.6.4 FASTX-toolkit/ (D) QIIME2/ bedgraphtobigwig/4 samtools/ FastQC/ R/3.2.5 bismark/ samtools/1.3 FastQC/ (D) R/3.4.3 (D) bismark/ (D) samtools/1.6 (D) GenomeAnalysisTK/3.5 RNAmmer/1.2 bowtie/1.0.0 strelka/ GenomeAnalysisTK/3.7 RSEM/ bowtie/1.2.2 (D) strelka/2.8.4 (D) GenomeAnalysisTK/3.8 (D) RSEM/1.3.0 (D) bowtie2/2.2.5 trnascan-se/1.3.1 HOMER/4.9 SPAdes/ bowtie2/2.3.4 (D) HTSeq/0.6.1 SPAdes/ (D) bwa/ /opt/lmod/7.7.14/lmod/lmod/modulefiles/core lmod/ settarg/ Where: D: Default Module 9

10 -bash-4.2$ module list No modules loaded -bash-4.2$ -bash-4.2$ module load cutadapt cutadapt/1.15 is loaded -bash-4.2$ -bash-4.2$ module list Currently Loaded Modules: 1) cutadapt/1.15 -bash-4.2$ -bash-4.2$ module load cutadapt/1.8.1 cutadapt/1.15 is unloaded cutadapt/1.8.1 is loaded The following have been reloaded with a version change: 1) cutadapt/1.15 => cutadapt/ bash-4.2$ 10

11 -bash-4.2$ module load GenomeAnalysisTK Lmod has detected the following error: Cannot load module "GenomeAnalysisTK/3.8" without these module(s) loaded: java/8.0_161 While processing the following module(s): Module fullname Module Filename GenomeAnalysisTK/3.8 /software/modules/modulefiles/genomeanalysistk/3.8.lua -bash-4.2$ -bash-4.2$ module list Currently Loaded Modules: 1) cutadapt/ ) java/8.0_161 3) GenomeAnalysisTK/3.8 -bash-4.2$ module unload cutadapt/1.8.1 cutadapt/1.8.1 is unloaded java/8.0_161 is loaded GenomeAnalysisTK/3.8 is loaded -bash-4.2$ module list Currently Loaded Modules: 1) java/8.0_161 2) GenomeAnalysisTK/3.8 11

12 # EXISTING SCRIPT #!/bin/bash #PBS -l nodes=1:ppn=2 #PBS -l mem=10gb #PBS -l walltime=60:00:00 #PBS -m abe #PBS -q test_queue #PBS -N Cutadapt #PBS -e err.$pbs_jobname."$file".$pbs_jobid #PBS -o out.$pbs_jobname."$file".$pbs_jobid echo [MSG] Start filtering adapters... cd $PBS_O_WORKDIR/RAW # UPDATED SCRIPT #!/bin/bash #PBS -l nodes=1:ppn=2 #PBS -l mem=10gb #PBS -l walltime=60:00:00 #PBS -m abe #PBS -q test_queue #PBS -N Cutadapt ##PBS -e err.$pbs_jobname."$file".$pbs_jobid ##PBS -o out.$pbs_jobname."$file".$pbs_jobid echo [MSG] Start filtering adapters... cd $PBS_O_WORKDIR/RAW module load cutadapt/1.8.1 /home/someone/software/cutadapt-1.8.1/bin/cutadapt $(<cutadapt_truseqpe.conf) -o "$file"_1_temp.fastq -p "$file"_2_temp.fastq "$file"_1.fastq "$file"_2.fastq cutadapt $(<cutadapt_truseqpe.conf) -o "$file"_1_temp.fastq -p "$file"_2_temp.fastq "$file"_1.fastq "$file"_2.fastq # run more commands below # run more commands below echo [MSG] Adapters Filtering Completed! echo [MSG] Adapters Filtering Completed 12

13 Potential / Known issues Software currently installed in the existing HPC may have to be recompiled/reinstalled in the new HPC Existing scripts have to be modified gatk 3.5 specifically for gatk in the new environment, you have to add the following parameter in order to prevent error when calling variants java -Xmx80g -jar $GATK_DIR/GenomeAnalysisTK.jar -T HaplotypeCaller \ -R $ref_file \ -I samplea.bam \ -D $GATK_bundle/dbsnp.vcf \ -o $OUT_GVCFDIR/sampleA.g.vcf \ --emitrefconfidence GVCF \ -log $OUT_GVCFDIR/"$file".g.vcf.log \ --pair_hmm_implementation LOGLESS_CACHING \ -nct 12 \ 13

14 Data backup You own the data and are responsible for your own data backup No centralized data backup solution but we re looking into the feasibility of it as a future service 14

15 What happens to the Existing HPC Will be gradually phased out by Dec 2018 Co-located servers will need to migrate to the new HPC by Dec

16 Moving Forward Better visibility of the HPC Job queues and resources status Capacity expansion $$$ More Compute Nodes More disk capacity New user orientation sessions, seminars and workshops Orientation session will be compulsory for new user Introduction to HPCF / PBS Pro / module technical session on 27 Mar 2018 (10:00am to 12:00pm) Better Utilisation of the HPC 16

17 FAQ Commonly Asked Questions How to apply for a new / removal of user account? Can I share my account with others? What are CGS-HPCF charges? How to apply for addition / reduction on disk quota? What to do when the usual queue / resource doesn t satisfy your analysis job? What is co-location service? 17

18 Requested CPU Number of jobs Good Citizen / Fair Use of resource Most people requested MORE than what they need Used CPU 18

19 Example Usage 8am) jobid username queue jobname ES End Time cpu% mem% wtime% 1577.omics cgs small canu_cre /14/ : omics cgs medium meryl_cre /14/2018 9: statgenpro cgs cgs pbalign 0 11/8/2017 8: statgenpro cgs cgs pbalign_contig 0 11/8/ : statgenpro cgs cgs pbalign_unitig 0 11/8/ : statgenpro cgs cgs pbalign_contig 0 10/31/2017 1: statgenpro cgs cgs pbalign_fl 0 10/31/ : statgenpro cgs cgs pbalign_unitig 0 10/31/2017 1: statgenpro cgs large STARlong 0 9/28/ : statgenpro cgs large STARlong 271 9/28/ : statgenpro cgs large STARlong 0 9/28/ : statgenpro cgs large STARlong 0 9/28/ : statgenpro cgs cgs insertsize 0 4/22/ : statgenpro cgs cgs insertsize 0 4/22/ : statgenpro cgs cgs insertsize -11 4/22/ : statgenpro cgs cgs insertsize 0 4/22/2017 8: statgenpro cgs cgs insertsize 0 4/22/ : statgenpro cgs cgs insertsize -11 4/22/ : statgenpro cgs cgs insertsize 0 4/22/ : statgenpro cgs cgs insertsize 0 4/22/ : statgenpro cgs cgs insertsize 0 4/22/ :

20 Scheduled Maintenance Periodically, usually quarterly, maintenance of the HPCF will be arranged for the following purposes with some downtime OS update / upgrade Patches installation Fine-tuning Health check Housekeeping Hardware upgrade Firmware update Configuration changes 20

21 The day is today hpcf2.cgs.hku.hk ************************ * Welcome to Omics * ************************

22 Questions / Suggestions Website: Wiki: itsupport.cgs@hku.hk

BioHPC Lab at Cornell

BioHPC Lab at Cornell BioHPC Lab at Cornell Robert Bukowski (formerly: Computational Biology Service Unit) http://cbsu.tc.cornell.edu/lab/doc/biohpclabintro20130916.pdf (CBSU) Cornell Core Facility providing services for a

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit cores : individual processing units within a CPU Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Introduction to High-Performance Computing (HPC)

Introduction to High-Performance Computing (HPC) Introduction to High-Performance Computing (HPC) Computer components CPU : Central Processing Unit CPU cores : individual processing units within a Storage : Disk drives HDD : Hard Disk Drive SSD : Solid

More information

Using Sapelo2 Cluster at the GACRC

Using Sapelo2 Cluster at the GACRC Using Sapelo2 Cluster at the GACRC New User Training Workshop Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Sapelo2 Cluster Diagram

More information

Introduction to High Performance Computing (HPC) Resources at GACRC

Introduction to High Performance Computing (HPC) Resources at GACRC Introduction to High Performance Computing (HPC) Resources at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? Concept

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is

More information

Introduction to High Performance Computing (HPC) Resources at GACRC

Introduction to High Performance Computing (HPC) Resources at GACRC Introduction to High Performance Computing (HPC) Resources at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu 1 Outline GACRC? High Performance

More information

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Using the New Cluster at GACRC Introduction to HPC Using the New Cluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is the new cluster

More information

WM2 Bioinformatics. ExomeSeq data analysis part 1. Dietmar Rieder

WM2 Bioinformatics. ExomeSeq data analysis part 1. Dietmar Rieder WM2 Bioinformatics ExomeSeq data analysis part 1 Dietmar Rieder RAW data Use putty to logon to cluster.i med.ac.at In your home directory make directory to store raw data $ mkdir 00_RAW Copy raw fastq

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What

More information

High Performance Computing (HPC) Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?

More information

BioHPC Lab at Cornell

BioHPC Lab at Cornell BioHPC Lab at Cornell Jaroslaw Pillardy CBSU, Life Sciences Core Laboratories Center Cornell University Practical exercises for the workshop will be carried out using CBSU BioHPC Lab Software used during

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

A Virtual Machine to teach NGS data analysis. Andreas Gisel CNR - ITB Bari, Italy

A Virtual Machine to teach NGS data analysis. Andreas Gisel CNR - ITB Bari, Italy A Virtual Machine to teach NGS data analysis Andreas Gisel CNR - ITB Bari, Italy The Virtual Machine A virtual machine is a tightly isolated software container that can run its own operating systems and

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What

More information

Introduction to High Performance Computing Using Sapelo2 at GACRC

Introduction to High Performance Computing Using Sapelo2 at GACRC Introduction to High Performance Computing Using Sapelo2 at GACRC Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 Outline High Performance Computing (HPC)

More information

UF Research Computing: Overview and Running STATA

UF Research Computing: Overview and Running STATA UF : Overview and Running STATA www.rc.ufl.edu Mission Improve opportunities for research and scholarship Improve competitiveness in securing external funding Matt Gitzendanner magitz@ufl.edu Provide high-performance

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

Running Jobs, Submission Scripts, Modules

Running Jobs, Submission Scripts, Modules 9/17/15 Running Jobs, Submission Scripts, Modules 16,384 cores total of about 21,000 cores today Infiniband interconnect >3PB fast, high-availability, storage GPGPUs Large memory nodes (512GB to 1TB of

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to Job Submission and Scheduling Andrew Gustafson Interacting with MSI Systems Connecting to MSI SSH is the most reliable connection method Linux and Mac

More information

Answers to Federal Reserve Questions. Training for University of Richmond

Answers to Federal Reserve Questions. Training for University of Richmond Answers to Federal Reserve Questions Training for University of Richmond 2 Agenda Cluster Overview Software Modules PBS/Torque Ganglia ACT Utils 3 Cluster overview Systems switch ipmi switch 1x head node

More information

Genome Assembly. 2 Sept. Groups. Wiki. Job files Read cleaning Other cleaning Genome Assembly

Genome Assembly. 2 Sept. Groups. Wiki. Job files Read cleaning Other cleaning Genome Assembly 2 Sept Groups Group 5 was down to 3 people so I merged it into the other groups Group 1 is now 6 people anyone want to change? The initial drafter is not the official leader use any management structure

More information

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220

Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Introduction to HPC Using zcluster at GACRC On-Class GENE 4220 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 OVERVIEW GACRC

More information

UMass High Performance Computing Center

UMass High Performance Computing Center .. UMass High Performance Computing Center University of Massachusetts Medical School October, 2015 2 / 39. Challenges of Genomic Data It is getting easier and cheaper to produce bigger genomic data every

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu March 2014 The Discovery Cluster 2 Agenda Resource overview Logging on to the cluster with ssh Transferring files to and from the cluster The Environment

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu 1 Outline What is GACRC? What is HPC Concept? What

More information

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Using the New Cluster at GACRC Introduction to HPC Using the New Cluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is the new cluster

More information

Shark Cluster Overview

Shark Cluster Overview Shark Cluster Overview 51 Execution Nodes 1 Head Node (shark) 1 Graphical login node (rivershark) 800 Cores = slots 714 TB Storage RAW Slide 1/14 Introduction What is a cluster? A cluster is a group of

More information

PBS Pro Documentation

PBS Pro Documentation Introduction Most jobs will require greater resources than are available on individual nodes. All jobs must be scheduled via the batch job system. The batch job system in use is PBS Pro. Jobs are submitted

More information

Parameter searches and the batch system

Parameter searches and the batch system Parameter searches and the batch system Scientific Computing Group css@rrzn.uni-hannover.de Parameter searches and the batch system Scientific Computing Group 1st of October 2012 1 Contents 1 Parameter

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou OVERVIEW GACRC High Performance

More information

Quality Control of Illumina Data at the Command Line

Quality Control of Illumina Data at the Command Line Quality Control of Illumina Data at the Command Line Quick UNIX Introduction: UNIX is an operating system like OSX or Windows. The interface between you and the UNIX OS is called the shell. There are a

More information

PBS Pro and Ansys Examples

PBS Pro and Ansys Examples PBS Pro and Ansys Examples Introduction This document contains a number of different types of examples of using Ansys on the HPC, listed below. 1. Single-node Ansys Job 2. Single-node CFX Job 3. Single-node

More information

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Using the New Cluster at GACRC Introduction to HPC Using the New Cluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu 1 Outline What is GACRC? What is the new cluster

More information

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing A Hands-On Tutorial: RNA Sequencing Using Computing February 11th and 12th, 2016 1st session (Thursday) Preliminaries: Linux, HPC, command line interface Using HPC: modules, queuing system Presented by:

More information

Graham vs legacy systems

Graham vs legacy systems New User Seminar Graham vs legacy systems This webinar only covers topics pertaining to graham. For the introduction to our legacy systems (Orca etc.), please check the following recorded webinar: SHARCNet

More information

Shark Cluster Overview

Shark Cluster Overview Shark Cluster Overview 51 Execution Nodes 1 Head Node (shark) 2 Graphical login nodes 800 Cores = slots 714 TB Storage RAW Slide 1/17 Introduction What is a High Performance Compute (HPC) cluster? A HPC

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Training day SLURM cluster. Context. Context renewal strategy

Training day SLURM cluster. Context. Context renewal strategy Training day cluster Context Infrastructure Environment Software usage Help section For further with Best practices Support Context PRE-REQUISITE : LINUX connect to «genologin» server Basic command line

More information

Galaxy workshop at the Winter School Igor Makunin

Galaxy workshop at the Winter School Igor Makunin Galaxy workshop at the Winter School 2016 Igor Makunin i.makunin@uq.edu.au Winter school, UQ, July 6, 2016 Plan Overview of the Genomics Virtual Lab Introduce Galaxy, a web based platform for analysis

More information

Frequently Asked Questions

Frequently Asked Questions Frequently Asked Questions Fabien Archambault Aix-Marseille Université 2012 F. Archambault (AMU) Rheticus: F.A.Q. 2012 1 / 13 1 Rheticus configuration 2 Front-end connection 3 Modules 4 OAR submission

More information

Introduction to HPC Using the New Cluster at GACRC

Introduction to HPC Using the New Cluster at GACRC Introduction to HPC Using the New Cluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is the new cluster

More information

Read mapping with BWA and BOWTIE

Read mapping with BWA and BOWTIE Read mapping with BWA and BOWTIE Before We Start In order to save a lot of typing, and to allow us some flexibility in designing these courses, we will establish a UNIX shell variable BASE to point to

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Using the Galaxy Local Bioinformatics Cloud at CARC

Using the Galaxy Local Bioinformatics Cloud at CARC Using the Galaxy Local Bioinformatics Cloud at CARC Lijing Bu Sr. Research Scientist Bioinformatics Specialist Center for Evolutionary and Theoretical Immunology (CETI) Department of Biology, University

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

Using the IBM Opteron 1350 at OSC. October 19-20, 2010

Using the IBM Opteron 1350 at OSC. October 19-20, 2010 Using the IBM Opteron 1350 at OSC October 19-20, 2010 Table of Contents Hardware Overview The Linux Operating System User Environment and Storage 2 Hardware Overview Hardware introduction Login node configuration

More information

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP

Training day SLURM cluster. Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Training day SLURM cluster Context Infrastructure Environment Software usage Help section SLURM TP For further with SLURM Best practices Support TP Context PRE-REQUISITE : LINUX connect to «genologin»

More information

Migrating from Zcluster to Sapelo

Migrating from Zcluster to Sapelo GACRC User Quick Guide: Migrating from Zcluster to Sapelo The GACRC Staff Version 1.0 8/4/17 1 Discussion Points I. Request Sapelo User Account II. III. IV. Systems Transfer Files Configure Software Environment

More information

Supercomputing environment TMA4280 Introduction to Supercomputing

Supercomputing environment TMA4280 Introduction to Supercomputing Supercomputing environment TMA4280 Introduction to Supercomputing NTNU, IMF February 21. 2018 1 Supercomputing environment Supercomputers use UNIX-type operating systems. Predominantly Linux. Using a shell

More information

Workshop Set up. Workshop website: Workshop project set up account at my.osc.edu PZS0724 Nq7sRoNrWnFuLtBm

Workshop Set up. Workshop website:   Workshop project set up account at my.osc.edu PZS0724 Nq7sRoNrWnFuLtBm Workshop Set up Workshop website: https://khill42.github.io/osc_introhpc/ Workshop project set up account at my.osc.edu PZS0724 Nq7sRoNrWnFuLtBm If you already have an OSC account, sign in to my.osc.edu

More information

Batch Systems. Running calculations on HPC resources

Batch Systems. Running calculations on HPC resources Batch Systems Running calculations on HPC resources Outline What is a batch system? How do I interact with the batch system Job submission scripts Interactive jobs Common batch systems Converting between

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 OVERVIEW GACRC High Performance

More information

Practical Linux Examples

Practical Linux Examples Practical Linux Examples Processing large text file Parallelization of independent tasks Qi Sun & Robert Bukowski Bioinformatics Facility Cornell University http://cbsu.tc.cornell.edu/lab/doc/linux_examples_slides.pdf

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

Decrypting your genome data privately in the cloud

Decrypting your genome data privately in the cloud Decrypting your genome data privately in the cloud Marc Sitges Data Manager@Made of Genes @madeofgenes The Human Genome 3.200 M (x2) Base pairs (bp) ~20.000 genes (~30%) (Exons ~1%) The Human Genome Project

More information

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.

More information

NBIC TechTrack PBS Tutorial

NBIC TechTrack PBS Tutorial NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen Visit our webpage at: http://www.nbic.nl/support/brs 1 NBIC PBS Tutorial

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

Big Data Analytics at OSC

Big Data Analytics at OSC Big Data Analytics at OSC 04/05/2018 SUG Shameema Oottikkal Data Application Engineer Ohio SuperComputer Center email:soottikkal@osc.edu 1 Data Analytics at OSC Introduction: Data Analytical nodes OSC

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

High Performance Compu2ng Using Sapelo Cluster

High Performance Compu2ng Using Sapelo Cluster High Performance Compu2ng Using Sapelo Cluster Georgia Advanced Compu2ng Resource Center EITS/UGA Zhuofei Hou, Training Advisor zhuofei@uga.edu 1 Outline GACRC What is High Performance Compu2ng (HPC) Sapelo

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Batch Systems. Running your jobs on an HPC machine

Batch Systems. Running your jobs on an HPC machine Batch Systems Running your jobs on an HPC machine Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational

More information

Reads Alignment and Variant Calling

Reads Alignment and Variant Calling Reads Alignment and Variant Calling CB2-201 Computational Biology and Bioinformatics February 22, 2016 Emidio Capriotti http://biofold.org/ Institute for Mathematical Modeling of Biological Systems Department

More information

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta

Using Compute Canada. Masao Fujinaga Information Services and Technology University of Alberta Using Compute Canada Masao Fujinaga Information Services and Technology University of Alberta Introduction to cedar batch system jobs are queued priority depends on allocation and past usage Cedar Nodes

More information

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD Research Scientist, PACE

PACE. Instructional Cluster Environment (ICE) Orientation. Mehmet (Memo) Belgin, PhD  Research Scientist, PACE PACE Instructional Cluster Environment (ICE) Orientation Mehmet (Memo) Belgin, PhD www.pace.gatech.edu Research Scientist, PACE What is PACE A Partnership for an Advanced Computing Environment Provides

More information

UBDA Platform User Gudie. 16 July P a g e 1

UBDA Platform User Gudie. 16 July P a g e 1 16 July 2018 P a g e 1 Revision History Version Date Prepared By Summary of Changes 1.0 Jul 16, 2018 Initial release P a g e 2 Table of Contents 1. Introduction... 4 2. Perform the test... 5 3 Job submission...

More information

PACE Orientation. Research Scientist, PACE

PACE Orientation. Research Scientist, PACE PACE Orientation Mehmet (Memo) Belgin, PhD Research Scientist, PACE www.pace.gatech.edu What is PACE A Partnership for an Advanced Computing Environment Provides faculty and researchers vital tools to

More information

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen NBIC TechTrack PBS Tutorial by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen 1 NBIC PBS Tutorial This part is an introduction to clusters and the PBS

More information

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE PACE Instructional Cluster Environment (ICE) Orientation Mehmet (Memo) Belgin, PhD Research Scientist, PACE www.pace.gatech.edu What is PACE A Partnership for an Advanced Computing Environment Provides

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

Using the computational resources at the GACRC

Using the computational resources at the GACRC An introduction to zcluster Georgia Advanced Computing Resource Center (GACRC) University of Georgia Dr. Landau s PHYS4601/6601 course - Spring 2017 What is GACRC? Georgia Advanced Computing Resource Center

More information

root.smart Power NET Bandwidth Usage CPU Usage

root.smart Power NET Bandwidth Usage CPU Usage root.smart Power NET Bandwidth Usage CPU Usage Free Space on Disks Patch Status RAM Usage Pagefile Usage Backup Status Script Time eventtime Script Name Status smartmelb_smart-dc01 4/15/2010 7:03:32

More information

Cerebro Quick Start Guide

Cerebro Quick Start Guide Cerebro Quick Start Guide Overview of the system Cerebro consists of a total of 64 Ivy Bridge processors E5-4650 v2 with 10 cores each, 14 TB of memory and 24 TB of local disk. Table 1 shows the hardware

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

Big Data Analytics with Hadoop and Spark at OSC

Big Data Analytics with Hadoop and Spark at OSC Big Data Analytics with Hadoop and Spark at OSC 09/28/2017 SUG Shameema Oottikkal Data Application Engineer Ohio SuperComputer Center email:soottikkal@osc.edu 1 Data Analytics at OSC Introduction: Data

More information

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved. Minnesota Supercomputing Institute Introduction to MSI Systems Andrew Gustafson The Machines at MSI Machine Type: Cluster Source: http://en.wikipedia.org/wiki/cluster_%28computing%29 Machine Type: Cluster

More information

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

Using ITaP clusters for large scale statistical analysis with R. Doug Crabill Purdue University

Using ITaP clusters for large scale statistical analysis with R. Doug Crabill Purdue University Using ITaP clusters for large scale statistical analysis with R Doug Crabill Purdue University Topics Running multiple R jobs on departmental Linux servers serially, and in parallel Cluster concepts and

More information

Slurm basics. Summer Kickstart June slide 1 of 49

Slurm basics. Summer Kickstart June slide 1 of 49 Slurm basics Summer Kickstart 2017 June 2017 slide 1 of 49 Triton layers Triton is a powerful but complex machine. You have to consider: Connecting (ssh) Data storage (filesystems and Lustre) Resource

More information

Please include the following sentence in any works using center resources.

Please include the following sentence in any works using center resources. The TCU High-Performance Computing Center The TCU HPCC currently maintains a cluster environment hpcl1.chm.tcu.edu. Work on a second cluster environment is underway. This document details using hpcl1.

More information

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS

Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing at UEA. Chris Collins Head of Research and Specialist Computing ITCS Introduction to High Performance Computing High Performance Computing at UEA http://rscs.uea.ac.uk/hpc/

More information

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende

Introduction to NCAR HPC. 25 May 2017 Consulting Services Group Brian Vanderwende Introduction to NCAR HPC 25 May 2017 Consulting Services Group Brian Vanderwende Topics we will cover Technical overview of our HPC systems The NCAR computing environment Accessing software on Cheyenne

More information

For Dr Landau s PHYS8602 course

For Dr Landau s PHYS8602 course For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your

More information

Introduction to UNIX

Introduction to UNIX PURDUE UNIVERSITY Introduction to UNIX Manual Michael Gribskov 8/21/2016 1 Contents Connecting to servers... 4 PUTTY... 4 SSH... 5 File Transfer... 5 scp secure copy... 5 sftp

More information

GACRC User Training: Migrating from Zcluster to Sapelo

GACRC User Training: Migrating from Zcluster to Sapelo GACRC User Training: Migrating from Zcluster to Sapelo The GACRC Staff Version 1.0 8/28/2017 GACRC Zcluster-Sapelo Migrating Training 1 Discussion Points I. Request Sapelo User Account II. III. IV. Systems

More information

Knights Landing production environment on MARCONI

Knights Landing production environment on MARCONI Knights Landing production environment on MARCONI Alessandro Marani - a.marani@cineca.it March 20th, 2017 Agenda In this presentation, we will discuss - How we interact with KNL environment on MARCONI

More information

The study of microbial communities: Bioinformatics applications within the UL HPC environment

The study of microbial communities: Bioinformatics applications within the UL HPC environment The study of microbial communities: Bioinformatics applications within the UL HPC environment UL HPC school 2017 13 June 2017 Shaman Narayanasamy Eco-Systems Biology group of LCSB The subject: microbial

More information

and how to use TORQUE & Maui Piero Calucci

and how to use TORQUE & Maui Piero Calucci Queue and how to use & Maui Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 We Are Trying to Solve 2 Using the Manager

More information

New User Seminar: Part 2 (best practices)

New User Seminar: Part 2 (best practices) New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency

More information

Sequence Mapping and Assembly

Sequence Mapping and Assembly Practical Introduction Sequence Mapping and Assembly December 8, 2014 Mary Kate Wing University of Michigan Center for Statistical Genetics Goals of This Session Learn basics of sequence data file formats

More information

Analyzing ChIP-Seq Data at the Command Line

Analyzing ChIP-Seq Data at the Command Line Analyzing ChIP-Seq Data at the Command Line Quick UNIX Introduction: UNIX is an operating system like OSX or Windows. The interface between you and the UNIX OS is called the shell. There are a few flavors

More information