Introduction to BioHPC

Similar documents
Introduction to BioHPC

Introduction to BioHPC

Introduction to BioHPC New User Training

Introduction to BioHPC New User Training

Welcome to the Introduc/on to BioHPC training session. My name is David Trudgian, and I m one of the Computa/onal Scien/sts in the BioHPC team.

Workstations & Thin Clients

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Introduction to High-Performance Computing (HPC)

Introduction to GACRC Teaching Cluster

Introduction to GACRC Teaching Cluster

For Dr Landau s PHYS8602 course

HPC Introductory Course - Exercises

Introduction to GACRC Teaching Cluster PHYS8602

Using Cartesius and Lisa. Zheng Meyer-Zhao - Consultant Clustercomputing

Quick Start Guide. Table of Contents

CS CS Tutorial 2 2 Winter 2018

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

Name Department/Research Area Have you used the Linux command line?

Graphical Access to IU's Supercomputers with Karst Desktop Beta

XSEDE New User Training. Ritu Arora November 14, 2014

Using a Linux System 6

Copyright 2013

Graham vs legacy systems

Our Workshop Environment

Using the SLURM Job Scheduler


CLC Genomics Workbench. Setup and User Guide

The BioHPC Nucleus Cluster & Future Developments

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

Visualization on BioHPC

New User Tutorial. OSU High Performance Computing Center

Introduction to High-Performance Computing (HPC)

Using CLC Genomics Workbench on Turing

CMTH/TYC Linux Cluster Overview. Éamonn Murray 1 September 2017

Using MFA with the Pulse Client

Slurm basics. Summer Kickstart June slide 1 of 49

High Performance Computing Cluster Basic course

An Introduction to Gauss. Paul D. Baines University of California, Davis November 20 th 2012

Introduction to Linux for BlueBEAR. January

Hollins University VPN

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

Junos Pulse Installation (SSL VPN)

Numascale Analytics Appliance in the Cloud

Linux for Biologists Part 2

Remote & Collaborative Visualization. Texas Advanced Computing Center

Temple University Computer Science Programming Under the Linux Operating System January 2017

Introduction to Discovery.

DNA Sequence Bioinformatics Analysis with the Galaxy Platform

From using an External Harddrive, to a Google Cloud Drive; there is no one way to backup data.

MICROSOFT OFFICE Desktop Applications. Student User Guide Overview

Our Workshop Environment

Robert Bukowski Jaroslaw Pillardy 6/27/2011

AHC IE Data Shelter User Guide

SAP GUI 7.30 for Windows Computer

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

eggplant v11.0 Mac OS X EggPlant: Getting Started

The Rockefeller University I NFORMATION T ECHNOLOGY E DUCATION & T RAINING. VPN Web Portal Usage Guide

GRID COMPANION GUIDE

Our Workshop Environment

Once you have installed MobaXterm, open MobaXterm. Go to Sessions -> New Session, and click on the SSH icon.

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

Introduction to High Performance Computing and an Statistical Genetics Application on the Janus Supercomputer. Purpose

Microsoft OneDrive. How to login to OneDrive:

Introduction to PICO Parallel & Production Enviroment

Remote Access to Unix Machines

SUG Breakout Session: OSC OnDemand App Development

Introduction to HPC Resources and Linux

Short Read Sequencing Analysis Workshop

This guide provides all of the information necessary to connect to MoFo resources from outside of the office.

Lab 1 Introduction to UNIX and C

SONOTON storage server

PARALLEL COMPUTING IN R USING WESTGRID CLUSTERS STATGEN GROUP MEETING 10/30/2017

How to Use a Supercomputer - A Boot Camp

DPHremote.ucsf.edu for Webconnect Users

Choosing Resources Wisely. What is Research Computing?

Click Studios. Passwordstate. Remote Session Launcher. Installation Instructions

Agent and Agent Browser. Updated Friday, January 26, Autotask Corporation

Get your own Galaxy within minutes

Our Workshop Environment

Workshop on Genomics 2018

Topic: Dropbox. Instructional Technology Services Dropbox Faculty Help. Dropbox Features: Minimum Requirements: Create a Dropbox Account

NBIC TechTrack PBS Tutorial

Installation Guide. Research Computing Team V2.0 RESTRICTED

BIF713. Operating Systems & Project Management. Instructor: Murray Saul Webpage: murraysaul.wordpress.

Our Workshop Environment

FEPS. SSH Access with Two-Factor Authentication. RSA Key-pairs

Linux Tutorial. Ken-ichi Nomura. 3 rd Magics Materials Software Workshop. Gaithersburg Marriott Washingtonian Center November 11-13, 2018

JCCC Virtual Labs. Click the link for more information on installing on that device type. Windows PC/laptop Apple imac or MacBook ipad Android Linux

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

HPC system startup manual (version 1.20)

Accessing OUHSC. Requirements to log into Topaz Elements: Steps to Access TOPAZ Elements:

An Introduction to Cluster Computing Using Newton

Zadara Enterprise Storage in

Anvil: HCC's Cloud. June Workshop Series - June 26th

JHU Economics August 24, Galaxy How To SSH and RDP

AN INTRODUCTION TO CLUSTER COMPUTING

Duke Compute Cluster Workshop. 3/28/2018 Tom Milledge rc.duke.edu

Licensing eggplant Functional

Triton file systems - an introduction. slide 1 of 28

IBM Cloud Client Technical Engagement Education Network Columbus, Ohio

Transcription:

Introduction to BioHPC New User Training [web] [email] portal.biohpc.swmed.edu biohpc-help@utsouthwestern.edu 1 Updated for 2015-06-03

Overview Today we re going to cover: What is BioHPC? How do I access BioHPC resources? How can I be a good user? (some basic rules) How do I get effective help? If you remember only one thing. If you have any question, ask us via biohpc-help@utsouthwestern.edu 2

What is HPC, and why do we need it? High-performance computing (HPC) is the use of parallel processing for running advanced application programs efficiently, reliably and quickly. Any computing that isn t possible on a standard system PROBLEMS Huge Datasets Complex Algorithms Difficult / inefficient software BioHPC SOLUTIONS Batch HPC jobs Interactive GUI sessions Visualization with GPUs Windows sessions on the cluster Wide range of software Easy web access to services 3

What is BioHPC? - An Overview BioHPC is: A 74-node compute cluster. >1Petabyte (1000 Terabytes) of storage across various systems. A network of thin-client and workstation machines. Large number of installed software packages. Cloud Services to access these facilities easily. A dedicated team to help you efficiently use these resources for your research. 4

Who is BioHPC? Liqiang Wang Director, 13 years experience in IT infrastructure, HPC. Yi Du Computational Scientist, experience in parallel software design, large-scale data analysis. David Trudgian Computational Scientist, Ph.D. in Computer Science and 10 years experience in bioinformatics (sequence classification & computational proteomics). Ross Bateman Technical Support Specialist, experienced in maintaining user systems and troubleshooting. We are biohpc-help@utsouthwestern.edu https://portal.biohpc.swmed.edu/content/about/staff/ 5

What is BioHPC? Nucleus Computer Cluster Nucleus is our compute cluster 74 nodes 128GB, 256GB, 384GB, GPU CPU cores: 3296 GPU cores: 19968 Memory: 16TB Network: 56Gb/s per node (internal) 40Gb/s(to campus) Login via ssh to nucleus.biohpc.swmed.edu or use web portal. 6

What can I do on Nucleus? Run any computationally intensive work Linux HPC Jobs GPU Visualization Interactive Sessions Windows with GPU Visualization 7

What is BioHPC? Storage Every user receives space in 3 file systems: /home2 /project /work Your home directory has a 50GB quota for private files, code, settings etc: /home2/username Your project directory is for your large data, and quota is set per lab/group: ` /project/department/group/user Your work quota is set per lab/group: /work/department/user 8

What is BioHPC? Lamella Storage Gateway Lamella is our storage gateway access your files easily, from anywhere Web Interface Windows / Mac drive mounts (SMB /WebDav) lamella.biohpc.swmed.edu FTP 9

What is BioHPC? Thin Client & Workstation Systems Desktop computers directly connected to the BioHPC systems. Run same version of Linux as the cluster, but with a graphical desktop. Login with BioHPC details, direct access to storage like on cluster. Same software available as on cluster. Will make up a distributed compute resource in future, using up to 50% of CPU to run distributed jobs. Thin client is less powerful but cheaper and smaller. Separate training session October 14th. 10

What is BioHPC? - Software Wide range of packages available as modules. You can ask biohpc-help@utsouthwestern.edu for additions/upgrades etc. 11

What is BioHPC? Cloud Services A big focus at BioHPC is easy-access to our systems. Our cloud services provide web-based access to resources, with only a browser. All accessible via portal.biohpc.swmed.edu 12

Okay, sounds great. But how do I use all of this? 13

If you don t know anything about using a Linux command line We provide simple web-based access, so you don t have to be an expert. Web tutorials are your friend when you want to learn more: http://linuxcommand.org/lc3_learning_the_shell.php Linux command line topics that are useful for BioHPC, plus writing shell scripts were covered in a previous course. See portal->training for slides. We won t be offering an introduction to very basic use of the Linux command line. There s plenty on the web. 14

Lamella / Cloud Storage Gateway Cloud storage gateway web-based. https://lamella.biohpc.swmed.edu 100GB separate space + Mount home / project /work Internal https://cloud.biohpc.swmed.edu 50GB space External file transfer Accessible from internet Separate training session: Wed Sep 9 th 15

Setting up Lamella to access your BioHPC home, project, and work space https://lamella.biohpc.swmed.edu For home leave blank For private project space: department/lab/user For lab shared project space: department/lab/shared BioHPC Endosome/Lysosome lysosome username password home project 16

Accessing BioHPC Storage Directly from Windows Computer -> Map Network Drive Folder is: \\lamella.biohpc.swmed.edu\username (home dir) \\lamella.biohpc.swmed.edu\project \\lamella.biohpc.swmed.edu\work Check Connect using different credentials Enter your BioHPC username and password when prompted. 17

Accessing BioHPC Storage Directly from Mac OSX Finder -> Go -> Connect to Server Folder is: smb://lamella.biohpc.swmed.edu/username (home dir) smb://lamella.biohpc.swmed.edu/project smb://lamella.biohpc.swmed.edu/work Enter your BioHPC username and password when prompted. 18

Web Job Script Generator https://portal.biohpc.swmed.edu -> Cloud Services -> Web Job Submission 19

Web Visualization Graphical Interactive Session via the Web Portal / VNC Client https://portal.biohpc.swmed.edu -> Cloud Services -> Web Visualization Connects to GUI running on a cluster node. WebGPU sessions have access to GPU card for 3D rendering. 20

Software Modules module list module avail module load <module name> module unload <module name> module help <module name> module H Show loaded modules Show available modules Load a module Unload a module Help notes for a module Help for the module command 21

22 Software Modules

SSH Cluster Login via the Web Portal https://portal.biohpc.swmed.edu -> Cloud Services -> Nucleus Web Terminal Connects to the login node, not a cluster node w 23

Connecting from Home Windows - Follow the IR VPN instructions at: http://www.utsouthwestern.net/intranet/administration/information-resources/network/vpn/ Mac Try the IR instructions first. If they don t work: On Campus Go -> Connect to Server Server Address: smb://swnas.swmed.org/data/installs Connect VPN Client (Juniper) -> Juniper Mac VPN Client Installer ->JunosPulse.dmg Install the software from in the.dmg file. You cannot test it on campus. At Home Start Junos Pulse and add a connection to server utswra.swmed.edu When connecting must enter a secondary password, which is obtained using the key icon in the Duo Mobile twofactor authentication smartphone app. We can help surgery session, or NL05.108 24

How To Be a Good User HPC Systems are crowded, shared resources Co-operation is necessary. The BioHPC team has a difficult job to do: Balance the requirements of a diverse group of users, running very different types of jobs. Make sure user actions don t adversely affect others using the systems. Keep the environment secure. Ensure resources are being used efficiently. Web-based Cloud-Services are designed to avoid problems. 25

All we ask is 1. If you have any question, or are unsure about something please ask us. biohpc-help@utsouthwestern.edu 2. When running jobs on the cluster, request the least amount of resources you know you need. Job times / memory limit / smallest node that will work etc. Up to a 2x margin of safety is appropriate. 3. Make reasonable attempts to use the resources efficiently. Run multiple small tasks on a node if you can. Cancel / close any jobs or sessions you no longer need. 4. Keep notes in case you need our help troubleshooting Keep old versions of scripts and job files 26

Currently Enforced Policy Don t run complex things on the login node. (web terminal or nucleus.biohpc.swmed.edu) Maximum of 16 nodes in use concurrently by any single user. 2 GPU node max per user. Interactive use of cluster nodes using the web visualization or remotegui/remotegpu scripts only. A login to a compute node not allocated to you will disable your account 27

Getting Effective Help Email the ticket system: biohpc-help@utsouthwestern.edu What is the problem? Provide any error message, and diagnostic output you have When did it happen? What time? Cluster or client? What job id? How did you run it? What did you run, what parameters, what do they mean? Any unusual circumstances? Have you compiled your own software? Do you customize startup scripts? Can we look at your scripts and data? Tell us if you are happy for us to access your scripts/data to help troubleshoot. 28

Next Steps New users wait for confirmation your account is activated. Spend some time experimenting with our systems. Check the training schedule and attend relevant sessions. Join us for coffee on the 4 th Wednesday of each month. 29

REFERENCE REFERENCE SLIDES 30

Essential BioHPC Commands quota ugs Show home directory and project directory quota and usage panfs-quota -G /work Show work directory quota and usage du sh <directory> Show size of a specific directory and it s contents squeue Show cluster job information sinfo Show cluster node status sbatch myscript.sh Submit a cluster batch job using a script file 31

Essential BioHPC Commands Viewing Files / Text Editors cat <filename> Display a file on the screen less <filename> Displays a file so that you can scroll up and down. q or ctrl-c quits vi or vim Powerful editors, with a cryptic set of commands! See http://www.webmonkey.com/2010/02/vi_tutorial_for_beginners/ nano Simpler, easier to use! See http://mintaka.sdsu.edu/reu/nano.html 32

Cluster Status - sinfo * Is default partition Nodes are in more than one partition! 33

Cluster Status - squeue PD = Pending, R = Running Days-H:M:S 34

Cluster Status - squeue squeue l shows more information 35

36 Cancelling a job

37 Submitting a job with sbatch

38 Submitting a job with sbatch