Beacon Quickstart Guide at AACE/NICS

Size: px
Start display at page:

Download "Beacon Quickstart Guide at AACE/NICS"

Transcription

1 Beacon Intel MIC Cluster Beacon Overview Beacon Quickstart Guide at AACE/NICS Each compute node has 2 8- core Intel Xeon E with 256GB of RAM All compute nodes also contain 4 KNC cards (mic0/1/2/3) with 8GB of RAM each, which can be accessed directly using micssh Throughout this document, "beacon#" represents the name of a generic Beacon compute node that should be replaced with the actual node name, while "beacon#- mic#" represents the name of a MIC on a compute node A queuing system is in place that gives users their own compute nodes to help prevent users from accessing the same MIC resource at the same time MIC Programming Models Native Mode All code runs on the MIC card directly Any libraries used will need to be recompiled for native mode To compile a program for native mode use the compiler flag mmic Parallelism across the cores is typically done through threads The executable, input files, and all libraries need to be copied over to the MIC card The location of all native mode libraries, custom or provided by a module, needs to be added to the LD_LIBRARY_PATH environment variable Offload Mode Code starts running on host Parallel regions of code are specified to be run on the MIC using pragmas/directives Data is either copied explicitly to the card or implicitly (used for complex data types involving pointers, only available in C++) Automatic Offload (AO) is available for certain MKL functions o?gemm,?trsm,?trmm,?potrf,?geqrf, and?getrf

2 Access and Login Beacon can be accessed via SSH ssh NOTE: there is only one production Beacon system. Beacon-login1.nics.utk.edu is now Beacon. The PASSCODE is your 4 digit PIN followed by the 6 digit number displayed on the OTP token. Once connected to Beacon, you will be placed on the login node Preventative Maintenance Unless otherwise stated in the Message of the Day upon logging into Beacon, Beacon is scheduled for preventative maintenance every Wednesday from 8am- 12pm EST/EDT. Compiling Upon connecting to beacon-lgn, all Intel compilers should be available for use immediately. Only the Intel compilers support the Intel MIC architecture at this time. Language Intel Compiler / MPI Wrapper C icc / mpiicc C++ icpc / mpiicc Fortran ifort / mpiifort Notes about configure scripts: When trying to build an application/library for native mode use with configure, the environment should be setup to use the proper compiler flag. export CC="icc -mmic" OR "mpiicc mmic" export CXX="icpc -mmic" OR "mpiicc mmic" export F77="ifort -mmic" OR "mpiifort mmic" export F90="ifort -mmic" OR "mpiifort mmic" Sometimes the script tries to run test binaries on the host. Since native MIC binaries cannot run on the host, the script will complain and exit. One workaround is to force a cross- compilation using --host=x86_64-k1omlinux If the --host option is not available, or the script still exits, then the configure script needs to be fooled with a dummy flag first, and then the flag needs to be changed back before running make.

3 export CC="icc -DMMIC" export CXX="icpc -DMMIC" export F77="ifort -DMMIC" export F90="ifort -DMMIC" Run configure files=$(find./* -name Makefile) perl -p -i -e 's/-dmmic/-mmic/g' $files export CC="icc -mmic" export CXX="icpc -mmic" export F77="ifort -mmic" export F90="ifort -mmic" Run make The compiler flag may affect files other than the Makefiles, and you may need to adjust them manually. You can find them using grep grep -R DMMIC./ In some instances, you may need to specify the native mode linker and archiver export LD="/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ld" export AR="/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ar"

4 Requesting Compute Nodes with the Scheduler While compiling should be done on the login nodes, actual computations should be done on the compute nodes. Compute nodes are named beacon# and their corresponding MICs are beacon#-mic0 thru beacon#-mic1 Users can request an interactive session on a compute node using o qsub I A ACCOUNT_NAME Users can get more than one node by using o qsub I A ACCOUNT_NAME l nodes=# By default, interactive jobs last 1 hour, but users can request more time by using o qsub I A ACCOUNT_NAME l nodes=#,walltime=hh:mm:ss An example ACCOUNT_NAME would be UT-AACE Users can also submit jobs using a submission script Sample Submission Script #!/bin/bash #PBS -N jobname #PBS A ACCOUNT_NAME #PBS -l nodes=1 #PBS -l walltime=2:00:00 # Change to directory where script was called cd $PBS_O_WORKDIR # Run executable./program arguments

5 File Systems There are 3 file systems on Beacon 1. NFS home space at /nics/[a-d]/home/$user 2. Lustre scratch space at /lustre/medusa/$user 3. Local SSD scratch space at $TMPDIR The login node only has access to the NFS and Lustre scratch spaces All compute nodes have access to all three file systems, with each compute node having unique local SSD scratch spaces The MICs only have access to the Lustre and local SSD scratch spaces TMPDIR The environment variable TMPDIR is created for the users after being allocated compute nodes through the queuing system. Its absolute path is determined by the job id assigned by the scheduler. The compute nodes mount the local SSD scratch space at $TMPDIR. Given the speed of the SSD drives, using $TMPDIR is preferable to using the Lustre scratch space. On compute nodes, unique temporary directories are found at $TMPDIR/mic0, $TMPDIR/mic0/lib and $TMPDIR/mic0/bin to aid in copying files to the KNC cards. A similar directory structure exists for mic1: $TMPDIR/mic1, $TMPDIR/mic1/lib and $TMPDIR/mic1/bin. The first and second KNC cards mount these directories respectively. For mic0, if the local SSD scratch space is to be used, the native mode binary should be copied over to $TMPDIR/mic0, and native mode libraries should be copied over to $TMPDIR/mic0/lib. Native mode MPI and OpenMP libraries are copied by default. All other libraries, including those from modules, need to be copied manually. If there are additional utility binaries, they can be copied to $TMPDIR/mic0/bin. Similar file transfers can be made to mic1, if necessary. Alternatively, the Lustre scratch space can be used directly if any issues are found using $TMPDIR. Local SSD Compute node mic0 $TMPDIR Coprocessors beacon#- mic0 beacon# $TMPDIR mic1 $TMPDIR beacon#- mic1

6 Custom Beacon Scripts Any secure communication with a MIC requires unique ssh keys that are automatically generated once the scheduler assigns compute nodes Custom scripts have been created to use these ssh keys, which prevent prompts asking using users for passwords Traditional Command ssh scp mpirun/mpiexec Custom Beacon Script micssh micscp micmpiexec Running Jobs and Copying Files to the KNC Cards After compiling source code, request a compute node to run the executable. Once connected to a compute node, offload mode executables can be run directly. Native mode executables require manual copying of libraries, binaries, and input data to either the SSD scratch space cp native_program.mic $TMPDIR/mic0 cp necessary_library.so $TMPDIR/mic0/lib or the Lustre scratch space with a folder_name of your choice mkdir /lustre/medusa/$user/folder_name cp native_program.mic /lustre/medusa/$user/folder_name mkdir /lustre/medusa/$user/folder_name/lib cp necessary_library.so /lustre/medusa/$user/folder_name/lib Once files are copied over, direct access to a KNC card is available through the micssh command micssh beacon#-mic0 To see the files that were copied to the local SSD scratch space, you will have to change directory to TMPDIR. cd $TMPDIR ls If native mode libraries were copied to the Lustre scratch space, then LD_LIBRARY_PATH needs to be modified accordingly export LD_LIBRARY_PATH=/lustre/medusa/$USER/folder_name/lib:$LD_LIBRARY_ PATH After the native mode application is run, type exit to return back to the compute node host. Output files located on the local SSD scratch space can then copied from

7 $TMPDIR/mic0 and/or $TMPDIR/mic1 to the user's home directory or to the Lustre scratch space. Files not copied from the local SSD scratch space will be lost once the interactive session is over. If you are planning to run MPI on MICs on multiple nodes with the local SSD scratch space, you also need to copy files to the MICs you plan to use located on the other assigned compute nodes. This can be done manually by first determining which nodes you have been assigned using cat $PBS_NODEFILE. Then, for each assigned node, copying the necessary files using micssh or micscp: micssh beacon# cp absolute_path/file_to_copy $TMPDIR/mic# or micscp absolute_path/file_to_copy beacon#:$tmpdir/mic# Instead of doing this manually, the custom allmicput script can be used Allmicput The allmicput script can easily copy files to $TMPDIR on all assigned MICs Usage: allmicput [[-t] FILE...] [-l LIBRARY...] [-x BINARY...] [-d DIR FILE...] Copy listed files to the corresponding directory on every MIC card in the current PBS job. [-t] FILE... the specified file(s) are copied to $TMPDIR on each mic -T LISTFILE the files in LISTFILE are copied to $TMPDIR on each mic -l LIBRARY... the specified file(s) are copied to $TMPDIR/lib on each mic -L LISTFILE the files in LISTFILE are copied to $TMPDIR/lib on each mic -x BINARY... the specified file(s) are copied to $TMPDIR/bin on each mic -X LISTFILE the files in LISTFILE are copied to $TMPDIR/bin on each mic -d DIR FILE... the specified file(s) are copied to $TMPDIR/DIR on each mic -D DIR LISTFILE the files in LISTFILE are copied to $TMPDIR/DIR on each mic

8 Native Mode Shared Libraries Unless provided by a module, all shared libraries need to be recompiled for native mode use 1. Compile the library source code o icc mmic c fpic mylib.c o icpc mmic c fpic mylib.cpp o ifort mmic c fpic mylib.f90 2. Use the shared compiler flag to create the library from the object file o icc mmic shared o libmylib.so mylib.o o icpc mmic shared o libmylib.so mylib.o o ifort mmic shared o libmylib.so mylib.o 3. Compile and link the native application code with the native shared object o icc mmic main.c libmylib.so o icpc mmic main.cpp libmylib.so o ifort mmic main.f90 libmylib.so 4. Copy binary and library over to MIC before executing o cp a.out $TMPDIR/mic# o cp libmylib.so $TMPDIR/mic#/lib The location of all native mode libraries, custom or provided by a module, needs to be added to the LD_LIBRARY_PATH environment variable Debugging Intel debuggers are available for both the host and the KNC cards idbc is the command line debugger for the host micidbc is the command line debugger for the KNC cards o Usage: micidbc -wdir $TMPDIR -tco - rconnect=tcpip:mic0:2000 o refer to the Debugging on Beacon Lab for further details

9 Modules The modules software package is installed on Beacon and it allows you to dynamically modify your user environment by using modulefiles. Typical uses of "modulefiles" include the adjusting of the PATH and LD_LIBRARY_PATH environment variables for use with the particular module. Below are some commands using the modules Command module list module avail module load module unload module swap module help module show Description Show what modules are currently loaded Show what modules can be loaded Load a module Unload a module Swap a currently loaded module for an unloaded module Displays a description of the module Displays how a module would affect the environment if it were loaded Documentation and Sample Code Official Intel documentation can always be found at /global/opt/intel/composerxe/documentation/en_us Intel's sample codes can always be found at /global/opt/intel/composerxe/samples/en_us More detailed information on how to program for the MIC can be found at Intel's website:

10 Native Mode Example We will take a simple OpenMP code that calculates PI and run it on a MIC card on Beacon 1. SSH into beacon 2. request a compute node 3. make a folder if you wish and change directory to it 4. using your favorite text editor create the file omp_pi_native.c from the following #include <stdio.h> int main () int num_steps = ; double step; int i; double pi, sum = 0.0; step = 1.0/(double) num_steps; #pragma omp parallel for reduction(+:sum) for (i=1;i<= num_steps; i++) double x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); pi = step * sum; printf("pi is calculated to be = %f\n",pi); return 0; 5. Compile for native use on the MIC icc openmp mmic o omp_pi_native omp_pi_native.c 6. Copy omp_pi_native to mic0 cp omp_pi_native $TMPDIR/mic0 7. SSH to mic0 via micssh micssh beacon#-mic0 8. Change to the TMPDIR directory cd $TMPDIR 9. Set the environment variable OMP_NUM_THREADS to specify the number of threads to be used export OMP_NUM_THREADS= Run the executable./omp_pi_native 11. Exit the ssh session and return to the host exit

11 Offload Mode Example We will take the previous OpenMP code that calculates PI and specify a parallel region to run on a MIC card on Beacon via the offload target(mic) pragma 1. SSH into beacon 2. Make a folder if you wish and change directory to it 3. Using your favorite text editor create the file omp_pi_offload.c from the following #include <stdio.h> int main () int num_steps = ; double step; int i; double pi, sum = 0.0; step = 1.0/(double) num_steps; #pragma offload target(mic) #pragma omp parallel for reduction(+:sum) for (i=1;i<= num_steps; i++) double x = (i-0.5)*step; sum = sum + 4.0/(1.0+x*x); pi = step * sum; printf("pi is calculated to be = %f\n",pi); return 0; 4. Compile for offload mode use on the MIC icc -openmp -o omp_pi_offload omp_pi_offload.c 5. Request a compute node 6. Change to directory containing omp_pi_offload 7. Set the environment variable MIC_ENV_PREFIX and MIC_OMP_NUM_THREADS to specify the number of threads to be used export MIC_OMP_PREFIX=MIC_ export MIC_OMP_NUM_THREADS=30 8. Run the binary 9. Try setting the Environment variable OFFLOAD_REPORT to 2 and run it again export OFFLOAD_REPORT=2

12 Offload Mode Example Using Shared Libraries This example is similar to the previous, only now the function calc_pi will be called from a library. First make a file named calc_pi.h from the following #ifndef CALC_PI_H #define CALC_PI_H // function prototype for calc_pi // Note how this function was marked for use with the Intel MIC coprocessor attribute ((target (MIC))) double calc_pi(int num_steps); #endif Next make a file named calc_pi.c from the following #include "calc_pi.h" double calc_pi(int num_steps) double step; int i; double pi, sum = 0.0; step = 1.0/(double) num_steps; #pragma omp parallel for for (i=1;i<= num_steps; i++) double x = (i-0.5)*step; #pragma omp critical sum = sum + 4.0/(1.0+x*x); pi = step * sum; return pi; Now make a file named calc_pi_shrd.c from the following #include <stdio.h> #include "calc_pi.h" int main () int num_steps = ; double pi; #pragma offload target(mic) out(pi) // Call the function found in the calc_pi.so library pi = calc_pi(num_steps); printf("pi is calculated to be = %f\n",pi); return 0;

13 On the login node, compile using 1. icc -c -fpic -openmp calc_pi.c 2. icc -shared -o libcalc_pi.so calc_pi.o 3. icc -L. -openmp -o calc_pi_shrd calc_pi_shrd.c -lcalc_pi Request a compute node, change directory to where calc_pi_shrd is located, and then run using 1. export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:. 2../calc_pi_shrd

14 Asynchronous Offload Example Create the file async_offload_example.c from the following #include <stdio.h> #include <stdlib.h> //NOTE: You will want to set OFFLODAD_REPORT=2 to see the offload in action int main() char sv; int i; int *p; p = (int*)calloc( ,sizeof(int)); //initialize p for (i = 0; i < ; i++) p[i]=-i; printf("\non host p[20] = %d\n",p[20]); fflush(stdout); //offload function to MIC //can use in, out, inout, nocopy #pragma offload target(mic:0) inout(p:length( )) signal(&sv) for (i = 0; i < ; i++) p[i]=2*p[i]; //immediately returns to do CPU calcs for (i = 0; i < ; i++) p[i]=-p[i]; printf("\non host after computation p[20] = %d\n",p[20]); fflush(stdout); //stops until offload completes and sends back #pragma offload_wait target(mic:0) wait(&sv) //now, the hosts value should change printf("\non host after offload completes p[20] = %d\n",p[20]); fflush(stdout); //free mem on host free(p); return(0); This example allows the CPU and MIC to do work simultaneously. On the login node, compile using icc - o async_offload_example async_offload_example.c and run the binary. Note: The Fortran equivalent needs to initialize the signal variable sv to a unique (may not be the same as another signal variable) integer value greater than or equal to 1

15 Asynchronous Offload Transfer Example Create the file async_offload_transfer_example.c from the following #include <stdio.h> #include <stdlib.h> #define ALLOC alloc_if(1) free_if(0) #define FREE alloc_if(0) free_if(1) #define REUSE alloc_if(0) free_if(0) //NOTE: You will want to set OFFLOAD_REPORT=2 to see the offload in action int main() char sv, sv1; int i; int *p, *q; p = (int*)calloc( ,sizeof(int)); q = (int*)calloc( ,sizeof(int)); //now, let's look at allocating and freeing memory on the MIC //which can also be done asynchronously //will allocate mem on mic:0 without data copy #pragma offload_transfer target(mic:0) nocopy(p,q:length( ) ALLOC) signal(&sv) //now, you can be doing CPU work, since this returns immediately //initialize p,q for (i = 0; i < ; i++) p[i]=-i; q[i]=-i; printf("\non host during data transfer p[20] = %d, q[20] = %d\n",p[20],q[20]); fflush(stdout); //now, do offload, copy in p, out q #pragma offload target(mic:0) in(p:length( ) REUSE) out(q:length( ) REUSE) wait(&sv) signal(&sv1) for (i = 0; i < ; i++) q[i]=-p[i]; //will free mem on mic:0 without data copy #pragma offload_transfer target(mic:0) nocopy(p,q:length( ) FREE) wait(&sv1) printf("\non host after offload p[20] = %d, q[20] = %d\n",p[20],q[20]); fflush(stdout); //free mem on host free(p); free(q); return(0); Compile similarly to the previous example and then run the specified binary. In this example, the second offload waits until the first one finishes. During that time the CPU is initializing values of p and q locally, while mic0 is allocating memory.

16 Intel MPI on the MIC Architecture Access to the Intel MPI tools and libraries on Beacon is managed through the "module" system. The intel- mpi module is loaded by default upon login. Part of the Intel MPI environment is the "mpiicc" command. This command ensures that icc is invoked with the necessary options for MPI. Note: for FORTRAN, do NOT use mpif90, but rather mpiifort. The simple mpi_hello.c example below will demonstrate how to compile, and run MPI applications on Beacon #include <mpi.h> #include <stdio.h> int main(int argc, char *argv[]) char name[64]; int rank, size, length; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Get_processor_name(name, &length); printf("hello, World. I am %d of %d on %s\n", rank, size, name); MPI_Finalize(); return 0; To compile a version that runs on the host system: mpiicc -o mpi_hello mpi_hello.c To compile a version that runs on a MIC card: mpiicc -mmic -o mpi_hello.mic mpi_hello.c The.MIC suffix is there to distinguish that the binary is to be executed on a MIC card. Any suffix can be used, but a separate binary must be compiled. Now that binaries are created, compute nodes can be requested and the MPI applications can be launched. The command "micmpiexec" can be used to launch the MPI program on the host (Xeon) node by specifying the host node: micmpiexec -n 2 -host beacon#./mpi_hello

17 In order to run it on a MIC, you must first copy the correct binary to the card(s): cp mpi_hello.mic $TMPDIR/mic0/ cp mpi_hello.mic $TMPDIR/mic1/ and/or if using MICs not on the working node: micssh beacon# cp mpi_hello.mic $TMPDIR/mic0 micssh beacon# cp mpi_hello.mic $TMPDIR/mic1 In order to tell "mpiexec" to use the "micssh" command to access the MIC, we use the "micmpiexec" command to run the MPI program: micmpiexec -n 2 -host beacon#-mic0 wdir $TMPDIR $TMPDIR/mpi_hello.MIC And to use both MIC cards: micmpiexec -n 2 -wdir $TMPDIR -host beacon#-mic0 $TMPDIR/mpi_hello.MIC : -n 2 -wdir $TMPDIR -host beacon#-mic1 $TMPDIR/mpi_hello.MIC MPI can also be used in heterogeneous mode utilizing both the Xeon host and one or more MIC cards from any node you have been allocated: micmpiexec -n 2 -wdir $TMPDIR -host beacon#-mic0 $TMPDIR/mpi_hello.MIC : -n 2 -host beacon#./mpi_hello micmpiexec -n 2 -wdir $TMPDIR -host beacon#-mic0 $TMPDIR/mpi_hello.MIC : -n 2 -wdir $TMPDIR -host beacon#-mic1 $TMPDIR/mpi_hello.MIC : -n 2 -host beacon#./mpi_hello micmpiexec -n 2 -host beacon#./mpi_hello.mic : -n 2 - host beacon#./mpi_hello.mic Using Custom Native Libraries with MPI If custom native libraries are to be used, they should be properly copied over to $TMPDIR/mic#/lib. If that application is to be launched using micmpiexec, then the environment should already be properly set to use these libraries.

18 MPI Machine File Instead of listing all the MPI hosts by hand on the command line, a machine file can be created and used. The contents of the machine file should be of the form <host>:<number of ranks> The following is an example machine file named hosts_file beacon11:8 beacon12:8 beacon11-mic0:2 beacon11-mic1:2 beacon12-mic0:2 beacon12-mic1:2 This machine file could be used to launch an MPI application with micmpiexec -machinefile hosts_file -n 16./application : -wdir $TMPDIR -n 8 $TMPDIR/application.MIC generate- mic- hostlist A custom script named generate-mic-hostlist has been created for beacon that generates machine files for you generate-mic-hostlist TYPE NUM_MIC NUM_XEON > machines where TYPE=offload, micnative, or hybrid Note: if TYPE=offload, then the generated machine file is simply all the nodes the scheduler has assigned listed once NUM_MIC is the number of MPI ranks to place on each MIC NUM_XEON is the number of MPI ranks to place on each CPU host machines is the name of the machine file to be created Getting Help If you need assistance using the Beacon resources, or have any questions, comments, suggestions, or concerns regarding the use of Beacon, please send an to help@nics.utk.edu

19 Issues to Look Out For If, at any time, you experience any of the following issues, please report them in an to In most cases, simply resubmitting your job again will work, but we still need to know about the issues encountered. - Failure to mount any directory Example: mount: mounting beacon1:/lustre/medusa/user on /lustre/medusa/user failed: Device or resource busy - The number of available nodes is 0. Use showq to see status of nodes and any reservations that might be present. Example: [user@beacon-lgn lib]$ showq active jobs JOBID USERNAME STATE PROCS REMAINING #### user1 Running 16 1:42:17 #### user2 Running 16 18:22:25 #### user3 Running 64 5:34:07 #### user4 Running 32 11:34:07 #### user5 Running 32 5:48:04 5 active jobs 80 of 80 processors in use by local jobs (100.00%) 4 of 5 nodes active (80.00%) - Your job is sitting in the queue or an interactive job is not starting. Please refer to the MOTD for information about maintenance (PM or EM). This is usually on Wednesday and occasionally on Tuesday when the file system is down. If you are submitting an interactive job with a walltime that crosses into the reservation for PM, it will not give you a node. Try submitting with a shorter walltime.

20 Example MOTD: Preventative maintenance (PM) will be performed on Beacon every Wednesday from 8am to noon Eastern time unless noted otherwise. The MIC driver on the compute nodes has been updated to version The PM for 4/3 has been cancelled. - You get the following message when a job is submitted. Warning: Cannot access allocation software. Please contact help@xsede.org if you need assistance. This is due to a scheduler/batch incompatibility issue that will be fixed once we obtain a new license. This does NOT mean your job didn t go through. Please ignore this message. This material is based upon work supported by the National Science Foundation under Grant Number Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Getting Started with TotalView on Beacon

Getting Started with TotalView on Beacon Getting Started with TotalView on Beacon 1) ssh X uname@beacon.nics.utk.edu (sets x forwarding so you can use the GUI) 2) qsub X I A UT- AACE- BEACON (sets x forwarding in the interactive job) 3) compile

More information

Intel Xeon Phi Coprocessor

Intel Xeon Phi Coprocessor Intel Xeon Phi Coprocessor A guide to using it on the Cray XC40 Terminology Warning: may also be referred to as MIC or KNC in what follows! What are Intel Xeon Phi Coprocessors? Hardware designed to accelerate

More information

PRACE PATC Course: Intel MIC Programming Workshop MPI LRZ,

PRACE PATC Course: Intel MIC Programming Workshop MPI LRZ, PRACE PATC Course: Intel MIC Programming Workshop MPI LRZ, 27.6.- 29.6.2016 Intel Xeon Phi Programming Models: MPI MPI on Hosts & MICs MPI @ LRZ Default Module: SuperMUC: mpi.ibm/1.4 SuperMIC: mpi.intel/5.1

More information

Xeon Phi Native Mode - Sharpen Exercise

Xeon Phi Native Mode - Sharpen Exercise Xeon Phi Native Mode - Sharpen Exercise Fiona Reid, Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents June 19, 2015 1 Aims 1 2 Introduction 1 3 Instructions 2 3.1 Log into yellowxx

More information

Our new HPC-Cluster An overview

Our new HPC-Cluster An overview Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization

More information

Xeon Phi Native Mode - Sharpen Exercise

Xeon Phi Native Mode - Sharpen Exercise Xeon Phi Native Mode - Sharpen Exercise Fiona Reid, Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents April 30, 2015 1 Aims The aim of this exercise is to get you compiling and

More information

Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC

Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC # pg. Subject Purpose directory 1 3 5 Offload, Begin (C) (F90) Compile and Run (CPU, MIC, Offload) offload_hello 2 7 Offload, Data Optimize

More information

Computing with the Moore Cluster

Computing with the Moore Cluster Computing with the Moore Cluster Edward Walter An overview of data management and job processing in the Moore compute cluster. Overview Getting access to the cluster Data management Submitting jobs (MPI

More information

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation

More information

SCALABLE HYBRID PROTOTYPE

SCALABLE HYBRID PROTOTYPE SCALABLE HYBRID PROTOTYPE Scalable Hybrid Prototype Part of the PRACE Technology Evaluation Objectives Enabling key applications on new architectures Familiarizing users and providing a research platform

More information

Tech Computer Center Documentation

Tech Computer Center Documentation Tech Computer Center Documentation Release 0 TCC Doc February 17, 2014 Contents 1 TCC s User Documentation 1 1.1 TCC SGI Altix ICE Cluster User s Guide................................ 1 i ii CHAPTER 1

More information

Vincent C. Betro, Ph.D. NICS March 6, 2014

Vincent C. Betro, Ph.D. NICS March 6, 2014 Vincent C. Betro, Ph.D. NICS March 6, 2014 NSF Acknowledgement This material is based upon work supported by the National Science Foundation under Grant Number 1137097 Any opinions, findings, and conclusions

More information

Introduction to Intel Xeon Phi programming techniques. Fabio Affinito Vittorio Ruggiero

Introduction to Intel Xeon Phi programming techniques. Fabio Affinito Vittorio Ruggiero Introduction to Intel Xeon Phi programming techniques Fabio Affinito Vittorio Ruggiero Outline High level overview of the Intel Xeon Phi hardware and software stack Intel Xeon Phi programming paradigms:

More information

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Contents User access, logging in Linux/Unix

More information

Shifter on Blue Waters

Shifter on Blue Waters Shifter on Blue Waters Why Containers? Your Computer Another Computer (Supercomputer) Application Application software libraries System libraries software libraries System libraries Why Containers? Your

More information

Cluster Clonetroop: HowTo 2014

Cluster Clonetroop: HowTo 2014 2014/02/25 16:53 1/13 Cluster Clonetroop: HowTo 2014 Cluster Clonetroop: HowTo 2014 This section contains information about how to access, compile and execute jobs on Clonetroop, Laboratori de Càlcul Numeric's

More information

Getting started with the CEES Grid

Getting started with the CEES Grid Getting started with the CEES Grid October, 2013 CEES HPC Manager: Dennis Michael, dennis@stanford.edu, 723-2014, Mitchell Building room 415. Please see our web site at http://cees.stanford.edu. Account

More information

Debugging Intel Xeon Phi KNC Tutorial

Debugging Intel Xeon Phi KNC Tutorial Debugging Intel Xeon Phi KNC Tutorial Last revised on: 10/7/16 07:37 Overview: The Intel Xeon Phi Coprocessor 2 Debug Library Requirements 2 Debugging Host-Side Applications that Use the Intel Offload

More information

Intel MPI Cluster Edition on Graham A First Look! Doug Roberts

Intel MPI Cluster Edition on Graham A First Look! Doug Roberts Intel MPI Cluster Edition on Graham A First Look! Doug Roberts SHARCNET / COMPUTE CANADA Intel Parallel Studio XE 2016 Update 4 Cluster Edition for Linux 1. Intel(R) MPI Library 5.1 Update 3 Cluster Ed

More information

Introduction to PICO Parallel & Production Enviroment

Introduction to PICO Parallel & Production Enviroment Introduction to PICO Parallel & Production Enviroment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Nicola Spallanzani n.spallanzani@cineca.it

More information

Simple examples how to run MPI program via PBS on Taurus HPC

Simple examples how to run MPI program via PBS on Taurus HPC Simple examples how to run MPI program via PBS on Taurus HPC MPI setup There's a number of MPI implementations install on the cluster. You can list them all issuing the following command: module avail/load/list/unload

More information

Practical Introduction to Message-Passing Interface (MPI)

Practical Introduction to Message-Passing Interface (MPI) 1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

Hybrid MPI+OpenMP Parallel MD

Hybrid MPI+OpenMP Parallel MD Hybrid MPI+OpenMP Parallel MD Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & Materials

More information

Accelerator Programming Lecture 1

Accelerator Programming Lecture 1 Accelerator Programming Lecture 1 Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences, M17 manfred.liebmann@tum.de January 11, 2016 Accelerator Programming

More information

JURECA Tuning for the platform

JURECA Tuning for the platform JURECA Tuning for the platform Usage of ParaStation MPI 2017-11-23 Outline ParaStation MPI Compiling your program Running your program Tuning parameters Resources 2 ParaStation MPI Based on MPICH (3.2)

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Introduction to CINECA Computer Environment

Introduction to CINECA Computer Environment Introduction to CINECA Computer Environment Today you will learn... Basic commands for UNIX environment @ CINECA How to submitt your job to the PBS queueing system on Eurora Tutorial #1: Example: launch

More information

Parallel Computing: Overview

Parallel Computing: Overview Parallel Computing: Overview Jemmy Hu SHARCNET University of Waterloo March 1, 2007 Contents What is Parallel Computing? Why use Parallel Computing? Flynn's Classical Taxonomy Parallel Computer Memory

More information

Beginner's Guide for UK IBM systems

Beginner's Guide for UK IBM systems Beginner's Guide for UK IBM systems This document is intended to provide some basic guidelines for those who already had certain programming knowledge with high level computer languages (e.g. Fortran,

More information

Intel Xeon Phi Coprocessors

Intel Xeon Phi Coprocessors Intel Xeon Phi Coprocessors Reference: Parallel Programming and Optimization with Intel Xeon Phi Coprocessors, by A. Vladimirov and V. Karpusenko, 2013 Ring Bus on Intel Xeon Phi Example with 8 cores Xeon

More information

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing Quick Start Guide by Burak Himmetoglu Supercomputing Consultant Enterprise Technology Services & Center for Scientific Computing E-mail: bhimmetoglu@ucsb.edu Linux/Unix basic commands Basic command structure:

More information

High Performance Computing (HPC) Using zcluster at GACRC

High Performance Computing (HPC) Using zcluster at GACRC High Performance Computing (HPC) Using zcluster at GACRC On-class STAT8060 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC?

More information

MIC Lab Parallel Computing on Stampede

MIC Lab Parallel Computing on Stampede MIC Lab Parallel Computing on Stampede Aaron Birkland and Steve Lantz Cornell Center for Advanced Computing June 11 & 18, 2013 1 Interactive Launching This exercise will walk through interactively launching

More information

MPI introduction - exercises -

MPI introduction - exercises - MPI introduction - exercises - Paolo Ramieri, Maurizio Cremonesi May 2016 Startup notes Access the server and go on scratch partition: ssh a08tra49@login.galileo.cineca.it cd $CINECA_SCRATCH Create a job

More information

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal

More information

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009 What is ACEnet? Shared resource......for research computing... physics, chemistry, oceanography, biology, math, engineering,

More information

Chip Multiprocessors COMP Lecture 9 - OpenMP & MPI

Chip Multiprocessors COMP Lecture 9 - OpenMP & MPI Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

Introduction to Unix Environment: modules, job scripts, PBS. N. Spallanzani (CINECA)

Introduction to Unix Environment: modules, job scripts, PBS. N. Spallanzani (CINECA) Introduction to Unix Environment: modules, job scripts, PBS N. Spallanzani (CINECA) Bologna PATC 2016 In this tutorial you will learn... How to get familiar with UNIX environment @ CINECA How to submit

More information

Ambiente CINECA: moduli, job scripts, PBS. A. Grottesi (CINECA)

Ambiente CINECA: moduli, job scripts, PBS. A. Grottesi (CINECA) Ambiente HPC @ CINECA: moduli, job scripts, PBS A. Grottesi (CINECA) Bologna 2017 In this tutorial you will learn... How to get familiar with UNIX environment @ CINECA How to submit your job to the PBS

More information

Intel MIC Programming Workshop, Hardware Overview & Native Execution LRZ,

Intel MIC Programming Workshop, Hardware Overview & Native Execution LRZ, Intel MIC Programming Workshop, Hardware Overview & Native Execution LRZ, 27.6.- 29.6.2016 1 Agenda Intro @ accelerators on HPC Architecture overview of the Intel Xeon Phi Products Programming models Native

More information

Working with Shell Scripting. Daniel Balagué

Working with Shell Scripting. Daniel Balagué Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. Command-Line Interface (CLI) editors: vi / vim nano (very intuitive and easy to use if you

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu The Discovery Cluster 2 Agenda What is a cluster and why use it Overview of computer hardware in cluster Help Available to Discovery Users Logging

More information

Intel MIC Programming Workshop, Hardware Overview & Native Execution. IT4Innovations, Ostrava,

Intel MIC Programming Workshop, Hardware Overview & Native Execution. IT4Innovations, Ostrava, , Hardware Overview & Native Execution IT4Innovations, Ostrava, 3.2.- 4.2.2016 1 Agenda Intro @ accelerators on HPC Architecture overview of the Intel Xeon Phi (MIC) Programming models Native mode programming

More information

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Practical Introduction to

Practical Introduction to 1 2 Outline of the workshop Practical Introduction to What is ScaleMP? When do we need it? How do we run codes on the ScaleMP node on the ScaleMP Guillimin cluster? How to run programs efficiently on ScaleMP?

More information

For Ryerson EE Network

For Ryerson EE Network 10/25/2015 MPI Instructions For Ryerson EE Network Muhammad Ismail Sheikh DR. NAGI MEKHIEL Mpich-3.1.4 software is already installed on Ryerson EE network and anyone using the following instructions can

More information

Introduc)on to Hyades

Introduc)on to Hyades Introduc)on to Hyades Shawfeng Dong Department of Astronomy & Astrophysics, UCSSC Hyades 1 Hardware Architecture 2 Accessing Hyades 3 Compu)ng Environment 4 Compiling Codes 5 Running Jobs 6 Visualiza)on

More information

High Performance Beowulf Cluster Environment User Manual

High Performance Beowulf Cluster Environment User Manual High Performance Beowulf Cluster Environment User Manual Version 3.1c 2 This guide is intended for cluster users who want a quick introduction to the Compusys Beowulf Cluster Environment. It explains how

More information

Lab MIC Experiments 4/25/13 TACC

Lab MIC Experiments 4/25/13 TACC Lab MIC Experiments 4/25/13 TACC # pg. Subject Purpose directory 1 3 5 Offload, Begin (C) (F90) Compile and Run (CPU, MIC, Offload) offload_hello 2 7 Offload, Data Optimize Offload Data Transfers offload_transfer

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What is HPC Concept? What is

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

Offloading. Kent Milfeld Stampede Training, January 11, 2013

Offloading. Kent Milfeld Stampede Training, January 11, 2013 Kent Milfeld milfeld@tacc.utexas.edu Stampede Training, January 11, 2013 1 MIC Information Stampede User Guide: http://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide TACC Advanced Offloading:

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class PBIO/BINF8350 Georgia Advanced Computing Resource Center University of Georgia Zhuofei Hou, HPC Trainer zhuofei@uga.edu Outline What is GACRC? What

More information

Hybrid MPI+OpenMP+CUDA Programming

Hybrid MPI+OpenMP+CUDA Programming Hybrid MPI+OpenMP+CUDA Programming Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering

More information

Offloading. Kent Milfeld June,

Offloading. Kent Milfeld June, Kent Milfeld milfeld@tacc.utexas.edu June, 16 2013 MIC Information Stampede User Guide: http://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide TACC Advanced Offloading: Search and click

More information

Overview of Intel Xeon Phi Coprocessor

Overview of Intel Xeon Phi Coprocessor Overview of Intel Xeon Phi Coprocessor Sept 20, 2013 Ritu Arora Texas Advanced Computing Center Email: rauta@tacc.utexas.edu This talk is only a trailer A comprehensive training on running and optimizing

More information

Compilation and Parallel Start

Compilation and Parallel Start Compiling MPI Programs Programming with MPI Compiling and running MPI programs Type to enter text Jan Thorbecke Delft University of Technology 2 Challenge the future Compiling and Starting MPI Jobs Compiling:

More information

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011) UoW HPC Quick Start Information Technology Services University of Wollongong ( Last updated on October 10, 2011) 1 Contents 1 Logging into the HPC Cluster 3 1.1 From within the UoW campus.......................

More information

Hybrid MPI+CUDA Programming

Hybrid MPI+CUDA Programming Hybrid MPI+CUDA Programming Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & Materials

More information

Hands-on. MPI basic exercises

Hands-on. MPI basic exercises WIFI XSF-UPC: Username: xsf.convidat Password: 1nt3r3st3l4r WIFI EDUROAM: Username: roam06@bsc.es Password: Bsccns.4 MareNostrum III User Guide http://www.bsc.es/support/marenostrum3-ug.pdf Remember to

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Alessandro Grottesi a.grottesi@cineca.it SuperComputing Applications and

More information

High Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin

High Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin High Performance Computing (HPC) Club Training Session Xinsheng (Shawn) Qin Outline HPC Club The Hyak Supercomputer Logging in to Hyak Basic Linux Commands Transferring Files Between Your PC and Hyak Submitting

More information

Sharpen Exercise: Using HPC resources and running parallel applications

Sharpen Exercise: Using HPC resources and running parallel applications Sharpen Exercise: Using HPC resources and running parallel applications Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into ARCHER frontend nodes and run commands.... 3 3.2 Download and extract

More information

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende

Introduction to the NCAR HPC Systems. 25 May 2018 Consulting Services Group Brian Vanderwende Introduction to the NCAR HPC Systems 25 May 2018 Consulting Services Group Brian Vanderwende Topics to cover Overview of the NCAR cluster resources Basic tasks in the HPC environment Accessing pre-built

More information

Introduction to GALILEO

Introduction to GALILEO November 27, 2016 Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it SuperComputing Applications and Innovation Department

More information

The JANUS Computing Environment

The JANUS Computing Environment Research Computing UNIVERSITY OF COLORADO The JANUS Computing Environment Monte Lunacek monte.lunacek@colorado.edu rc-help@colorado.edu What is JANUS? November, 2011 1,368 Compute nodes 16,416 processors

More information

Laboratory 1 Semester 1 11/12

Laboratory 1 Semester 1 11/12 CS2106 National University of Singapore School of Computing Laboratory 1 Semester 1 11/12 MATRICULATION NUMBER: In this lab exercise, you will get familiarize with some basic UNIX commands, editing and

More information

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.

More information

Hybrid MPI and OpenMP Parallel Programming

Hybrid MPI and OpenMP Parallel Programming Hybrid MPI and OpenMP Parallel Programming Jemmy Hu SHARCNET HPTC Consultant July 8, 2015 Objectives difference between message passing and shared memory models (MPI, OpenMP) why or why not hybrid? a common

More information

Exercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters.

Exercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters. Exercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters. Before you login to the Blue Waters system, make sure you have the following information

More information

Migrating from Zcluster to Sapelo

Migrating from Zcluster to Sapelo GACRC User Quick Guide: Migrating from Zcluster to Sapelo The GACRC Staff Version 1.0 8/4/17 1 Discussion Points I. Request Sapelo User Account II. III. IV. Systems Transfer Files Configure Software Environment

More information

Tools for Intel Xeon Phi: VTune & Advisor Dr. Fabio Baruffa - LRZ,

Tools for Intel Xeon Phi: VTune & Advisor Dr. Fabio Baruffa - LRZ, Tools for Intel Xeon Phi: VTune & Advisor Dr. Fabio Baruffa - fabio.baruffa@lrz.de LRZ, 27.6.- 29.6.2016 Architecture Overview Intel Xeon Processor Intel Xeon Phi Coprocessor, 1st generation Intel Xeon

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah Introduction to Parallel Programming Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Types of parallel computers. Parallel programming options. How to

More information

P a g e 1. HPC Example for C with OpenMPI

P a g e 1. HPC Example for C with OpenMPI P a g e 1 HPC Example for C with OpenMPI Revision History Version Date Prepared By Summary of Changes 1.0 Jul 3, 2017 Raymond Tsang Initial release 1.1 Jul 24, 2018 Ray Cheung Minor change HPC Example

More information

Introduction to Discovery.

Introduction to Discovery. Introduction to Discovery http://discovery.dartmouth.edu March 2014 The Discovery Cluster 2 Agenda Resource overview Logging on to the cluster with ssh Transferring files to and from the cluster The Environment

More information

Семинар 4 (26) Программирование сопроцессора Intel Xeon Phi (MPI)

Семинар 4 (26) Программирование сопроцессора Intel Xeon Phi (MPI) Семинар 4 (26) Программирование сопроцессора Intel Xeon Phi (MPI) Михаил Курносов E-mail: mkurnosov@gmail.com WWW: www.mkurnosov.net Цикл семинаров «Основы параллельного программирования» Институт физики

More information

XSEDE New User Tutorial

XSEDE New User Tutorial April 2, 2014 XSEDE New User Tutorial Jay Alameda National Center for Supercomputing Applications XSEDE Training Survey Make sure you sign the sign in sheet! At the end of the module, I will ask you to

More information

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing

HPC Workshop. Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing HPC Workshop Nov. 9, 2018 James Coyle, PhD Dir. Of High Perf. Computing NEEDED EQUIPMENT 1. Laptop with Secure Shell (ssh) for login A. Windows: download/install putty from https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html

More information

Using Sapelo2 Cluster at the GACRC

Using Sapelo2 Cluster at the GACRC Using Sapelo2 Cluster at the GACRC New User Training Workshop Georgia Advanced Computing Resource Center (GACRC) EITS/University of Georgia Zhuofei Hou zhuofei@uga.edu 1 Outline GACRC Sapelo2 Cluster Diagram

More information

Introduction to Parallel Programming with MPI

Introduction to Parallel Programming with MPI Introduction to Parallel Programming with MPI PICASso Tutorial October 25-26, 2006 Stéphane Ethier (ethier@pppl.gov) Computational Plasma Physics Group Princeton Plasma Physics Lab Why Parallel Computing?

More information

Introduction to CINECA HPC Environment

Introduction to CINECA HPC Environment Introduction to CINECA HPC Environment 23nd Summer School on Parallel Computing 19-30 May 2014 m.cestari@cineca.it, i.baccarelli@cineca.it Goals You will learn: The basic overview of CINECA HPC systems

More information

Advanced Message-Passing Interface (MPI)

Advanced Message-Passing Interface (MPI) Outline of the workshop 2 Advanced Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Morning: Advanced MPI Revision More on Collectives More on Point-to-Point

More information

Offload Computing on Stampede

Offload Computing on Stampede Offload Computing on Stampede Kent Milfeld milfeld@tacc.utexas.edu July, 22 2013 1 MIC Information mic-developer (programming & training tabs): http://software.intel.com/mic-developer Intel Programming

More information

Debugging on Blue Waters

Debugging on Blue Waters Debugging on Blue Waters Debugging tools and techniques for Blue Waters are described here with example sessions, output, and pointers to small test codes. For tutorial purposes, this material will work

More information

First steps on using an HPC service ARCHER

First steps on using an HPC service ARCHER First steps on using an HPC service ARCHER ARCHER Service Overview and Introduction ARCHER in a nutshell UK National Supercomputing Service Cray XC30 Hardware Nodes based on 2 Intel Ivy Bridge 12-core

More information

Hybrid MPI/OpenMP parallelization. Recall: MPI uses processes for parallelism. Each process has its own, separate address space.

Hybrid MPI/OpenMP parallelization. Recall: MPI uses processes for parallelism. Each process has its own, separate address space. Hybrid MPI/OpenMP parallelization Recall: MPI uses processes for parallelism. Each process has its own, separate address space. Thread parallelism (such as OpenMP or Pthreads) can provide additional parallelism

More information

Tool for Analysing and Checking MPI Applications

Tool for Analysing and Checking MPI Applications Tool for Analysing and Checking MPI Applications April 30, 2010 1 CONTENTS CONTENTS Contents 1 Introduction 3 1.1 What is Marmot?........................... 3 1.2 Design of Marmot..........................

More information

Sharpen Exercise: Using HPC resources and running parallel applications

Sharpen Exercise: Using HPC resources and running parallel applications Sharpen Exercise: Using HPC resources and running parallel applications Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents 1 Aims 2 2 Introduction 2 3 Instructions 3 3.1 Log into

More information

ncsa eclipse internal training

ncsa eclipse internal training ncsa eclipse internal training This tutorial will cover the basic setup and use of Eclipse with forge.ncsa.illinois.edu. At the end of the tutorial, you should be comfortable with the following tasks:

More information

Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003

Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 2: Carrying out Jobs under Microsoft Compute Cluster Server 20031 Lab Objective...1

More information

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016 Batch environment PBS (Running applications on the Cray XC30) 1/18/2016 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch

More information

Solution of Exercise Sheet 2

Solution of Exercise Sheet 2 Solution of Exercise Sheet 2 Exercise 1 (Cluster Computing) 1. Give a short definition of Cluster Computing. Clustering is parallel computing on systems with distributed memory. 2. What is a Cluster of

More information

Reusing this material

Reusing this material XEON PHI BASICS Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Introduction to HPC Using zcluster at GACRC

Introduction to HPC Using zcluster at GACRC Introduction to HPC Using zcluster at GACRC On-class STAT8330 Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu Slides courtesy: Zhoufei Hou 1 Outline What

More information

Exercise 1: Basic Tools

Exercise 1: Basic Tools Exercise 1: Basic Tools This exercise is created so everybody can learn the basic tools we will use during this course. It is really more like a tutorial than an exercise and, you are not required to submit

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information