Practice of Software Development: Dynamic scheduler for scientific simulations

Size: px
Start display at page:

Download "Practice of Software Development: Dynamic scheduler for scientific simulations"

Transcription

1 Practice of Software Development: Dynamic scheduler for scientific SimLab EA Teilchen STEINBUCH CENTRE FOR COMPUTING - SCC KIT Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft

2 Scheduler is a user-friendly middleware between application and cluster scheduling system with graphical (in- and) output interfaces; applicable for scientific applications with high number of identical independent tasks in one computational job; can be integrated into particular scientific applications: CORSIKA (COsmic Ray SImulations for KAscade) code for simulation of extensive air showers initiated by high energy cosmic ray particles; Simulation code for Atomic Clusters; PolGrawAllSky code searching for periodic gravitational wave signals. 2 Steinbuch Centre for Computing

3 communicator Scheduler modules executer interface code interface scheduler Scheduling Data Mining queue bookkeeping statistics visualization system Possible system events: executer interface: send: execute task; receive: task finished. code interface: receive: new task. Coupling types: - task data; - schedule information; - communication. where task is a pair of: Data Type (e.g. structure, array); Input Data of the specified type. 3 Steinbuch Centre for Computing

4 communicator Scheduler modules: advanced version GUI executer interface code interface scheduler Scheduling queue Data Mining bookkeeping statistics visualization system GUI Graphical user interface for the input data entering and batch script generation for particular batch system; can be based on X-Win API and batch system API e.g. MOAB API; 4 Steinbuch Centre for Computing

5 communicator System events 3.1* executer interface 4 Scheduling queue code interface scheduler 4 Data Mining 3 bookkeeping statistics visualization system * - advanced: if local queue is empty check other schedulers for a task; new task signal Possible system events: 1. receive new task from code interface; 2. send task to data mining module; 3. analyse task according to collected statistics; 4. send estimated task weight to scheduler; 5. schedule task (place it in the queue); task finished signal 1. receive finished task from executer interface; 2. store task information to the database; 3. take new task from the queue; 4. send execute task signal to executer interface; 5 Steinbuch Centre for Computing

6 communicator Module scheduling + communicator executer interface code interface Scheduling Module: Scheduling queue contains a variety of scheduling algorithms and strategies; queries Data Mining module, how much resources the task needs; Data Mining bookkeeping statistics makes a decision how to place a task in a schedule (queue / run immediately); does bookkeeping; Communicator: has to organize a multi-level communication among multiple parallel schedulers (one computation several processes multiple tasks); communication can be organized e.g. as following: submit new multiprocessor jobs; change the size of your parallel world; fix the size of parallel world and distribute tasks among reserved processes; 6 Steinbuch Centre for Computing

7 Ordering resources when required submit new multiprocessor jobs using shell script generation or cluster submission system API (e.g. Moab-API): batching clusters; Interfacing with Moab (APIs); use MPI 3.x mechanism to change the size of you parallel world (the number of processes you are using for computation) according to current requirements: MPI 3.1 Standard; MPI-dynamicprocesses.pdf; advanced topics in MPI chapter 9; 7 Steinbuch Centre for Computing

8 Fixed amount of resources (size of MPI World) Single-queue master-managed scheduling: one process is a master and does the scheduling, others are slaves and do the work; - master is a bottleneck and can + relatively easy to implement; serve only one slave at a time; Multiple-queues scheduling (preferable) each process has its own queue of tasks, tasks can be re-distributed among queues; - difficult to implement synchronisation + no bottlenecks all processes many-to-many should be organized; are doing equal algorithm; 8 Steinbuch Centre for Computing

9 Single-queue master-managed scheduling Each process can request master to access the queue Rules can be:... tasks queue LIFO, FIFO; Long-first, Short-first; p0 master Round-Robin; Statistics-based. p1 p2 p3 pn slaves See more about scheduling strategies in the following Doctoral theses: chapter 2.2 from shed2.pdf (good basic overview); chapters 2 and 3 from shed1.pdf; chapters from shed3.pdf; 9 Steinbuch Centre for Computing

10 Multiple-queues scheduling Each process has its own queue, and does the same task: task queues p0 p1 p2 p3 pn processes The queue managing can be one of the following: Random or rule-based task distribution; Statistics-based task distribution; Task stealing (see next 2 slides). 10 Steinbuch Centre for Computing

11 Task stealing algorithm (equal for all processes) begin task check own q no task Blocking operation on own queue. Check whether there is a task and if yes - take it (and remove form q). no task for Ɐ p in other processes check p s q all procs are idle yes no Blocking operation on p s queue. Check whether there is a task and if yes - take it (and remove form q). task work (task) end Work cycle: the execute function of the executer interface with input parameters taken from task is called here. During the work cycle one or several new tasks may be added to the own queue. 11 Steinbuch Centre for Computing

12 Task stealing algorithm intermediate state Some processes are working, some are stealing; task queues p0 p1 p2 p3 pn processes pi - working process - own queue request - task in the queue pj - idle process - other queue request - empty queue slot How to organize stealing on distributed-memory systems see in: Cornell Virtual Workshop on MPI one-sided communication ; The article One-Sided Communications with MPI-2 in Linux Magazine; 12 Steinbuch Centre for Computing

13 Module Data mining Data Mining Advanced: statistics is called from Scheduling Module; analyses the statistics to define resource requirements for a task; keeps statistics for completed tasks; accumulates statistics of the equal computations (same scientific code) in a single database. polynomial approximation of collected statistics (e.g. once at the end of computation); using single function instead of on-line access to the database and data analysis on each step; 13 Steinbuch Centre for Computing

14 Module Database bookkeeping statistics is used to keep statistics, bookkeeping and local schedule; can either be a parallel-accessed or multiplied among processes with synchronisation during the computational process or accumulation at the end of it; Bookkeeping is done for the following events: task appears (timestamp, task parameters, parent process); task started (timestamp, task parameters, hosting process); task finished (timestamp, task parameters, hosting process); interconnection between processes (timestamp, processes ranks, result); Statistics collected for each task is: the runtime of the task depending on task parameters. 14 Steinbuch Centre for Computing

15 Module Visualization system visualization system Visualization system is used to visually analyse the collected data. Either or exports obtained results into selected format (e.g. Gnuplot format,.vtk format for Paraview or any other preferred format, used by opensource visualization software); works with visualization system API (e.g. Gnuplot API). 15 Steinbuch Centre for Computing

16 Code and executer interfaces The task is a set of two: Data_Type (e.g. structure or array); the Data of this type; Within one computation the Data_Type is fixed, differs only the Data. However, interfaces to different simulation software have different task types. Therefore template programming usage is highly recommended. Simple example can be found at: 16 Steinbuch Centre for Computing

Database Driven Processing and Slow Control Monitoring in EDELWEISS

Database Driven Processing and Slow Control Monitoring in EDELWEISS Database Driven Processing and Slow Control Monitoring in EDELWEISS CRESST / EDELWEISS / EURECA Software Workshop München, 21. Mai 2014 Lukas Hehn, KIT Universität des Landes Baden-Württemberg und nationales

More information

Gatlet - a Grid Portal Framework

Gatlet - a Grid Portal Framework Gatlet - a Grid Portal Framework Stefan Bozic stefan.bozic@kit.edu STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association

More information

Extraordinary HPC file system solutions at KIT

Extraordinary HPC file system solutions at KIT Extraordinary HPC file system solutions at KIT Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State Roland of Baden-Württemberg Laifer Lustre and tools for ldiskfs investigation

More information

Using file systems at HC3

Using file systems at HC3 Using file systems at HC3 Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu Basic Lustre

More information

The KASCADE Cosmic ray Data Center - providing open access to astroparticle physics research data

The KASCADE Cosmic ray Data Center - providing open access to astroparticle physics research data The KASCADE Cosmic ray Data Center - providing open access to astroparticle physics research data Dr. Benjamin Fuchs Karlsruhe Institute of Technology Today s menu The right tool for the job Why a web

More information

Hands-On Workshop bwunicluster June 29th 2015

Hands-On Workshop bwunicluster June 29th 2015 Hands-On Workshop bwunicluster June 29th 2015 Agenda Welcome Introduction to bwhpc and the bwunicluster Modules - Software Environment Management Job Submission and Monitoring Interactive Work and Remote

More information

Parallel Tools Platform for Judge

Parallel Tools Platform for Judge Parallel Tools Platform for Judge Carsten Karbach, Forschungszentrum Jülich GmbH September 20, 2013 Abstract The Parallel Tools Platform (PTP) represents a development environment for parallel applications.

More information

Linked Data for Data Integration based on SWIMing Guideline: Use Cases in DAREED Project

Linked Data for Data Integration based on SWIMing Guideline: Use Cases in DAREED Project Linked Data for Data Integration based on SWIMing Guideline: Use Cases in DAREED Project Hendro Wicaksono, Kiril Tonev, Preslava Krahtova 4th International Workshop on Linked Data in Architecture and Construction

More information

A Framework for a Comprehensive Evaluation of Ant-Inspired Peer-to-Peer Protocols

A Framework for a Comprehensive Evaluation of Ant-Inspired Peer-to-Peer Protocols A Framework for a Comprehensive Evaluation of Ant-Inspired Peer-to-Peer Protocols Amos Brocco Department of Innovative Technologies, University of Applied Science of Southern Switzerland Ingmar Baumgart,

More information

Assistance in Lustre administration

Assistance in Lustre administration Assistance in Lustre administration Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu

More information

MDHIM: A Parallel Key/Value Store Framework for HPC

MDHIM: A Parallel Key/Value Store Framework for HPC MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system

More information

LAMBDA The LSDF Execution Framework for Data Intensive Applications

LAMBDA The LSDF Execution Framework for Data Intensive Applications LAMBDA The LSDF Execution Framework for Data Intensive Applications Thomas Jejkal 1, V. Hartmann 1, R. Stotzka 1, J. Otte 2, A. García 3, J. van Wezel 3, A. Streit 3 1 Institute for Dataprocessing and

More information

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO

Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO Virtualization of the ATLAS Tier-2/3 environment on the HPC cluster NEMO Ulrike Schnoor (CERN) Anton Gamel, Felix Bührer, Benjamin Rottler, Markus Schumacher (University of Freiburg) February 02, 2018

More information

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100 GATE- 2016-17 Postal Correspondence 1 Operating Systems Computer Science & Information Technology (CS) 20 Rank under AIR 100 Postal Correspondence Examination Oriented Theory, Practice Set Key concepts,

More information

Lessons learned from Lustre file system operation

Lessons learned from Lustre file system operation Lessons learned from Lustre file system operation Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association

More information

bwfortreff bwhpc user meeting

bwfortreff bwhpc user meeting bwfortreff bwhpc user meeting bwhpc Competence Center MLS&WISO Universitätsrechenzentrum Heidelberg Rechenzentrum der Universität Mannheim Steinbuch Centre for Computing (SCC) Funding: www.bwhpc-c5.de

More information

Reliability, load-balancing, monitoring and all that: deployment aspects of UNICORE. Bernd Schuller UNICORE Summit 2016 June 23, 2016

Reliability, load-balancing, monitoring and all that: deployment aspects of UNICORE. Bernd Schuller UNICORE Summit 2016 June 23, 2016 Mitglied der Helmholtz-Gemeinschaft Reliability, load-balancing, monitoring and all that: deployment aspects of UNICORE Bernd Schuller UNICORE Summit 2016 Outline Clustering recent progress Monitoring

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine Table of Contents Table of Contents Job Manager for remote execution of QuantumATK scripts A single remote machine Settings Environment Resources Notifications Diagnostics Save and test the new machine

More information

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs Frauke Bösert, SCC, KIT 1 Material: Slides & Scripts https://indico.scc.kit.edu/indico/event/263/ @bwunicluster/forhlr I/ForHLR

More information

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs

bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs bwunicluster Tutorial Access, Data Transfer, Compiling, Modulefiles, Batch Jobs Frauke Bösert, SCC, KIT 1 Material: Slides & Scripts https://indico.scc.kit.edu/indico/event/263/ @bwunicluster/forhlr I/ForHLR

More information

SUG Breakout Session: OSC OnDemand App Development

SUG Breakout Session: OSC OnDemand App Development SUG Breakout Session: OSC OnDemand App Development Basil Mohamed Gohar Web and Interface Applications Manager Eric Franz Senior Engineer & Technical Lead This work is supported by the National Science

More information

Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform

Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform Jianlei Liu KIT Institute for Applied Computer Science (Prof. Dr. Veit Hagenmeyer) KIT University of the State of Baden-Wuerttemberg and National

More information

Tutorial 4: Condor. John Watt, National e-science Centre

Tutorial 4: Condor. John Watt, National e-science Centre Tutorial 4: Condor John Watt, National e-science Centre Tutorials Timetable Week Day/Time Topic Staff 3 Fri 11am Introduction to Globus J.W. 4 Fri 11am Globus Development J.W. 5 Fri 11am Globus Development

More information

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved.

Installing and running COMSOL 4.3a on a Linux cluster COMSOL. All rights reserved. Installing and running COMSOL 4.3a on a Linux cluster 2012 COMSOL. All rights reserved. Introduction This quick guide explains how to install and operate COMSOL Multiphysics 4.3a on a Linux cluster. It

More information

OpenCL: History & Future. November 20, 2017

OpenCL: History & Future. November 20, 2017 Mitglied der Helmholtz-Gemeinschaft OpenCL: History & Future November 20, 2017 OpenCL Portable Heterogeneous Computing 2 APIs and 2 kernel languages C Platform Layer API OpenCL C and C++ kernel language

More information

2/26/2017. For instance, consider running Word Count across 20 splits

2/26/2017. For instance, consider running Word Count across 20 splits Based on the slides of prof. Pietro Michiardi Hadoop Internals https://github.com/michiard/disc-cloud-course/raw/master/hadoop/hadoop.pdf Job: execution of a MapReduce application across a data set Task:

More information

Visualization and Data Analysis using VisIt - In Situ Visualization -

Visualization and Data Analysis using VisIt - In Situ Visualization - Mitglied der Helmholtz-Gemeinschaft Visualization and Data Analysis using VisIt - In Situ Visualization - Jens Henrik Göbbert 1, Herwig Zilken 1 1 Jülich Supercomputing Centre, Forschungszentrum Jülich

More information

Workflow System for Data-intensive Many-task Computing

Workflow System for Data-intensive Many-task Computing Workflow System for Data-intensive Many-task Computing Masahiro Tanaka and Osamu Tatebe University of Tsukuba, JST/CREST Japan Science and Technology Agency ISP2S2 Dec 4, 2014 1 Outline Background Pwrake

More information

CS6401- Operating System QUESTION BANK UNIT-I

CS6401- Operating System QUESTION BANK UNIT-I Part-A 1. What is an Operating system? QUESTION BANK UNIT-I An operating system is a program that manages the computer hardware. It also provides a basis for application programs and act as an intermediary

More information

AstroGrid-D. Advanced Prototype Implementation of Monitoring & Steering Methods. Documentation and Test Report 1. Deliverable D6.5

AstroGrid-D. Advanced Prototype Implementation of Monitoring & Steering Methods. Documentation and Test Report 1. Deliverable D6.5 AstroGrid-D Deliverable D6.5 Advanced Prototype Implementation of Monitoring & Steering Methods Documentation and Test Report 1 Deliverable D6.5 Authors Thomas Radke (AEI) Editors Thomas Radke (AEI) Date

More information

Workshop Agenda Feb 25 th 2015

Workshop Agenda Feb 25 th 2015 Workshop Agenda Feb 25 th 2015 Time Presenter Title 09:30 T. König Talk bwhpc Concept & bwhpc-c5 - Federated User Support Activities 09:45 R. Walter Talk bwhpc architecture (bwunicluster, bwforcluster

More information

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013

ICAT Job Portal. a generic job submission system built on a scientific data catalog. IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013 ICAT Job Portal a generic job submission system built on a scientific data catalog IWSG 2013 ETH, Zurich, Switzerland 3-5 June 2013 Steve Fisher, Kevin Phipps and Dan Rolfe Rutherford Appleton Laboratory

More information

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Today s class Scheduling Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Aim of Scheduling Assign processes to be executed by the processor(s) Need to meet system objectives regarding:

More information

Grid Computing Activities at KIT

Grid Computing Activities at KIT Grid Computing Activities at KIT Meeting between NCP and KIT, 21.09.2015 Manuel Giffels Karlsruhe Institute of Technology Institute of Experimental Nuclear Physics & Steinbuch Center for Computing Courtesy

More information

DPA CONTEST 08/09 A SIMPLE IMPROVEMENT OF CLASSICAL CORRELATION POWER ANALYSIS ATTACK ON DES

DPA CONTEST 08/09 A SIMPLE IMPROVEMENT OF CLASSICAL CORRELATION POWER ANALYSIS ATTACK ON DES DPA CONTEST 08/09 A SIMPLE IMPROVEMENT OF CLASSICAL CORRELATION POWER ANALYSIS ATTACK ON DES, Fakultät für Informatik Antonio Almeida Prof. Dejan Lazich Karlsruhe, 9 th June 2010 KIT Universität des Landes

More information

HPC-Reuse: efficient process creation for running MPI and Hadoop MapReduce on supercomputers

HPC-Reuse: efficient process creation for running MPI and Hadoop MapReduce on supercomputers Chung Dao 1 HPC-Reuse: efficient process creation for running MPI and Hadoop MapReduce on supercomputers Thanh-Chung Dao and Shigeru Chiba The University of Tokyo, Japan Chung Dao 2 Running Hadoop and

More information

Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse

Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao 1 Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao and Shigeru Chiba The University of Tokyo Thanh-Chung Dao 2 Supercomputers Expensive clusters Multi-core

More information

CS418 Operating Systems

CS418 Operating Systems CS418 Operating Systems Lecture 9 Processor Management, part 1 Textbook: Operating Systems by William Stallings 1 1. Basic Concepts Processor is also called CPU (Central Processing Unit). Process an executable

More information

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP (extended abstract) Mitsuhisa Sato 1, Motonari Hirano 2, Yoshio Tanaka 2 and Satoshi Sekiguchi 2 1 Real World Computing Partnership,

More information

Big Data for Engineers Spring Resource Management

Big Data for Engineers Spring Resource Management Ghislain Fourny Big Data for Engineers Spring 2018 7. Resource Management artjazz / 123RF Stock Photo Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models

More information

Getting Started with Serial and Parallel MATLAB on bwgrid

Getting Started with Serial and Parallel MATLAB on bwgrid Getting Started with Serial and Parallel MATLAB on bwgrid CONFIGURATION Download either bwgrid.remote.r2014b.zip (Windows) or bwgrid.remote.r2014b.tar (Linux/Mac) For Windows users, unzip the download

More information

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a

More information

For Dr Landau s PHYS8602 course

For Dr Landau s PHYS8602 course For Dr Landau s PHYS8602 course Shan-Ho Tsai (shtsai@uga.edu) Georgia Advanced Computing Resource Center - GACRC January 7, 2019 You will be given a student account on the GACRC s Teaching cluster. Your

More information

Using KVK as a technical base for "Virtual Catalogue of European Historical Bibliographies"

Using KVK as a technical base for Virtual Catalogue of European Historical Bibliographies Using KVK as a technical base for "Virtual Catalogue of European Historical Bibliographies" Uwe Dierolf Library of the Karlsruhe Institute of Technology (KIT) KIT-BIBLIOTHEK KIT Universität des Landes

More information

College on Multiscale Computational Modeling of Materials for Energy Applications: Tutorial

College on Multiscale Computational Modeling of Materials for Energy Applications: Tutorial College on Multiscale Computational Modeling of Materials for Energy Applications: Tutorial Ivan Kondov STEINBUCH CENTRE FOR COMPUTING - SCC KIT The Research University in the Helmholtz Association www.kit.edu

More information

Mitglied der Helmholtz-Gemeinschaft. Eclipse Parallel Tools Platform (PTP)

Mitglied der Helmholtz-Gemeinschaft. Eclipse Parallel Tools Platform (PTP) Mitglied der Helmholtz-Gemeinschaft Eclipse Parallel Tools Platform (PTP) April 25, 2013 Carsten Karbach Content 1 Parallel Tools Platform (PTP) 2 Eclipse Plug-In Development April 25, 2013 Carsten Karbach

More information

Scheduling. Jesus Labarta

Scheduling. Jesus Labarta Scheduling Jesus Labarta Scheduling Applications submitted to system Resources x Time Resources: Processors Memory Objective Maximize resource utilization Maximize throughput Minimize response time Not

More information

Supercomputing in Plain English Exercise #6: MPI Point to Point

Supercomputing in Plain English Exercise #6: MPI Point to Point Supercomputing in Plain English Exercise #6: MPI Point to Point In this exercise, we ll use the same conventions and commands as in Exercises #1, #2, #3, #4 and #5. You should refer back to the Exercise

More information

Workflows and Scheduling

Workflows and Scheduling Workflows and Scheduling Frank Röder Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg 14-12-2015 Frank Röder

More information

OBTAINING AN ACCOUNT:

OBTAINING AN ACCOUNT: HPC Usage Policies The IIA High Performance Computing (HPC) System is managed by the Computer Management Committee. The User Policies here were developed by the Committee. The user policies below aim to

More information

Accelerate HPC Development with Allinea Performance Tools

Accelerate HPC Development with Allinea Performance Tools Accelerate HPC Development with Allinea Performance Tools 19 April 2016 VI-HPS, LRZ Florent Lebeau / Ryan Hulguin flebeau@allinea.com / rhulguin@allinea.com Agenda 09:00 09:15 Introduction 09:15 09:45

More information

Press Release Page 1 / 5 Monika Landgraf Press Officer (acting)

Press Release Page 1 / 5 Monika Landgraf Press Officer (acting) When the Washing Machine Talks to the Power Plant Peer Energy Cloud Wins Trusted Cloud Competition of the Federal Ministry of Economics. Innovative Development of Cloud Enabled Smart Energy Micro Grids

More information

CSCE Operating Systems Scheduling. Qiang Zeng, Ph.D. Fall 2018

CSCE Operating Systems Scheduling. Qiang Zeng, Ph.D. Fall 2018 CSCE 311 - Operating Systems Scheduling Qiang Zeng, Ph.D. Fall 2018 Resource Allocation Graph describing the traffic jam CSCE 311 - Operating Systems 2 Conditions for Deadlock Mutual Exclusion Hold-and-Wait

More information

Load Balancing and Termination Detection

Load Balancing and Termination Detection Chapter 7 Load Balancing and Termination Detection 1 Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection

More information

Practical Introduction to

Practical Introduction to 1 2 Outline of the workshop Practical Introduction to What is ScaleMP? When do we need it? How do we run codes on the ScaleMP node on the ScaleMP Guillimin cluster? How to run programs efficiently on ScaleMP?

More information

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters*

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Chao-Tung Yang, Chun-Sheng Liao, and Ping-I Chen High-Performance Computing Laboratory Department of Computer

More information

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT The Monte Carlo Method: Versatility Unbounded in a Dynamic Computing World Chattanooga, Tennessee, April 17-21, 2005, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2005) MONTE CARLO SIMULATION

More information

CODE-DE.org Copernicus Data and Exploitation Platform Deutsch. CODE-DE Processing Integration Workshop. 25. & 26. June 2018 DLR Oberpfaffenhofen

CODE-DE.org Copernicus Data and Exploitation Platform Deutsch. CODE-DE Processing Integration Workshop. 25. & 26. June 2018 DLR Oberpfaffenhofen CODE-DE.org Copernicus Data and Exploitation Platform Deutsch CODE-DE Processing Integration Workshop 25. & 26. June 2018 DLR Oberpfaffenhofen DLR.de Folie 3 Outline Purpose of the CODE-DE processing portal

More information

A Web-based Prophesy Automated Performance Modeling System

A Web-based Prophesy Automated Performance Modeling System A Web-based Prophesy Automated Performance Modeling System Xingfu Wu, Valerie Taylor Department of Computer Science, Texas A & M University, College Station, TX 77843, USA Email: {wuxf, taylor}@cs.tamu.edu

More information

ESO Reflex (FinReflex)

ESO Reflex (FinReflex) ESO Reflex (FinReflex) A Graphical Workflow Engine for Data Reduction Tero Oittinen Observatory, University of Helsinki The Sampo Team CSC - Scientific Computing Ltd Observatory, University of Helsinki

More information

CSinParallel Workshop. OnRamp: An Interactive Learning Portal for Parallel Computing Environments

CSinParallel Workshop. OnRamp: An Interactive Learning Portal for Parallel Computing Environments CSinParallel Workshop : An Interactive Learning for Parallel Computing Environments Samantha Foley ssfoley@cs.uwlax.edu http://cs.uwlax.edu/~ssfoley Josh Hursey jjhursey@cs.uwlax.edu http://cs.uwlax.edu/~jjhursey/

More information

Course Syllabus. Operating Systems

Course Syllabus. Operating Systems Course Syllabus. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation of Processes 3. Scheduling Paradigms; Unix; Modeling

More information

Multiprocessor and Real- Time Scheduling. Chapter 10

Multiprocessor and Real- Time Scheduling. Chapter 10 Multiprocessor and Real- Time Scheduling Chapter 10 Classifications of Multiprocessor Loosely coupled multiprocessor each processor has its own memory and I/O channels Functionally specialized processors

More information

PBS PROFESSIONAL VS. MICROSOFT HPC PACK

PBS PROFESSIONAL VS. MICROSOFT HPC PACK PBS PROFESSIONAL VS. MICROSOFT HPC PACK On the Microsoft Windows Platform PBS Professional offers many features which are not supported by Microsoft HPC Pack. SOME OF THE IMPORTANT ADVANTAGES OF PBS PROFESSIONAL

More information

Moab Passthrough. Administrator Guide February 2018

Moab Passthrough. Administrator Guide February 2018 Moab Passthrough Administrator Guide 9.1.2 February 2018 2018 Adaptive Computing Enterprises, Inc. All rights reserved. Distribution of this document for commercial purposes in either hard or soft copy

More information

User interface for a computational cluster: resource description approach

User interface for a computational cluster: resource description approach User interface for a computational cluster: resource description approach A. Bogdanov 1,a, V. Gaiduchok 1,2, N. Ahmed 2, P. Ivanov 2, M. Kamande 2, A. Cubahiro 2 1 Saint Petersburg State University, 7/9,

More information

The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D

The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D Marcel Kunze Steinbuch Centre for Computing (SCC) Karlsruhe Institute of Technology (KIT) Germany KIT The cooperation of Forschungszentrum

More information

Moab Workload Manager on Cray XT3

Moab Workload Manager on Cray XT3 Moab Workload Manager on Cray XT3 presented by Don Maxwell (ORNL) Michael Jackson (Cluster Resources, Inc.) MOAB Workload Manager on Cray XT3 Why MOAB? Requirements Features Support/Futures 2 Why Moab?

More information

UL HPC Monitoring in practice: why, what, how, where to look

UL HPC Monitoring in practice: why, what, how, where to look C. Parisot UL HPC Monitoring in practice: why, what, how, where to look 1 / 22 What is HPC? Best Practices Getting Fast & Efficient UL HPC Monitoring in practice: why, what, how, where to look Clément

More information

NovoalignMPI User Guide

NovoalignMPI User Guide MPI User Guide MPI is a messaging passing version of that allows the alignment process to be spread across multiple servers in a cluster or other network of computers 1. Multiple servers can be used to

More information

Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ Electronics

Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ Electronics Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ S. Chilingaryan, M. Caselle, T. Dritschler, T. Farago, A. Kopmann, U. Stevanovic, M. Vogelgesang Hardware, Software, and

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

The coolest place on earth

The coolest place on earth The coolest place on earth Large Scale Messaging with ActiveMQ for Particle Accelerators at CERN 2 Overview Examples 30min Introduction to CERN Operation Usage of ActiveMQ 3 About the Speaker Member of

More information

Tutorial: Compiling, Makefile, Parallel jobs

Tutorial: Compiling, Makefile, Parallel jobs Tutorial: Compiling, Makefile, Parallel jobs Hartmut Häfner Steinbuch Centre for Computing (SCC) Funding: www.bwhpc-c5.de Outline Compiler + Numerical Libraries commands Linking Makefile Intro, Syntax

More information

TotalView. Debugging Tool Presentation. Josip Jakić

TotalView. Debugging Tool Presentation. Josip Jakić TotalView Debugging Tool Presentation Josip Jakić josipjakic@ipb.ac.rs Agenda Introduction Getting started with TotalView Primary windows Basic functions Further functions Debugging parallel programs Topics

More information

Big Data 7. Resource Management

Big Data 7. Resource Management Ghislain Fourny Big Data 7. Resource Management artjazz / 123RF Stock Photo Data Technology Stack User interfaces Querying Data stores Indexing Processing Validation Data models Syntax Encoding Storage

More information

Cluster Upgrade Procedure with Job Queue Migration.

Cluster Upgrade Procedure with Job Queue Migration. Cluster Upgrade Procedure with Job Queue Migration. Zend Server 5.6 Overview Zend Server 5.6 introduces a new highly-reliable Job Queue architecture, based on a MySQL database storage backend. This document

More information

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky RWTH GPU-Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de March 2012 Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative

More information

SNiPER: an offline software framework for non-collider physics experiments

SNiPER: an offline software framework for non-collider physics experiments SNiPER: an offline software framework for non-collider physics experiments J. H. Zou 1, X. T. Huang 2, W. D. Li 1, T. Lin 1, T. Li 2, K. Zhang 1, Z. Y. Deng 1, G. F. Cao 1 1 Institute of High Energy Physics,

More information

Thrill : High-Performance Algorithmic

Thrill : High-Performance Algorithmic Thrill : High-Performance Algorithmic Distributed Batch Data Processing with C++ Timo Bingmann, Michael Axtmann, Peter Sanders, Sebastian Schlag, and 6 Students 2016-12-06 INSTITUTE OF THEORETICAL INFORMATICS

More information

Mitglied der Helmholtz-Gemeinschaft. System Monitoring: LLview

Mitglied der Helmholtz-Gemeinschaft. System Monitoring: LLview Mitglied der Helmholtz-Gemeinschaft System Monitoring: LLview November 27, 2015 Carsten Karbach and Julia Valder Content 1 Overview 2 Components 3 Customization November 27, 2015 Carsten Karbach and Julia

More information

A short Overview of KIT and SCC and the Future of Grid Computing

A short Overview of KIT and SCC and the Future of Grid Computing A short Overview of KIT and SCC and the Future of Grid Computing Frank Schmitz@kit.edu Die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Agenda Introduction KIT and SCC,

More information

Compiling applications for the Cray XC

Compiling applications for the Cray XC Compiling applications for the Cray XC Compiler Driver Wrappers (1) All applications that will run in parallel on the Cray XC should be compiled with the standard language wrappers. The compiler drivers

More information

NUSGRID a computational grid at NUS

NUSGRID a computational grid at NUS NUSGRID a computational grid at NUS Grace Foo (SVU/Academic Computing, Computer Centre) SVU is leading an initiative to set up a campus wide computational grid prototype at NUS. The initiative arose out

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

SCRIPT: An Architecture for IPFIX Data Distribution

SCRIPT: An Architecture for IPFIX Data Distribution SCRIPT Public Workshop January 20, 2010, Zurich, Switzerland SCRIPT: An Architecture for IPFIX Data Distribution Peter Racz Communication Systems Group CSG Department of Informatics IFI University of Zürich

More information

Setup InstrucIons. If you need help with the setup, please put a red sicky note at the top of your laptop.

Setup InstrucIons. If you need help with the setup, please put a red sicky note at the top of your laptop. Setup InstrucIons Please complete these steps for the June 26 th workshop before the lessons start at 1:00 PM: h;p://hcc.unl.edu/june-workshop-setup#weekfour And make sure you can log into Crane. OS-specific

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 28 August 2018 Last Thursday Introduction

More information

Grid Experiment and Job Management

Grid Experiment and Job Management Grid Experiment and Job Management Week #6 Basics of Grid and Cloud computing University of Tartu March 20th 2013 Hardi Teder hardi@eenet.ee Overview Grid Jobs Simple Jobs Pilot Jobs Workflows Job management

More information

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund Data reduction for CORSIKA Technical Report Dominik Baack 06/2016 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft (DFG) within

More information

If you log into your account directly, please go to your personal startpage. You find the submission to review under the status Active.

If you log into your account directly, please go to your personal startpage. You find the submission to review under the status Active. 1. Submission of a review If you received an email with the request to review an article, you can click on the link in the email to enter the page dedicated to reviewers (please see the next page of these

More information

Scalable, Automated Parallel Performance Analysis with TAU, PerfDMF and PerfExplorer

Scalable, Automated Parallel Performance Analysis with TAU, PerfDMF and PerfExplorer Scalable, Automated Parallel Performance Analysis with TAU, PerfDMF and PerfExplorer Kevin A. Huck, Allen D. Malony, Sameer Shende, Alan Morris khuck, malony, sameer, amorris@cs.uoregon.edu http://www.cs.uoregon.edu/research/tau

More information

Blue Gene/Q User Workshop. User Environment & Job submission

Blue Gene/Q User Workshop. User Environment & Job submission Blue Gene/Q User Workshop User Environment & Job submission Topics Blue Joule User Environment Loadleveler Task Placement & BG/Q Personality 2 Blue Joule User Accounts Home directories organised on a project

More information

Operating two InfiniBand grid clusters over 28 km distance

Operating two InfiniBand grid clusters over 28 km distance Operating two InfiniBand grid clusters over 28 km distance Sabine Richling, Steffen Hau, Heinz Kredel, Hans-Günther Kruse IT-Center University of Heidelberg, Germany IT-Center University of Mannheim, Germany

More information

Remaining Contemplation Questions

Remaining Contemplation Questions Process Synchronisation Remaining Contemplation Questions 1. The first known correct software solution to the critical-section problem for two processes was developed by Dekker. The two processes, P0 and

More information

A Laconic HPC with an Orgone Accumulator. Presentation to Multicore World Wellington, February 15-17,

A Laconic HPC with an Orgone Accumulator. Presentation to Multicore World Wellington, February 15-17, A Laconic HPC with an Orgone Accumulator Presentation to Multicore World 2016 Wellington, February 15-17, 2016 http://levlafayette.com Edward - University of Melbourne Cluster - System Installed and operational

More information

SAP CERTIFIED APPLICATION ASSOCIATE - SAP HANA 2.0 (SPS01)

SAP CERTIFIED APPLICATION ASSOCIATE - SAP HANA 2.0 (SPS01) SAP EDUCATION SAMPLE QUESTIONS: C_HANAIMP_13 SAP CERTIFIED APPLICATION ASSOCIATE - SAP HANA 2.0 (SPS01) Disclaimer: These sample questions are for self-evaluation purposes only and do not appear on the

More information

Memory Footprint of Locality Information On Many-Core Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25

Memory Footprint of Locality Information On Many-Core Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25 ROME Workshop @ IPDPS Vancouver Memory Footprint of Locality Information On Many- Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25 Locality Matters to HPC Applications Locality Matters

More information

Informatica Data Explorer Performance Tuning

Informatica Data Explorer Performance Tuning Informatica Data Explorer Performance Tuning 2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information