Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011)

Size: px
Start display at page:

Download "Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011)"

Transcription

1 Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011) Sergio Maffioletti Grid Computing Competence Centre, University of Zurich March 18, 2012

2 Overview of the course Theme 1: from local execution to distributed computing through clouds Theme 2: Overview of large scale infrastructures for scientific computing (grids and clouds) Theme 3: Scientific application s challenges Theme 4: Security Theme 5: Python basic (yes, you need it) Theme 6: Data handling and information processing Theme 7: Let s put everything together and solve some real problems

3 What will you learn here? What characterize a scale computing infrastructure? Why such an infrastructure could be beneficial for scientific research? Why scientific research has a demand for large amount of computing resources? How do we map a scientific usecase in terms of infrastructure requirements? What are the challenges that need to be addressed when porting a scientific usecase on a large scale infrastructure?

4 Practical information Dates and location: Wednesday 12:00-14:00. Room BIN-2.A.10 Exercises: Thursday 16:00-18:00. Room BIN-1.D.12 Course web link: HS11/suche/e details.html Individual projects: During the course, each attenders will develop an individual project centered around one of the thematics we will discuss throughout the course Exam: still to define dates and modalities Learning material: At the end of each class we will provide pointers to online documentation

5 Lecture 1: from Local computing to Distributed systems through Clouds Let s look at the application s execution profile 1. Local systems 2. Cluster systems 3. Distributed systems Slides available for download from:

6 Local System Execute an application on your personal computer. Everything locally available: Application, input data and results Dedicated system (most of the times, user has 100% control) Performance depends on local machine Reliability depends on Application Sequential execution mode No scalability issues (provided one has time to wait until all data are processed sequentially) Access exclusive (own account): 1 username + 1 password

7 Local System Single Resource Single Owner No particular security requirements nor access policies Reliable environment (you know your laptop!) Resource is homogeneous Local resource (you re sitting in front of it!) No resource management policies No specific network connectivity

8 Question 1 Given a single thread application and 10 input files to analyze (1 application execution per input file). Given a 4 cores machine with a single SATA disk. How can we reduce the overall execution time?

9 Question 2 If each application execution generates an I/O throughput of 50MB/s r w (and assuming disk access performance is 100MB/s r w) How shall we distribute the load to optimize the throughput? Note: this exercise can be made also considering memory bandwidth

10 What do we learn here? From these exercises we learn that we need to profile the application as well as the entire experiment (e.g. the 10 input files to analyze) according to the available platform to understand what HTC approach to follow. This can also be applied the other way round; the more we understand the application and experiment behaviours, the better we can plan the computing and data infrastructure.

11 Cluster System What is Cluster? a collection of parallel and distributed processing system that are interconnected by a high-speed network work as a single integrated computing resource

12 Cluster System Application Application Application Queue 1 Queue 2 Queue N Local Resource Management System Node1 Node2 NodeK Cluster Interconnection Network/Switch

13 Example of PBS structure

14 Cluster System Most of the time data available on the cluster: Application, input data and results Minimal control on the system Network File Server involved (NFS, Lustre, GPFS,... ) Execution needs to be described (i.e. Resource requirements) Performance can be tuned by adapting execution to hosting environments (e.g. local storage vs Network file server)

15 Cluster System, cont. Shared access. Own account (configured by a system administrator). 1 username + 1 password (equal on each node of the cluster) Reliability I depends also on how the application behaves during the execution. Reliability II may be affected by reliabiity of the execution node(s) Asynchronous execution (controlled by a Local Resource Management System) Parallel execution (having more nodes at disposal) Scalability is measured against the entire system

16 Cluster System Multiple Resources Owned by a single institution Single security and access policies Volatile environment (It is always better to check before start executing) Resources are homogeneous Resources are within your institution s campus Single resource management policies May have structured network connectivity within university campus and on the Internet

17 Goals of a batch management system Administrative goals Maximize utilization and cluster responsiveness Tune fairness policies and workload distribution Automate time-consuming tasks Trouble-shoot job and resource failures Integrate new hardware and cluster services into the batch system User goals Manage current workload Identify available resources Minimize workload response time Track historical usage Identify effectiveness of prior submissions

18 Example of Resource requirements cput: max CPU time used by all processes in the job pcput: max CPU time used by any single process in the job mem: max amount of physical memory used by the job pmem: max amount of physical memory used by any process of the job vmem: max amount of virtual memory used by the job pvmem: max amount of virtual memory used by any process of the job

19 Example of Resource requirements cont. walltime: wall clock time running file: the largest size of any single file that may be created by the job host: name of the host on which job should be run nodes: number and/or type of nodes to be reserved for exclusive use by the job

20 Question 1 Provided an homogeneous cluster with 4 nodes (4 cores per node); pre-installed application binary. 100 input files to analyze (1 application run per input file) How do we distribute the load?

21 Question 2 If application and data are available on a Network Filesystem (let s say NFS) and each execution node has a local disk large enough to contain 4 input files, how can we improve the overall performance?

Experiences with HP SFS / Lustre in HPC Production

Experiences with HP SFS / Lustre in HPC Production Experiences with HP SFS / Lustre in HPC Production Computing Centre (SSCK) University of Karlsruhe Laifer@rz.uni-karlsruhe.de page 1 Outline» What is HP StorageWorks Scalable File Share (HP SFS)? A Lustre

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Knights Landing production environment on MARCONI

Knights Landing production environment on MARCONI Knights Landing production environment on MARCONI Alessandro Marani - a.marani@cineca.it March 20th, 2017 Agenda In this presentation, we will discuss - How we interact with KNL environment on MARCONI

More information

Bright Cluster Manager

Bright Cluster Manager Bright Cluster Manager Using Slurm for Data Aware Scheduling in the Cloud Martijn de Vries CTO About Bright Computing Bright Computing 1. Develops and supports Bright Cluster Manager for HPC systems, server

More information

Large Scale Computing Infrastructures

Large Scale Computing Infrastructures GC3: Grid Computing Competence Center Large Scale Computing Infrastructures Lecture 2: Cloud technologies Sergio Maffioletti GC3: Grid Computing Competence Center, University

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee @kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture

More information

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar An Exploration into Object Storage for Exascale Supercomputers Raghu Chandrasekar Agenda Introduction Trends and Challenges Design and Implementation of SAROJA Preliminary evaluations Summary and Conclusion

More information

Workload management at KEK/CRC -- status and plan

Workload management at KEK/CRC -- status and plan Workload management at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai CPU in KEKCC Work server & Batch server Xeon 5670 (2.93 GHz /

More information

Virtual SQL Servers. Actual Performance. 2016

Virtual SQL Servers. Actual Performance. 2016 @kleegeek davidklee.net heraflux.com linkedin.com/in/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture Health

More information

Parallel Programming Multicore systems

Parallel Programming Multicore systems FYS3240 PC-based instrumentation and microcontrollers Parallel Programming Multicore systems Spring 2011 Lecture #9 Bekkeng, 4.4.2011 Introduction Until recently, innovations in processor technology have

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011 I Tier-3 di CMS-Italia: stato e prospettive Claudio Grandi Workshop CCR GRID 2011 Outline INFN Perugia Tier-3 R&D Computing centre: activities, storage and batch system CMS services: bottlenecks and workarounds

More information

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5) Announcements Reading Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) 1 Relationship between Kernel mod and User Mode User Process Kernel System Calls User Process

More information

MDHIM: A Parallel Key/Value Store Framework for HPC

MDHIM: A Parallel Key/Value Store Framework for HPC MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming Batched Systems Time-Sharing Systems Personal-Computer Systems Parallel Systems Distributed Systems Real -Time

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat Grid Workflow Efficient Enactment for Data Intensive Applications L3.4 Data Management Techniques Authors : Eddy Caron Frederic Desprez Benjamin Isnard Johan Montagnat Summary : This document presents

More information

BeeGFS. Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting

BeeGFS.   Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting BeeGFS The Parallel Cluster File System Container Workshop ISC 28.7.18 www.beegfs.io July 2018 Marco Merkel VP ww Sales, Consulting HPC & Cognitive Workloads Demand Today Flash Storage HDD Storage Shingled

More information

Introduction to Distributed HTC and overlay systems

Introduction to Distributed HTC and overlay systems Introduction to Distributed HTC and overlay systems Tuesday morning session Igor Sfiligoi University of California San Diego Logistical reminder It is OK to ask questions - During

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 9 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 CPU Scheduling: Objectives CPU scheduling,

More information

Cloud Control Panel User Manual v1.1

Cloud Control Panel User Manual v1.1 Cloud Control Panel User Manual v1.1 March 2011 Page: 1 / 27 Contents 1 Introduction...3 2 Login procedure...4 3 Using the Dashboard...7 3.1 Enabling the Detailed View...8 3.2 Stopping the component...9

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

Introduction to GALILEO

Introduction to GALILEO Introduction to GALILEO Parallel & production environment Mirko Cestari m.cestari@cineca.it Alessandro Marani a.marani@cineca.it Domenico Guida d.guida@cineca.it Maurizio Cremonesi m.cremonesi@cineca.it

More information

Lecture 9: Midterm Review

Lecture 9: Midterm Review Project 1 Due at Midnight Lecture 9: Midterm Review CSE 120: Principles of Operating Systems Alex C. Snoeren Midterm Everything we ve covered is fair game Readings, lectures, homework, and Nachos Yes,

More information

Introduction to Abel/Colossus and the queuing system

Introduction to Abel/Colossus and the queuing system Introduction to Abel/Colossus and the queuing system November 14, 2018 Sabry Razick Research Infrastructure Services Group, USIT Topics First 7 slides are about us and links The Research Computing Services

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data

More information

ARCHER/RDF Overview. How do they fit together? Andy Turner, EPCC

ARCHER/RDF Overview. How do they fit together? Andy Turner, EPCC ARCHER/RDF Overview How do they fit together? Andy Turner, EPCC a.turner@epcc.ed.ac.uk www.epcc.ed.ac.uk www.archer.ac.uk Outline ARCHER/RDF Layout Available file systems Compute resources ARCHER Compute

More information

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI outlook Alice Examples Atlas Examples CMS Examples Alice Examples ALICE Tier-2s at the moment do not support interactive analysis not

More information

Guillimin HPC Users Meeting. Bryan Caron

Guillimin HPC Users Meeting. Bryan Caron July 17, 2014 Bryan Caron bryan.caron@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

Announcements. Program #1. Program #0. Reading. Is due at 9:00 AM on Thursday. Re-grade requests are due by Monday at 11:59:59 PM.

Announcements. Program #1. Program #0. Reading. Is due at 9:00 AM on Thursday. Re-grade requests are due by Monday at 11:59:59 PM. Program #1 Announcements Is due at 9:00 AM on Thursday Program #0 Re-grade requests are due by Monday at 11:59:59 PM Reading Chapter 6 1 CPU Scheduling Manage CPU to achieve several objectives: maximize

More information

Review. Preview. Three Level Scheduler. Scheduler. Process behavior. Effective CPU Scheduler is essential. Process Scheduling

Review. Preview. Three Level Scheduler. Scheduler. Process behavior. Effective CPU Scheduler is essential. Process Scheduling Review Preview Mutual Exclusion Solutions with Busy Waiting Test and Set Lock Priority Inversion problem with busy waiting Mutual Exclusion with Sleep and Wakeup The Producer-Consumer Problem Race Condition

More information

Cloud Control Panel (CCP) User Guide

Cloud Control Panel (CCP) User Guide Cloud Control Panel (CCP) User Guide Version 1.0: 01.01.11 Copyright 2011 DNS Europe Ltd. All rights reserved. Cloud Control Panel (CCP) User Guide v1.0 Table of Contents 1 Introduction 3 1.1 Intended

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD

More information

SolidFire and Ceph Architectural Comparison

SolidFire and Ceph Architectural Comparison The All-Flash Array Built for the Next Generation Data Center SolidFire and Ceph Architectural Comparison July 2014 Overview When comparing the architecture for Ceph and SolidFire, it is clear that both

More information

Midterm Exam. October 20th, Thursday NSC

Midterm Exam. October 20th, Thursday NSC CSE 421/521 - Operating Systems Fall 2011 Lecture - XIV Midterm Review Tevfik Koşar University at Buffalo October 18 th, 2011 1 Midterm Exam October 20th, Thursday 9:30am-10:50am @215 NSC Chapters included

More information

Architecting & Tuning IIB / extreme Scale for Maximum Performance and Reliability

Architecting & Tuning IIB / extreme Scale for Maximum Performance and Reliability Architecting & Tuning IIB / extreme Scale for Maximum Performance and Reliability Suganya Rane Solution Architect Prolifics Agenda Introduction Challenge: The need for Speed & Scalability - WXS Extreme

More information

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD

More information

Performance Matters Scaling Integration Processes to Meet the Needs of Your Business. James Ahlborn, Chief Software Architect, Dell Boomi

Performance Matters Scaling Integration Processes to Meet the Needs of Your Business. James Ahlborn, Chief Software Architect, Dell Boomi Performance Matters Scaling Integration Processes to Meet the Needs of Your Business James Ahlborn, Chief Software Architect, Dell Boomi 1 Atoms Agenda Atoms vs. Molecules Atom Clouds Atom Workers Performance

More information

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Digital transformation is taking place in businesses of all sizes Big Data and Analytics Mobility Internet of Things

More information

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2 EGI User Forum Vilnius, 11-14 April 2011 SPINOSO Vincenzo Optimization of the job submission and data access in a LHC Tier2 Overview User needs Administration issues INFN Bari farm design and deployment

More information

Exercise Architecture of Parallel Computer Systems

Exercise Architecture of Parallel Computer Systems Exercise Architecture of Parallel Computer Systems SoSe 18 L.079.05810 www.uni-paderborn.de/pc2 Architecture of Parallel Computer Systems SoSe 18 J.Simon 1 Overview Computer Systems Test Cluster (Arminius)

More information

Allowing Users to Run Services at the OLCF with Kubernetes

Allowing Users to Run Services at the OLCF with Kubernetes Allowing Users to Run Services at the OLCF with Kubernetes Jason Kincl Senior HPC Systems Engineer Ryan Adamson Senior HPC Security Engineer This work was supported by the Oak Ridge Leadership Computing

More information

DELL EMC ISILON FOR KDB+ ALGORITHMIC TRADING

DELL EMC ISILON FOR KDB+ ALGORITHMIC TRADING WHITE PAPER DELL EMC ISILON FOR KDB+ ALGORITHMIC TRADING A use case for historical, large scale near real-time, and high concurrency storage for algorithmic trading Abstract This whitepaper describes the

More information

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems Processes CS 475, Spring 2018 Concurrent & Distributed Systems Review: Abstractions 2 Review: Concurrency & Parallelism 4 different things: T1 T2 T3 T4 Concurrency: (1 processor) Time T1 T2 T3 T4 T1 T1

More information

Dell EMC CIFS-ECS Tool

Dell EMC CIFS-ECS Tool Dell EMC CIFS-ECS Tool Architecture Overview, Performance and Best Practices March 2018 A Dell EMC Technical Whitepaper Revisions Date May 2016 September 2016 Description Initial release Renaming of tool

More information

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection Switching Operational modes: Store-and-forward: Each switch receives an entire packet before it forwards it onto the next switch - useful in a general purpose network (I.e. a LAN). usually, there is a

More information

EMC Business Continuity for Microsoft Applications

EMC Business Continuity for Microsoft Applications EMC Business Continuity for Microsoft Applications Enabled by EMC Celerra, EMC MirrorView/A, EMC Celerra Replicator, VMware Site Recovery Manager, and VMware vsphere 4 Copyright 2009 EMC Corporation. All

More information

Secure Block Storage (SBS) FAQ

Secure Block Storage (SBS) FAQ What is Secure Block Storage (SBS)? Atlantic.Net's Secure Block Storage allows you to easily attach additional storage to your Atlantic.Net Cloud Servers. You can use SBS for your file, database, application,

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer

More information

Announcement. Exercise #2 will be out today. Due date is next Monday

Announcement. Exercise #2 will be out today. Due date is next Monday Announcement Exercise #2 will be out today Due date is next Monday Major OS Developments 2 Evolution of Operating Systems Generations include: Serial Processing Simple Batch Systems Multiprogrammed Batch

More information

COMPUTER ARCHITECTURE

COMPUTER ARCHITECTURE COURSE: COMPUTER ARCHITECTURE per week: Lectures 3h Lab 2h For the specialty: COMPUTER SYSTEMS AND TECHNOLOGIES Degree: BSc Semester: VII Lecturer: Assoc. Prof. PhD P. BOROVSKA Head of Computer Systems

More information

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 Email:plamenkrastev@fas.harvard.edu Objectives Inform you of available computational resources Help you choose appropriate computational

More information

Operating Systems (2INC0) 2017/18

Operating Systems (2INC0) 2017/18 Operating Systems (2INC0) 2017/18 Memory Management (09) Dr. Courtesy of Dr. I. Radovanovic, Dr. R. Mak (figures from Bic & Shaw) System Architecture and Networking Group Agenda Reminder: OS & resources

More information

SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH

SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH SGI Altix Running Batch Jobs With PBSPro Reiner Vogelsang SGI GmbH reiner@sgi.com Module Objectives After completion of this module you should be able to Submit batch jobs Create job chains Monitor your

More information

Operating Systems. Memory Management. Lecture 9 Michael O Boyle

Operating Systems. Memory Management. Lecture 9 Michael O Boyle Operating Systems Memory Management Lecture 9 Michael O Boyle 1 Memory Management Background Logical/Virtual Address Space vs Physical Address Space Swapping Contiguous Memory Allocation Segmentation Goals

More information

Course Syllabus. Operating Systems

Course Syllabus. Operating Systems Course Syllabus. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation of Processes 3. Scheduling Paradigms; Unix; Modeling

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

User Guide of High Performance Computing Cluster in School of Physics

User Guide of High Performance Computing Cluster in School of Physics User Guide of High Performance Computing Cluster in School of Physics Prepared by Sue Yang (xue.yang@sydney.edu.au) This document aims at helping users to quickly log into the cluster, set up the software

More information

Windows Hardware Performance Tuning for Nastran. Easwaran Viswanathan (Siemens PLM Software)

Windows Hardware Performance Tuning for Nastran. Easwaran Viswanathan (Siemens PLM Software) Windows Hardware Performance Tuning for Nastran By Easwaran Viswanathan (Siemens PLM Software) NX Nastran is a very I/O intensive application. It is important to select the proper hardware to satisfy expected

More information

Preview. Process Scheduler. Process Scheduling Algorithms for Batch System. Process Scheduling Algorithms for Interactive System

Preview. Process Scheduler. Process Scheduling Algorithms for Batch System. Process Scheduling Algorithms for Interactive System Preview Process Scheduler Short Term Scheduler Long Term Scheduler Process Scheduling Algorithms for Batch System First Come First Serve Shortest Job First Shortest Remaining Job First Process Scheduling

More information

Sherlock for IBIIS. William Law Stanford Research Computing

Sherlock for IBIIS. William Law Stanford Research Computing Sherlock for IBIIS William Law Stanford Research Computing Overview How we can help System overview Tech specs Signing on Batch submission Software environment Interactive jobs Next steps We are here to

More information

SAS workload performance improvements with IBM XIV Storage System Gen3

SAS workload performance improvements with IBM XIV Storage System Gen3 SAS workload performance improvements with IBM XIV Storage System Gen3 Including performance comparison with XIV second-generation model Narayana Pattipati IBM Systems and Technology Group ISV Enablement

More information

Comp 310 Computer Systems and Organization

Comp 310 Computer Systems and Organization Comp 310 Computer Systems and Organization Lecture #9 Process Management (CPU Scheduling) 1 Prof. Joseph Vybihal Announcements Oct 16 Midterm exam (in class) In class review Oct 14 (½ class review) Ass#2

More information

Assessing performance in HP LeftHand SANs

Assessing performance in HP LeftHand SANs Assessing performance in HP LeftHand SANs HP LeftHand Starter, Virtualization, and Multi-Site SANs deliver reliable, scalable, and predictable performance White paper Introduction... 2 The advantages of

More information

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced Sarvani Chadalapaka HPC Administrator University of California

More information

CISC 7310X. C05: CPU Scheduling. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/1/2018 CUNY Brooklyn College

CISC 7310X. C05: CPU Scheduling. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/1/2018 CUNY Brooklyn College CISC 7310X C05: CPU Scheduling Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/1/2018 CUNY Brooklyn College 1 Outline Recap & issues CPU Scheduling Concepts Goals and criteria

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Two hours - online. The exam will be taken on line. This paper version is made available as a backup

Two hours - online. The exam will be taken on line. This paper version is made available as a backup COMP 25212 Two hours - online The exam will be taken on line. This paper version is made available as a backup UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE System Architecture Date: Monday 21st

More information

Computer Science 4500 Operating Systems

Computer Science 4500 Operating Systems Computer Science 4500 Operating Systems Module 6 Process Scheduling Methods Updated: September 25, 2014 2008 Stanley A. Wileman, Jr. Operating Systems Slide 1 1 In This Module Batch and interactive workloads

More information

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN Application of Virtualization Technologies & CernVM Benedikt Hegner CERN Virtualization Use Cases Worker Node Virtualization Software Testing Training Platform Software Deployment }Covered today Server

More information

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var

More information

STUDENT NAME: STUDENT ID: Problem 1 Problem 2 Problem 3 Problem 4 Problem 5 Total

STUDENT NAME: STUDENT ID: Problem 1 Problem 2 Problem 3 Problem 4 Problem 5 Total University of Minnesota Department of Computer Science CSci 5103 - Fall 2016 (Instructor: Tripathi) Midterm Exam 1 Date: October 17, 2016 (4:00 5:15 pm) (Time: 75 minutes) Total Points 100 This exam contains

More information

Lecture: Benchmarks, Pipelining Intro. Topics: Performance equations wrap-up, Intro to pipelining

Lecture: Benchmarks, Pipelining Intro. Topics: Performance equations wrap-up, Intro to pipelining Lecture: Benchmarks, Pipelining Intro Topics: Performance equations wrap-up, Intro to pipelining 1 Measuring Performance Two primary metrics: wall clock time (response time for a program) and throughput

More information

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group

The cluster system. Introduction 22th February Jan Saalbach Scientific Computing Group The cluster system Introduction 22th February 2018 Jan Saalbach Scientific Computing Group cluster-help@luis.uni-hannover.de Contents 1 General information about the compute cluster 2 Available computing

More information

HPC Downtime Budgets: Moving SRE Practice to the Rest of the World

HPC Downtime Budgets: Moving SRE Practice to the Rest of the World LA-UR-16-24361 HPC Downtime Budgets: Moving SRE Practice to the Rest of the World SREcon Europe 2016 Cory Lueninghoener July 12, 2016 Operated by Los Alamos National Security, LLC for the U.S. Department

More information

Largest dedicated HPC operation in Europe!

Largest dedicated HPC operation in Europe! Holger Berger, NE HE, Service&Delivery hberger@hpce.nec.com Newly (in 2003) created NE subsidiary Dedicated to H business Headquarters in Düsseldorf, Germany Serving the European Market Branch offices

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER Hardware Sizing Using Amazon EC2 A QlikView Scalability Center Technical White Paper June 2013 qlikview.com Table of Contents Executive Summary 3 A Challenge

More information

SLIDE 1 - COPYRIGHT 2015 ELEPHANT FLOWS IN THE ROOM: SCIENCEDMZ NATIONALLY DISTRIBUTED

SLIDE 1 - COPYRIGHT 2015 ELEPHANT FLOWS IN THE ROOM: SCIENCEDMZ NATIONALLY DISTRIBUTED SLIDE 1 - COPYRIGHT 2015 ELEPHANT FLOWS IN THE ROOM: SCIENCEDMZ NATIONALLY DISTRIBUTED SLIDE 2 - COPYRIGHT 2015 Do you know what your campus network is actually capable of? (i.e. have you addressed your

More information

Slurm basics. Summer Kickstart June slide 1 of 49

Slurm basics. Summer Kickstart June slide 1 of 49 Slurm basics Summer Kickstart 2017 June 2017 slide 1 of 49 Triton layers Triton is a powerful but complex machine. You have to consider: Connecting (ssh) Data storage (filesystems and Lustre) Resource

More information

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions How to run applications on Aziz supercomputer Mohammad Rafi System Administrator Fujitsu Technology Solutions Agenda Overview Compute Nodes Storage Infrastructure Servers Cluster Stack Environment Modules

More information

Lecture 23 Database System Architectures

Lecture 23 Database System Architectures CMSC 461, Database Management Systems Spring 2018 Lecture 23 Database System Architectures These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used

More information

Resource Management on a Mixed Processor Linux Cluster. Haibo Wang. Mississippi Center for Supercomputing Research

Resource Management on a Mixed Processor Linux Cluster. Haibo Wang. Mississippi Center for Supercomputing Research Resource Management on a Mixed Processor Linux Cluster Haibo Wang Mississippi Center for Supercomputing Research Many existing clusters were built as a small test-bed for small group of users and then

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia

More information

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling. Don Porter CSE 506 Block Device Scheduling Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Kernel RCU File System Networking Sync Memory Management Device Drivers CPU Scheduler

More information

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access

More information

Block Device Scheduling

Block Device Scheduling Logical Diagram Block Device Scheduling Don Porter CSE 506 Binary Formats RCU Memory Management File System Memory Allocators System Calls Device Drivers Interrupts Net Networking Threads Sync User Kernel

More information

Parallel & Scalable Machine Learning Introduction to Machine Learning Algorithms

Parallel & Scalable Machine Learning Introduction to Machine Learning Algorithms Parallel & Scalable Machine Learning Introduction to Machine Learning Algorithms Dr. Ing. Morris Riedel Adjunct Associated Professor School of Engineering and Natural Sciences, University of Iceland Research

More information

Improve Web Application Performance with Zend Platform

Improve Web Application Performance with Zend Platform Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching

More information

Guillimin HPC Users Meeting. Bart Oldeman

Guillimin HPC Users Meeting. Bart Oldeman June 19, 2014 Bart Oldeman bart.oldeman@mcgill.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News Upcoming Maintenance Downtime in August Storage System

More information

I/O Systems. 04/16/2007 CSCI 315 Operating Systems Design 1

I/O Systems. 04/16/2007 CSCI 315 Operating Systems Design 1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying the textbook Operating Systems Concepts with Java, by Silberschatz, Galvin, and Gagne (2007). Many, if not

More information

BRC HPC Services/Savio

BRC HPC Services/Savio BRC HPC Services/Savio Krishna Muriki and Gregory Kurtzer LBNL/BRC kmuriki@berkeley.edu, gmk@lbl.gov SAVIO - The Need Has Been Stated Inception and design was based on a specific need articulated by Eliot

More information