Content. Execution Modes SMP, DUAL, VN Partition Types HPC VS HTC HTC Compute Node Linux (CNL) IBM PSSC Montpellier Customer Center

Size: px
Start display at page:

Download "Content. Execution Modes SMP, DUAL, VN Partition Types HPC VS HTC HTC Compute Node Linux (CNL) IBM PSSC Montpellier Customer Center"

Transcription

1 Content IB SSC ontpellier Customer Center Execution odes S, DUAL, VN artition Types HC VS HTC HTC Compute Node Linux (CNL)

2 Execution odes ossibilities Single Node / ulti Node 1, 2 or 4 rocesses per Node 1, 2 or 4 Threads per rocess Notation (default: 1 thread per core) Virtual ode VN 4 I rocesses er Node Dual ode DUAL 2 I rocesses + 2 Threads er rocess Shared emory S 1 I rocess + 4 Threads er rocess Limitation One user process or thread per core

3 Blue Gene/ Execution odes Quad ode reviously called Virtual Node ode All four cores run one I process each No threading emory / I process = ¼ node memory I programming model Dual ode Two cores run one I process each Each process may spawn one thread on core not used by other process emory / I process = ½ node memory Hybrid I/Open programming model S ode One core runs one I process rocess may spawn threads on each of the other cores emory / I process = full node memory Hybrid I/Open programming model Application Application Application Core 1 Core 3 Core 1 CU2 Core 2 CU3 Core 3 Core 0 Core 1 Core 2 Core 3 Core 0 T Core 2 T Core 0 T T T emory address space emory address space emory address space

4 Symetrical ulti-rocessing ode (S) S 1 rocess/node 4 Threads/rocess 2 GB/rocess pthreads and Open are supported 4B default stack for all new threads

5 Dual ode (DUAL) DUAL 2 rocesses/node 2 Threads/rocess 1 GB/rocess pthreads and Open are supported 4B default stack for all new threads

6 Virtual Node ode (VN) VN 4 rocesses/node 1 Thread/rocess 512 B/process 512 B/process versus Shared emory support

7 ultiple Threads er Core ossibility to have 1-3 threads per core New in V1R40 Not exactly the same behavior compared to Linux ainly useful for switches between programming models in phases (Open / pthreads) Application puts one set of threads to sleep and wakes the other set of threads A core does not automatically switch between threads on a timed basis Switches occur either through a sched_yield() system call, signal delivery, or futex wakeup anaged through environment variable BG_ATHREADDETH

8 Shared emory Support Shared memory is supported in Dual and Virtual Node odes BG_SHAREDEOOLSIZE environment variable specifies in B the amount of memory to be allocated, which you can do using the mpirun Shared memory is allocated using standard Linux methods shm_open() / mmap() fd = shm_open( SH_FILE, O_RDWR, 0600 ); allocation ftruncate( fds[0], AX_SHARED_SIZE ); shmptr1 = mmap( NULL, AX_SHARED_SIZE, ROT_READ ROT_WRITE, A_SHARED, fd,0); munmap(shmptrl, AX_SHARED_SIZE); close(fd) shm_unlink(sh_file); deallocation

9 High Throughput Computing (HTC) ode any applications that run on Blue Gene today are embarrassingly (pleasantly) parallel They do not fully exploit the torus for I communication, since that is not needed for their problem They just want a very large number of small tasks, with a coordinator of results High Throughput Computing ode on Blue Gene Enables a new class of workloads that use many single-node jobs Leverages the low-cost, low-energy, small footprint of a rack of 1,024 compute nodes Capacity machine ( cluster buster ): run 4,096 jobs on a single rack in virtual node mode (VN) New HTC CNL mode with full Linux kernel on each Compute Node (from BG/ driver V1R3)

10 HTC Value for «leasantly arallel» Codes Application resiliency A single node failure ends the entire application in the I model For HTC, only the job running on the failed node is ended while other single node jobs continue to run. For long-running jobs that require many tasks, this can mean the difference between having to start from scratch and just being able to proceed ahead on the remaining nodes. The front-end node has more memory, better performance, and more functionality than a single compute node Code that runs on the compute nodes is much cleaner It only contains the work to be performed, and leaves the coordination to a script or scheduler This also eliminates the need to sacrifice one node as being the master node. The coordinator functionality can be anything that runs on Linux erl script, ython, compiled program The coordinator can interact directly with a database To either get the inputs for the application, or to store the results This can eliminate the need to create a flatfile input for the application, or to generate the results in an output file.

11 High erformance VS High Throughput odes High erformance Computing (HC) ode Best for Capability Computing arallel, tightly coupled applications Single Instruction, ultiple Data (SID) architecture rogramming model: typically I Apps need tremendous amount of computational power over short time period High Throughput Computing (HTC) ode Best for Capacity Computing Large number of independent tasks ultiple Instruction, ultiple Data (ID) architecture rogramming model: non-i Applications need large amount of computational power over long time period Traditionally run on large clusters HTC and HC modes co-exist on Blue Gene Determined when resource pool (partition) is allocated

12 HTC Compute Node Linux (CNL) Feature Description Brings full Linux Kernel functionality onto Compute Node Substituting minimal Compute Node Kernel Allows any serial Linux workload to be executed on Blue Gene/ In particular: Brings support for scripted workload (Shell, erl) Characteristics New feature introduced by Blue Gene driver V1R3 Light Linux Kernel but with full compatibility Still limited memory footprint One single CNL / Compute Node Compute Node is seen as a regular Linux S system Number of rocesses and/or Threads is under user control SSH session on Compute Node becomes possible

IBM PSSC Montpellier Customer Center. Blue Gene/P ASIC IBM Corporation

IBM PSSC Montpellier Customer Center. Blue Gene/P ASIC IBM Corporation Blue Gene/P ASIC Memory Overview/Considerations No virtual Paging only the physical memory (2-4 GBytes/node) In C, C++, and Fortran, the malloc routine returns a NULL pointer when users request more memory

More information

! How is a thread different from a process? ! Why are threads useful? ! How can POSIX threads be useful?

! How is a thread different from a process? ! Why are threads useful? ! How can POSIX threads be useful? Chapter 2: Threads: Questions CSCI [4 6]730 Operating Systems Threads! How is a thread different from a process?! Why are threads useful?! How can OSIX threads be useful?! What are user-level and kernel-level

More information

!! How is a thread different from a process? !! Why are threads useful? !! How can POSIX threads be useful?

!! How is a thread different from a process? !! Why are threads useful? !! How can POSIX threads be useful? Chapter 2: Threads: Questions CSCI [4 6]730 Operating Systems Threads!! How is a thread different from a process?!! Why are threads useful?!! How can OSIX threads be useful?!! What are user-level and kernel-level

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 erformance Workshop odern HC Architectures David Henty d.henty@epcc.ed.ac.uk ECC, University of Edinburgh Overview Components History Flynn s Taxonomy SID ID Classification via emory Distributed

More information

Lecture 28 Introduction to Parallel Processing and some Architectural Ramifications. Flynn s Taxonomy. Multiprocessing.

Lecture 28 Introduction to Parallel Processing and some Architectural Ramifications. Flynn s Taxonomy. Multiprocessing. 1 2 Lecture 28 Introduction to arallel rocessing and some Architectural Ramifications 3 4 ultiprocessing Flynn s Taxonomy Flynn s Taxonomy of arallel achines How many Instruction streams? How many Data

More information

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song CSCE 313 Introduction to Computer Systems Instructor: Dezhen Song Programs, Processes, and Threads Programs and Processes Threads Programs, Processes, and Threads Programs and Processes Threads Processes

More information

CSCE 313: Intro to Computer Systems

CSCE 313: Intro to Computer Systems CSCE 313 Introduction to Computer Systems Instructor: Dr. Guofei Gu http://courses.cse.tamu.edu/guofei/csce313/ Programs, Processes, and Threads Programs and Processes Threads 1 Programs, Processes, and

More information

Introduction to Operating Systems (Part II)

Introduction to Operating Systems (Part II) Introduction to Operating Systems (Part II) Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Introduction 1393/6/24 1 / 45 Computer

More information

Scalable Flash Architectures Meet Instant On

Scalable Flash Architectures Meet Instant On calable lash Architectures eet Instant On Jackson Huang Vice resident egment and Ecosystem arketing Cypress Aug 13, 2015 lash emory ummit 2015 anta Clara, CA Explosion of ore Intelligent and Connected

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Name Department/Research Area Have you used the Linux command line?

Name Department/Research Area Have you used the Linux command line? Please log in with HawkID (IOWA domain) Macs are available at stations as marked To switch between the Windows and the Mac systems, press scroll lock twice 9/27/2018 1 Ben Rogers ITS-Research Services

More information

High Performance Computing IBM collaborations with EDF R&D on IBM Blue Gene system

High Performance Computing IBM collaborations with EDF R&D on IBM Blue Gene system IBM eserver pseries Sciomp Meeting- July 16-20 2007 High Performance Computing IBM collaborations with EDF R&D on IBM Blue Gene system Jean-Yves Berthou EDF R&D Pascal Vezolle / Olivier Hess - IBM Deep

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,

More information

CS 470 Spring Mike Lam, Professor. Performance Analysis

CS 470 Spring Mike Lam, Professor. Performance Analysis CS 470 Sring 2018 Mike Lam, Professor Performance Analysis Performance analysis Why do we arallelize our rograms? Performance analysis Why do we arallelize our rograms? So that they run faster! Performance

More information

ARM Vision for Thermal Management and Energy Aware Scheduling on Linux

ARM Vision for Thermal Management and Energy Aware Scheduling on Linux ARM Vision for Management and Energy Aware Scheduling on Linux Charles Garcia-Tobin, Software Power Architect, ARM Thomas Molgaard, Director of Product Management, ARM ARM Tech Symposia China 2015 November

More information

mos: An Architecture for Extreme Scale Operating Systems

mos: An Architecture for Extreme Scale Operating Systems mos: An Architecture for Extreme Scale Operating Systems Robert W. Wisniewski, Todd Inglett, Pardo Keppel, Ravi Murty, Rolf Riesen Presented by: Robert W. Wisniewski Chief Software Architect Extreme Scale

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

Compute Node Linux (CNL) The Evolution of a Compute OS

Compute Node Linux (CNL) The Evolution of a Compute OS Compute Node Linux (CNL) The Evolution of a Compute OS Overview CNL The original scheme plan, goals, requirements Status of CNL Plans Features and directions Futures May 08 Cray Inc. Proprietary Slide

More information

Multi-core Programming Evolution

Multi-core Programming Evolution Multi-core Programming Evolution Based on slides from Intel Software ollege and Multi-ore Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts, Evolution

More information

Operating System Review

Operating System Review COP 4225 Advanced Unix Programming Operating System Review Chi Zhang czhang@cs.fiu.edu 1 About the Course Prerequisite: COP 4610 Concepts and Principles Programming System Calls Advanced Topics Internals,

More information

First Experiences with Intel Cluster OpenMP

First Experiences with Intel Cluster OpenMP First Experiences with Intel Christian Terboven, Dieter an Mey, Dirk Schmidl, Marcus Wagner surname@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University, Germany IWOMP 2008 May

More information

Arachne. Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout

Arachne. Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout Arachne Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout Granular Computing Platform Zaharia Winstein Levis Applications Kozyrakis Cluster Scheduling Ousterhout Low-Latency RPC

More information

Making System z the Center of Enterprise Computing

Making System z the Center of Enterprise Computing 8471 - Making System z the Center of Enterprise Computing Presented By: Mark Neft Accenture Application Modernization & Optimization Strategy Lead Mark.neft@accenture.com March 2, 2011 Session 8471 Presentation

More information

Operating System. Chapter 4. Threads. Lynn Choi School of Electrical Engineering

Operating System. Chapter 4. Threads. Lynn Choi School of Electrical Engineering Operating System Chapter 4. Threads Lynn Choi School of Electrical Engineering Process Characteristics Resource ownership Includes a virtual address space (process image) Ownership of resources including

More information

Scalable Multiprocessors

Scalable Multiprocessors arallel Computer Organization and Design : Lecture 7 er Stenström. 2008, Sally A. ckee 2009 Scalable ultiprocessors What is a scalable design? (7.1) Realizing programming models (7.2) Scalable communication

More information

Sharing High-Performance Devices Across Multiple Virtual Machines

Sharing High-Performance Devices Across Multiple Virtual Machines Sharing High-Performance Devices Across Multiple Virtual Machines Preamble What does sharing devices across multiple virtual machines in our title mean? How is it different from virtual networking / NSX,

More information

CS30002: Operating Systems. Arobinda Gupta Spring 2017

CS30002: Operating Systems. Arobinda Gupta Spring 2017 CS30002: Operating Systems Arobinda Gupta Spring 2017 General Information Textbook: Operating System Concepts, 8 th or 9 th Ed, by Silberschatz, Galvin, and Gagne I will use materials from other books

More information

ESL-Based Full System Simulation Platform

ESL-Based Full System Simulation Platform EL-Based Full ystem imulation latform 陳中和 Department of Electrical Engineering Institute of Computer and Communication Engineering National Cheng Kung University NCKU-CALab Term roject-reparation Lab1:

More information

Threads. CS3026 Operating Systems Lecture 06

Threads. CS3026 Operating Systems Lecture 06 Threads CS3026 Operating Systems Lecture 06 Multithreading Multithreading is the ability of an operating system to support multiple threads of execution within a single process Processes have at least

More information

Executing Message-Passing Programs. Mitesh Meswani

Executing Message-Passing Programs. Mitesh Meswani Executing Message-assing rograms Mitesh Meswani resentation Outline Introduction to Top Gun (eserver pseries 690) MI on Top Gun (AIX/Linux) Itanium2 (Linux) Cluster Sun (Solaris) Workstation Cluster Environment

More information

IBM Blue Gene/Q solution

IBM Blue Gene/Q solution IBM Blue Gene/Q solution Pascal Vezolle vezolle@fr.ibm.com Broad IBM Technical Computing portfolio Hardware Blue Gene/Q Power Systems 86 Systems idataplex and Intelligent Cluster GPGPU / Intel MIC PureFlexSystems

More information

Project 2 Overview: Part A: User space memory allocation

Project 2 Overview: Part A: User space memory allocation Project 2 Overview: Once again, this project will have 2 parts. In the first part, you will get to implement your own user space memory allocator. You will learn the complexities and details of memory

More information

Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology

Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology September 19, 2007 Markus Levy, EEMBC and Multicore Association Enabling the Multicore Ecosystem Multicore

More information

Operating System. Operating System Overview. Layers of Computer System. Operating System Objectives. Services Provided by the Operating System

Operating System. Operating System Overview. Layers of Computer System. Operating System Objectives. Services Provided by the Operating System Operating System Operating System Overview Chapter 2 A program that controls the execution of application programs An interface between applications and hardware 1 2 Operating System Objectives Layers

More information

Operating System Overview. Operating System

Operating System Overview. Operating System Operating System Overview Chapter 2 1 Operating System A program that controls the execution of application programs An interface between applications and hardware 2 1 Operating System Objectives Convenience

More information

Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows

Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows J. Bernard 1, P. Morjan 2, B. Hagley 3, F. Delalondre 1, F. Schürmann 1,

More information

Threads, SMP, and Microkernels

Threads, SMP, and Microkernels Threads, SMP, and Microkernels Chapter 4 E&CE 354: Processes 0 Multithreaded Programming So what is multithreaded programming? Basically, multithreaded programming is implementing software so that two

More information

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU

Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU Introduction to Joker Cyber Infrastructure Architecture Team CIA.NMSU.EDU What is Joker? NMSU s supercomputer. 238 core computer cluster. Intel E-5 Xeon CPUs and Nvidia K-40 GPUs. InfiniBand innerconnect.

More information

Porting Applications to Blue Gene/P

Porting Applications to Blue Gene/P Porting Applications to Blue Gene/P Dr. Christoph Pospiech pospiech@de.ibm.com 05/17/2010 Agenda What beast is this? Compile - link go! MPI subtleties Help! It doesn't work (the way I want)! Blue Gene/P

More information

Cisco HyperFlex and the F5 BIG-IP Platform Accelerate Infrastructure and Application Deployments

Cisco HyperFlex and the F5 BIG-IP Platform Accelerate Infrastructure and Application Deployments OVERVIEW + Cisco and the F5 BIG-IP Platform Accelerate Infrastructure and Application Deployments KEY BENEFITS Quickly create private clouds Tested with industry-leading BIG-IP ADC platform Easily scale

More information

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Today s class Scheduling Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1 Aim of Scheduling Assign processes to be executed by the processor(s) Need to meet system objectives regarding:

More information

An introduction to checkpointing. for scientifc applications

An introduction to checkpointing. for scientifc applications damien.francois@uclouvain.be UCL/CISM An introduction to checkpointing for scientifc applications November 2016 CISM/CÉCI training session What is checkpointing? Without checkpointing: $./count 1 2 3^C

More information

Resource allocation and utilization in the Blue Gene/L supercomputer

Resource allocation and utilization in the Blue Gene/L supercomputer Resource allocation and utilization in the Blue Gene/L supercomputer Tamar Domany, Y Aridor, O Goldshmidt, Y Kliteynik, EShmueli, U Silbershtein IBM Labs in Haifa Agenda Blue Gene/L Background Blue Gene/L

More information

Evolution and Convergence of Parallel Architectures

Evolution and Convergence of Parallel Architectures History Evolution and Convergence of arallel Architectures Historically, parallel architectures tied to programming models Divergent architectures, with no predictable pattern of growth. Todd C. owry CS

More information

High Performance Computing

High Performance Computing High erformance Computing ADVANCED SCIENTIFIC COUTING Dr. Ing. orris Riedel Adjunct Associated rofessor School of Engineering and Natural Sciences, University of Iceland Research Group Leader, Juelich

More information

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb. Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation

More information

Misc. Third Generation Batch Multiprogramming. Fourth Generation Time Sharing. Last Time Evolution of OSs

Misc. Third Generation Batch Multiprogramming. Fourth Generation Time Sharing. Last Time Evolution of OSs Third Generation Batch Multiprogramming Misc. Problem: but I/O still expensive; can happen in middle of job Idea: have a pool of ready jobs in memory, switch to one when another needs I/O When one job

More information

High Performance Computing Cluster Advanced course

High Performance Computing Cluster Advanced course High Performance Computing Cluster Advanced course Jeremie Vandenplas, Gwen Dawes 9 November 2017 Outline Introduction to the Agrogenomics HPC Submitting and monitoring jobs on the HPC Parallel jobs on

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 8 Threads and Scheduling Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ How many threads

More information

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications? YOUR APPLICATION S JOURNEY TO THE CLOUD What s the best way to get cloud native capabilities for your existing applications? Introduction Moving applications to cloud is a priority for many IT organizations.

More information

The rcuda technology: an inexpensive way to improve the performance of GPU-based clusters Federico Silla

The rcuda technology: an inexpensive way to improve the performance of GPU-based clusters Federico Silla The rcuda technology: an inexpensive way to improve the performance of -based clusters Federico Silla Technical University of Valencia Spain The scope of this talk Delft, April 2015 2/47 More flexible

More information

More Types of Synchronization 11/29/16

More Types of Synchronization 11/29/16 More Types of Synchronization 11/29/16 Today s Agenda Classic thread patterns Other parallel programming patterns More synchronization primitives: RW locks Condition variables Semaphores Message passing

More information

Three basic multiprocessing issues

Three basic multiprocessing issues Three basic multiprocessing issues 1. artitioning. The sequential program must be partitioned into subprogram units or tasks. This is done either by the programmer or by the compiler. 2. Scheduling. Associated

More information

History of Distributed Systems. Joseph Cordina

History of Distributed Systems. Joseph Cordina History of Distributed Systems Joseph Cordina joseph.cordina@um.edu.mt otivation Computation demands were always higher than technological status quo Obvious answer Several computing elements working in

More information

Operating Systems (2INC0) 2017/18

Operating Systems (2INC0) 2017/18 Operating Systems (2INC0) 2017/18 Memory Management (09) Dr. Courtesy of Dr. I. Radovanovic, Dr. R. Mak (figures from Bic & Shaw) System Architecture and Networking Group Agenda Reminder: OS & resources

More information

Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group

Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group Simultaneous Multi-threading Implementation in POWER5 -- IBM's Next Generation POWER Microprocessor Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group Outline Motivation Background Threading Fundamentals

More information

NLUUG, Bunnik CloudABI: safe, testable and maintainable software for UNIX Speaker: Ed Schouten,

NLUUG, Bunnik CloudABI: safe, testable and maintainable software for UNIX Speaker: Ed Schouten, NLUUG, Bunnik 2015-05-28 CloudABI: safe, testable and maintainable software for UNIX Speaker: Ed Schouten, ed@nuxi.nl Programme What is wrong with UNIX? What is CloudABI? Use cases for CloudABI Links 2

More information

Unit 8: Superscalar Pipelines

Unit 8: Superscalar Pipelines A Key Theme: arallelism reviously: pipeline-level parallelism Work on execute of one instruction in parallel with decode of next CIS 501: Computer Architecture Unit 8: Superscalar ipelines Slides'developed'by'Milo'Mar0n'&'Amir'Roth'at'the'University'of'ennsylvania'

More information

Looking ahead with IBM i. 10+ year roadmap

Looking ahead with IBM i. 10+ year roadmap Looking ahead with IBM i 10+ year roadmap 1 Enterprises Trust IBM Power 80 of Fortune 100 have IBM Power Systems The top 10 banking firms have IBM Power Systems 9 of top 10 insurance companies have IBM

More information

Operating Systems Fundamentals. What is an Operating System? Focus. Computer System Components. Chapter 1: Introduction

Operating Systems Fundamentals. What is an Operating System? Focus. Computer System Components. Chapter 1: Introduction Operating Systems Fundamentals Overview of Operating Systems Ahmed Tawfik Modern Operating Systems are increasingly complex Operating System Millions of Lines of Code DOS 0.015 Windows 95 11 Windows 98

More information

Veeam with Cohesity Data Platform

Veeam with Cohesity Data Platform Veeam with Cohesity Data Platform Table of Contents About This Guide: 2 Data Protection for VMware Environments: 2 Benefits of using the Cohesity Data Platform with Veeam Backup & Replication: 4 Appendix

More information

Parallel Computing Basics, Semantics

Parallel Computing Basics, Semantics 1 / 15 Parallel Computing Basics, Semantics Landau s 1st Rule of Education Rubin H Landau Sally Haerer, Producer-Director Based on A Survey of Computational Physics by Landau, Páez, & Bordeianu with Support

More information

Process Environment. Pradipta De

Process Environment. Pradipta De Process Environment Pradipta De pradipta.de@sunykorea.ac.kr Today s Topic Program to process How is a program loaded by the kernel How does kernel set up the process Outline Review of linking and loading

More information

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop Before We Start Sign in hpcxx account slips Windows Users: Download PuTTY Google PuTTY First result Save putty.exe to Desktop Research Computing at Virginia Tech Advanced Research Computing Compute Resources

More information

Limitations of Memory System Performance

Limitations of Memory System Performance Slides taken from arallel Computing latforms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar! " To accompany the text ``Introduction to arallel Computing'', Addison Wesley, 2003. Limitations

More information

Parallelism and Concurrency. COS 326 David Walker Princeton University

Parallelism and Concurrency. COS 326 David Walker Princeton University Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary

More information

PxP: Performance x Power Optimizations for Sparse Scientific Computing

PxP: Performance x Power Optimizations for Sparse Scientific Computing x: erformance x ower Optimizations for Sparse Scientific Computing adma Raghavan, Mary Jane Irwin, Mahmut Kandemir Suzanne Shontz, and Jia Li Sarah Conner, Yang Ding, and Konrad Malkowski (h.d.) Department

More information

The cow and Zaphod... Virtual Memory #2 Feb. 21, 2007

The cow and Zaphod... Virtual Memory #2 Feb. 21, 2007 15-410...The cow and Zaphod... Virtual Memory #2 Feb. 21, 2007 Dave Eckhardt Bruce Maggs 1 L16_VM2 Wean Synchronization Watch for exam e-mail Please answer promptly Computer Club demo night Thursday (2/22)

More information

CPU-GPU Heterogeneous Computing

CPU-GPU Heterogeneous Computing CPU-GPU Heterogeneous Computing Advanced Seminar "Computer Engineering Winter-Term 2015/16 Steffen Lammel 1 Content Introduction Motivation Characteristics of CPUs and GPUs Heterogeneous Computing Systems

More information

System Call. Preview. System Call. System Call. System Call 9/7/2018

System Call. Preview. System Call. System Call. System Call 9/7/2018 Preview Operating System Structure Monolithic Layered System Microkernel Virtual Machine Process Management Process Models Process Creation Process Termination Process State Process Implementation Operating

More information

Bringing OpenStack to the Enterprise. An enterprise-class solution ensures you get the required performance, reliability, and security

Bringing OpenStack to the Enterprise. An enterprise-class solution ensures you get the required performance, reliability, and security Bringing OpenStack to the Enterprise An enterprise-class solution ensures you get the required performance, reliability, and security INTRODUCTION Organizations today frequently need to quickly get systems

More information

Blue Gene/Q User Workshop. User Environment & Job submission

Blue Gene/Q User Workshop. User Environment & Job submission Blue Gene/Q User Workshop User Environment & Job submission Topics Blue Joule User Environment Loadleveler Task Placement & BG/Q Personality 2 Blue Joule User Accounts Home directories organised on a project

More information

Kernel Internals. Course Duration: 5 days. Pre-Requisites : Course Objective: Course Outline

Kernel Internals. Course Duration: 5 days. Pre-Requisites : Course Objective: Course Outline Course Duration: 5 days Pre-Requisites : Good C programming skills. Required knowledge Linux as a User Course Objective: To get Kernel and User Space of Linux and related programming Linux Advance Programming

More information

FlexSC. Flexible System Call Scheduling with Exception-Less System Calls. Livio Soares and Michael Stumm. University of Toronto

FlexSC. Flexible System Call Scheduling with Exception-Less System Calls. Livio Soares and Michael Stumm. University of Toronto FlexSC Flexible System Call Scheduling with Exception-Less System Calls Livio Soares and Michael Stumm University of Toronto Motivation The synchronous system call interface is a legacy from the single

More information

EMBEDDED LINUX ON ARM9 Weekend Workshop

EMBEDDED LINUX ON ARM9 Weekend Workshop Here to take you beyond EMBEDDED LINUX ON ARM9 Weekend Workshop Embedded Linux on ARM9 Weekend workshop Objectives: Get you exposed with various trends in Embedded OS Leverage Opensource tools to build

More information

Scaling Facebook. Ben Maurer

Scaling Facebook. Ben Maurer Scaling Userspace @ Facebook Ben Maurer bmaurer@fb.com! About Me At Facebook since 2010 Co-founded recaptcha Tech-lead of Web Foundation team Responsible for the overall performance & reliability of Facebook

More information

CSE Opera+ng System Principles

CSE Opera+ng System Principles CSE 30341 Opera+ng System Principles Lecture 2 Introduc5on Con5nued Recap Last Lecture What is an opera+ng system & kernel? What is an interrupt? CSE 30341 Opera+ng System Principles 2 1 OS - Kernel CSE

More information

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff cctrieloff@redhat.com Red Hat Lee Fisher lee.fisher@hp.com Hewlett-Packard High Performance Computing on Wall Street conference 14

More information

Enlightening the I/O Path: A Holistic Approach for Application Performance

Enlightening the I/O Path: A Holistic Approach for Application Performance Enlightening the I/O Path: A Holistic Approach for Application Performance Sangwook Kim 13, Hwanju Kim 2, Joonwon Lee 3, and Jinkyu Jeong 3 Apposha 1 Dell EMC 2 Sungkyunkwan University 3 Data-Intensive

More information

Revisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach

Revisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach Revisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach Yuki Soma, Balazs Gerofi, Yutaka Ishikawa 1 Agenda Background on virtual memory

More information

Advanced Job Launching. mapping applications to hardware

Advanced Job Launching. mapping applications to hardware Advanced Job Launching mapping applications to hardware A Quick Recap - Glossary of terms Hardware This terminology is used to cover hardware from multiple vendors Socket The hardware you can touch and

More information

Threads Chapter 5 1 Chapter 5

Threads Chapter 5 1 Chapter 5 Threads Chapter 5 1 Chapter 5 Process Characteristics Concept of Process has two facets. A Process is: A Unit of resource ownership: a virtual address space for the process image control of some resources

More information

Concurrency, Thread. Dongkun Shin, SKKU

Concurrency, Thread. Dongkun Shin, SKKU Concurrency, Thread 1 Thread Classic view a single point of execution within a program a single PC where instructions are being fetched from and executed), Multi-threaded program Has more than one point

More information

RxNetty vs Tomcat Performance Results

RxNetty vs Tomcat Performance Results RxNetty vs Tomcat Performance Results Brendan Gregg; Performance and Reliability Engineering Nitesh Kant, Ben Christensen; Edge Engineering updated: Apr 2015 Results based on The Hello Netflix benchmark

More information

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016 Batch environment PBS (Running applications on the Cray XC30) 1/18/2016 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch

More information

Introduction. CS3026 Operating Systems Lecture 01

Introduction. CS3026 Operating Systems Lecture 01 Introduction CS3026 Operating Systems Lecture 01 One or more CPUs Device controllers (I/O modules) Memory Bus Operating system? Computer System What is an Operating System An Operating System is a program

More information

Tweaking Linux for a Green Datacenter

Tweaking Linux for a Green Datacenter Tweaking Linux for a Green Datacenter Vaidyanathan Srinivasan Jenifer Hopper Agenda Platform features and Linux exploitation Tuning scheduler and cpufreq

More information

THE PROCESS ABSTRACTION. CS124 Operating Systems Winter , Lecture 7

THE PROCESS ABSTRACTION. CS124 Operating Systems Winter , Lecture 7 THE PROCESS ABSTRACTION CS124 Operating Systems Winter 2015-2016, Lecture 7 2 The Process Abstraction Most modern OSes include the notion of a process Term is short for a sequential process Frequently

More information

Memory management. Johan Montelius KTH

Memory management. Johan Montelius KTH Memory management Johan Montelius KTH 2017 1 / 22 C program # include int global = 42; int main ( int argc, char * argv []) { if( argc < 2) return -1; int n = atoi ( argv [1]); int on_stack

More information

Agenda. Threads. Single and Multi-threaded Processes. What is Thread. CSCI 444/544 Operating Systems Fall 2008

Agenda. Threads. Single and Multi-threaded Processes. What is Thread. CSCI 444/544 Operating Systems Fall 2008 Agenda Threads CSCI 444/544 Operating Systems Fall 2008 Thread concept Thread vs process Thread implementation - user-level - kernel-level - hybrid Inter-process (inter-thread) communication What is Thread

More information

Problem Set: Processes

Problem Set: Processes Lecture Notes on Operating Systems Problem Set: Processes 1. Answer yes/no, and provide a brief explanation. (a) Can two processes be concurrently executing the same program executable? (b) Can two running

More information

High Performance Computing Cluster Basic course

High Performance Computing Cluster Basic course High Performance Computing Cluster Basic course Jeremie Vandenplas, Gwen Dawes 30 October 2017 Outline Introduction to the Agrogenomics HPC Connecting with Secure Shell to the HPC Introduction to the Unix/Linux

More information

OS Virtualization. Linux Containers (LXC)

OS Virtualization. Linux Containers (LXC) OS Virtualization Emulate OS-level interface with native interface Lightweight virtual machines No hypervisor, OS provides necessary support Referred to as containers Solaris containers, BSD jails, Linux

More information

Running applications on the Cray XC30

Running applications on the Cray XC30 Running applications on the Cray XC30 Running on compute nodes By default, users do not access compute nodes directly. Instead they launch jobs on compute nodes using one of three available modes: 1. Extreme

More information

Announcements. Assignment 4 is due today. Project is posted. Groups of 2 are allowed. Final Exam. Course evaluation. Dec 20, :00 11:30 M3-1006

Announcements. Assignment 4 is due today. Project is posted. Groups of 2 are allowed. Final Exam. Course evaluation. Dec 20, :00 11:30 M3-1006 Announcements Assignment 4 is due today Project is posted. Groups of 2 are allowed. Final Exam Dec 20, 2017 09:00 11:30 M3-1006 Course evaluation https://evaluate.uwaterloo.ca/ ECE650 section 001 November

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming Batched Systems Time-Sharing Systems Personal-Computer Systems Parallel Systems Distributed Systems Real -Time

More information

Light & NOS. Dan Li Tsinghua University

Light & NOS. Dan Li Tsinghua University Light & NOS Dan Li Tsinghua University Performance gain The Power of DPDK As claimed: 80 CPU cycles per packet Significant gain compared with Kernel! What we care more How to leverage the performance gain

More information

Runtime Application Self-Protection (RASP) Performance Metrics

Runtime Application Self-Protection (RASP) Performance Metrics Product Analysis June 2016 Runtime Application Self-Protection (RASP) Performance Metrics Virtualization Provides Improved Security Without Increased Overhead Highly accurate. Easy to install. Simple to

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real -Time Systems Handheld Systems Computing Environments

More information