RiSE: Relaxed Systems Engineering? Christoph Kirsch University of Salzburg

Size: px
Start display at page:

Download "RiSE: Relaxed Systems Engineering? Christoph Kirsch University of Salzburg"

Transcription

1 RiSE: Relaxed Systems Engineering? Christoph Kirsch University of Salzburg

2 Application: >10k #threads, producer/consumer, blocking Hardware: CPUs, cores, MMUs,, caches

3 Application: >10k #threads, producer/consumer, blocking allocate Hardware: CPUs, cores, MMUs,, caches

4 Application: >10k #threads, producer/consumer, blocking allocate access Hardware: CPUs, cores, MMUs,, caches

5 Application: >10k #threads, producer/consumer, blocking allocate access share Hardware: CPUs, cores, MMUs,, caches

6 Application: >10k #threads, producer/consumer, blocking allocate access share deallocate Hardware: CPUs, cores, MMUs,, caches

7 Application: >10k #threads, producer/consumer, blocking allocate access share deallocate throughput Hardware: CPUs, cores, MMUs,, caches

8 Application: >10k #threads, producer/consumer, blocking allocate access share deallocate throughput scalability Hardware: CPUs, cores, MMUs,, caches

9 Application: >10k #threads, producer/consumer, blocking allocate access share deallocate throughput scalability latency Hardware: CPUs, cores, MMUs,, caches

10 Application: >10k #threads, producer/consumer, blocking allocate access share deallocate throughput scalability latency consumption Hardware: CPUs, cores, MMUs,, caches

11

12 free lists

13 free lists thread-local

14 free lists core-local thread-local

15 free lists core-local thread-local CPU-local

16 global free lists core-local thread-local CPU-local

17 lock-based global free lists core-local thread-local CPU-local

18 lock-based global lock-free free lists core-local thread-local CPU-local

19 lock-based global lock-free is it a stack? free lists core-local thread-local CPU-local

20 lock-based global lock-free is it a stack? free lists is it a queue? core-local thread-local CPU-local

21 TryReviveSlow StealFailed ReviveFailed ReviveOkSwapHot NoBlockFastAlloc BlockDoNothing NoHotNoBlock OthersNotSlow NoHotButBlock FastAlloc MineNotSlow StealOkSwapHot TryStealSpan allocation transitions start live Terminate dead RemoteFree deallocation transitions StillFast FastFree MakeSlow TryMakeSafe SlowFree Emptied ReviveSwapHot StaySlow FastFree NotEmpty Figure 5: State machine for thread limited to one size class

22 Relaxed Semantics vs. Operational Performance vs. Denotational Performance

23 Relaxed Semantics [PaCT13] [CF13] [POPL13] vs. Operational Performance vs. Denotational Performance

24 Relaxed Semantics [PaCT13] [CF13] [POPL13] vs. Operational Performance vs. [RACES12] Denotational Performance

25 jemalloc llalloc ptmalloc2 nedmalloc tbb tcmalloc streamflow hoard compact scalloc scalloc-eager scalloc-reuse total allocation time in seconds (logscale, less is better) total deallocation time in seconds (logscale, less is better) average consumption in MB (logscale, less is better) B B 256-1KB 1-4KB 4-16KB 16-64KB object size in bytes (logscale) (a) Allocation time KB 256KB-1MB 1-4MB B B 256-1KB 1-4KB 4-16KB 16-64KB object size in bytes (logscale) (b) Deallocation time KB 256KB-1MB 1-4MB B B 256-1KB 1-4KB 4-16KB 16-64KB KB object size in bytes (logscale) (c) Memory consumption 256KB-1MB 1-4MB Figure 7: ACDC for increasing object sizes per-thread total allocation time seconds (logscale, less is better) per-thread total deallocation time seconds (logscale, less is better) per-thread average consumption in kb (less is better) number of threads (a) Allocation time number of threads (b) Deallocation time number of threads (c) Memory consumption Figure 8: ACDC for an increasing number of threads allocating thread-local objects from a large size range per-thread total allocation time seconds (logscale, less is better) per-thread total deallocation time seconds (logscale, less is better) per-thread average consumption in kb (less is better) number of threads (a) Allocation time number of threads (b) Deallocation time number of threads (c) Memory consumption Figure 9: ACDC for an increasing number of threads allocating shared objects from a large size range

26 Scalable Concurrent Data Structures: scal.cs.uni-salzburg.at github.com/cksystemsgroup/scal Scalable Concurrent Memory Allocator: github.com/cksystemsgroup/scalloc Allocator Benchmarking: acdc.cs.uni-salzburg.at github.com/cksystemsgroup/acdc

Design of Concurrent and Distributed Data Structures

Design of Concurrent and Distributed Data Structures METIS Spring School, Agadir, Morocco, May 2015 Design of Concurrent and Distributed Data Structures Christoph Kirsch University of Salzburg Joint work with M. Dodds, A. Haas, T.A. Henzinger, A. Holzer,

More information

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg Short-term Memory for Self-collecting Mutators Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg CHESS Seminar, UC Berkeley, September 2010 Heap Management explicit heap

More information

Performance of Various Levels of Storage. Movement between levels of storage hierarchy can be explicit or implicit

Performance of Various Levels of Storage. Movement between levels of storage hierarchy can be explicit or implicit Memory Management All data in memory before and after processing All instructions in memory in order to execute Memory management determines what is to be in memory Memory management activities Keeping

More information

StreamBox: Modern Stream Processing on a Multicore Machine

StreamBox: Modern Stream Processing on a Multicore Machine StreamBox: Modern Stream Processing on a Multicore Machine Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu

More information

Concurrency and Scalability versus Fragmentation and Compaction with Compact-fit

Concurrency and Scalability versus Fragmentation and Compaction with Compact-fit Concurrency and Scalability versus Fragmentation and Compaction with Compact-fit Silviu S. Craciunas Christoph M. Kirsch Hannes Payer Harald Röck Ana Sokolova Technical Report 2009-02 April 2009 Department

More information

A Comprehensive Complexity Analysis of User-level Memory Allocator Algorithms

A Comprehensive Complexity Analysis of User-level Memory Allocator Algorithms 2012 Brazilian Symposium on Computing System Engineering A Comprehensive Complexity Analysis of User-level Memory Allocator Algorithms Taís Borges Ferreira, Márcia Aparecida Fernandes, Rivalino Matias

More information

Lecture 13 Condition Variables

Lecture 13 Condition Variables Lecture 13 Condition Variables Contents In this lecture, you will learn Condition Variables And how to use CVs to solve The Producer/Consumer (Bounded Buffer) Problem Review Thus far we have developed

More information

The Next Frontier of Cloud Computing is in the Clouds, Literally

The Next Frontier of Cloud Computing is in the Clouds, Literally The Next Frontier of Cloud Computing is in the Clouds, Literally Silviu Craciunas, Andreas Haas Christoph Kirsch, Hannes Payer Harald Röck, Andreas Rottmann Ana Sokolova, Rainer Trummer Joshua Love Raja

More information

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES Anish Athalye and Patrick Long Mentors: Austin Clements and Stephen Tu 3 rd annual MIT PRIMES Conference Sequential

More information

Distributed caching for cloud computing

Distributed caching for cloud computing Distributed caching for cloud computing Maxime Lorrillere, Julien Sopena, Sébastien Monnet et Pierre Sens February 11, 2013 Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, 2013 1 / 16 Introduction Context

More information

Allocating memory in a lock-free manner

Allocating memory in a lock-free manner Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers

More information

Scalable Concurrent Hash Tables via Relativistic Programming

Scalable Concurrent Hash Tables via Relativistic Programming Scalable Concurrent Hash Tables via Relativistic Programming Josh Triplett September 24, 2009 Speed of data < Speed of light Speed of light: 3e8 meters/second Processor speed: 3 GHz, 3e9 cycles/second

More information

Utilizing the IOMMU scalably

Utilizing the IOMMU scalably Utilizing the IOMMU scalably Omer Peleg, Adam Morrison, Benjamin Serebrin, and Dan Tsafrir USENIX ATC 15 2017711456 Shin Seok Ha 1 Introduction What is an IOMMU? Provides the translation between IO addresses

More information

Preview. Memory Management

Preview. Memory Management Preview Memory Management With Mono-Process With Multi-Processes Multi-process with Fixed Partitions Modeling Multiprogramming Swapping Memory Management with Bitmaps Memory Management with Free-List Virtual

More information

Deterministic Memory Allocation for Mission-Critical Linux

Deterministic Memory Allocation for Mission-Critical Linux Deterministic Memory Allocation for Mission-Critical Linux Jim Huang (黃敬群) & Keng-Fu Hsu (許耕福) Open Source Summit North America / Sep 11, 2017 Background: Phoenix CubeSat 2U-CubeSat for upper atmosphere

More information

Performance, Scalability, and Semantics of Concurrent FIFO Queues

Performance, Scalability, and Semantics of Concurrent FIFO Queues Performance, Scalability, and Semantics of Concurrent FIFO Queues Christoph M. Kirsch Hannes Payer Harald Röck Ana Sokolova Technical Report 2011-03 September 2011 Department of Computer Sciences Jakob-Haringer-Straße

More information

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model

More information

UNIT III MEMORY MANAGEMENT

UNIT III MEMORY MANAGEMENT UNIT III MEMORY MANAGEMENT TOPICS TO BE COVERED 3.1 Memory management 3.2 Contiguous allocation i Partitioned memory allocation ii Fixed & variable partitioning iii Swapping iv Relocation v Protection

More information

Performance, Scalability, and Semantics of Concurrent FIFO Queues

Performance, Scalability, and Semantics of Concurrent FIFO Queues Performance, Scalability, and Semantics of Concurrent FIFO Queues Christoph M. Kirsch Hannes Payer Harald Röck Ana Sokolova Department of Computer Sciences University of Salzburg, Austria firstname.lastname@cs.uni-salzburg.at

More information

Capriccio : Scalable Threads for Internet Services

Capriccio : Scalable Threads for Internet Services Capriccio : Scalable Threads for Internet Services - Ron von Behren &et al - University of California, Berkeley. Presented By: Rajesh Subbiah Background Each incoming request is dispatched to a separate

More information

CS420: Operating Systems

CS420: Operating Systems Main Memory James Moscola Department of Engineering & Computer Science York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Background Program must

More information

Memory Management. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Memory Management. Frédéric Haziza Spring Department of Computer Systems Uppsala University Memory Management Frédéric Haziza Department of Computer Systems Uppsala University Spring 2008 Operating Systems Process Management Memory Management Storage Management Compilers Compiling

More information

WORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES BIG AND SMALL SERVER PLATFORMS

WORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES BIG AND SMALL SERVER PLATFORMS WORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES ON BIG AND SMALL SERVER PLATFORMS Shuang Chen*, Shay Galon**, Christina Delimitrou*, Srilatha Manne**, and José Martínez* *Cornell University **Cavium

More information

@ Massachusetts Institute of Technology All rights reserved.

@ Massachusetts Institute of Technology All rights reserved. ..... CPHASH: A Cache-Partitioned Hash Table with LRU Eviction by Zviad Metreveli MASSACHUSETTS INSTITUTE OF TECHAC'LOGY JUN 2 1 2011 LIBRARIES Submitted to the Department of Electrical Engineering and

More information

Hierarchical PLABs, CLABs, TLABs in Hotspot

Hierarchical PLABs, CLABs, TLABs in Hotspot Hierarchical s, CLABs, s in Hotspot Christoph M. Kirsch ck@cs.uni-salzburg.at Hannes Payer hpayer@cs.uni-salzburg.at Harald Röck hroeck@cs.uni-salzburg.at Abstract Thread-local allocation buffers (s) are

More information

Memory Management. Dr. Yingwu Zhu

Memory Management. Dr. Yingwu Zhu Memory Management Dr. Yingwu Zhu Big picture Main memory is a resource A process/thread is being executing, the instructions & data must be in memory Assumption: Main memory is infinite Allocation of memory

More information

Operating systems. Part 1. Module 11 Main memory introduction. Tami Sorgente 1

Operating systems. Part 1. Module 11 Main memory introduction. Tami Sorgente 1 Operating systems Module 11 Main memory introduction Part 1 Tami Sorgente 1 MODULE 11 MAIN MEMORY INTRODUCTION Background Swapping Contiguous Memory Allocation Noncontiguous Memory Allocation o Segmentation

More information

Hoard: A Fast, Scalable, and Memory-Efficient Allocator for Shared-Memory Multiprocessors

Hoard: A Fast, Scalable, and Memory-Efficient Allocator for Shared-Memory Multiprocessors Hoard: A Fast, Scalable, and Memory-Efficient Allocator for Shared-Memory Multiprocessors Emery D. Berger Robert D. Blumofe femery,rdbg@cs.utexas.edu Department of Computer Sciences The University of Texas

More information

Predicting Null-Pointer Dereferences in Concurrent Programs. Parthasarathy Madhusudan Niloofar Razavi Francesco Sorrentino

Predicting Null-Pointer Dereferences in Concurrent Programs. Parthasarathy Madhusudan Niloofar Razavi Francesco Sorrentino Predicting Null-Pointer Dereferences in Concurrent Programs After work by: Azadeh Farzan Parthasarathy Madhusudan Niloofar Razavi Francesco Sorrentino Overview The problem The idea The solution The evaluation

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit

More information

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh Accelerating Pointer Chasing in 3D-Stacked : Challenges, Mechanisms, Evaluation Kevin Hsieh Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu Executive Summary

More information

Designing a Compositional Real-Time Operating System. Christoph Kirsch Universität Salzburg

Designing a Compositional Real-Time Operating System. Christoph Kirsch Universität Salzburg Designing a Compositional Real-Time Operating System Christoph Kirsch Universität Salzburg ARTIST Summer School Shanghai July 2008 tiptoe.cs.uni-salzburg.at # Silviu Craciunas* (Programming Model) Hannes

More information

2 nd Half. Memory management Disk management Network and Security Virtual machine

2 nd Half. Memory management Disk management Network and Security Virtual machine Final Review 1 2 nd Half Memory management Disk management Network and Security Virtual machine 2 Abstraction Virtual Memory (VM) 4GB (32bit) linear address space for each process Reality 1GB of actual

More information

CS 261 Fall Mike Lam, Professor. Virtual Memory

CS 261 Fall Mike Lam, Professor. Virtual Memory CS 261 Fall 2016 Mike Lam, Professor Virtual Memory Topics Operating systems Address spaces Virtual memory Address translation Memory allocation Lingering questions What happens when you call malloc()?

More information

Local Linearizability

Local Linearizability Local Linearizability joint work with: Andreas Haas Andreas Holzer Michael Lippautz Ali Sezgin Tom Henzinger Christoph Kirsch Hannes Payer Helmut Veith Concurrent Data Structures Correctness and Performance

More information

SE Memory Consumption

SE Memory Consumption Page 1 of 5 view online Overview Calculating the utilization of memory within a Service Engine (SE) is useful to estimate the number of concurrent connections or the amount of memory that may be allocated

More information

Performance and Optimization Issues in Multicore Computing

Performance and Optimization Issues in Multicore Computing Performance and Optimization Issues in Multicore Computing Minsoo Ryu Department of Computer Science and Engineering 2 Multicore Computing Challenges It is not easy to develop an efficient multicore program

More information

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan SE-292 High Performance Computing Memory Hierarchy R. Govindarajan govind@serc Reality Check Question 1: Are real caches built to work on virtual addresses or physical addresses? Question 2: What about

More information

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University CS 333 Introduction to Operating Systems Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University Virtual addresses Virtual memory addresses (what the process uses) Page

More information

An efficient Unbounded Lock-Free Queue for Multi-Core Systems

An efficient Unbounded Lock-Free Queue for Multi-Core Systems An efficient Unbounded Lock-Free Queue for Multi-Core Systems Authors: Marco Aldinucci 1, Marco Danelutto 2, Peter Kilpatrick 3, Massimiliano Meneghin 4 and Massimo Torquati 2 1 Computer Science Dept.

More information

SE Memory Consumption

SE Memory Consumption Page 1 of 5 SE Memory Consumption view online Calculating the utilization of memory within a Service Engine is useful to estimate the number of concurrent connections or the amount of memory that may be

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications OS CPU

More information

Operating Systems, Fall

Operating Systems, Fall Operating Systems: Memory management Fall 2008 Basic Memory Management: One program Monoprogramming without Swapping or Paging Tiina Niklander No memory abstraction, no address space, just an operating

More information

Operating Systems, Fall

Operating Systems, Fall Operating Systems: Memory management Fall 2008 Tiina Niklander Memory Management Programmer wants memory to be Indefinitely large Indefinitely fast Non volatile Memory hierarchy Memory manager handles

More information

Memory Management. Dr. Yingwu Zhu

Memory Management. Dr. Yingwu Zhu Memory Management Dr. Yingwu Zhu Big picture Main memory is a resource A process/thread is being executing, the instructions & data must be in memory Assumption: Main memory is super big to hold a program

More information

Chapter 8: Memory Management. Operating System Concepts with Java 8 th Edition

Chapter 8: Memory Management. Operating System Concepts with Java 8 th Edition Chapter 8: Memory Management 8.1 Silberschatz, Galvin and Gagne 2009 Background Program must be brought (from disk) into memory and placed within a process for it to be run Main memory and registers are

More information

Disruptor Using High Performance, Low Latency Technology in the CERN Control System

Disruptor Using High Performance, Low Latency Technology in the CERN Control System Disruptor Using High Performance, Low Latency Technology in the CERN Control System ICALEPCS 2015 21/10/2015 2 The problem at hand 21/10/2015 WEB3O03 3 The problem at hand CESAR is used to control the

More information

Parallel Processing SIMD, Vector and GPU s cont.

Parallel Processing SIMD, Vector and GPU s cont. Parallel Processing SIMD, Vector and GPU s cont. EECS4201 Fall 2016 York University 1 Multithreading First, we start with multithreading Multithreading is used in GPU s 2 1 Thread Level Parallelism ILP

More information

Dongjun Shin Samsung Electronics

Dongjun Shin Samsung Electronics 2014.10.31. Dongjun Shin Samsung Electronics Contents 2 Background Understanding CPU behavior Experiments Improvement idea Revisiting Linux I/O stack Conclusion Background Definition 3 CPU bound A computer

More information

Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing

Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Shigeki Akiyama, Kenjiro Taura The University of Tokyo June 17, 2015 HPDC 15 Lightweight Threads Lightweight threads enable

More information

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory

OPERATING SYSTEM. PREPARED BY : DHAVAL R. PATEL Page 1. Q.1 Explain Memory Q.1 Explain Memory Data Storage in storage device like CD, HDD, DVD, Pen drive etc, is called memory. The device which storage data is called storage device. E.g. hard disk, floppy etc. There are two types

More information

Distributed Queues in Shared Memory

Distributed Queues in Shared Memory Distributed Queues in Shared Memory Multicore Performance and Scalability through Quantitative Relaxation Andreas Haas University of Salzburg ahaas@cs.unisalzburg.at Michael Lippautz University of Salzburg

More information

MARACAS: A Real-Time Multicore VCPU Scheduling Framework

MARACAS: A Real-Time Multicore VCPU Scheduling Framework : A Real-Time Framework Computer Science Department Boston University Overview 1 2 3 4 5 6 7 Motivation platforms are gaining popularity in embedded and real-time systems concurrent workload support less

More information

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15 Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures Matthias Lange TU Berlin June 1st, 2010 Machine-Independent Virtual Memory Management for Paged June

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

CS 433 Homework 5. Assigned on 11/7/2017 Due in class on 11/30/2017

CS 433 Homework 5. Assigned on 11/7/2017 Due in class on 11/30/2017 CS 433 Homework 5 Assigned on 11/7/2017 Due in class on 11/30/2017 Instructions: 1. Please write your name and NetID clearly on the first page. 2. Refer to the course fact sheet for policies on collaboration.

More information

Shaping Process Semantics

Shaping Process Semantics Shaping Process Semantics [Extended Abstract] Christoph M. Kirsch Harald Röck Department of Computer Sciences University of Salzburg, Austria {ck,hroeck}@cs.uni-salzburg.at Analysis. Composition of virtually

More information

AutoStream: Automatic Stream Management for Multi-stream SSDs in Big Data Era

AutoStream: Automatic Stream Management for Multi-stream SSDs in Big Data Era AutoStream: Automatic Stream Management for Multi-stream SSDs in Big Data Era Changho Choi, PhD Principal Engineer Memory Solutions Lab (San Jose, CA) Samsung Semiconductor, Inc. 1 Disclaimer This presentation

More information

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin) : LFS and Soft Updates Ken Birman (based on slides by Ben Atkin) Overview of talk Unix Fast File System Log-Structured System Soft Updates Conclusions 2 The Unix Fast File System Berkeley Unix (4.2BSD)

More information

RxNetty vs Tomcat Performance Results

RxNetty vs Tomcat Performance Results RxNetty vs Tomcat Performance Results Brendan Gregg; Performance and Reliability Engineering Nitesh Kant, Ben Christensen; Edge Engineering updated: Apr 2015 Results based on The Hello Netflix benchmark

More information

Improving Scalability of Processor Utilization on Heavily-Loaded Servers with Real-Time Scheduling

Improving Scalability of Processor Utilization on Heavily-Loaded Servers with Real-Time Scheduling Improving Scalability of Processor Utilization on Heavily-Loaded Servers with Real-Time Scheduling Eiji Kawai, Youki Kadobayashi, Suguru Yamaguchi Nara Institute of Science and Technology JAPAN Motivation

More information

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1 Distributed File Systems Issues NFS (Network File System) Naming and transparency (location transparency versus location independence) Host:local-name Attach remote directories (mount) Single global name

More information

Lecture 15: I/O Devices & Drivers

Lecture 15: I/O Devices & Drivers CS 422/522 Design & Implementation of Operating Systems Lecture 15: I/O Devices & Drivers Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions

More information

Parallel storage allocator

Parallel storage allocator CSE 539 02/7/205 Parallel storage allocator Lecture 9 Scribe: Jing Li Outline of this lecture:. Criteria and definitions 2. Serial storage allocators 3. Parallel storage allocators Criteria and definitions

More information

Research on the Implementation of MPI on Multicore Architectures

Research on the Implementation of MPI on Multicore Architectures Research on the Implementation of MPI on Multicore Architectures Pengqi Cheng Department of Computer Science & Technology, Tshinghua University, Beijing, China chengpq@gmail.com Yan Gu Department of Computer

More information

Scalable, multithreaded, shared memory machine Designed for single word random global access patterns Very good at large graph problems

Scalable, multithreaded, shared memory machine Designed for single word random global access patterns Very good at large graph problems Cray XMT Scalable, multithreaded, shared memory machine Designed for single word random global access patterns Very good at large graph problems Next Generation Cray XMT Goals Memory System Improvements

More information

Device-Functionality Progression

Device-Functionality Progression Chapter 12: I/O Systems I/O Hardware I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Incredible variety of I/O devices Common concepts Port

More information

Chapter 12: I/O Systems. I/O Hardware

Chapter 12: I/O Systems. I/O Hardware Chapter 12: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations I/O Hardware Incredible variety of I/O devices Common concepts Port

More information

Design challenges of Highperformance. MPI over InfiniBand. Presented by Karthik

Design challenges of Highperformance. MPI over InfiniBand. Presented by Karthik Design challenges of Highperformance and Scalable MPI over InfiniBand Presented by Karthik Presentation Overview In depth analysis of High-Performance and scalable MPI with Reduced Memory Usage Zero Copy

More information

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions CMSC 330: Organization of Programming Languages Multithreaded Programming Patterns in Java CMSC 330 2 Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 21 Main Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ Why not increase page size

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science Performance of Main : Latency: affects cache miss

More information

High Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science

High Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science High Performance Computing Lecture 21 Matthew Jacob Indian Institute of Science Semaphore Examples Semaphores can do more than mutex locks Example: Consider our concurrent program where process P1 reads

More information

[MS10987A]: Performance Tuning and Optimizing SQL Databases

[MS10987A]: Performance Tuning and Optimizing SQL Databases [MS10987A]: Performance Tuning and Optimizing SQL Databases Length : 4 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual

More information

Recap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack

Recap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack What is it? Recap: Thread Independent flow of control What does it need (thread private)? Stack What for? Lightweight programming construct for concurrent activities How to implement? Kernel thread vs.

More information

Page Replacement Algorithms

Page Replacement Algorithms Page Replacement Algorithms MIN, OPT (optimal) RANDOM evict random page FIFO (first-in, first-out) give every page equal residency LRU (least-recently used) MRU (most-recently used) 1 9.1 Silberschatz,

More information

Final Exam Preparation Questions

Final Exam Preparation Questions EECS 678 Spring 2013 Final Exam Preparation Questions 1 Chapter 6 1. What is a critical section? What are the three conditions to be ensured by any solution to the critical section problem? 2. The following

More information

Main Memory (Part II)

Main Memory (Part II) Main Memory (Part II) Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Main Memory 1393/8/17 1 / 50 Reminder Amir H. Payberah

More information

Caches. Cache Memory. memory hierarchy. CPU memory request presented to first-level cache first

Caches. Cache Memory. memory hierarchy. CPU memory request presented to first-level cache first Cache Memory memory hierarchy CPU memory request presented to first-level cache first if data NOT in cache, request sent to next level in hierarchy and so on CS3021/3421 2017 jones@tcd.ie School of Computer

More information

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee @kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture

More information

Introduction to Virtual Memory Management

Introduction to Virtual Memory Management Introduction to Virtual Memory Management Minsoo Ryu Department of Computer Science and Engineering Virtual Memory Management Page X Demand Paging Page X Q & A Page X Memory Allocation Three ways of memory

More information

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems

More information

( D ) 4. Which is not able to solve the race condition? (A) Test and Set Lock (B) Semaphore (C) Monitor (D) Shared memory

( D ) 4. Which is not able to solve the race condition? (A) Test and Set Lock (B) Semaphore (C) Monitor (D) Shared memory CS 540 - Operating Systems - Final Exam - Name: Date: Wenesday, May 12, 2004 Part 1: (78 points - 3 points for each problem) ( C ) 1. In UNIX a utility which reads commands from a terminal is called: (A)

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs ASPLOS 2013 GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications

More information

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1 Memory Management Disclaimer: some slides are adopted from book authors slides with permission 1 CPU management Roadmap Process, thread, synchronization, scheduling Memory management Virtual memory Disk

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building

More information

Operating System Performance and Large Servers 1

Operating System Performance and Large Servers 1 Operating System Performance and Large Servers 1 Hyuck Yoo and Keng-Tai Ko Sun Microsystems, Inc. Mountain View, CA 94043 Abstract Servers are an essential part of today's computing environments. High

More information

Registers Cache Main memory Magnetic disk Magnetic tape

Registers Cache Main memory Magnetic disk Magnetic tape Operating Systems: Memory management Mon 14.9.2009 Tiina Niklander Memory Management Programmer wants memory to be Indefinitely large Indefinitely fast Non volatile Memory hierarchy small amount of fast,

More information

Main Memory. Electrical and Computer Engineering Stephen Kim ECE/IUPUI RTOS & APPS 1

Main Memory. Electrical and Computer Engineering Stephen Kim ECE/IUPUI RTOS & APPS 1 Main Memory Electrical and Computer Engineering Stephen Kim (dskim@iupui.edu) ECE/IUPUI RTOS & APPS 1 Main Memory Background Swapping Contiguous allocation Paging Segmentation Segmentation with paging

More information

Utilizing the IOMMU Scalably

Utilizing the IOMMU Scalably Utilizing the IOMMU Scalably USENIX Annual Technical Conference 2015 Omer Peleg, Adam Morrison, Benjamin Serebrin * and Dan Tsafrir * Google In This Talk IOMMU overview Main challenges to OSes Current

More information

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition Chapter 8: Memory- Management Strategies Operating System Concepts 9 th Edition Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Section 7: Wait/Exit, Address Translation

Section 7: Wait/Exit, Address Translation William Liu October 15, 2014 Contents 1 Wait and Exit 2 1.1 Thinking about what you need to do.............................. 2 1.2 Code................................................ 2 2 Vocabulary 4

More information

CS 5523 Operating Systems: Memory Management (SGG-8)

CS 5523 Operating Systems: Memory Management (SGG-8) CS 5523 Operating Systems: Memory Management (SGG-8) Instructor: Dr Tongping Liu Thank Dr Dakai Zhu, Dr Palden Lama, and Dr Tim Richards (UMASS) for providing their slides Outline Simple memory management:

More information

Operating Systems (Classroom Practice Booklet Solutions)

Operating Systems (Classroom Practice Booklet Solutions) Operating Systems (Classroom Practice Booklet Solutions) 1. Process Management I 1. Ans: (c) 2. Ans: (c) 3. Ans: (a) Sol: Software Interrupt is generated as a result of execution of a privileged instruction.

More information

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition Chapter 7: Main Memory Operating System Concepts Essentials 8 th Edition Silberschatz, Galvin and Gagne 2011 Chapter 7: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure

More information

MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores

MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores Junbin Kang, Benlong Zhang, Tianyu Wo, Chunming Hu, and Jinpeng Huai Beihang University 夏飞 20140904 1 Outline Background

More information

UNIT:2. Process Management

UNIT:2. Process Management 1 UNIT:2 Process Management SYLLABUS 2.1 Process and Process management i. Process model overview ii. Programmers view of process iii. Process states 2.2 Process and Processor Scheduling i Scheduling Criteria

More information