A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention
|
|
- Marilyn Lang
- 6 years ago
- Views:
Transcription
1 A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention Jonatan Lindén and Bengt Jonsson Uppsala University, Sweden December 18, 2013 Jonatan Lindén 1
2 Contributions Motivation: Improve performance of concurrent Discrete Event Simulator. Outcome: New lock-free skiplist-based priority queue. New representation for logically deleted nodes. Minimizes the contention. Improved performance over existing algorithms by 30 80% on multiprocessors. Linearizable. Jonatan Lindén 2
3 Outline Contributions Background Priority queue Skiplist Standard solution to increase concurrency The problem: contention Our algorithm Correctness Evaluation Jonatan Lindén 3
4 Priority Queues Priority Queue - A set of (key,value) pairs with two operations: Insert(key, value) Applications: Discrete Event Simulation, Numerical algorithms. Jonatan Lindén 4
5 Priority Queues Priority Queue - A set of (key,value) pairs with two operations: Insert(key, value) Applications: Discrete Event Simulation, Numerical algorithms. Implementations: Traditionally, implemented on top of heaps or tree data structures. Skiplists [Pugh:1990] have been used for several parallel implementations. Jonatan Lindén 4
6 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
7 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) Peek H T Jonatan Lindén 5
8 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
9 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
10 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
11 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
12 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
13 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
14 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
15 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Insert(4) H T Jonatan Lindén 5
16 Skiplist layered linked list lowest-level list defines logical state, ordered higher-level lists are shortcuts probabilistic guarantee of logarithmic search time Smallest element at the beginning of the lowest-level list. DeleteMin entry point H T Jonatan Lindén 5
17 Concurrent skiplist-based priority queues Skiplists are easy to make concurrent Skiplists scale well when concurrent threads access different parts of the structure. Jonatan Lindén 6
18 Concurrent skiplist-based priority queues Skiplists are easy to make concurrent Skiplists scale well when concurrent threads access different parts of the structure. Jonatan Lindén 6
19 Concurrent skiplist-based priority queues Skiplists are easy to make concurrent Skiplists scale well when concurrent threads access different parts of the structure. Bottleneck: concurrent DeleteMin operations in priority queues try to remove the same element. Jonatan Lindén 6
20 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. Delete flag. H T Jonatan Lindén 7
21 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
22 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
23 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
24 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
25 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
26 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. H T Jonatan Lindén 7
27 Standard solution Standard solution to increase concurrency: logical deletion by setting a delete flag. physical deletion unlinks the node after the logical deletion has succeeded. Lock-free: Perform all writes using Compare-and-Swap (CAS) Colocate delete flag together with each next pointer, e.g., in the lowest order bit [Harris 2001], to make physical deletion safe. H T Jonatan Lindén 7
28 Contention Bottleneck: CAS in DeleteMin Several types of contention: (i) Multiple CASes compete H T Jonatan Lindén 8
29 Contention Bottleneck: CAS in DeleteMin Several types of contention: (i) Multiple CASes compete modified by CAS H T Jonatan Lindén 8
30 Contention Bottleneck: CAS in DeleteMin Several types of contention: (i) Multiple CASes compete (ii) Updates must be propagated to reads H T Jonatan Lindén 8
31 Contention Bottleneck: CAS in DeleteMin Several types of contention: (i) Multiple CASes compete (ii) Updates must be propagated to reads Insert(6) serialize H T Jonatan Lindén 8
32 Our algorithm Jonatan Lindén 9
33 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. H T Jonatan Lindén 10
34 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. H T Jonatan Lindén 10
35 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. H T Jonatan Lindén 10
36 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. Insert(1) H T Jonatan Lindén 10
37 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. Insert(1) H T Jonatan Lindén 10
38 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. Insert(1) H T Jonatan Lindén 10
39 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. H T Does not work! Jonatan Lindén 10
40 Our solution Key idea: No physical deletion after logical deletion! Instead, delete nodes in batches. By updating the pointers in head node. Requires that logically deleted nodes form a prefix. H T should be inserted here Jonatan Lindén 10
41 Our solution To guarantee prefix property, Store delete flag together with the predecessor s next pointer. H T Jonatan Lindén 11
42 Our solution To guarantee prefix property, Store delete flag together with the predecessor s next pointer. Insert(1) H T Jonatan Lindén 11
43 Our solution To guarantee prefix property, Store delete flag together with the predecessor s next pointer. Insert(1) H T Jonatan Lindén 11
44 Our solution To guarantee prefix property, Store delete flag together with the predecessor s next pointer. Insert(1) H T Jonatan Lindén 11
45 Our solution To guarantee prefix property, Store delete flag together with the predecessor s next pointer. Insert(1) H T Still a prefix! Jonatan Lindén 11
46 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T Jonatan Lindén 12
47 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T 1 Jonatan Lindén 12
48 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T 1 Jonatan Lindén 12
49 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T 1 Jonatan Lindén 12
50 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T Jonatan Lindén 12
51 Our solution Resolving conflicts between Insert and DeleteMin. Insert(1) H T 1 Jonatan Lindén 12
52 Our solution Resolving conflicts between Insert and DeleteMin. H T 1 Jonatan Lindén 12
53 Our solution Resolving conflicts between Insert and DeleteMin. H T 1 Jonatan Lindén 12
54 Physical batch deletion Physical batch deletion: update pointers in the head Done by DeleteMin, when the prefix of deleted nodes exceeds a threshold H T Jonatan Lindén 13
55 Physical batch deletion Physical batch deletion: update pointers in the head Done by DeleteMin, when the prefix of deleted nodes exceeds a threshold H T Jonatan Lindén 13
56 Physical batch deletion Physical batch deletion: update pointers in the head Done by DeleteMin, when the prefix of deleted nodes exceeds a threshold H T Jonatan Lindén 13
57 Correctness linearizable concurrent priority queue Correctness proof based on assertional reasoning in the paper Follows rather easily after establishing that the lowest level list consists of a deleted prefix followed by a sorted list which defines the logical state of the queue. We have also modeled the algorithm in SPIN and performed extensive state-space exploration Jonatan Lindén 14
58 Evaluation Jonatan Lindén 15
59 Comparison of maximal throughput Compared against other lock-free skiplist-based priority queues: Sundell & Tsigas (ST): Only a single logically deleted node allowed in the lowest-level list. Herlihy & Shavit (HS): Lock-free adaptation of [Lotan & Shavit]. Logically deleted nodes need not form a prefix. Jonatan Lindén 16
60 Comparison of maximal throughput Benchmark: 50% Insert, 50% DeleteMin, 4 socket Intel sandybridge machine. M operations/s New HS ST Number of threads 30 80% improvement in comparison to HS 1-8 cores: single socket (shared L3 cache) Jonatan Lindén 17
61 Evaluation Benchmark: DES workload, 4-socket AMD bulldozer machine. M operations/s New HS ST Number of threads 30 80% improvement in comparison to HS 1 2 cores: shared L2 cache Jonatan Lindén 18
62 More resources BSD licensed implementation SPIN model extended technical report Performance bug in the version in the proceedings: in the Restructure algorithm which updates the head pointers please see the extended technical report. Jonatan Lindén 19
63 Conclusions a new linearizable, lock-free, skiplist-based priority queue algorithm. new representation of logical deletion. This reduces the number of CASes to critical shared memory locations % performance improvement over existing such algorithms. Jonatan Lindén 20
64 Conclusions a new linearizable, lock-free, skiplist-based priority queue algorithm. new representation of logical deletion. This reduces the number of CASes to critical shared memory locations % performance improvement over existing such algorithms. Thank you. Jonatan Lindén 20
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems
More informationCache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed
More informationarxiv: v1 [cs.ds] 16 Mar 2016
Benchmarking Concurrent Priority Queues: Performance of k-lsm and Related Data Structures [Brief Announcement] Jakob Gruber TU Wien, Austria gruber@par.tuwien.ac.at Jesper Larsson Träff TU Wien, Austria
More informationCBPQ: High Performance Lock-Free Priority Queue
CBPQ: High Performance Lock-Free Priority Queue Anastasia Braginsky 1, Nachshon Cohen 2, and Erez Petrank 2 1 Yahoo! Labs Haifa anastas@yahoo-inc.com 2 Technion - Israel Institute of Technology {ncohen,erez}@cs.technion.ac.il
More informationThe Contention Avoiding Concurrent Priority Queue
The Contention Avoiding Concurrent Priority Queue Konstantinos Sagonas and Kjell Winblad Department of Information Technology, Uppsala University, Sweden Abstract. Efficient and scalable concurrent priority
More informationLinked Lists: The Role of Locking. Erez Petrank Technion
Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing
More informationPERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES
PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES Anish Athalye and Patrick Long Mentors: Austin Clements and Stephen Tu 3 rd annual MIT PRIMES Conference Sequential
More informationLock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap
Lock-Free and Practical Doubly Linked List-Based Deques using Single-Word Compare-And-Swap Håkan Sundell Philippas Tsigas OPODIS 2004: The 8th International Conference on Principles of Distributed Systems
More informationA Concurrent Skip List Implementation with RTM and HLE
A Concurrent Skip List Implementation with RTM and HLE Fan Gao May 14, 2014 1 Background Semester Performed: Spring, 2014 Instructor: Maurice Herlihy The main idea of my project is to implement a skip
More informationAllocating memory in a lock-free manner
Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers
More informationPer-Thread Batch Queues For Multithreaded Programs
Per-Thread Batch Queues For Multithreaded Programs Tri Nguyen, M.S. Robert Chun, Ph.D. Computer Science Department San Jose State University San Jose, California 95192 Abstract Sharing resources leads
More informationEfficient & Lock-Free Modified Skip List in Concurrent Environment
Efficient & Lock-Free Modified Skip List in Concurrent Environment Ranjeet Kaur Department of Computer Science and Application Kurukshetra University, Kurukshetra Kurukshetra, Haryana Pushpa Rani Suri
More informationCourse Outline. Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led
Performance Tuning and Optimizing SQL Databases Course 10987B: 4 days Instructor Led About this course This four-day instructor-led course provides students who manage and maintain SQL Server databases
More informationAnalyzing the Performance of Lock-Free Data Structures: A Conflict-based Model
Analyzing the Performance of Lock-Free Data Structures: A Conflict-based Model Aras Atalar, Paul Renaud-Goud, and Philippas Tsigas Chalmers University of Technology {aaras goud tsigas}@chalmers.se Abstract.
More informationFlat Parallelization. V. Aksenov, ITMO University P. Kuznetsov, ParisTech. July 4, / 53
Flat Parallelization V. Aksenov, ITMO University P. Kuznetsov, ParisTech July 4, 2017 1 / 53 Outline Flat-combining PRAM and Flat parallelization PRAM binary heap with Flat parallelization ExtractMin Insert
More informationConcurrent Data Structures Concurrent Algorithms 2016
Concurrent Data Structures Concurrent Algorithms 2016 Tudor David (based on slides by Vasileios Trigonakis) Tudor David 11.2016 1 Data Structures (DSs) Constructs for efficiently storing and retrieving
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationNon-blocking Array-based Algorithms for Stacks and Queues!
Non-blocking Array-based Algorithms for Stacks and Queues! Niloufar Shafiei! Department of Computer Science and Engineering York University ICDCN 09 Outline! Introduction! Stack algorithm! Queue algorithm!
More informationA Simple Optimistic skip-list Algorithm
A Simple Optimistic skip-list Algorithm Maurice Herlihy Brown University & Sun Microsystems Laboratories Yossi Lev Brown University & Sun Microsystems Laboratories yosef.lev@sun.com Victor Luchangco Sun
More informationCS4021/4521 INTRODUCTION
CS4021/4521 Advanced Computer Architecture II Prof Jeremy Jones Rm 4.16 top floor South Leinster St (SLS) jones@scss.tcd.ie South Leinster St CS4021/4521 2018 jones@scss.tcd.ie School of Computer Science
More informationWait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing
Wait-Free Multi-Word Compare-And-Swap using Greedy Helping and Grabbing H. Sundell 1 1 School of Business and Informatics, University of Borås, Borås, Sweden Abstract We present a new algorithm for implementing
More informationProgress Guarantees When Composing Lock-Free Objects
Progress Guarantees When Composing Lock-Free Objects Nhan Nguyen Dang and Philippas Tsigas Department of Computer Science and Engineering Chalmers University of Technology Gothenburg, Sweden {nhann,tsigas}@chalmers.se
More informationMounds: Array-Based Concurrent Priority Queues
212 41st International Conference on Parallel Processing Mounds: Array-Based Concurrent Priority Queues Yujie Liu and Michael Spear Department of Computer Science and Engineering Lehigh University {yul51,
More informationBuilding Efficient Concurrent Graph Object through Composition of List-based Set
Building Efficient Concurrent Graph Object through Composition of List-based Set Sathya Peri Muktikanta Sa Nandini Singhal Department of Computer Science & Engineering Indian Institute of Technology Hyderabad
More informationDynamic Concurrent Van Emde Boas Array
Dynamic Concurrent Van Emde Boas Array Data structure for high performance computing Konrad Kułakowski AGH University of Science and Technology HiPEAC Workshop 17 June 2016 Outline Instead of introduction:
More informationFast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors
Background Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA December 18,
More informationarxiv: v1 [cs.dc] 5 Aug 2014
The Adaptive Priority Queue with Elimination and Combining arxiv:1408.1021v1 [cs.dc] 5 Aug 2014 Irina Calciu, Hammurabi Mendes, and Maurice Herlihy Department of Computer Science Brown University 115 Waterman
More informationarxiv: v2 [cs.dc] 9 May 2017
Flat Parallelization Vitaly Aksenov and Petr Kuznetsov INRIA Paris, France and ITMO University, Russia aksenov.vitaly@gmail.com LTCI, Télécom ParisTech, Université Paris-Saclay petr.kuznetsov@telecom-paristech.fr
More information[MS10987A]: Performance Tuning and Optimizing SQL Databases
[MS10987A]: Performance Tuning and Optimizing SQL Databases Length : 4 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit
More informationAdvanced Multiprocessor Programming Project Topics and Requirements
Advanced Multiprocessor Programming Project Topics and Requirements Jesper Larsson Trä TU Wien May 5th, 2017 J. L. Trä AMP SS17, Projects 1 / 21 Projects Goal: Get practical, own experience with concurrent
More informationConcurrent Access Algorithms for Different Data Structures: A Research Review
Concurrent Access Algorithms for Different Data Structures: A Research Review Parminder Kaur Program Study of Information System University Sari Mutiara, Indonesia Parm.jass89@gmail.com Abstract Algorithms
More informationA Practical Scalable Distributed B-Tree
A Practical Scalable Distributed B-Tree CS 848 Paper Presentation Marcos K. Aguilera, Wojciech Golab, Mehul A. Shah PVLDB 08 March 8, 2010 Presenter: Evguenia (Elmi) Eflov Presentation Outline 1 Background
More informationImproving STM Performance with Transactional Structs 1
Improving STM Performance with Transactional Structs 1 Ryan Yates and Michael L. Scott University of Rochester IFL, 8-31-2016 1 This work was funded in part by the National Science Foundation under grants
More informationConcurrent Skip Lists. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Concurrent Skip Lists Companion slides for The by Maurice Herlihy & Nir Shavit Set Object Interface Collection of elements No duplicates Methods add() a new element remove() an element contains() if element
More informationFlat Combining and the Synchronization-Parallelism Tradeoff
Flat Combining and the Synchronization-Parallelism Tradeoff Danny Hendler Ben-Gurion University hendlerd@cs.bgu.ac.il Itai Incze Tel-Aviv University itai.in@gmail.com Moran Tzafrir Tel-Aviv University
More informationCombining Techniques Application for Tree Search Structures
RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES BLAVATNIK SCHOOL OF COMPUTER SCIENCE Combining Techniques Application for Tree Search Structures Thesis submitted in partial fulfillment of requirements
More informationMultiprocessor Support
CSC 256/456: Operating Systems Multiprocessor Support John Criswell University of Rochester 1 Outline Multiprocessor hardware Types of multi-processor workloads Operating system issues Where to run the
More informationLock vs. Lock-free Memory Project proposal
Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history
More informationSQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases. Upcoming Dates. Course Description.
SQL Server Administration 10987: Performance Tuning and Optimizing SQL Databases Learn the high level architectural overview of SQL Server 2016 and explore SQL Server execution model, waits and queues
More informationA Study of the Behavior of Synchronization Methods in Commonly Used Languages and Systems
A Study of the Behavior of Synchronization Methods in Commonly Used Languages and Systems Daniel Cederman, Bapi Chatterjee, Nhan Nguyen, Yiannis Nikolakopoulos, Marina Papatriantafilou and Philippas Tsigas
More informationCS377P Programming for Performance Multicore Performance Synchronization
CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional
More informationEfficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting
Efficient and Reliable Lock-Free Memory Reclamation d on Reference ounting nders Gidenstam, Marina Papatriantafilou, Håkan Sundell and Philippas Tsigas Distributed omputing and Systems group, Department
More informationScalable Concurrent Hash Tables via Relativistic Programming
Scalable Concurrent Hash Tables via Relativistic Programming Josh Triplett September 24, 2009 Speed of data < Speed of light Speed of light: 3e8 meters/second Processor speed: 3 GHz, 3e9 cycles/second
More informationModern High-Performance Locking
Modern High-Performance Locking Nir Shavit Slides based in part on The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Locks (Mutual Exclusion) public interface Lock { public void lock();
More informationGLocks: Efficient Support for Highly- Contended Locks in Many-Core CMPs
GLocks: Efficient Support for Highly- Contended Locks in Many-Core CMPs Authors: Jos e L. Abell an, Juan Fern andez and Manuel E. Acacio Presenter: Guoliang Liu Outline Introduction Motivation Background
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationA Heap-Based Concurrent Priority Queue with Mutable Priorities for Faster Parallel Algorithms
A Heap-Based Concurrent Priority Queue with Mutable Priorities for Faster Parallel Algorithms Orr Tamir 1, Adam Morrison 2, and Noam Rinetzky 3 1 ortamir@post.tau.ac.il Blavatnik School of Computer Science,
More informationMSL Based Concurrent and Efficient Priority Queue
MSL Based Concurrent and Efficient Priority Queue Ranjeet Kaur #1, Dr. Pushpa Rani Suri #2 1 Student, 2 Professor 1, 2 Department of Computer Science and Application. Kurukshetra University, Kurukshetra
More informationFast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors
Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA April 27, 2016 Mutual Exclusion
More informationDESIGN CHALLENGES FOR SCALABLE CONCURRENT DATA STRUCTURES for Many-Core Processors
DESIGN CHALLENGES FOR SCALABLE CONCURRENT DATA STRUCTURES for Many-Core Processors DIMACS March 15 th, 2011 Philippas Tsigas Data Structures In Manycore Sys. Decomposition Synchronization Load Balancing
More informationLinked Lists: Locking, Lock-Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Linked Lists: Locking, Lock-Free, and Beyond Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Objects Adding threads should not lower throughput Contention
More informationA Skip List for Multicore
A Skip List for Multicore Ian Dick University of Sydney Alan Fekete University of Sydney Vincent Gramoli University of Sydney Abstract In this paper, we introduce the Rotating skip list, the fastest concurrent
More informationCPSC/ECE 3220 Summer 2018 Exam 2 No Electronics.
CPSC/ECE 3220 Summer 2018 Exam 2 No Electronics. Name: Write one of the words or terms from the following list into the blank appearing to the left of the appropriate definition. Note that there are more
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More information10987: Performance Tuning and Optimizing SQL Databases
Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com
More informationPerformance Tuning & Optimizing SQL Databases Microsoft Official Curriculum (MOC 10987)
Performance Tuning & Optimizing SQL Databases Microsoft Official Curriculum (MOC 10987) Course Length: 4 days Course Delivery: Traditional Classroom Online Live Course Overview This 4-day instructor-led
More informationNon-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei
Non-blocking Array-based Algorithms for Stacks and Queues Niloufar Shafiei Outline Introduction Concurrent stacks and queues Contributions New algorithms New algorithms using bounded counter values Correctness
More informationLog-Free Concurrent Data Structures
Log-Free Concurrent Data Structures Abstract Tudor David IBM Research Zurich udo@zurich.ibm.com Rachid Guerraoui EPFL rachid.guerraoui@epfl.ch Non-volatile RAM (NVRAM) makes it possible for data structures
More informationExtreme Performance Platform for Real-Time Streaming Analytics
Extreme Performance Platform for Real-Time Streaming Analytics Achieve Massive Scalability on SPARC T7 with Oracle Stream Analytics O R A C L E W H I T E P A P E R A P R I L 2 0 1 6 Disclaimer The following
More informationImplementations. Priority Queues. Heaps and Heap Order. The Insert Operation CS206 CS206
Priority Queues An internet router receives data packets, and forwards them in the direction of their destination. When the line is busy, packets need to be queued. Some data packets have higher priority
More informationBig and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant
Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model
More informationLinked Lists: Locking, Lock- Free, and Beyond. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Linked Lists: Locking, Lock- Free, and Beyond Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Coarse-Grained Synchronization Each method locks the object Avoid
More informationData Structures and Algorithms
Data Structures and Algorithms Spring 2017-2018 Outline 1 Priority Queues Outline Priority Queues 1 Priority Queues Jumping the Queue Priority Queues In normal queue, the mode of selection is first in,
More informationSorting and Searching
Sorting and Searching Lecture 2: Priority Queues, Heaps, and Heapsort Lecture 2: Priority Queues, Heaps, and Heapsort Sorting and Searching 1 / 24 Priority Queue: Motivating Example 3 jobs have been submitted
More informationLock-Free Concurrent Binomial Heaps
Lock-Free Concurrent Binomial Heaps Gavin Lowe Department of Computer Science, University of Oxford gavin.lowe@cs.ox.ac.uk August 21, 2018 Abstract We present a linearizable, lock-free concurrent binomial
More informationDistributed Scheduling for the Sombrero Single Address Space Distributed Operating System
Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.
More informationM4 Parallelism. Implementation of Locks Cache Coherence
M4 Parallelism Implementation of Locks Cache Coherence Outline Parallelism Flynn s classification Vector Processing Subword Parallelism Symmetric Multiprocessors, Distributed Memory Machines Shared Memory
More informationChapter 6 Heaps. Introduction. Heap Model. Heap Implementation
Introduction Chapter 6 Heaps some systems applications require that items be processed in specialized ways printing may not be best to place on a queue some jobs may be more small 1-page jobs should be
More informationDrop the Anchor: Lightweight Memory Management for Non-Blocking Data Structures
Drop the Anchor: Lightweight Memory Management for Non-Blocking Data Structures Anastasia Braginsky Computer Science Technion anastas@cs.technion.ac.il Alex Kogan Oracle Labs alex.kogan@oracle.com Erez
More informationSorting and Searching
Sorting and Searching Lecture 2: Priority Queues, Heaps, and Heapsort Lecture 2: Priority Queues, Heaps, and Heapsort Sorting and Searching 1 / 24 Priority Queue: Motivating Example 3 jobs have been submitted
More informationConcurrent specifications beyond linearizability
Concurrent specifications beyond linearizability Éric Goubault Jérémy Ledent Samuel Mimram École Polytechnique, France OPODIS 2018, Hong Kong December 19, 2018 1 / 14 Objects Processes communicate through
More informationBrushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool
Brushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool Håkan Sundell School of Business and Informatics University of Borås, 50 90 Borås E-mail: Hakan.Sundell@hb.se Philippas
More informationNon-blocking Priority Queue based on Skiplists with Relaxed Semantics
UNLV Theses, Dissertations, Professional Papers, and Capstones 5-1-2017 Non-blocking Priority Queue based on Skiplists with Relaxed Semantics Ashok Adhikari University of Nevada, Las Vegas, ashokadhikari42@gmail.com
More informationCS106X Programming Abstractions in C++ Dr. Cynthia Bailey Lee
CS106X Programming Abstractions in C++ Dr. Cynthia Bailey Lee 2 Today s Topics: 1. Binary tree 2. Heap Priority Queue Emergency Department waiting room operates as a priority queue: patients are sorted
More informationLocking Granularity. CS 475, Spring 2019 Concurrent & Distributed Systems. With material from Herlihy & Shavit, Art of Multiprocessor Programming
Locking Granularity CS 475, Spring 2019 Concurrent & Distributed Systems With material from Herlihy & Shavit, Art of Multiprocessor Programming Discussion: HW1 Part 4 addtolist(key1, newvalue) Thread 1
More informationPrimeBase XT. A transactional engine for MySQL. Paul McCullagh SNAP Innovation GmbH
PrimeBase XT A transactional engine for MySQL Paul McCullagh SNAP Innovation GmbH Our Company SNAP Innovation GmbH was founded in 1996, currently 25 employees. Purpose: develop and support PrimeBase database,
More informationJava Performance: The Definitive Guide
Java Performance: The Definitive Guide Scott Oaks Beijing Cambridge Farnham Kbln Sebastopol Tokyo O'REILLY Table of Contents Preface ix 1. Introduction 1 A Brief Outline 2 Platforms and Conventions 2 JVM
More informationWorkload Characterization and Optimization of TPC-H Queries on Apache Spark
Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 3 Nov 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 3 Nov 2017 Lecture 1/3 Introduction Basic spin-locks Queue-based locks Hierarchical locks Reader-writer locks Reading without locking Flat
More informationConcurrent Preliminaries
Concurrent Preliminaries Sagi Katorza Tel Aviv University 09/12/2014 1 Outline Hardware infrastructure Hardware primitives Mutual exclusion Work sharing and termination detection Concurrent data structures
More informationThe SkipTrie: Low-Depth Concurrent Search without Rebalancing
The SkipTrie: Low-Depth Concurrent Search without Rebalancing Rotem Oshman University of Toronto rotem@cs.toronto.edu Nir Shavit MIT shanir@csail.mit.edu ABSTRACT To date, all concurrent search structures
More informationHigh Performance Transactions in Deuteronomy
High Performance Transactions in Deuteronomy Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang Microsoft Research Overview Deuteronomy: componentized DB stack Separates transaction,
More informationFrom Lock-Free to Wait-Free: Linked List. Edward Duong
From Lock-Free to Wait-Free: Linked List Edward Duong Outline 1) Outline operations of the locality conscious linked list [Braginsky 2012] 2) Transformation concept from lock-free -> wait-free [Timnat
More informationCharacterizing the Performance and Energy Efficiency of Lock-Free Data Structures
Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures Nicholas Hunt Paramjit Singh Sandhu Luis Ceze University of Washington {nhunt,paramsan,luisceze}@cs.washington.edu Abstract
More information6.852: Distributed Algorithms Fall, Class 15
6.852: Distributed Algorithms Fall, 2009 Class 15 Today s plan z z z z z Pragmatic issues for shared-memory multiprocessors Practical mutual exclusion algorithms Test-and-set locks Ticket locks Queue locks
More informationEnergy-centric DVFS Controlling Method for Multi-core Platforms
Energy-centric DVFS Controlling Method for Multi-core Platforms Shin-gyu Kim, Chanho Choi, Hyeonsang Eom, Heon Y. Yeom Seoul National University, Korea MuCoCoS 2012 Salt Lake City, Utah Abstract Goal To
More informationAlgorithms and Data Structures
Algorithms and Data Structures Dr. Malek Mouhoub Department of Computer Science University of Regina Fall 2002 Malek Mouhoub, CS3620 Fall 2002 1 6. Priority Queues 6. Priority Queues ffl ADT Stack : LIFO.
More informationHigh-Performance Composable Transactional Data Structures
University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) High-Performance Composable Transactional Data Structures 2016 Deli Zhang University of Central Florida
More informationStamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model
Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model Manuel Pöter TU Wien, Faculty of Informatics Vienna, Austria manuel@manuel-poeter.at Jesper Larsson Träff
More informationDistributed Computing Group
Distributed Computing Group HS 2009 Prof. Dr. Roger Wattenhofer, Thomas Locher, Remo Meier, Benjamin Sigg Assigned: December 11, 2009 Discussion: none Distributed Systems Theory exercise 6 1 ALock2 Have
More informationMultiprocessor Scheduling. Multiprocessor Scheduling
Multiprocessor Scheduling Will consider only shared memory multiprocessor or multi-core CPU Salient features: One or more caches: cache affinity is important Semaphores/locks typically implemented as spin-locks:
More informationImportant Lessons. A Distributed Algorithm (2) Today's Lecture - Replication
Important Lessons Lamport & vector clocks both give a logical timestamps Total ordering vs. causal ordering Other issues in coordinating node activities Exclusive access to resources/data Choosing a single
More informationHeap Model. specialized queue required heap (priority queue) provides at least
Chapter 6 Heaps 2 Introduction some systems applications require that items be processed in specialized ways printing may not be best to place on a queue some jobs may be more small 1-page jobs should
More informationIntroduction. CS3026 Operating Systems Lecture 01
Introduction CS3026 Operating Systems Lecture 01 One or more CPUs Device controllers (I/O modules) Memory Bus Operating system? Computer System What is an Operating System An Operating System is a program
More informationHigh-Performance Key-Value Store on OpenSHMEM
High-Performance Key-Value Store on OpenSHMEM Huansong Fu*, Manjunath Gorentla Venkata, Ahana Roy Choudhury*, Neena Imam, Weikuan Yu* *Florida State University Oak Ridge National Laboratory Outline Background
More informationLock Oscillation: Boosting the Performance of Concurrent Data Structures
Lock Oscillation: Boosting the Performance of Concurrent Data Structures Panagiota Fatourou FORTH ICS & University of Crete Nikolaos D. Kallimanis FORTH ICS The Multicore Era The dominance of Multicore
More informationCourse Syllabus. Operating Systems
Course Syllabus. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation of Processes 3. Scheduling Paradigms; Unix; Modeling
More informationTransactional Memory. Concurrency unlocked Programming. Bingsheng Wang TM Operating Systems
Concurrency unlocked Programming Bingsheng Wang TM Operating Systems 1 Outline Background Motivation Database Transaction Transactional Memory History Transactional Memory Example Mechanisms Software Transactional
More informationApplication Programming
Multicore Application Programming For Windows, Linux, and Oracle Solaris Darryl Gove AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris
More information