Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1]

Size: px
Start display at page:

Download "Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1]"

Transcription

1 Report on Cache-Oblivious Priority Queue and Graph Algorithm Applications[1] Marc André Tanner May 30, 2014 Abstract This report contains two main sections: In section 1 the cache-oblivious computational model is motivated and introduced. Section 2 describes the main contribution of the original paper an optimal cache-oblivious priority queue. 1 Background and Models of Computation Memory systems of modern computers consist of complex multilevel memory hierarchies with several layers of cache, main memory and disk. Since access times between different layers of the hierarchy can vary by several orders of magnitude, it is becoming increasingly important to obtain high data locality in memory access patterns. These developments lead to new theoretical models of computation which allow to analyze algorithms with regard to their memory behaviour. 1.1 Random Access Machine (RAM) Model In the Random Access Machine (RAM) model memory is assumed to be infinite i.e. the relevant data always fits into main memory. Furthermore the memory is considered to have uniform access time, which is why it also referred to as flat memory. Clearly these assumptions are not suitable for use cases in which the memory system becomes the bottleneck, especially in a Big Data context with graphs consisting of millions of vertices and billions of edges. However developing a model which is both simple and realistic is a challenging task. The external memory model, which has gained widespread use due to its simplicity, is described next. 1

2 1.2 External Memory Model This model is also known as the I/O model or the cache-aware model (to contrast it with cache obliviousness) and was introduced in 1988 by Aggarwal and Vitter[2]. In order to avoid the complications of multilevel memory models the memory hierarchy consists of only two levels: an internal memory (often called cache) of size M which is fast to access but limited in space, and an arbitrarily large external memory (referred to as disk), partitioned into blocks of size B, with significantly slower access times. The efficiency of an algorithm is measured in terms of the number of block or memory transfers it performs between these two levels. What follows are a few important bounds which characterize the I/O model and will be used later on in the analysis of the priority queue. linear or scanning bound scan(n) = Θ( N B ) is the number of memory transfers needed to read N contiguous items from disk. sorting bound sort(n) = Θ( N B log M/B N B ) refers to the fact that sort(n) memory transfers are both necessary and sufficient to sort N elements. finding the median of N elements is possible in O( N B ) memory transfers. searching bound, the number of memory transfers needed to search for an element among a set of N elements is Ω(log B N). The searching bound is matched by the B-tree, which also supports updates in O(log B N). Notice however that unlike in the RAM model, one cannot sort optimally with a search tree inserting N elements in a B-tree takes N O(N log B N) memory transfers which is a factor of (B log B N)/(log M/B B ) from optimal. While the external memory model is reasonably simple, its algorithms still crucially depend on the parametrization of M and B. Furthermore the algorithms have (at least in principle) to explicitly issue read/write requests to the disk as well as explicitly manage the cache. 1.3 Cache-Oblivious Model The main idea of the cache-oblivious model introduced in 1999 by Frigo et al.[3] is to design and analyze algorithms in the external memory model, but without using the parameters M and B in the algorithm description. Thus combining the simplicity of a two-level model with the realism of more complicated hierarchical models. The cache oblivious model is based on a few assumptions which might seem unrealistic at first, but Frigo et al. showed with a couple of reductions, based on the least recently used (LRU) replacement strategy and 2-universal hash functions, that such a model can be simulated by essentially any memory system with only a small constant-factor overhead. 2

3 These assumptions are: there exist exactly two levels of memory. optimal paging strategy that is if main memory is full the ideal i.e. the block which will be accessed the farthest in the future will be evicted. automatic replacement when an algorithm accesses an element that is not currently stored in main memory, the relevant block is automatically fetched with a memory transfer. full associativity any block can be stored anywhere in memory. tall cache assumption: the number of blocks M/B is larger than the size of each block B or equivalently M B 2. It is important to realize that if a cache-oblivious algorithm performs well between two levels of the hierarchy, then it must automatically work well between any two adjacent levels of the memory hierarchy. This is implied by the fact that a cache-oblivious algorithm neither depends on the memory size nor on the block size. Therefore if the algorithm is optimal in the two-level model, it is optimal on all levels of the hierarchy. This is the reason why the cache-oblivious model is useful: it allows convenient algorithm analysis in a simple two-level model while still deriving reasonable conclusions about much more complex, multilayer memory systems as found in contemporary systems. 2 Optimal Cache-Oblivious Priority Queue 2.1 Background and Motivation A priority queue maintains a set of elements each with a priority (or key) under the operations insert, delete and deletemin, where a deletemin operation finds and deletes the minimum key element in the queue. The goal is now to design such a data structure which is both optimal (that is the number of memory transfers matches the sorting bound) as well as cacheoblivious (i.e. it should work without any knowledge of the memory and block size). These criteria are not satisfied by an implementation based on a heap or a balanced search tree as known from the RAM model. The authors point out that even though several efficient priority queues for the I/O model are known, none of them can readily be made cache-oblivious. Since there exist cache-oblivious B-tree implementations supporting all operations in O(log B N) memory transfers, this immediately implies the existence of a O(log B N) cache-oblivious priority queue. However as discussed above, in order to sort optimally a data structure performing all the operations in O( 1 B log M/B N B ) amortized memory transfers and O(log 2 N) amortized computation time is needed. This is exactly what the presented cache-oblivious priority queue achieves. 3

4 2.2 Structure In order to minimize the number of memory transfers a technique which could be described as lazy evaluation using buffers is employed. The idea is to keep keys with similar priorities grouped into buffers in such that random I/O is avoided as much as possible. Intuitively a buffer holds a certain interval of elements and is used to move elements between levels. As a consequence all elements of a buffer can be processed sequentially, in one operation, thus amortizing the cache misses among all involved elements. To efficiently support the deletemin operation an order among the elements - or at least among the buffers - needs to be maintained. Therefore the priority queue is structured in terms of levels which contain various buffers. The general idea is that smaller elements are stored in lower levels and as the level grow the contained elements do likewise. In particular all insert and deletemin operations are performed on the lowest level and over time the larger elements rise up, whereas the smaller ones trickle down. During this process a level might reach its maximum capacity and elements of a buffer need to be pushed one level up. Similarly if a level becomes too empty, elements are pulled from the next higher level. As will be shown, the structure is carefully designed in such a way that these operations can be performed efficiently. The whole data structure is statically pre-allocated and is completely rebuilt after a certain number of operations. The following section formally introduces this multilevel structure, the contained buffers as well as the maintained invariants Levels The priority queue is built of Θ(log log N) levels. The largest level has size N and all subsequent levels decrease by the power of 2/3 until a constant size c is reached. The levels are referred to by their respective sizes and the i th level from above has size N (2/3)i 1. Thus the levels from largest to smallest are level N, level N 2/3,..., level X 3/2, level X, level X 2/3..., level c 3/2, level c Buffers In order to efficiently move elements between different levels there exist two types of buffers. Up buffers store elements which are on their way up (i.e. they have not yet found the buffer they belong to) and will be stored in one of the down buffers higher up in the hierarchy. Similarly down buffers store elements which are on their way down. Their size is chosen in such a way that the up buffer one level down can quickly be filled with the smallest elements among the down buffers of this level. Level X consists of one up buffer u X that can store up to X elements, and at most X 1/3 down buffers d X 1,..., d X each containing between 1 X 1/3 2 X2/3 and 2X 2/3 elements. Notice that this means that each down buffer is at all times at least a quarter full. The element of each down buffer with the largest 4

5 . level X 3/2 d X3/2 1 u X3/2.. d X3/2 X 1/2 level X up buffer of size X at most X 1/3 down buffers each of size X 2/3 level X 2/ up buffer of size X 2/3 at most X 2/9 down buffers each of size X 4/9. Figure 1: Levels X 3/2, X and X 2/3 of the priority queue data structure with some example elements illustrating invariants 1-3. Pivot elements are highlighted. key is called the pivot element. In total the maximum capacity of level X is X + X 1/3 2X 2/3 = 3X. The size of the down buffers is twice as large as the size of the up buffer one level down. As an example consider the down buffers on level X 3/2 which have size 2X (3/2)2/3 = 2X, whereas the up buffer u X one level below has size X. Furthermore the following three invariants about the relationship between the elements in buffers of various levels are maintained. Invariant 1. At level X, elements are sorted among the down buffer, that is, elements in d X i have smaller keys than elements in d X i+1, but the elements within are unordered. d X i Invariant 2. At level X, the elements in the down buffers have smaller keys than the elements in the up buffer. Invariant 3. The elements in the down buffers at level X have smaller keys than the elements in the down buffer at the next higher level X 3/2. These invariants define intervals for the various buffers and ensure that the elements get larger as the levels grow Layout The priority queue is stored in a contiguous array, where levels are placed consecutively from smallest to largest. Each level reserves space for its maximal capacity of 3X. The up buffer is stored first, followed by all down buffers in an arbitrary order, but linked together to form an ordered linked list. 5

6 u X d X 1 d X 3 d X 2 d X 4... d X X 1/ X X 1/3 2X 2/3 = 2X Figure 2: Physical storage layout of level X which has size 3X. arbitrary, but linked together order of the down buffers. Notice the Summing up over all levels log 3/2 log c N i=0 3N (2/3)i = O(N) leads to the following space requirement. Lemma 1. The cache-oblivious priority queue uses O(N) space. 2.3 Operations The priority queue works with two main operations, push which inserts X elements into the next higher level X 3/2 and pull which moves the X elements with smallest key from level X 3/2 to the next lower one. Thus inserting an element into the priority queue corresponds to a push into the lowest level c. Similarly a deletemin is implemented by a pull from the lowest level Push A push is used when level X is full. In which case the largest X elements are moved from level X into the level above X 3/2. As a first step the X elements which are to be inserted into level X 3/2 are sorted cache-obliviously using O(1 + X B log M/B X B ) memory transfers. Now that these X elements are sorted they are distributed into the X 1/2 down buffers of the next higher level X 3/2. Remember that the elements are sorted among the down buffers (Invariant 1), and each down buffer stores its largest element as a pivot element. Therefore distributing the elements works by visiting the down buffers in linked order and appending elements as long as they are smaller than the current down buffer s pivot element. Elements with keys larger than the pivot element of the last down buffer d X3/2 are inserted into the up buffer of X 1/2 the same level u X3/2 While this process is fairly straight forward a few corner cases need to be handled carefully: down buffer overflows: Remember that a down buffer on level X 3/2 is twice as large as the up buffer on level X and thus has a maximal capacity of 2X. If during the distribution of elements this capacity is reached, the down buffer is split into two new down buffers and the elements are evenly distributed into the two new buffers such that both contain X elements. 6

7 Algorithm 1 push X elements into level X 3/2 Input: an array A of size X B := d1 X3/2 A sort(a) for all e A do {find the correct down buffer to insert the current element} while B nil and B u X3/2 and x > pivot(b) do B := B.next end while if B = nil then {the element is too large for the down buffers} B := u X3/2 {hence prepare insertion into up buffer} if B = u X3/2 then insert-into-up-buffer({e}) {see algorithm 2} else if B = 2X then {down buffer full} {check whether there is space left for a new down buffer} if number-of-down-buffers-on-level(x 3/2 ) = X 1/2 then {if not, move content of last down buffer to up buffer} insert-into-up-buffer(d X3/2 ) {see algorithm 2} X 1/2 d X3/2 X 1/2 B new := {allocate a new down buffer} m median(b) for all e B do {split the current buffer based on its median} if e > m then B B \ {e} B new B new {e} end for {chain the new buffer into the linked list} B new.next = B.next B.next = B new {make sure the current element will be placed into the correct buffer} if e > pivot(b) then B := B new B B {e} {add the element to the buffer} end for 7

8 Algorithm 2 insert a set of elements into the up buffer of level X 3/2, used by push Input: an array A for all e A do if u X3/2 = X 3/2 then {check wheter up buffer is full} push(u X3/2 ) {if so, push all elements into the next higher level X 9/4 } u X3/2 u X3/2 u X3/2 {e} end for This splitting step is a two phase process, first the median of the elements is calculated based on which the elements are partitioned into their respective buffer in a simple scan. When calculating the median it is assumed that the priority queue contains no duplicate keys, that is no elements with the same priority. Since the down buffers are linked together to form an ordered list, the newly allocated buffer can be placed at the end (after all already existing down buffers) where space is reserved. All in all this case can be implemented in median(x) + scan(x) + O(1) memory transfers. level X 3/2 already contains the maximum X 1/2 amount of down buffers: This case is problematic if the above splitting procedure happens when there is no space left to allocate a new down buffer. In this case the less than 2X elements of the last down buffer d X3/2, which by Invariant 1 are X 1/2 larger than all elements of the other down buffers, are moved into the up buffer u X3/2. Since the number of elements involved is bounded by 2X this case can be handled in scan(x) + O(1). up buffer u X3/2 overflows: If the up buffer reaches its maximum capacity of X 3/2 all of its elements are recursively pushed into the next higher level. Notice that after such a recursive push the up buffer is empty and X 3/2 elements have to be inserted before another recursive push is needed. The cost of this recursive push is for now ignored, it will be taken into account when doing an amortized analysis over all levels. Let us now do an analysis with regard to the number of memory transfers needed to perform such a push operation. First the X elements are sorted, during the distribution step X elements are scanned and in the worst case each of the X 1/2 down buffers is visited. The above listed special cases can all be dealt with in scan(x) memory transfers which means a pull can be performed in O(1) + sort(x) + scan(x) + X 1/2 = O(1 + X B log M/B 8 X B + X1/2 )

9 memory transfers. However upon closer inspection the X 1/2 term, which stems from the fact that during the distribution step non-full buffers might have to be written back, can be eliminated. To see this a case distinction on X, the number of elements involved is performed. B 2 < X : or equivalently X 1/2 < X B, which immediately leads to O(1 + X B log M/B X B ). B X B 2 : in this case the X 1/2 term could possibly dominate. The problem is that during the distribution step a down buffer could have to be written back even though its data does not amount to a full block. However since X 1/2 B M B, where the second inequality is justified by the tall-cache assumption (M B 2 ), a block for each of the X 1/2 down buffers can fit into memory. Notice that the operations take place on level X 3/2 and since B 3/2 X 3/2 B 3 there exist only one such level. Therefore a fraction of the main memory can be reserved to hold such partially filled blocks until they become full and are written back to disk. Since the assumed optimal paging strategy will perform at least as good as the strategy outlined above, the X 1/2 term can be eliminated. X < B : this case induces no costs since all levels less than B 3/2 can be kept in main memory at all times. Ignoring the cost of the recursion for now it can be concluded that: Lemma 2. A push of X elements from level X up to level X 3/2 can be performed in O(1+ X B log M/B X B ) memory transfers amortized while maintaining Invariants Pull A pull operation removes the X elements with smallest key from level X 3/2 and returns them in sorted order. It is used when there are not enough elements in the down buffers of level X. Recall that each down buffer at all times needs to be at least 1/4 full, which amounts to X/2 elements. During a pull X elements will be removed, but this invariant still has to be fulfilled. Therefore a case distinction on the number of elements contained within all down buffers is performed. In the first case it is assumed that the down buffers contain at least 3 2 X elements. Since the maximal capacity of a down buffer at level X 3/2 is 2X, the first three down buffers contain the smallest between 3 2X and 6X elements. These elements are sorted using O(1 + X B log M/B X B ) memory transfers. The smallest X elements are removed, while the other remaining between X/2 and 5X elements are left in one, two, or three down buffers of size between X/2 and 2X. These buffers can be constructed in O(1 + X/B) which means the sorting dominates. This procedure maintains Invariants

10 Algorithm 3 Pull from level X 3/2, remove the X smallest elements Output: the X smallest elements in sorted order {check whether the first three down buffers contain enough elements} if d1 X3/2 d2 X3/2 d X3/2 < 3 2 X then 3 P pull from X 9/4 {if not, pull X 3/2 elements from the level above} U u X3/2 M merge(p, sort(u X3/2 )) u X3/2 U largest elements of M {distribute the remaining smaller elements into the down buffers} di X3/2 M \ u X3/2 {sort the first 3 down buffers, return the X smallest elements and distribute the remaining ones} T sort(d1 X3/2 d2 X3/2 d3 X3/2 ) S X smallest elements of T d1 X3/2, d2 X3/2, d3 X3/2 T \ S return S In the second case where the down buffers contain fewer than 3 2 X elements, a recursive pull of X 3/2 elements is performed on the next higher level. Recall that the keys of the elements in an up buffer are unordered relative to the keys of the elements in the down buffers one level up. Assume the up buffer u X3/2 contains U elements, these are sorted and then merged with the already sorted elements pulled from one level above. Now that all elements are sorted the U elements with largest keys are inserted into the up buffer, thus the number of elements in the up buffer is the same as before the pull operation. The remaining between X 3/2 and X 3/ X elements are distributed into the X1/2 down buffers of size between X and X X1/2. This procedure maintains the three invariants and the down buffers now contain at least X 3/2 elements, which means the previously discussed first case applies. As for the cost it requires one sort and one scan of X 3/2 elements, which is negligible compared to the cost of the recursive pull operation on the next level up. Ignoring these costs for now, it can be concluded that: Lemma 3. A pull of X elements from level X 3/2 down to level X can be performed in O(1+ X B log M/B X B ) memory transfers amortized while maintaining Invariants Total cost In order to analyze the amortized cost of an insert or deletemin, a sequence of N/2 operations is considered with regard to the performed memory transfers in their respective push and pull invocations. After N/2 operations the structure is 10

11 completely rebuilt such that all up buffers are empty and level X has X 1/3 down buffers each containing X 2/3. Notice that this ensures that the largest level N is always of size Θ(N). The rebuilding can be performed in a sorting step using sort(n) memory transfers, or O( 1 B log M/B N B ) transfers per operation. A push of X elements from level X up to level X 3/2 is charged to level X, because after such a push the up buffer u X is completely empty and X elements will have to be inserted before another recursive push is needed. Similarly a pull of X elements from level X 3/2 down to level X is charged to level X, because X elements will have to be deleted from level X before another recursive pull is needed. During the N/2 operations at most O(N/X) pushes and pulls are charged to level X. According to Lemma 2 and 3, a push or pull charged to level X uses O(1 + X B log M/B X B ) memory transfers. Altogether, the amortized memory transfers during the N/2 operations charged to level X are bounded by O(1 + 1 B log M/B X B ). Thus summing up over all levels it can be concluded that the total amortized transfer cost of an insert or deletemin operation in the sequence of N/2 such operations is: 1 O( B log M/B i=0 N (2/3)i B ) = O( 1 B log M/B N B ) The paper briefly mentions that a delete operation can be implemented in the same bounds and concludes with: Theorem 1. A set of N elements can be maintained in a linear-space cacheoblivious priority queue data structure supporting each insert, deletemin and delete operation in O( 1 B log M/B N B ) amortized memory transfers and O(log 2 N) amortized computing time. 3 Conclusion On a personal note I find the simplicity of the cache-oblivious model quite appealing. It is remarkable that such a universal, hardware independent model can be used to predict certain algorithm characteristics of real world systems. More concretely the main insight I got from the paper is the idea of lazy evaluation using buffers. That is the technique to carefully craft a data structure in a way that just about enough data can be kept in memory. Organizing the data such that the required operations can always be performed in sequential fashion, thus yielding excellent I/O behaviour regardless of the underlying memory system. As for further information, the interested reader can find a more detailed description and analysis of the cache-oblivious priority queue in a follow up paper by the same authors[4]. 11

12 References [1] L. Arge and M. A. Bender and E. D. Demaine and B. Holland-Minkley and J. I. Munro. Cache-Oblivious Priority Queue and Graph Algorithm Applications. Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pages , [2] A. Aggarwal and J. S. Vitter. The Input/Output complexity of sorting and related problems. Communications of the ACM, 31(9): , [3] M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proceedings of the IEEE Symposium on Foundations of Computer Science, pages , [4] L. Arge and M. A. Bender and E. D. Demaine and B. Holland-Minkley and J. I. Munro. An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms. SIAM Journal on Computing, Volume 36, Number 6, pages ,

Lecture 8 13 March, 2012

Lecture 8 13 March, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.

More information

Funnel Heap - A Cache Oblivious Priority Queue

Funnel Heap - A Cache Oblivious Priority Queue Alcom-FT Technical Report Series ALCOMFT-TR-02-136 Funnel Heap - A Cache Oblivious Priority Queue Gerth Stølting Brodal, Rolf Fagerberg Abstract The cache oblivious model of computation is a two-level

More information

Lecture 19 Apr 25, 2007

Lecture 19 Apr 25, 2007 6.851: Advanced Data Structures Spring 2007 Prof. Erik Demaine Lecture 19 Apr 25, 2007 Scribe: Aditya Rathnam 1 Overview Previously we worked in the RA or cell probe models, in which the cost of an algorithm

More information

Cache-Oblivious Traversals of an Array s Pairs

Cache-Oblivious Traversals of an Array s Pairs Cache-Oblivious Traversals of an Array s Pairs Tobias Johnson May 7, 2007 Abstract Cache-obliviousness is a concept first introduced by Frigo et al. in [1]. We follow their model and develop a cache-oblivious

More information

Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms

Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms Aarhus University Cache-Oblivious Current Trends Algorithms in Algorithms, - A Unified Complexity Approach to Theory, Hierarchical

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Buffer Heap Implementation & Evaluation. Hatem Nassrat. CSCI 6104 Instructor: N.Zeh Dalhousie University Computer Science

Buffer Heap Implementation & Evaluation. Hatem Nassrat. CSCI 6104 Instructor: N.Zeh Dalhousie University Computer Science Buffer Heap Implementation & Evaluation Hatem Nassrat CSCI 6104 Instructor: N.Zeh Dalhousie University Computer Science Table of Contents Introduction...3 Cache Aware / Cache Oblivious Algorithms...3 Buffer

More information

38 Cache-Oblivious Data Structures

38 Cache-Oblivious Data Structures 38 Cache-Oblivious Data Structures Lars Arge Duke University Gerth Stølting Brodal University of Aarhus Rolf Fagerberg University of Southern Denmark 38.1 The Cache-Oblivious Model... 38-1 38.2 Fundamental

More information

Algorithms and Data Structures: Efficient and Cache-Oblivious

Algorithms and Data Structures: Efficient and Cache-Oblivious 7 Ritika Angrish and Dr. Deepak Garg Algorithms and Data Structures: Efficient and Cache-Oblivious Ritika Angrish* and Dr. Deepak Garg Department of Computer Science and Engineering, Thapar University,

More information

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology taoyf@cse.cuhk.edu.hk 3 Making the external interval tree

More information

Report Seminar Algorithm Engineering

Report Seminar Algorithm Engineering Report Seminar Algorithm Engineering G. S. Brodal, R. Fagerberg, K. Vinther: Engineering a Cache-Oblivious Sorting Algorithm Iftikhar Ahmad Chair of Algorithm and Complexity Department of Computer Science

More information

Preview. Memory Management

Preview. Memory Management Preview Memory Management With Mono-Process With Multi-Processes Multi-process with Fixed Partitions Modeling Multiprogramming Swapping Memory Management with Bitmaps Memory Management with Free-List Virtual

More information

Lecture April, 2010

Lecture April, 2010 6.851: Advanced Data Structures Spring 2010 Prof. Eri Demaine Lecture 20 22 April, 2010 1 Memory Hierarchies and Models of Them So far in class, we have wored with models of computation lie the word RAM

More information

Lecture 24 November 24, 2015

Lecture 24 November 24, 2015 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 24 November 24, 2015 Scribes: Zhengyu Wang 1 Cache-oblivious Model Last time we talked about disk access model (as known as DAM, or

More information

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version

Lecture Notes: External Interval Tree. 1 External Interval Tree The Static Version Lecture Notes: External Interval Tree Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk This lecture discusses the stabbing problem. Let I be

More information

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible.

l So unlike the search trees, there are neither arbitrary find operations nor arbitrary delete operations possible. DDS-Heaps 1 Heaps - basics l Heaps an abstract structure where each object has a key value (the priority), and the operations are: insert an object, find the object of minimum key (find min), and delete

More information

Massive Data Algorithmics. Lecture 12: Cache-Oblivious Model

Massive Data Algorithmics. Lecture 12: Cache-Oblivious Model Typical Computer Hierarchical Memory Basics Data moved between adjacent memory level in blocks A Trivial Program A Trivial Program: d = 1 A Trivial Program: d = 1 A Trivial Program: n = 2 24 A Trivial

More information

Hierarchical Memory. Modern machines have complicated memory hierarchy

Hierarchical Memory. Modern machines have complicated memory hierarchy Hierarchical Memory Modern machines have complicated memory hierarchy Levels get larger and slower further away from CPU Data moved between levels using large blocks Lecture 2: Slow IO Review Disk access

More information

I/O-Algorithms Lars Arge

I/O-Algorithms Lars Arge Fall 2014 August 28, 2014 assive Data Pervasive use of computers and sensors Increased ability to acquire/store/process data assive data collected everywhere Society increasingly data driven Access/process

More information

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]:

Analyze the obvious algorithm, 5 points Here is the most obvious algorithm for this problem: (LastLargerElement[A[1..n]: CSE 101 Homework 1 Background (Order and Recurrence Relations), correctness proofs, time analysis, and speeding up algorithms with restructuring, preprocessing and data structures. Due Thursday, April

More information

6 Distributed data management I Hashing

6 Distributed data management I Hashing 6 Distributed data management I Hashing There are two major approaches for the management of data in distributed systems: hashing and caching. The hashing approach tries to minimize the use of communication

More information

External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair

External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair {machado,tron}@visgraf.impa.br Theoretical Models Random Access Machine Memory: Infinite Array. Access

More information

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort

Deliverables. Quick Sort. Randomized Quick Sort. Median Order statistics. Heap Sort. External Merge Sort More Sorting Deliverables Quick Sort Randomized Quick Sort Median Order statistics Heap Sort External Merge Sort Copyright @ gdeepak.com 2 Quick Sort Divide and conquer algorithm which relies on a partition

More information

6.895 Final Project: Serial and Parallel execution of Funnel Sort

6.895 Final Project: Serial and Parallel execution of Funnel Sort 6.895 Final Project: Serial and Parallel execution of Funnel Sort Paul Youn December 17, 2003 Abstract The speed of a sorting algorithm is often measured based on the sheer number of calculations required

More information

CACHE-OBLIVIOUS SEARCHING AND SORTING IN MULTISETS

CACHE-OBLIVIOUS SEARCHING AND SORTING IN MULTISETS CACHE-OLIVIOUS SEARCHIG AD SORTIG I MULTISETS by Arash Farzan A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Master of Mathematics in Computer

More information

A Distribution-Sensitive Dictionary with Low Space Overhead

A Distribution-Sensitive Dictionary with Low Space Overhead A Distribution-Sensitive Dictionary with Low Space Overhead Prosenjit Bose, John Howat, and Pat Morin School of Computer Science, Carleton University 1125 Colonel By Dr., Ottawa, Ontario, CANADA, K1S 5B6

More information

FINAL EXAM SOLUTIONS

FINAL EXAM SOLUTIONS COMP/MATH 3804 Design and Analysis of Algorithms I Fall 2015 FINAL EXAM SOLUTIONS Question 1 (12%). Modify Euclid s algorithm as follows. function Newclid(a,b) if a

More information

Lecture 22 November 19, 2015

Lecture 22 November 19, 2015 CS 229r: Algorithms for ig Data Fall 2015 Prof. Jelani Nelson Lecture 22 November 19, 2015 Scribe: Johnny Ho 1 Overview Today we re starting a completely new topic, which is the external memory model,

More information

Multi-core Computing Lecture 2

Multi-core Computing Lecture 2 Multi-core Computing Lecture 2 MADALGO Summer School 2012 Algorithms for Modern Parallel and Distributed Models Phillip B. Gibbons Intel Labs Pittsburgh August 21, 2012 Multi-core Computing Lectures: Progress-to-date

More information

Cache-Oblivious String Dictionaries

Cache-Oblivious String Dictionaries Cache-Oblivious String Dictionaries Gerth Stølting Brodal University of Aarhus Joint work with Rolf Fagerberg #"! Outline of Talk Cache-oblivious model Basic cache-oblivious techniques Cache-oblivious

More information

l Heaps very popular abstract data structure, where each object has a key value (the priority), and the operations are:

l Heaps very popular abstract data structure, where each object has a key value (the priority), and the operations are: DDS-Heaps 1 Heaps - basics l Heaps very popular abstract data structure, where each object has a key value (the priority), and the operations are: l insert an object, find the object of minimum key (find

More information

Cache-Aware and Cache-Oblivious Adaptive Sorting

Cache-Aware and Cache-Oblivious Adaptive Sorting Cache-Aware and Cache-Oblivious Adaptive Sorting Gerth Stølting rodal 1,, Rolf Fagerberg 2,, and Gabriel Moruz 1 1 RICS, Department of Computer Science, University of Aarhus, IT Parken, Åbogade 34, DK-8200

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION DESIGN AND ANALYSIS OF ALGORITHMS Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION http://milanvachhani.blogspot.in EXAMPLES FROM THE SORTING WORLD Sorting provides a good set of examples for analyzing

More information

External Sorting. Why We Need New Algorithms

External Sorting. Why We Need New Algorithms 1 External Sorting All the internal sorting algorithms require that the input fit into main memory. There are, however, applications where the input is much too large to fit into memory. For those external

More information

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19 CSE34T/CSE549T /05/04 Lecture 9 Treaps Binary Search Trees (BSTs) Search trees are tree-based data structures that can be used to store and search for items that satisfy a total order. There are many types

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Algorithms for dealing with massive data

Algorithms for dealing with massive data Computer Science Department Federal University of Rio Grande do Sul Porto Alegre, Brazil Outline of the talk Introduction Outline of the talk Algorithms models for dealing with massive datasets : Motivation,

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

I/O Model. Cache-Oblivious Algorithms : Algorithms in the Real World. Advantages of Cache-Oblivious Algorithms 4/9/13

I/O Model. Cache-Oblivious Algorithms : Algorithms in the Real World. Advantages of Cache-Oblivious Algorithms 4/9/13 I/O Model 15-853: Algorithms in the Real World Locality II: Cache-oblivious algorithms Matrix multiplication Distribution sort Static searching Abstracts a single level of the memory hierarchy Fast memory

More information

Simple and Semi-Dynamic Structures for Cache-Oblivious Planar Orthogonal Range Searching

Simple and Semi-Dynamic Structures for Cache-Oblivious Planar Orthogonal Range Searching Simple and Semi-Dynamic Structures for Cache-Oblivious Planar Orthogonal Range Searching ABSTRACT Lars Arge Department of Computer Science University of Aarhus IT-Parken, Aabogade 34 DK-8200 Aarhus N Denmark

More information

The History of I/O Models Erik Demaine

The History of I/O Models Erik Demaine The History of I/O Models Erik Demaine MASSACHUSETTS INSTITUTE OF TECHNOLOGY Memory Hierarchies in Practice CPU 750 ps Registers 100B Level 1 Cache 100KB Level 2 Cache 1MB 10GB 14 ns Main Memory 1EB-1ZB

More information

Effect of memory latency

Effect of memory latency CACHE AWARENESS Effect of memory latency Consider a processor operating at 1 GHz (1 ns clock) connected to a DRAM with a latency of 100 ns. Assume that the processor has two ALU units and it is capable

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Final Exam in Algorithms and Data Structures 1 (1DL210)

Final Exam in Algorithms and Data Structures 1 (1DL210) Final Exam in Algorithms and Data Structures 1 (1DL210) Department of Information Technology Uppsala University February 0th, 2012 Lecturers: Parosh Aziz Abdulla, Jonathan Cederberg and Jari Stenman Location:

More information

Lecture 7 8 March, 2012

Lecture 7 8 March, 2012 6.851: Advanced Data Structures Spring 2012 Lecture 7 8 arch, 2012 Prof. Erik Demaine Scribe: Claudio A Andreoni 2012, Sebastien Dabdoub 2012, Usman asood 2012, Eric Liu 2010, Aditya Rathnam 2007 1 emory

More information

COMPUTER SCIENCE 4500 OPERATING SYSTEMS

COMPUTER SCIENCE 4500 OPERATING SYSTEMS Last update: 3/28/2017 COMPUTER SCIENCE 4500 OPERATING SYSTEMS 2017 Stanley Wileman Module 9: Memory Management Part 1 In This Module 2! Memory management functions! Types of memory and typical uses! Simple

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Cache Oblivious Matrix Transposition: Simulation and Experiment

Cache Oblivious Matrix Transposition: Simulation and Experiment Cache Oblivious Matrix Transposition: Simulation and Experiment Dimitrios Tsifakis, Alistair P. Rendell * and Peter E. Strazdins Department of Computer Science Australian National University Canberra ACT0200,

More information

* (4.1) A more exact setting will be specified later. The side lengthsj are determined such that

* (4.1) A more exact setting will be specified later. The side lengthsj are determined such that D D Chapter 4 xtensions of the CUB MTOD e present several generalizations of the CUB MTOD In section 41 we analyze the query algorithm GOINGCUB The difference to the CUB MTOD occurs in the case when the

More information

Cache-Adaptive Analysis

Cache-Adaptive Analysis Cache-Adaptive Analysis Michael A. Bender1 Erik Demaine4 Roozbeh Ebrahimi1 Jeremy T. Fineman3 Rob Johnson1 Andrea Lincoln4 Jayson Lynch4 Samuel McCauley1 1 3 4 Available Memory Can Fluctuate in Real Systems

More information

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics 1 Sorting 1.1 Problem Statement You are given a sequence of n numbers < a 1, a 2,..., a n >. You need to

More information

Cache-oblivious comparison-based algorithms on multisets

Cache-oblivious comparison-based algorithms on multisets Cache-oblivious comparison-based algorithms on multisets Arash Farzan 1, Paolo Ferragina 2, Gianni Franceschini 2, and J. Ian unro 1 1 {afarzan, imunro}@uwaterloo.ca School of Computer Science, University

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

UNIT III BALANCED SEARCH TREES AND INDEXING

UNIT III BALANCED SEARCH TREES AND INDEXING UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8

Scan and its Uses. 1 Scan. 1.1 Contraction CSE341T/CSE549T 09/17/2014. Lecture 8 CSE341T/CSE549T 09/17/2014 Lecture 8 Scan and its Uses 1 Scan Today, we start by learning a very useful primitive. First, lets start by thinking about what other primitives we have learned so far? The

More information

II (Sorting and) Order Statistics

II (Sorting and) Order Statistics II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison

More information

Removing Belady s Anomaly from Caches with Prefetch Data

Removing Belady s Anomaly from Caches with Prefetch Data Removing Belady s Anomaly from Caches with Prefetch Data Elizabeth Varki University of New Hampshire varki@cs.unh.edu Abstract Belady s anomaly occurs when a small cache gets more hits than a larger cache,

More information

Soft Heaps And Minimum Spanning Trees

Soft Heaps And Minimum Spanning Trees Soft And George Mason University ibanerje@gmu.edu October 27, 2016 GMU October 27, 2016 1 / 34 A (min)-heap is a data structure which stores a set of keys (with an underlying total order) on which following

More information

Hashing Based Dictionaries in Different Memory Models. Zhewei Wei

Hashing Based Dictionaries in Different Memory Models. Zhewei Wei Hashing Based Dictionaries in Different Memory Models Zhewei Wei Outline Introduction RAM model I/O model Cache-oblivious model Open questions Outline Introduction RAM model I/O model Cache-oblivious model

More information

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig.

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig. Topic 7: Data Structures for Databases Olaf Hartig olaf.hartig@liu.se Database System 2 Storage Hierarchy Traditional Storage Hierarchy CPU Cache memory Main memory Primary storage Disk Tape Secondary

More information

1 Motivation for Improving Matrix Multiplication

1 Motivation for Improving Matrix Multiplication CS170 Spring 2007 Lecture 7 Feb 6 1 Motivation for Improving Matrix Multiplication Now we will just consider the best way to implement the usual algorithm for matrix multiplication, the one that take 2n

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

FOUR EDGE-INDEPENDENT SPANNING TREES 1

FOUR EDGE-INDEPENDENT SPANNING TREES 1 FOUR EDGE-INDEPENDENT SPANNING TREES 1 Alexander Hoyer and Robin Thomas School of Mathematics Georgia Institute of Technology Atlanta, Georgia 30332-0160, USA ABSTRACT We prove an ear-decomposition theorem

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Computational Geometry in the Parallel External Memory Model

Computational Geometry in the Parallel External Memory Model Computational Geometry in the Parallel External Memory Model Nodari Sitchinava Institute for Theoretical Informatics Karlsruhe Institute of Technology nodari@ira.uka.de 1 Introduction Continued advances

More information

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6. Sorting Algorithms SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 6 6.0 Introduction Sorting algorithms used in computer science are often classified by: Computational complexity (worst, average and best behavior) of element

More information

CACHE-OBLIVIOUS MAPS. Edward Kmett McGraw Hill Financial. Saturday, October 26, 13

CACHE-OBLIVIOUS MAPS. Edward Kmett McGraw Hill Financial. Saturday, October 26, 13 CACHE-OBLIVIOUS MAPS Edward Kmett McGraw Hill Financial CACHE-OBLIVIOUS MAPS Edward Kmett McGraw Hill Financial CACHE-OBLIVIOUS MAPS Indexing and Machine Models Cache-Oblivious Lookahead Arrays Amortization

More information

External Memory. Philip Bille

External Memory. Philip Bille External Memory Philip Bille Outline Computationals models Modern computers (word) RAM I/O Cache-oblivious Shortest path in implicit grid graphs RAM algorithm I/O algorithms Cache-oblivious algorithm Computational

More information

L9: Storage Manager Physical Data Organization

L9: Storage Manager Physical Data Organization L9: Storage Manager Physical Data Organization Disks and files Record and file organization Indexing Tree-based index: B+-tree Hash-based index c.f. Fig 1.3 in [RG] and Fig 2.3 in [EN] Functional Components

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures Spring 2019 Alexis Maciel Department of Computer Science Clarkson University Copyright c 2019 Alexis Maciel ii Contents 1 Analysis of Algorithms 1 1.1 Introduction.................................

More information

arxiv: v1 [cs.ds] 1 May 2015

arxiv: v1 [cs.ds] 1 May 2015 Strictly Implicit Priority Queues: On the Number of Moves and Worst-Case Time Gerth Stølting Brodal, Jesper Sindahl Nielsen, and Jakob Truelsen arxiv:1505.00147v1 [cs.ds] 1 May 2015 MADALGO, Department

More information

Basic Data Structures (Version 7) Name:

Basic Data Structures (Version 7) Name: Prerequisite Concepts for Analysis of Algorithms Basic Data Structures (Version 7) Name: Email: Concept: mathematics notation 1. log 2 n is: Code: 21481 (A) o(log 10 n) (B) ω(log 10 n) (C) Θ(log 10 n)

More information

Sorting and Selection

Sorting and Selection Sorting and Selection Introduction Divide and Conquer Merge-Sort Quick-Sort Radix-Sort Bucket-Sort 10-1 Introduction Assuming we have a sequence S storing a list of keyelement entries. The key of the element

More information

Optimal Parallel Randomized Renaming

Optimal Parallel Randomized Renaming Optimal Parallel Randomized Renaming Martin Farach S. Muthukrishnan September 11, 1995 Abstract We consider the Renaming Problem, a basic processing step in string algorithms, for which we give a simultaneously

More information

ICS 691: Advanced Data Structures Spring Lecture 8

ICS 691: Advanced Data Structures Spring Lecture 8 ICS 691: Advanced Data Structures Spring 2016 Prof. odari Sitchinava Lecture 8 Scribe: Ben Karsin 1 Overview In the last lecture we continued looking at arborally satisfied sets and their equivalence to

More information

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

CS Operating Systems

CS Operating Systems CS 4500 - Operating Systems Module 9: Memory Management - Part 1 Stanley Wileman Department of Computer Science University of Nebraska at Omaha Omaha, NE 68182-0500, USA June 9, 2017 In This Module...

More information

CS Operating Systems

CS Operating Systems CS 4500 - Operating Systems Module 9: Memory Management - Part 1 Stanley Wileman Department of Computer Science University of Nebraska at Omaha Omaha, NE 68182-0500, USA June 9, 2017 In This Module...

More information

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247

More information

Cache-Oblivious Algorithms and Data Structures

Cache-Oblivious Algorithms and Data Structures Cache-Oblivious Algorithms and Data Structures Erik D. Demaine MIT Laboratory for Computer Science, 200 Technology Square, Cambridge, MA 02139, USA, edemaine@mit.edu Abstract. A recent direction in the

More information

CSE 638: Advanced Algorithms. Lectures 18 & 19 ( Cache-efficient Searching and Sorting )

CSE 638: Advanced Algorithms. Lectures 18 & 19 ( Cache-efficient Searching and Sorting ) CSE 638: Advanced Algorithms Lectures 18 & 19 ( Cache-efficient Searching and Sorting ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2013 Searching ( Static B-Trees ) A Static

More information

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019

CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting. Ruth Anderson Winter 2019 CSE 332: Data Structures & Parallelism Lecture 12: Comparison Sorting Ruth Anderson Winter 2019 Today Sorting Comparison sorting 2/08/2019 2 Introduction to sorting Stacks, queues, priority queues, and

More information

3 Competitive Dynamic BSTs (January 31 and February 2)

3 Competitive Dynamic BSTs (January 31 and February 2) 3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within

More information

I/O-Algorithms Lars Arge Aarhus University

I/O-Algorithms Lars Arge Aarhus University I/O-Algorithms Aarhus University April 10, 2008 I/O-Model Block I/O D Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory M T =

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence

More information

3.2 Cache Oblivious Algorithms

3.2 Cache Oblivious Algorithms 3.2 Cache Oblivious Algorithms Cache-Oblivious Algorithms by Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. In the 40th Annual Symposium on Foundations of Computer Science,

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Technical Report UU-CS-2008-042 December 2008 Department of Information and Computing Sciences Utrecht

More information

Applied Algorithm Design Lecture 3

Applied Algorithm Design Lecture 3 Applied Algorithm Design Lecture 3 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 3 1 / 75 PART I : GREEDY ALGORITHMS Pietro Michiardi (Eurecom) Applied Algorithm

More information

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols SIAM Journal on Computing to appear From Static to Dynamic Routing: Efficient Transformations of StoreandForward Protocols Christian Scheideler Berthold Vöcking Abstract We investigate how static storeandforward

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 15, March 15, 2015 Mohammad Hammoud Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+ Tree) and Hash-based (i.e., Extendible

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

Massive Data Algorithmics. Lecture 1: Introduction

Massive Data Algorithmics. Lecture 1: Introduction . Massive Data Massive datasets are being collected everywhere Storage management software is billion-dollar industry . Examples Phone: AT&T 20TB phone call database, wireless tracking Consumer: WalMart

More information

Approximation Algorithms

Approximation Algorithms Chapter 8 Approximation Algorithms Algorithm Theory WS 2016/17 Fabian Kuhn Approximation Algorithms Optimization appears everywhere in computer science We have seen many examples, e.g.: scheduling jobs

More information