Lecture 5 Using Data Structures to Improve Dijkstra s Algorithm. (AM&O Sections and Appendix A)

Lecture Using Data Structures to Improve Dijkstra s Algorithm (AM&O Sections 4.6 4.8 and Appendix A)

Manipulating the Data in Dijkstra s Algorithm The bottleneck operation in Dijkstra s Algorithm is that of finding the minimum temporary label to use in extending the shortest path. This is an example of a data manipulation problem. In this variant, we maintain a HEAP (also called a PRIORITY QUEUE) of items, each item x having a value key(x). During the course of maintaining this list, the following operations are performed: insert(x): add an item x with value key(x) to HEAP delete min: find and delete the item on HEAP having the minimum key decrease key(x, v): replace the key of an item x on HEAP by smaller value v

More specifically, in the course of Dijkstra s Algorithm we have the following frequency of calls on these operations: insert: O(n) times delete min: O(n) times decrease key: O(m) times. Question: How fast can we get Dijkstra s algorithm to run, by efficiently executing these operations?

Data-Independent Data Manipulations: Tree Heaps rooted tree: directed tree with root node r with all edges directed toward r in the tree. children of a node v: the nodes at the tail of arcs pointing into v. v is the parent of each of its children. leaf: any node having no children. depth of a node v is the number of edges in the path from v to the root. heap: rooted tree each of whose nodes v have associated value key(v), such that the key of any node is the smallest of the nodes in its subtree.

binary heap: heap in which every node has at most children. data structure for binary heaps: an array in which nodes are stored in order from top to bottom. The root has position, its children have positions and 3, and in general the nodes of depth i occupy positions i to i+. assumption: the binary heap will always be maintained so that its r elements will always occupy the first r positions in the array. consequence: every element of the tree will have depth at most log r.

9 7 40 30 3 3 4 3 0 4 3 7 76 37 99 6 A Binary Heap

Elementary Operations siftup(x): takes an element x whose weight has been decreased and moves it up the tree to the appropriate place. siftdown(x): takes an element x whose weight has been increased and moves it down the tree to the appropriate place.

The Procedures siftup(x) do while x is not the root Let y be the parent of x if key(y) > key(x) then swap the entries in x and y perform siftup(y) end if end while siftdown(x) do while x is not a leaf Let y be the child of x with minimum key if key(y) < key(x) then swap the entries in x and y perform siftdown(y) end if end while Complexity: linear in the depth of x = O(log n)

7 6 9 7 40 30 6 3 3 4 3 0 4 3 7 76 37 99 6 A deletemin (siftdown) Operation

9 7 40 0 30 3 3 4 40 3 0 4 3 7 76 37 99 6 0 4 An insert (siftup) Operation

Heap Operations Suppose the heap currently has r elements in it. insert(x): place element x in the (r + ) th position in the tree and perform siftup(x). delete min: remove the root of the tree, move the r th element y to the root position, and perform siftdown(y). decrease key(x): change the key of element x and perform siftup(x). Complexity: All operations have complexity O(log n) Total complexity of Dijkstra: O(n log n + m log n) = O(m log n)

d-heaps For any integer d, the d-heap structure is similar to the binary heap structure, except that each node has at most d children. complexity of siftup: O(log d n) complexity of siftdown: O(d log d n) total complexity of Dijkstra: O(nd log d n+m log d n) Best choice of d: d = m/n. This makes complexity O(m log m/n n), which is linear time for even slightly dense graphs (m = Ω(n +ɛ ) for fixed ɛ > 0).

Fibonacci Heaps Fibonacci heaps maintains a forest of heaps, cutting and regrouping heaps in the process of the heap operations. Complexities are average ( amortized ) time per operation taken over all operations of that type performed in the algorithm. insert: O() decrease key: O() delete min: O(log n) total complexity of Dijkstra: O(m + n log n) Best known for any data-independent algorithm on sparse graphs, and theoretically the best for any such algorithm that outputs nodes in order of increasing key. Unfortunately it is too complex to implement, and only outperforms simpler data structures for very large networks.

Data-Dependent Data Manipulation Recall the linear time sorting algorithm for sorting n items whose keys are known to take on a fixed number of values v, v,..., v r. First you initialize r BUCKETS, and when an item with key v i comes in you put it in BUCKET(i). By reading the buckets in order, you get the sorted list. radix heaps: a method of manipulating heaps that uses the keys as the storage locations. additional problem parameter: assume all data integer. Let C be the largest arc cost. Fact: Every label used in Dijkstra s Algorithm appears in the set {0,,..., nc}

Using Bucket Sorts: Dial s Implementation Initialize an array of nc + buckets indexed by {0,,..., nc}, and place marker min value at the 0 th bucket. Now the heap operations are as follows. insert(x): place node x with initial (finite) key key(x) into the key(x) th bucket. decrease key(x, v): take x out of the key(x) th bucket and put it into the v th bucket. delete min: increase min value successively until a bucket is reached which is nonempty. (Why?) Remove that element from the bucket.

Dial Example (Stage ) 3 4 s 0 3 3 0 8 0 0 6 t 3 4 6 7 8 9 0 node 3 pred 3 4 6 7 8 9 0 node 4 pred

Dial Example (Stage ) 3 4 s 0 3 3 0 8 9 0 6 t 3 4 6 7 8 9 0 node pred 3 3 4 6 7 8 9 0 node 4 pred

Dial Example (Stage 3) 3 4 s 0 3 0 3 8 9 0 9 6 t node pred 3 4 6 7 8 9 0 3 4 6 7 8 9 0 node 4 6 pred

Dial Example (Stage 4&) 3 4 s 0 3 0 3 8 9 0 9 6 t 3 4 s 0 3 0 3 8 9 0 7 6 t 3 4 6 7 8 9 0 node 6 pred 4

s 0 3 3 4 0 3 8 9 0 7 6 t Final Network complexity of operations: O() for insert and decrease key, O(nC) total for all delete min operations (O(C) amortized complexity). Complexity of Dial s Algorthm: O(m + nc) Can be improved to O(m + C) by observing that there are never more than C key values in the set at any one time. Not polynomial ( pseudo-polynomial ), but fastest Dijkstra implementation in empirical tests.

Radix Heaps Idea: Group the keys into log(nc) buckets, each bucket containing a range of keys, with the range in each succeeding bucket having twice as many numbers as the last. This requires searching through a large bucket to find the minimum element, while redistributing the searched elements into several more finely graded buckets. Initialization: Bucket has all nodes with value 0, and for i =,..., log(nc) Bucket i has all nodes with values from i to i+. Each bucket is labeled with the range of distance values it contains. (This can be done simply by marking the binary place values of the numbers.) The marker min value is placed at the lowest bucket.

insert, decrease key: Simply pull the value out of the appropriate bucket (decrease key only) and place it into the new bucket. delete min: Starting at min value, the buckets are searched until the first nonempty bucket b is found. If this bucket has a range of size i >, it is then expanded into i sub-buckets of size,,,4,..., i. (These can take the place of the i buckets the right of b, since they are now empty.) As the elements are searched for the minimum value, they are redistributed into the appropriate newly created bucket. Complexity: whenever an element is moved for decrease key or delete min, it always goes down the list of buckets. At any stage there are only log(nc) buckets and thus the total time that can be taken by all of these operations will be O(n log nc). Total complexity of Dijkstra using radix heaps: O(m+n log nc) polynomial in input size.

Radix Example (Stage ) 3 4 s 0 3 3 0 8 0 0 6 t range 0 3 4 7 8 6 3 nodes 3,4

Radix Example (Stage ) 3 4 s 0 3 3 0 8 9 0 6 t range 0 3 4 7 8 6 3 nodes,4,

Radix Example (Stage 3) 3 4 s 0 3 0 3 8 9 0 9 6 t range 8 9 0 6 3 nodes,4 range 8 9 0 6 3 nodes,4 6

Radix Example (Stage 4&) 3 4 s 0 3 0 3 8 9 0 7 6 t range 3 4 6 3 nodes 4 6 range 3 4 6 3 nodes 4 6 range 3 4 6 3 nodes 6

Complexity of Dijksta's Algorithm Using Various Data Structures n =number of nodes m =number of arcs C = max c ij () T =total over all operations basic binary radix Fibonacci algorithm heaps d-heaps Dial heaps heaps insert O() O(log n) O(log d n) O() O(n log nc) T O(m) T decrease key O() O(log n) O(log d n) O() O(n log nc) T O(m) T deletemin O(n) O(log n) O(d log d n) O(nC) T O(n log nc) T O(n log n) T Total complexity O(n ) O(m log n) O(m log m=n n) O(m + nc) O(m + n log nc) O(m + n log n) Other Implementations improved priority queue: O(m log log C) radix+fibonacci: O(m + p log C) Best Known Complexity of Dijkstra (depending upon parameters) q O(minfm + n log n; m log log C; m + n log Cg)