Lecture Summary CSC 263H August 5, 2016 This document is a very brief overview of what we did in each lecture, it is by no means a replacement for attending lecture or doing the readings. 1. Week 1 2. Week 2 3. Week 3 This week was just an introduction to the course. The main idea is to measure the complexity of various functions and come up with smart ways (data structures) to store information. Review of asymptotic notation (Big-Oh Big-Omega Big-Theta). Difference between worst case and best case. The big take away was Big-Oh and Big-Omega do not mean Best Case and Worst Case. Instead they serve as an upper and lower bound on any function runtimes. Best Case and Worst Case are simply functions. We used Insertion Sort as a running example. We showed the worst case running time was Θ(n 2 ) and the best case running time was Θ(n). We did a brief review of probability. The important concepts were: Independence, Mutually Exclusive, Conditional Probability, and how to find the Intersection and Union of events. This week our ADT was Priority Queues and our Data Structure was Heaps. Priority Queues support Insert(x), Max(x), ExtractMax(x). We talked about how one could implement a Priority Queue using sorted and unsorted linked lists, but in the worst case many of the operations required n time. Heaps are a special type of binary tree. They are nearly complete (each row is full except the last which may have a gap on the right end) and each element has a key at least the size of its children s key. Because of their structure we can represent them without pointers using only an array. Heaps can support Insert(x) and ExtractMax(x) in O(log n) time and Max(x) in O(1) time. We saw how to build heaps via repeated insertion Θ(n log n) time and the smarter one that utilizes bubbling down O(n) time. We also saw how to sort an array using heaps in O(n log n) time (in-place). This week our ADT was Dictionaries and our Data Structure was AVL-Trees. 1
4. Week 4 5. Week 5 6. Week 6 Priority Queues support Insert(x), Search(k), Delete(k). AVL trees are a special type of BST, now each node stores the heigh of the subtree it is the root of. Using this we can determine if a subtree becomes unbalanced, it s two subtree children s height differs by a factor of 2 or more. AVL trees are balanced because of this their height is O(log n). To do AVL operations like Insert/Delete/Search we just do normal BST operations. After we fix the height information bubbling or way towards the root. Whenever something becomes broken we identify one of 4 cases and do the appropriate rotation to fix this. AVL rotations only require O(1) time. This combined with their height this means AVL operations require only O(log n) time. We saw how to further augment an AVL tree by storing information related to the sum of all keys under a node at that node. For more information see the sample solutions posted on the website. This week we looked at hashing and quicksort. Hash tables are an efficient way to implement Dictionaries. We looked at direct access tables, a fancy name for arrays. Here we associate each potential input with a slot in the array. We keep some information about that input in its slot. The problem with this is it is very memory inefficient and potentially impossible if the input space is unbounded. Hash Tables are a compromise, we have m slots and a hash function h() which maps potential keys K to a slot {0, 1,..., m 1}. If two keys end up in the same slot then we resolve this with chaining (the slot contains a linked list of all of the elements at it). We usually assume that the hash function is easy to compute and evenly spreads out it s keys (SUHA). Under these assumptions (and if we are careful with how we construct the table) we can search for a key in O(1) expected time. We saw QuickSort which is another sorting algorithm. In the worst case it requires Θ(n 2 ) comparisons. It does much better in expectation requiring only O(n log n) comparisons on average. We analyzed QuickSort proving the average case bound. We finished the analysis of quick sort (showing how to use randomize to get around any input distribution). We showed how to use hashing to solve a problem presented earlier in the course (see the sample solutions). We introduced Merge-able Heaps. We looked at the ADT mergeable heaps. These are regular priority queues, but they support the ability to merge two structures into one quickly. 2
7. Week 7 8. Week 8 9. Week 9 They support all of the Priority Queue operations and a special Union(X 1, X 2 ) operation, which merges two queues together. We saw how to implement this with Binomial Heaps. A Binomial Heap is a collection of Binomial Trees. Each tree is itself a heap (it supports a max or min level by level ordering). We saw how to build Binomial Trees via combining Binomial Trees of the same order to get a larger one. In a Binomial Heap of n nodes we have exactly Binary(n) trees. One tree for each one in the binary representation. That is if we have a 1 in position i we have a tree of order i. All operations were based around the Union operation which can be done in nlogn time, in a fashion similar to Binary Addition. Because of this all operations took nlogn time. Midterm We introduced Graphs (an ADT for representing various problems, routes between cities, links of Facebook). A graph is comprised of n nodes and m edges. The edges (or arcs) link the nodes (or vertexes) together. An edge can be directed or undirected. Edges can have weights or no weights. We saw two ways of representing the graph. In an adjacency list each node points to a list of nodes it is adjacent to, that is there is an outgoing edge from the node to each node in its list. In a adjacency matrix we have a n by n matrix where each cell (i, j) represents if there is an edge going from i to j. The space required to represent a list is O(n + m) the space required to represent a matrix is O(n 2 ). The matrix supports looking at an edge in constant time while the list requires us to scan each element adjacent to that particular node. We saw several different types of graphs such as, ones that are connected (each node can reach each other node via a path of edges) or acyclic (no node can reach itself via a path of edges). Trees are a special type of connected acyclic graph with exactly m 1 edges and a designated root node. We saw how to computed the shortest path from a special root node to every other node in an unweighted graph using Breadth First Search. This algorithm returns a tree which allows us to trace the shortest path to any node from the source node (along with the cost of this path). The runtime of this depended on the graph representation. For lists the runtime was O(n + m) for matrix the runtime was O(n 2 ). When we have non negative edge weights we saw how to use Dijkstra s algorithm to calculate the lowest cost path (where the cost is the sum of edge weights) from a source node to each other node in the graph. Like BFS Dijkstra s returns a tree along with costs to each node from the source. We proved that BFS was correct. 3
10. Week 10 11. Week 11 We introduced Depth First Search. Unlike BFS, DFS returns a DFS forrest (a collection of DFS trees) since it explores nodes in a somewhat random order. The forrest is not always the same, it depends on the order the algorithm initializes the nodes. The actual returned value is simply a pointer from each node to its parent (if there is one) in the DFS forrest. And an annotation of the nodes with start (when the node is first explored) and finish (when the node is closed) times. The runtime of the algorithm is the same as BFS (and depends on graph representation. We saw several theorems related to DFS. Perhaps the most important was the Parentheses Theorem, it showed us how to (in constant time) determine the ancestor descendent relation in the DFS forrest by using start and finish times. We also showed how to label the edges of the graph using the start and finish times of the node. We said Tree Edges were edges in the DFS forrest. Back Edges were edges that point from a node to its ancestor in the DFS forrest. Forward Edges point from a node to a descendent (not a direct descendent) in a DFS forrest. Cross Edges are all other edges. Labeling edges on the fly a we execute DFS incurred only an extra constant runtime, thus did not affect the asymptotic runtime. We showed that a graph is cycle free iff it contains no Back Edges in any DFS forrest. That is to find if a graph contains cycles we simply need to run a DFS and check that we never label an edge as Back Edge. While we did this later we showed how ordering by reverse finish time allows us to Topologically Sort a Directed Acyclic Graph. That is if an edge (u, v) represents that a task v requires task u to be completed Topological Sort gives an ordering where u will be set before task v. We studied Amortized Analysis this week. We used dynamic arrays as an example. When inserting into an array sometimes we fill it up, any more insertions would result in an issue since we are writing in memory we don t own. Instead at a large cost we must copy all the contents of an array. While most insert operations are cheap some but few are expensive. Inspired by this we study the Amortized Cost of a sequence of operations. This is the worst case cost of any sequence of m operations divided by m. That is the average of the worst case operation. We showed if the array when full is doubled in size the Amortized Cost is O(1) that is on average the insertion operation is cheap. We showed this with two techniques the Accounting and Aggregate method. In the Aggregate method we simply write out the worst case sequence (which is very hard to do sometimes) and sum up the cost. In the accounting method we device a pretend charge for each operation. If we slightly overestimate the time it takes to do the cheap inserts we can use our surplus pretend time to pay for our expensive operations when it came. At a charge of 3 units of time per insert we always had enough time to stay ahead of actual time. We showed how to contract arrays if they became too empty. If the array ever becomes 1/4 full we shrink it s size by 1/2. In conjunction with doubling a full array this led to an Amortized Cost of O(1). We showed this with the Accounting Method. 4
12. Week 12 We studied Disjoint Sets ADT this week. These are useful to some algorithms which calculate minimum spanning trees (next week) and representing connected components in a graph. A Disjoint Set is a partition of n elements into non overlapping sets. A Disjoint Set supports inserting a new element into it s own set MakeSet(x), merging the sets which contain elements x and y into one new set Union(x, y), and finding a special representative element of the set element x is in F indset(x). The main use of F indset(x) is to determine if two elements are in the same set, two elements will have the same representative iff they are in the same set. We saw 7 different ways of implementing Disjoint Sets. The most effective of which is Trees which use Path Compression and Union by Rank. This method led to an amortized cost of m operations n of which were MakeSet of O(mα(n)), where α is incredibly slow growing, for example α(10 80 ) 5, 10 80 is more than the number of atoms in our observable universe. We studied Minimum Spanning Trees this week. In an undirected and connected graph a MST is a tree which connects each node to each other node. While there are many trees that connect the entire graph the MST is the one with the lowest overall minimum weight (sum of taken edge weights). There can be several MSTs. But if the edge weights are unique there is only one minimum spanning tree. There are two techniques for building MSTs, starting with no edges and adding edges until we have a MST or starting with all the edges and removing edges until we have a MST. Whichever is faster depends on how many edges are in the graph. Since a MST must have n 1 edges if we only have a few more edges than n 1 we should use the removal algorithm. But if the graph is very full we should use the adding algorithm. We focused on the adding algorithm presenting a general technique that would add safe edges to the tree until we had a MST. We explored the idea of a safe edge using the idea of cuts and edges crossing the cut. This was called the MST Cut Theorem. While we never actually use this idea explicitly but it was useful to show future algorithms were correct. We saw two algorithms which are edge adding ones. Prim s and Kruskal s. Prim s algorithm starts from a source node and builds one tree outwards from this source. That is it keeps track of one connected component (initially just the source) and adds the smallest edge with one end in the component with the source and the other end out of it. We didn t see how to do this but it can be implemented with Fibonacci Heaps. Kruskal s algorithm keeps several disjoint components (initially just the individual nodes). At each round it joins together two disjoint components (the two connected via the smallest edge). When there is only one component left we have found the MST. Using Disjoint Sets the runtime of Kruskal s is O(mlogm), the runtime is entirely dominated by the time required to sort the edges by weight. 5