Dynamic Data Structures. John Ross Wallrabenstein

Similar documents
COP 4531 Complexity & Analysis of Data Structures & Algorithms

CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find. Lauren Milne Spring 2015

+ Abstract Data Types

Union-Find Disjoint Set

CS301 - Data Structures Glossary By

Types II. Hwansoo Han

COMP Analysis of Algorithms & Data Structures

Fibonacci Heaps Priority Queues. John Ross Wallrabenstein

Amortized Analysis and Union-Find , Inge Li Gørtz

CS F-14 Disjoint Sets 1

Maintaining Disjoint Sets

CS S-14 Disjoint Sets 1

Disjoint Sets and the Union/Find Problem

Nested Loops. A loop can be nested inside another loop.

Algorithms and Data Structures

CMSC 341 Lecture 20 Disjointed Sets

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

University of Illinois at Urbana-Champaign Department of Computer Science. Final Examination

Cpt S 223 Fall Cpt S 223. School of EECS, WSU

UNIT-1. Chapter 1(Introduction and overview) 1. Asymptotic Notations 2. One Dimensional array 3. Multi Dimensional array 4. Pointer arrays.

CMSC 341 Lecture 20 Disjointed Sets

Disjoint Sets 3/22/2010

CMSC 341 Lecture 20 Disjointed Sets

Lecture Summary CSC 263H. August 5, 2016

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

Outline. Disjoint set data structure Applications Implementation

ECE250: Algorithms and Data Structures Midterm Review

CS 161: Design and Analysis of Algorithms

Disjoint set (Union-Find)

An Introduction to Trees

Advanced Set Representation Methods

CSE 373 Spring Midterm. Friday April 21st

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How?

Lecture 4: Elementary Data Structures Steven Skiena

Course Review. Cpt S 223 Fall 2010

COMP26120: Linked List in C (2018/19) Lucas Cordeiro

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

DOWNLOAD PDF LINKED LIST PROGRAMS IN DATA STRUCTURE

University of Illinois at Urbana-Champaign Department of Computer Science. Final Examination

Data Structures for Disjoint Sets

Course Review. Cpt S 223 Fall 2009

CS 302 ALGORITHMS AND DATA STRUCTURES

Section 05: Midterm Review

Section 05: Solutions

Requirements, Partitioning, paging, and segmentation

RUNNING TIME ANALYSIS. Problem Solving with Computers-II

Representations of Weighted Graphs (as Matrices) Algorithms and Data Structures: Minimum Spanning Trees. Weighted Graphs

Union-Find: A Data Structure for Disjoint Set Operations

Today s Outline. Motivation. Disjoint Sets. Disjoint Sets and Dynamic Equivalence Relations. Announcements. Today s Topics:

The time and space are the two measure for efficiency of an algorithm.

Programming in C. main. Level 2. Level 2 Level 2. Level 3 Level 3

"apple" "grape" "grape" "grape" "apple"

News and information! Review: Java Programs! Feedback after Lecture 2! Dead-lines for the first two lab assignment have been posted.!

EECS 477: Introduction to algorithms. Lecture 10

Data Structures Question Bank Multiple Choice

Introduction to Linked List: Review. Source:

Disjoint Sets. Andreas Klappenecker [using notes by R. Sekar]

Chapter 10. Implementing Subprograms

Disjoint Sets. The obvious data structure for disjoint sets looks like this.

Contents. 8-1 Copyright (c) N. Afshartous

Reading. Disjoint Union / Find. Union. Disjoint Union - Find. Cute Application. Find. Reading. CSE 326 Data Structures Lecture 14

DATA STRUCTURES AND ALGORITHMS

Dynamic Memory Allocation

Programming Languages

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

Summer Final Exam Review Session August 5, 2009

CS61C : Machine Structures

Union-Find Structures. Application: Connected Components in a Social Network

Basic Data Structures (Version 7) Name:

22c:31 Algorithms. Kruskal s Algorithm for MST Union-Find for Disjoint Sets

CSCS-200 Data Structure and Algorithms. Lecture

Computer Science 62. Bruce/Mawhorter Fall 16. Midterm Examination. October 5, Question Points Score TOTAL 52 SOLUTIONS. Your name (Please print)

1. Attempt any three of the following: 15

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

CSci 231 Final Review

Review of course COMP-251B winter 2010

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Introduction and Overview

CSE373: Data Structures & Algorithms Lecture 12: Amortized Analysis and Memory Locality. Linda Shapiro Winter 2015

G Programming Languages - Fall 2012

Requirements, Partitioning, paging, and segmentation

LECTURE 11 TREE TRAVERSALS

Programming Languages Third Edition. Chapter 7 Basic Semantics

Course Review for Finals. Cpt S 223 Fall 2008

CS:3330 (22c:31) Algorithms

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

12/4/18. Outline. Implementing Subprograms. Semantics of a subroutine call. Storage of Information. Semantics of a subroutine return

Data Structure. IBPS SO (IT- Officer) Exam 2017

CS/ENGRD 2110 Object-Oriented Programming and Data Structures Spring 2012 Thorsten Joachims. Lecture 10: Asymptotic Complexity and

Implementing Subroutines. Outline [1]

Jump Statements. The keyword break and continue are often used in repetition structures to provide additional controls.

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

CSCI-1200 Data Structures Spring 2018 Lecture 8 Templated Classes & Vector Implementation

Principles of Programming Pointers, Dynamic Memory Allocation, Character Arrays, and Buffer Overruns

CS558 Programming Languages

Trees. A tree is a directed graph with the property

Bachelor Level/ First Year/ Second Semester/ Science Full Marks: 60 Computer Science and Information Technology (CSc. 154) Pass Marks: 24

Run-Time Data Structures

Jump Statements. The keyword break and continue are often used in repetition structures to provide additional controls.

Lecture 3. Lecture

cs Java: lecture #6

Transcription:

Dynamic Data Structures John Ross Wallrabenstein

Recursion Review In the previous lecture, I asked you to find a recursive definition for the following problem: Given: A base10 number X Assume X has N digits Output: X printed vertically

Print Vertically Did anyone come up with a solution?

Print Vertically

Print Vertically What is our base case?

Print Vertically What is our base case? When X is a single digit: print X

Print Vertically What is our base case? When X is a single digit: print X What is our recursive call?

Print Vertically What is our base case? When X is a single digit: print X What is our recursive call? When X has more than one digit:

Print Vertically What is our base case? When X is a single digit: print X What is our recursive call? When X has more than one digit: Print the N-1 most significant digits

Print Vertically What is our base case? When X is a single digit: print X What is our recursive call? When X has more than one digit: Print the N-1 most significant digits Print the least significant digit

Print Vertically public class PrintVertically { public static void main(string[] args) { } int x = 1234; System.out.println("writeVertical(" + x + "):"); writevertical(x); public static void writevertical(int x){ //If X is a single digit if(x < 10){ System.out.println(x); } //If X has more than a single digit else{ //Print the most significant digits vertically writevertical(x/10); } } } //Print the least significant digit vertically System.out.println(x%10);

Asymptotic Notation When analyzing an algorithm, it is convenient to have a simple way to express the efficiency of the algorithm This allows us to compare the efficiency of two different algorithms

Big-Oh Notation Assumption: Our algorithm is defined in terms of functions whose domains are the set of natural numbers N={0,1,2,...}

Big-Oh Notation We would like to bound the running time of our algorithm by a measure of the number of atomic (constant time) operations it performs

Big-Oh Notation We say that a function f(n) is O(g(n)) iff: Not a typo! there exists some constant c and n0 such that:

Big-Oh Notation Formally:

Big-Oh Notation

Big-Oh Notation Example: This program runs in O(n) time for(int i = 0; i < n; ++i){ //Atomic Operations }

Big-Oh Notation Example: This program runs in O(n 2 ) time for(int i = 0; i < n; ++i){ //Atomic Operations for(int j = 0; j < n; ++j){ //Atomic Operations } }

Asymptotic Analysis The smaller the asymptotic bound, the faster the algorithm e.g. O(n) is preferable to O(n 2 )

Data Structures

Data Structures What are they?

Data Structures What are they? A data structure is an abstraction used to model a set of related data

Data Structures What are they? A data structure is an abstraction used to model a set of related data What data structures are you familiar with?

Data Structures What are they? A data structure is an abstraction used to model a set of related data What data structures are you familiar with? Most Common Example: Arrays

Data Structures What are they? A data structure is an abstraction used to model a set of related data What data structures are you familiar with? Most Common Example: Arrays Food for Thought: File Systems

Static Data Structures Static Data Structures Data Structures that do not change* over the lifetime of their use *This usually refers to the size of the structure

Static Data Structures Benefits: Fast Access to Individual Elements Drawbacks: Expensive to update e.g. Insertion, Deletion Finite size known at compile time e.g. Arrays

Dynamic Data Structures Dynamic Data Structures Data Structures that change over their lifetime, allowing for efficient updates

Dynamic Data Structures Benefits: Fast Updates (Insertion, Deletion) Flexible Size Drawbacks: Slow access to individual elements

Arrays Recall that the size of an array must be declared at compile time How does this affect insertion/deletion?

Array Deletion When we delete an element from an array, we usually waste memory If we want to reclaim this memory, we have to create a new array (of size n-1) and copy all n-1 elements This is inefficient!

Array Insertion Case: Insert an element between existing elements We have to copy the entire array to a new location, leaving space for the new element This is inefficient!

Array Insertion Case: Insert (push) an element onto the end of a full array We have to: Allocate space for an array of size n+1 Copy n elements into the new array Add new element to the end Deallocate memory for old array (For C Programmers)

Array Insertion Can we do better than O(n) time? How?

Array Insertion Improving Insertion Efficiency Rule: When the array is full and a new element is to be added, proceed by: Allocating space for an array of size 2n Copy original n elements Add new element

Array Insertion How does this help? In the amortized case, we now have that insertion takes O(1)! Amortized analysis involves taking the cost of an operation over the number of times the operation is invoked In this case: A single push is effectively O(1)

ArrayLists You have probably seen ArrayLists An ArrayList is a hybrid between a static data structure (arrays) and a fully dynamic structure ArrayLists use the previous optimization when inserting elements into a full array

Linked Lists A linked list is a dynamic data structure composed of nodes Each node contains: Data Pointer to next node in list* *In the case of a singly linked list

Linked Lists How is the order of the elements in an array determined? By the index of the element How is the order of the elements in a linked list determined? By the next pointer in each node

Linked Lists What are the benefits of this structure? Insertion: Point the current node s next to predecessor s next Point predecessor s next to the current node No resizing!

Linked Lists What are the benefits of this structure? Deletion: Point the previous node s (predecessor) next to the current node s next (successor) No inefficient use of memory!

Linked Lists What are the drawbacks of this structure? Representation on disk may be fragmented Why? Linked Lists do not need contiguous blocks of memory Thus, nodes can be stored anywhere

Dynamic Sets When building algorithms, sets of data must be able to change over time* We must be able to: Insert Delete Test Membership *Have we seen this phrase before?...

Dynamic Sets What dynamic sets have you seen before? Stacks Queues Linked Lists Etc... Let s look at a particular application of sets

Disjoint Sets Two sets A and B are said to be disjoint if their intersection is null Many times, it is useful to partition a larger set into smaller, disjoint sets How can we keep track (Find) of the elements and combine (Union) the smaller sets into larger sets efficiently? Dynamic Data Structures to the rescue!

Disjoint Sets A disjoint-set data structure maintains a collection S={S1,S2,...,Sk} of disjoint dynamic sets Each set is identified by a representative

Disjoint Set Operations We require the following operations to be supported: Make-Set(x) - create a new set whose only member is x Union(x,y) - unites the dynamic sets that contain elements x and y Find-Set(x) - return a pointer to the representative of the set containing x

Why We Care Before proceeding, why should we worry about how to use dynamic data structures to handle dynamic disjoint sets? Common Application: Determining the Connected Components of a Graph See Board

Union / Find We will focus on constructing a dynamic data structure that computes Union and Find efficiently

Union / Find To support the operations Union and Find on disjoint sets, we will use a disjoint set forest of trees Setup: Each member points only to its parent The root is the representative of the set (and also its own parent)

Union / Find

Find To implement Find(x) and return its representative element: Follow parent pointers until the root is reached Return the root (representative)

Union To implement Union(x,y) Set the parent of one of the set s representatives to the root of the other set

Union / Find Consider the following scenario: We have x1,x2,...,xn objects We make each xi a set with Make-Set(x) We Union each xi into one large set See Board Does order matter?

Union / Find Consider the running time of Find if we only use the tools we have seen so far: To Find element X in a set, if we have a linear chain of n nodes takes O(n) time If we have n calls to Find our running time is then: Can we do better?

Union / Find What is the source of inefficiency? We are appending a larger set onto a smaller set (xi) How can we fix this? Maintain the size of each set (easy) Always append the smaller set onto the larger set Use path compression to reduce tree height

Union By Rank To always append the smaller set to the larger set, we maintain a rank for each set The rank is an upper bound on the height of the set tree The root with the smaller rank points to the root of the larger set

Union By Rank See Board for Example Performance Analysis: Where we perform m operations on n elements The proof of this bound is beyond the scope of this course

Path Compression To lower the number of parent nodes we have to pass through to reach the root in a Find(x) operation: Run Find(x) once to retrieve the root Run Find(x) again, pointing each node along the way to root See Board Running time:

Combining Both When both Union By Rank and Path Compression are used: Running Time: Here, alpha represents the Ackermann Function, which is effectively constant

Questions