Algorithms and Data Structures

Similar documents
Algorithms and Data Structures

Binary Trees, Binary Search Trees

Algorithms and Data Structures

Trees. (Trees) Data Structures and Programming Spring / 28

Algorithms and Data Structures

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

) $ f ( n) " %( g( n)

Introduction to Computers and Programming. Concept Question

TREES. Trees - Introduction

Red-Black trees are usually described as obeying the following rules :

INSTITUTE OF AERONAUTICAL ENGINEERING

Trees. Eric McCreath

Data Structure. IBPS SO (IT- Officer) Exam 2017

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

Advanced Set Representation Methods

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

8. Write an example for expression tree. [A/M 10] (A+B)*((C-D)/(E^F))

Algorithms and Data Structures

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1


CS301 - Data Structures Glossary By

CS24 Week 8 Lecture 1

March 20/2003 Jayakanth Srinivasan,

Sorting and Selection

CS102 Binary Search Trees

Binary Trees

Binary Heaps in Dynamic Arrays

(2,4) Trees. 2/22/2006 (2,4) Trees 1

logn D. Θ C. Θ n 2 ( ) ( ) f n B. nlogn Ο n2 n 2 D. Ο & % ( C. Θ # ( D. Θ n ( ) Ω f ( n)

Algorithms and Data Structures

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Introduction and Overview

CS-301 Data Structure. Tariq Hanif

Fundamentals of Data Structure

Data Structures Question Bank Multiple Choice

Course Review for Finals. Cpt S 223 Fall 2008

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

COSC 2007 Data Structures II Final Exam. Part 1: multiple choice (1 mark each, total 30 marks, circle the correct answer)

( ) n 3. n 2 ( ) D. Ο

& ( D. " mnp ' ( ) n 3. n 2. ( ) C. " n

CSCI2100B Data Structures Trees

Topics. Trees Vojislav Kecman. Which graphs are trees? Terminology. Terminology Trees as Models Some Tree Theorems Applications of Trees CMSC 302

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

A6-R3: DATA STRUCTURE THROUGH C LANGUAGE

Design and Analysis of Algorithms - - Assessment

A set of nodes (or vertices) with a single starting point

Trees. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck. April 29, 2016

CSE 373 MAY 10 TH SPANNING TREES AND UNION FIND

DATA STRUCTURE : A MCQ QUESTION SET Code : RBMCQ0305

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

( ) + n. ( ) = n "1) + n. ( ) = T n 2. ( ) = 2T n 2. ( ) = T( n 2 ) +1

Binary Tree. Preview. Binary Tree. Binary Tree. Binary Search Tree 10/2/2017. Binary Tree

EE 368. Weeks 5 (Notes)

SFU CMPT Lecture: Week 9

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

Outline. Preliminaries. Binary Trees Binary Search Trees. What is Tree? Implementation of Trees using C++ Tree traversals and applications

CS 8391 DATA STRUCTURES

CS 171: Introduction to Computer Science II. Binary Search Trees

Suppose that the following is from a correct C++ program:

Week 2. TA Lab Consulting - See schedule (cs400 home pages) Peer Mentoring available - Friday 8am-12pm, 12:15-1:30pm in 1289CS

Graphs & Digraphs Tuesday, November 06, 2007

Analysis of Algorithms

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

Algorithms. Deleting from Red-Black Trees B-Trees

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Direct Addressing Hash table: Collision resolution how handle collisions Hash Functions:

Algorithms and complexity: 2 The dictionary problem: search trees

Algorithms. AVL Tree

Balanced Search Trees. CS 3110 Fall 2010

TREES Lecture 12 CS2110 Spring 2019

9. The expected time for insertion sort for n keys is in which set? (All n! input permutations are equally likely.)

EXAM ELEMENTARY MATH FOR GMT: ALGORITHM ANALYSIS NOVEMBER 7, 2013, 13:15-16:15, RUPPERT D

IX. Binary Trees (Chapter 10)

Algorithms and Data S tructures Structures Optimal S earc Sear h ch Tr T ees; r T ries Tries Ulf Leser

Exercise 1 : B-Trees [ =17pts]

Randomized incremental construction. Trapezoidal decomposition: Special sampling idea: Sample all except one item

Trees and Graphs Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Chapter 9. Priority Queue

CS302 Data Structures using C++

quiz heapsort intuition overview Is an algorithm with a worst-case time complexity in O(n) data structures and algorithms lecture 3

( D. Θ n. ( ) f n ( ) D. Ο%

Balanced Binary Search Trees. Victor Gao

CE 221 Data Structures and Algorithms

CS8391-DATA STRUCTURES QUESTION BANK UNIT I

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

! Tree: set of nodes and directed edges. ! Parent: source node of directed edge. ! Child: terminal node of directed edge

3137 Data Structures and Algorithms in C++

Week 8. BinaryTrees. 1 Binary trees. 2 The notion of binary search tree. 3 Tree traversal. 4 Queries in binary search trees. 5 Insertion.

Lecture 9 March 4, 2010

Data Structures and Algorithms Week 4

Operations on Heap Tree The major operations required to be performed on a heap tree are Insertion, Deletion, and Merging.

CS8391-DATA STRUCTURES

Week 8. BinaryTrees. 1 Binary trees. 2 The notion of binary search tree. 3 Tree traversal. 4 Queries in binary search trees. 5 Insertion.

Course Review. Cpt S 223 Fall 2009

Transcription:

Algorithms and Data Structures (Search) Trees Ulf Leser

Source: whmsoft.net/ Ulf Leser: Alg&DS, Summer semester 2011 2

Content of this Lecture Trees Search Trees Short Analysis Ulf Leser: Alg&DS, Summer semester 2011 3

Motivation A tree is a list in which every element may have more than one successor Traversing such lists has degrees-of-freedom we may choose the next element from the set of all successors Wonderful structure to represent Partitions of sets and their characteristics Order between elements of a set Common parts of elements Trees are everywhere in computer science Ulf Leser: Alg&DS, Summer semester 2011 4

Already Seen? Example Divide-and-conquer call stack in max-subarray Also: Merge-Sort, QuickSort -2 3 1 3 4-3 -4 2-2 3 1 3 4-3 -4 2 Solution 11 Solutions 7, 4 rmax/lmax: 7, 4-2 3 1 3 4-3 -4 2 Solutions 3, 4, 4, 2 rmax/lmax: 3, 4, 1, 0 Ulf Leser: Alg&DS, Summer semester 2011 22 Data A Tree XML depth-first vs breadth-first traversal The data items of an XML database form a tree <customers> <customer> <last_name> Müller </last_name> <first_name> Peter </last_name> <age> 2 </age> </customer> <customer> <last_name> Meier </last_name> <first_name> Stefanie </last_name> <age> 27 </age> </customer> </customers> last_name Müller customer first_name Peter age 2 customers last_name Meier customer first_name Stefanie age 27 Ulf Leser: Alg&DS, Summer semester 2011 10 Ulf Leser: Alg&DS, Summer semester 2011

Already Seen? Full Decision Tree Decision trees for proving the lower bound for sorting S[i3]<S[j3]? 3 1 6 8 9 4 1 2 S[i2]<S[j2]? 6 7 1 3 4 8 9 S[i4]<S[j4]? S[i1]<S[j1]? S[i]<S[j]? 7 1 6 2 9 4 3 S[i6]<S[j6]? S[i7]<S[j7]? 1 8 3 9 3 6 3 9 6 4 2 4 4 8 6 7 2 9 2 7 2 8 3 1 1 7 6 7 4 Ulf Leser: Alg&DS, Summer semester 2011 8 9 1 29 1 7 8 3 2 9 4 3 8 3 1 4 9 7 1 6 Heaps Heaps for priority queues Definition A heap is a labeled binary tree for which the following holds Form-constraint (FC): The tree is complete except the last layer I.e.: Every node has exactly two children Heap-constraint (HC): The value of any node is smaller than that of its children 3 Layer 1 11 18 8 10 9 12 1 Layer 2 Layer 3 Layer 4 (last) Ulf Leser: Alg&DS, Summer semester 2011 32 Ulf Leser: Alg&DS, Summer semester 2011 6

Machine Learning Want to go to a football game? Might be canceled depends on the whether Let s learn from examples Ulf Leser: Alg&DS, Summer semester 2011 7

Decision Trees Temperature sunny Outlook overcast Temperature rainy Temperature hot mild Windy Humidity high true false No No Ulf Leser: Alg&DS, Summer semester 2011 8

Many Applications The decision tree partitions the set of all possible situations based on predefined characteristics (attributes) Challenge: Which tree leads to the best decisions as soon as possible? Ulf Leser: Alg&DS, Summer semester 2011 9

Suffix-Trees Recall the problem to find all occurrences of a (short) string P in a (long) string T Fastest way (O( P )): Suffix Trees Loot at all suffixes of T (there are T many) Construct a tree Every edge is labeled with a letter from T All edges emitting from a node are labeled differently Every path from root to a leaf is uniquely labeled All suffixes of T are represented as leaves Every occurrence of P must be the prefix of a suffix of T Thus, every occurrence of P must map to a path starting at the root of the suffix tree Ulf Leser: Alg&DS, Summer semester 2011 10

Example 1234678901 S= BANANARAMA$ 1 bananarama$ 3 narama$ 11 $ na rama$ 2 narama$ 4 $ 8 a 10 9 ma$ na rama$ rama$ ma$ rama$ 6 7 Ulf Leser: Alg&DS, Summer semester 2011 11

Searching in the Suffix Tree 2 narama$ 10 9 ma$ 4 8 rama$ 6 7 P = na 11 bananarama$ a rama$ The suffix tree for T represents $ all ma$ 10 ma$ 9 common prefixes 1 of suffixes of T as a na bananarama$ 3 8 rama$ unique path. narama$ 6 narama$ 2 rama$ 11 rama$ $ na 4 Challenge: Construction of a suffix tree a rama$ $ ma$ in linear time. na rama$ P = an $ na 1 rama$ 7 3 narama$ Ulf Leser: Alg&DS, Summer semester 2011 12

Classifications Phylogenetic Trees Eukaryoten Tiere diverse Zwischenstufen Chraniata (Schädelknochen) Vertebraten (Wirbeltier) Viele Zwischenstufen Mammals (Säugetiere) Eutheria (Placenta) Primaten (Affen) Catarrhini Hominidae (Mensch, Schimpanse, Orang- Utan, Gorilla) Homo (erectus, sapiens...) Homo Sapiens Ulf Leser: Alg&DS, Summer semester 2011 13

Tree Construction Phylogenetic Trees determine the most E likely order of speciation events. A: AGGTCTCGAGA AGTCTAC B: GTCCGTAGC_AGGTGTC_ C: AGGTGAGCGACCAGG TCA D: AT_TCAG_AG TCTAC E: _GGTCACCGACCAGGTCTA_ Challenge: Find the tree which assumes the least number of mutations (along an edge) D B D B A C E A C Ulf Leser: Alg&DS, Summer semester 2011 14

Graphs Definition A graph G=(V, E) consists of a set of vertices (nodes) V and a set of edges (E VxV). A sequence of edges e 1, e 2,.., e n is called a path iff 1 i<n: e i =(v, v) and e i+1 =(v, v``); the length of this path is n A path (v 1,v 2 ), (v 2,v 3 ),, (v n-1,v n ) is acyclic iff all v i are different G is connected if every pair v i, v j is connected by at least one path G is undirected, if (v,v ) E (v,v) E. Otherwise it is directed Let G=(V, E) be a directed graph and let v,v V. Every edge (v,v ) E is called outgoing for v Every edge (v,v) E is called incoming for v Ulf Leser: Alg&DS, Summer semester 2011 1

Trees as Graphs Definition A undirected, connected acyclic graph is called a undirected tree A directed acyclic graph in which every node has at most one incoming edge is called a directed tree Lemma In a undirected tree, there exists exactly one path between any pair of nodes Ulf Leser: Alg&DS, Summer semester 2011 16

Rooted Trees Definition A directed tree with exactly one vertex v with no incoming edges is called a rooted tree; v is called the root of the tree From now on: Tree means a directed, rooted tree Lemma 1 In a rooted tree, there exists exactly one path between root and any other node 2 3 8 7 6 4 4 7 3 2 1 Ulf Leser: Alg&DS, Summer semester 2011 17 6 8

Terminology (mostly informal, not axiomatic) Definition. Let T be a tree. Then A node with no outgoing edge is a leaf; other nodes are inner nodes The depth of a node p is the length of the (only) path from root to p The height of T is the depth of its deepest leaf The order of T is the maximal number of children of its nodes Level i denotes all nodes with depth i T is ordered if children of all inner nodes are ordered Ulf Leser: Alg&DS, Summer semester 2011 18

More Terminology Definition. Let T be a tree and v a node of T. Then All nodes adjacent to an outgoing edge of v are its children v is called the parent of all its children All nodes on the path from root to v are the ancestors of v All nodes reachable from v are its successors The rank of a node v is its number of children Ulf Leser: Alg&DS, Summer semester 2011 19

Two More Definition. Let T be a directed tree of order k. T is complete if all its inner nodes have rank k and all leaves have the same depth In this lecture, we will mostly consider rooted ordered trees of order two Inner nodes have a left / right child Ulf Leser: Alg&DS, Summer semester 2011 20

Recursive Definition of Trees We defined trees as graphs with certain constraints However, we will mostly traverse trees using recursive functions The relationship may become clearer when using a recursive definition of (binary) trees Definition A tree is a structure defined as follows: A single node is a tree with height 0 If T 1 and T 2 are trees, then the structure formed by a new node v and edges from v to T 1 and from v to T 2 is a tree with height max(height(t 1 ), height(t 2 ))+1 Ulf Leser: Alg&DS, Summer semester 2011 21

Some Properties (without proofs) Lemma Let T=(V, E) be a tree of order k. Then V = E +1 If T is complete, T has k height(t) leaves If T is a complete binary tree, T has k height(t)+1-1 nodes If T is a binary tree with n leaves, then height(t) [floor(log(n)), n-1] Ulf Leser: Alg&DS, Summer semester 2011 22

Content of this Lecture Trees Search Trees Definition Searching Inserting Deleting Short Analysis Ulf Leser: Alg&DS, Summer semester 2011 23

Search Trees Definition A search tree T=(V,E) is a rooted binary tree for n= V different keys with key-labeled nodes such that v V label(v)>max(label(left_child(v)), label(successors(left_child(v))) label(v)<min(label(right_child(v)), label(successors(right_child(v))) Remarks We will use integer labels node ~ label of a node We only consider trees without duplicate keys (easy to fix) Search trees are used to manage and search a list of keys Operations: search, insert, delete 8 12 18 13 22 21 24 Ulf Leser: Alg&DS, Summer semester 2011 24

Complete Trees Conceptually, we pad search trees to full rank in all nodes padded leaves are usually neither drawn not implemented (NULL) A padded leaf represents the interval of values that would be below this node (but none of its values is a key) 12 12 8 22 8 22 18 24 9-11 18 24 13 21-4 6-7 13 21 23-23 2- - 14-17 19-20 - Ulf Leser: Alg&DS, Summer semester 2011 2

What For? For a search tree T=(V,E), we will reach O(log( V )) for testing whether k T Compared to binsearch, search trees are a dynamically growing / shrinking data structure But need to store pointers Ulf Leser: Alg&DS, Summer semester 2011 26

Searching Straight-forward Comparing the search key to the label of a node tells whether we have to look into the left or into the right subtree If there is no child left, k T Complexity In the worst case we need to traverse the longest path in T to show k T Thus: O( V ) Wait a bit func node search( T search_tree, k integer) { v := root(t); while v!=null do if label(v)>k then v := v.left_child(); else if label(v)<k then v := v.right_child(); else return v; end while; return null; } 8 12 18 22 24 Ulf Leser: Alg&DS, Summer semester 2011 27 13 21

Insertion func bool insert( T search_tree, k integer) { v := root(t); while v!=null do p := v; if label(v)>k then v := v.left_child(); else if label(v)<k then v := v.right_child(); else return false; end while; if label(p)>k then p.left_child := new node(k); else p.right_child := new node(k); end if; return true; } We search the new key k If k T, we do nothing If k T, the search must finish at a null pointer in a node p A right pointer if label(p)<k, otherwise a left pointer We replace the null with a pointer to a new node k This creates a new search tree which contains k Complexity: Same as search Ulf Leser: Alg&DS, Summer semester 2011 28

Example 12 12 8 18 22 24 Insert 19 8 22 18 24 13 21 13 21 12 8 22 19 11 18 24 13 21 19 Ulf Leser: Alg&DS, Summer semester 2011 29

Deletion Again, we first search k. If k T, we are done Assume k T. The following situations are possible k is stored in a leaf. Then simple remove this leaf k is stored in an inner node q with only one child. Then remove q and connect parent(q) to child(q) Clearly, the new structure still is a search tree and does not contain k k is stored in an inner node q with two children Ulf Leser: Alg&DS, Summer semester 2011 30

Observations We cannot remove q, but we can replace the label of q with another label - and remove this node We need a node q which can 7 be removed and whose label k can replace k without hurting the search tree constraints SC label(k )>max(label(left_child(k )), label(successors(left_child(k ))) label(k )<min(label(right_child(k )), label(successors(right_child(k ))) 12 10 22 9 11 13 18 21 24 28 Ulf Leser: Alg&DS, Summer semester 2011 31

Observations Two candidates 12 Largest value in the left subtree (symmetric predecessor of k) 10 22 Smallest value in the right subtree (symmetric successor of k) We can choose any of those 9 11 18 13 21 24 28 Let s use the symmetric predecessor 7 This is either a leaf no problem Ulf Leser: Alg&DS, Summer semester 2011 32

Observations Two candidates Largest value in the left subtree (symmetric predecessor of k) Smallest value in the right subtree (symmetric successor of k) We can choose any of those Let s use the symmetric predecessor 7 This is either a leaf Or an inner node; but since its label is larger than that of all other labels in the left subtree of q, it can only have a left child Thus it is a node with one child - can be removed 12 10 22 9 11 13 18 21 24 28 Ulf Leser: Alg&DS, Summer semester 2011 33

Example 12 7 12 10 22 11 18 9 13 21 24 28 Remove 10 7 9 22 11 Remove 22 18 13 21 24 28 11 12 9 21 18 24 Remove 12 9 21 11 18 24 7 13 28 7 13 28 Ulf Leser: Alg&DS, Summer semester 2011 34

Enumeration One more operation: Print all keys from T in sorted order In which order do we traverse? Prefix, postfix, infix? Clearly, we must first print all left successors, then print a node, then print all right successors Thus, infix order Ulf Leser: Alg&DS, Summer semester 2011 3

Content of this Lecture Trees Search Trees Definition Searching Inserting Deleting Short Analysis Ulf Leser: Alg&DS, Summer semester 2011 36

Natural Tree A search tree created by inserting and deleting keys in arbitrary order is called a natural tree A natural tree T=(V,E) has height(t) [ V -1, log( V )] The concrete height for a set of keys depends on the order in which keys were inserted Example 11,9,10,,21,13,24,18,9,10,11,13,18,21,24 9 11 21 9 10 10 13 24 11 18 Ulf Leser: Alg&DS, Summer semester 2011 37

Average Case Analysis We have seen that a natural tree with n nodes has a maximal height of n-1 Thus, searching will need O(n) comparisons in worst-case Nevertheless, natural trees are not bad on average The average case is O(log(n)) More precisely, a natural tree is on average only ~1.4 times larger than the optimal search tree (with height O(log(n)) We skip the proof (argue over all possible orders of inserting n keys), because balanced search trees (AVL trees) are O(log(n)) also in worst-case and are not much harder to implement Ulf Leser: Alg&DS, Summer semester 2011 38

Example Source: cg.scs.carleton.ca/ Ulf Leser: Alg&DS, Summer semester 2011 39