CS/COE 1501 cs.pitt.edu/~bill/1501/ Searching

Similar documents
Data Structures and Algorithms. Course slides: Radix Search, Radix sort, Bucket sort, Patricia tree

CMPSC112 Lecture 37: Data Compression. Prof. John Wenskovitch 04/28/2017

Need Backtracking! L 0 A 1 B 2 K 3 P 4 C 5 I 6 Z 7 S 8 A 9 N 10 X 11 A 12 N 13 X 14 P 15

CS 231 Data Structures and Algorithms Fall Binary Search Trees Lecture 23 October 29, Prof. Zadia Codabux

Friday Four Square! 4:15PM, Outside Gates

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Representing and Searching Sets of Strings

Chapter 12 Digital Search Structures

Dictionaries. Priority Queues

Data Structures and Algorithms for Engineers

Lecture Notes on Tries

CSCI 104 Tries. Mark Redekopp David Kempe

Lecture Notes on Tries

CSE100. Advanced Data Structures. Lecture 21. (Based on Paul Kube course materials)

Implementing Dynamic Minimal-prefix Tries

1. O(log n), because at worst you will only need to downheap the height of the heap.

CS 206 Introduction to Computer Science II

Sets: Efficient Searching and Insertion

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

1 CSE 100: HASH TABLES

CSE 100 Tutor Review Section. Anish, Daniel, Galen, Raymond

Giri Narasimhan & Kip Irvine

Tree Travsersals and BST Iterators

Trees. Prof. Dr. Debora Weber-Wulff

Radix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees:

Lecture Notes on Tries

BST Implementation. Data Structures. Lecture 4 Binary search trees (BST) Dr. Mahmoud Attia Sakr University of Ain Shams

CS/COE 1501

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Lecture 26. Introduction to Trees. Trees

Binary Search Tree (2A) Young Won Lim 5/17/18

PROBLEM 1 : (And the winner is...(12 points)) Assume you are considering the implementation of a priority queue that will always give you the smallest

CSE 506: Opera.ng Systems. The Page Cache. Don Porter

Applied Databases. Sebastian Maneth. Lecture 14 Indexed String Search, Suffix Trees. University of Edinburgh - March 9th, 2017

Maps; Binary Search Trees

CS/COE 1501

Hierarchical data structures. Announcements. Motivation for trees. Tree overview

CS 307 Final Spring 2009

The Page Cache 3/16/16. Logical Diagram. Background. Recap of previous lectures. The address space abstracvon. Today s Problem.

Binary Trees. College of Computing & Information Technology King Abdulaziz University. CPCS-204 Data Structures I

Announcements. Midterm exam 2, Thursday, May 18. Today s topic: Binary trees (Ch. 8) Next topic: Priority queues and heaps. Break around 11:45am

CSE 332 Winter 2015: Final Exam (closed book, closed notes, no calculators)

CS/COE 1501

Programming II (CS300)

University of Waterloo CS240 Spring 2018 Help Session Problems

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

CS 171: Introduction to Computer Science II. Binary Search Trees

TREES. Trees - Introduction

CSE 332: Data Structures & Parallelism Lecture 7: Dictionaries; Binary Search Trees. Ruth Anderson Winter 2019

Course Name: B.Tech. 3 th Sem. No of hours allotted to complete the syllabi: 44 Hours No of hours allotted per week: 3 Hours. Planned.

CSE 143 Final Exam Part 1 - August 18, 2011, 9:40 am

x-fast and y-fast Tries

A Structure-Shared Trie Compression Method

INF2220: algorithms and data structures Series 1

MLR Institute of Technology

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Test #2. Login: 2 PROBLEM 1 : (Balance (6points)) Insert the following elements into an AVL tree. Make sure you show the tree before and after each ro

Last Lecture: Network Layer

CMPSCI 187: Programming With Data Structures. Lecture #26: Binary Search Trees David Mix Barrington 9 November 2012

Module Contact: Dr Tony Bagnall, CMP Copyright of the University of East Anglia Version 1

Balanced Trees Part One

- Alan Perlis. Topic 24 Heaps

Garbage Collection: recycling unused memory

ECE 250 Algorithms and Data Structures

Hash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Tree Structures. A hierarchical data structure whose point of entry is the root node

CPSC 536N: Randomized Algorithms Term 2. Lecture 5

CS 206 Introduction to Computer Science II

An introduction to suffix trees and indexing

CS/COE 1501 cs.pitt.edu/~bill/1501/ Graphs

CS24 Week 8 Lecture 1

CSE373: Data Structures & Algorithms Lecture 5: Dictionary ADTs; Binary Trees. Linda Shapiro Spring 2016

OPPA European Social Fund Prague & EU: We invest in your future.

CS211, LECTURE 20 SEARCH TREES ANNOUNCEMENTS:

ECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.

CSE 332 Spring 2013: Midterm Exam (closed book, closed notes, no calculators)

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Associate Professor Dr. Raed Ibraheem Hamed

DO NOT. In the following, long chains of states with a single child are condensed into an edge showing all the letters along the way.

CSI33 Data Structures

CSE 373: Data Structures and Algorithms

ECE242 Data Structures and Algorithms Fall 2008

PyTrie Documentation. Release 0.3. George Sakkis

Spring 2018 Mentoring 8: March 14, Binary Trees

Fusion Trees Part Two

UNIT III BALANCED SEARCH TREES AND INDEXING

ECE697AA Lecture 20. Forwarding Tables

Learning Goals. CS221: Algorithms and Data Structures Lecture #3 Mind Your Priority Queues. Today s Outline. Back to Queues. Priority Queue ADT

University of Waterloo CS240R Fall 2017 Review Problems

Lecture 6 Sorting and Searching

Binary Trees, Binary Search Trees

Binary Search Tree (3A) Young Won Lim 6/2/18

Sample Question Paper

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Binary Search Trees. Chapter 21. Binary Search Trees

CS 3114 Data Structures and Algorithms DRAFT Project 2: BST Generic

Search trees, binary trie, patricia trie Marko Berezovský Radek Mařík PAL 2012

CS/COE

CSE332: Data Abstractions Lecture 4: Priority Queues; Heaps. James Fogarty Winter 2012

Transcription:

CS/COE 1501 cs.pitt.edu/~bill/1501/ Searching

Review: Searching through a collection Given a collection of keys C, how to we search for a given key k? Store collection in an array Unsorted Sorted Linked list Unsorted Sorted Binary search tree Differences? Runtimes? 2

Symbol tables Abstract structures that link keys to values Key is used to search the data structure for a value Described as a class in the text, but probably more accurate to think of the concept of a symbol table in general as an interface Key functions: put() contains() 3

A closer look BinarySearchST.java and BST.java present symbol tables based on sorted arrays and binary search trees, respectively Can we do better than these? Both methods depend on comparisons against other keys i.e., k is compared against other keys in the data structure 4 options at each node in a BST: Node ref is null, k not found k is equal to the current node's key, k is found k is less than current key, continue to left child k is greater than the current key, continue to right child 4

Digital Search Trees (DSTs) Instead of looking at less than/greater than, lets go left right based on the bits of the key, so we again have 4 options: Node ref is null, k not found k is equal to the current node's key, k is found current bit of k is 0, continue to left child current bit of k is 1, continue to right child 5

DST example Insert: 4 3 0100 0011 4 0 1 2 0010 3 6 0110 5 0101 Search: 2 0 0 1 5 0 1 6 1 3 0011 0 1 7 0111 6

Analysis of digital search trees Runtime? We end up doing many comparisons against the full key, can we improve on this? 7

Radix search tries (RSTs) Trie as in retrieve, pronounced the same as try Instead of storing keys as nodes in the tree, we store them implicitly as paths down the tree Interior nodes of the tree only serve to direct us according to the bitstring of the key Values can then be stored at the end of key s bit string path 8

RST example Insert: 4 0100 0 1 3 2 0011 0010 0 1 6 0110 0 1 0 1 5 0101 Search: 0 1 0 1 0 1 3 0011 V V V V V 7 0111 9

RST analysis Runtime? Would this structure work as well for other key data types? Characters? Strings? 10

Larger branching factor tries In our binary-based Radix search trie, we considered one bit at a time What if we applied the same method to characters in a string? What would like this new structure look like? Let s try inserting the following strings into an trie: she, sells, sea, shells, by, the, sea, shore 11

Another trie example b s t y e h h a l e o e l l r s l e s 12

Analysis Runtime? 13

Further analysis Miss times Require an average of log R (n) nodes to be examined Where R is the size of the alphabet being considered Proof in Proposition H of Section 5.2 of the text Average # of checks with 2 20 keys in an RST? With 2 20 keys in a large branching factor trie, assuming 8-bits at a time? 14

Implementation Concerns See TrieSt.java Implements an R-way trie Basic node object: Where R is the branching factor private static class Node { private Object val; private Node[] next = new Node[R]; } Non-null val means we have traversed to a valid key Again, note that keys are not directly stored in the trie at all 15

R-way trie example Val: Next A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Val: Next A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Val: Next A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Val: Next 0 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Val: Next Val: Next A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 16

So what s the catch? Space! Considering 8-bit ASCII, each node contains 2 8 references! This is especially problematic as in many cases, alot of this space is wasted Common paths or prefixes for example, e.g., if all keys begin with key, thats 255*3 wasted references! At the lower levels of the trie, most keys have probably been separated out and reference lists will be sparse 17

De La Briandais tries (DLBs) Replace the.next array of the R-way trie with a linked-list 18

DLB Example S B T H E Y H E L A ^ E ^ L L ^ ^ L S S ^ ^ 19

DLB analysis How does DLB performance differ from R-way tries? Which should you use? 20

Searching So far we ve continually assumed each search would only look for the presence of a whole key What about if we wanted to know if our search term was a prefix to a valid key? 21

Final notes This lecture does not present an exhaustive look at search trees/tries, just the sampling that we re going to focus on Many variations on these techniques exist and perform quite well in different circumstances Red/black BSTs Ternary search Tries R-way tries without 1-way branching See the table at the end of Section 5.2 of the text 22