Range Searching and Windowing

Similar documents
Orthogonal range searching. Range Trees. Orthogonal range searching. 1D range searching. CS Spring 2009

Computational Geometry

Computational Geometry. Lecture 17

Computational Geometry

CS Fall 2010 B-trees Carola Wenk

Computational Geometry

Computational Geometry

Binary Search Trees > = 2014 Goodrich, Tamassia, Goldwasser. Binary Search Trees 1

CMSC 754 Computational Geometry 1

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

Computational Geometry

CS 3343 Fall 2007 Red-black trees Carola Wenk

Orthogonal range searching. Orthogonal range search

Binary Trees. Recursive definition. Is this a binary tree?

Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b

Lecture 3 February 9, 2010

Lecture 6: Analysis of Algorithms (CS )

CMPS 2200 Fall 2017 Red-black trees Carola Wenk

Range Reporting. Range Reporting. Range Reporting Problem. Applications

CMPS 2200 Fall 2017 B-trees Carola Wenk

CPS 231 Exam 2 SOLUTIONS

Algorithms. Red-Black Trees

CMPS 2200 Fall 2015 Red-black trees Carola Wenk

Design and Analysis of Algorithms Lecture- 9: Binary Search Trees

Binary search trees. Support many dynamic-set operations, e.g. Search. Minimum. Maximum. Insert. Delete ...

Binary Search Trees, etc.

Lecture: Analysis of Algorithms (CS )

Computational Geometry

Orthogonal Range Search and its Relatives

Balanced Binary Search Trees. Victor Gao

COT 5407: Introduction to Algorithms. Giri Narasimhan. ECS 254A; Phone: x3748

CSE Data Structures and Introduction to Algorithms... In Java! Instructor: Fei Wang. Binary Search Trees. CSE2100 DS & Algorithms 1

Linked Structures Songs, Games, Movies Part III. Fall 2013 Carola Wenk

CSE2331/5331. Topic 6: Binary Search Tree. Data structure Operations CSE 2331/5331

10/9/17. Using recursion to define objects. CS 220: Discrete Structures and their Applications

Balanced search trees

Binary search trees :

Range Reporting. Range reporting problem 1D range reporting Range trees 2D range reporting Range trees Predecessor in nested sets kd trees

January 10-12, NIT Surathkal Introduction to Graph and Geometric Algorithms

Friday Four Square! 4:15PM, Outside Gates

2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS

CSE 530A. B+ Trees. Washington University Fall 2013

CS 220: Discrete Structures and their Applications. Recursive objects and structural induction in zybooks

GEOMETRIC SEARCHING PART 2: RANGE SEARCH

1D Range Searching (cont) 5.1: 1D Range Searching. Database queries

Balanced Search Trees. CS 3110 Fall 2010

Augmenting Data Structures

Planar Point Location

Algorithm Theory. 8 Treaps. Christian Schindelhauer

Priority Queues. T. M. Murali. January 29, 2009

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

8. Binary Search Tree

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

AVL Trees. (AVL Trees) Data Structures and Programming Spring / 17

COMP171. AVL-Trees (Part 1)

WINDOWING PETR FELKEL. FEL CTU PRAGUE Based on [Berg], [Mount]

Ch04 Balanced Search Trees

Advanced Data Structures Summary

Problem Set 6 Solutions

Data Structures Week #6. Special Trees

Priority Queues. T. M. Murali. January 26, T. M. Murali January 26, 2016 Priority Queues

CISC 235: Topic 4. Balanced Binary Search Trees

Search Trees (Ch. 9) > = Binary Search Trees 1

External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair

Problem Set 5 Solutions

1 The range query problem

Animations. 01/29/04 Lecture

Trees and Graphs Shabsi Walfish NYU - Fundamental Algorithms Summer 2006

Some Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees

Priority Queues. T. M. Murali. January 23, T. M. Murali January 23, 2008 Priority Queues

Algorithms. AVL Tree

Binary Search Trees. 1. Inorder tree walk visit the left subtree, the root, and right subtree.

Range Queries. Kuba Karpierz, Bruno Vacherot. March 4, 2016

Priority queues. Priority queues. Priority queue operations

Lecture 8: Orthogonal Range Searching

13.4 Deletion in red-black trees

We assume uniform hashing (UH):

Chapter 2: Basic Data Structures

Dynamic trees (Steator and Tarjan 83)

How fast can we sort? Sorting. Decision-tree example. Decision-tree example. CS 3343 Fall by Charles E. Leiserson; small changes by Carola

Balanced Search Trees

CS 350 : Data Structures Binary Search Trees

Trapezoidal decomposition:

Lecture 5: Scheduling and Binary Search Trees

Balanced search trees. DS 2017/2018

Extra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University

An AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time

Binary Tree. Preview. Binary Tree. Binary Tree. Binary Search Tree 10/2/2017. Binary Tree

Data Structures Week #6. Special Trees

CS 315 Data Structures mid-term 2

AVL Trees / Slide 2. AVL Trees / Slide 4. Let N h be the minimum number of nodes in an AVL tree of height h. AVL Trees / Slide 6

13.4 Deletion in red-black trees

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Dictionaries. Priority Queues

Algorithms. Deleting from Red-Black Trees B-Trees

Data structures for totally monotone matrices

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

COMP251: Red-black trees

Search Trees. Undirected graph Directed graph Tree Binary search tree

Transcription:

CS 6463 -- Fall 2010 Range Searching and Windowing Carola Wenk 1

Orthogonal range searching Input: n points in d dimensions E.g., representing a database of n records each with d numeric fields Query: Axis-aligned box (in 2D, a rectangle) Report on the points inside the box: Are there any points? How many are there? List the points. 2

Orthogonal range searching Input: n points in d dimensions Query: Axis-aligned box (in 2D, a rectangle) Report on the points inside the box Goal: Preprocess points into a data structure to support fast queries Primary goal: Static data structure In 1D, we will also obtain a dynamic data structure supporting insert and delete 3

1D range searching In 1D, the query is an interval: First solution: Sort the points and store them in an array Solve query by binary search on endpoints. Obtain a static structure that can list k answers in a query in O(k +logn) time. Goal: Obtain a dynamic structure that can list k answers in a query in O(k +l log n) ) time. 4

1D range searching In 1D, the query is an interval: New solution that extends to higher dimensions: Balanced binary search tree New organization principle: Store points in the leaves of the tree. Internal nodes store copies of the leaves to satisfy binary search property: Node x stores in key[x] the maximum key of any leaf in the left subtree of x. 5

Example of a 1D range tree 1 17 43 6 8 12 14 26 35 41 42 59 61 key[x] is the maximum key of any leaf in the left subtree of x. 6

Example of a 1D range tree 17 x 8 42 x > x 1 14 35 43 1 6 12 17 26 41 43 59 6 8 12 14 26 35 41 42 59 61 key[x] is the maximum key of any leaf in the left subtree of x. 7

Example of a 1D range query 17 x 8 42 x > x 1 14 35 43 1 6 12 17 26 41 43 59 6 8 12 14 26 35 41 42 59 61 RANGE-QUERY([7, 41]) 8

General 1D range query root split node 9

Pseudocode, part 1: Find the split node 1D-RANGE-QUERY(T, [x 1, x 2 ]) w root[t] while w is not a leaf and (x 2 key[w] or key[w] < x 1 ) do if x 2 key[w] then w lf left[w] else w right[w] // w is now the split node [traverse left and right from w and report relevant subtrees] 10

Pseudocode, part 2: Traverse left and right from split node 1D-RANGE-QUERY(T, [x 1, x 2 ]) [find the split node] // w is now the split node if w is a leaf then output the leaf w if x 1 key[w] x 2 else v left[w] // Left traversal while v is not a leaf do if x 1 key[v] then output the subtree rooted at right[v] v left[v] w else v right[v] output the leaf v if x 1 key[v] x 2 [symmetrically i for right traversal] 11

Analysis of 1D-RANGE-QUERYQ Query time: Answer to range query represented by O(log n) ) subtrees found in O(log n) ) time. Thus: Can test for points in interval in O(log n) ) time. Can report all k points in interval in O(k + log n) time. Can count points in interval in O(log n) time Space: O(n) Preprocessing time: O(n log n) 12

2D range trees 13

2D range trees Store a primary 1D range tree for all the points based on x-coordinate. Thus in O(log n) time we can find O(log n) subtrees representing the points with proper x-coordinate. t How to restrict to points with proper y-coordinate? 14

2D range trees Idea: In primary 1D range tree of x-coordinate, everyer node stores a secondary 1D range tree based on y-coordinate for all points in the subtree of the node. Recursively el search within each. 15

2D range tree example Secondary trees 2 6/6 5 5/8 3/5 1 2/7 5/8 8 5/8 2/7 2/7 7 6/6 6 3/5 5 3/5 8 7 6/6 6 7/2 1/1 9/3 7/2 3 1/1 2 1/1 5 9/3 7/2 3 5 2 7 1 3 6 9/3 1/1 2/7 3/5 5/8 6/6 7/2 Primary tree 16

Analysis of 2D range trees Query time: In O(log 2 n) = O((log n) 2 ) time, we can represent answer to range query by O(log 2 n) ) subtrees. Total cost for reporting k points: O(k + (log n) 2 ). Space: The secondary trees at each level of the primary tree together store a copy of the points. Also, each point is present in each secondary tree along the path from the leaf to the root. Either way, we obtain that the space is O(n log n). Preprocessing time: O(n log n) ) 17

d-dimensional range trees Each node of the secondary y-structure stores a tertiary z-structure representing the points in the subtree rooted at the node, etc. Save one log factor using fractional cascading Query time: O(k + log d n) to report k points. Space: O(n log d 1 n) Preprocessing time: O(n log d 1 n) 18

Search in Subsets Given: Two sorted arrays A 1 and A, with A 1 A A query interval [l,r] Task: Report all elements e in A 1 and A with l e r Idea: Add pointers from A to A 1 : For each a A add a pointer to the smallest element b A 1 with b a Query: Find l A,, follow pointer to A 1. Both in A and A 1 sequentially output all elements in [l,r]. Query: [15,40] A 3 10 19 23 30 37 59 62 80 90 A 1 10 19 30 62 80 Runtime: O((log n + k) + (1 + k)) = O(log n + k)) 19

Search in Subsets (cont.) Given: Three sorted arrays A 1, A 2, and A, with A 1 A and A 2 A 1 2 Query: [15,40] A 3 10 19 23 30 37 59 62 80 90 A 2 A 1 10 19 30 62 80 3 23 37 62 90 Runtime: O((log n + k) + (1+k) + (1+k)) = O(log n + k)) Range trees: Y 1 Y 2 X Y 1 Y 2 20

Fractional Cascading: Layered Range Tree Replace 2D range tree with a layered range tree, using sorted arrays and pointers instead of the secondary range trees. Preprocessing: O(n log n) Query: O(log n + k) 21

d-dimensional range trees Query time: O(k + log d-1 n) ) to report k points, uses fractional cascading in the last dimension Space: O(n log d 1 n) Preprocessing time: O(n log d 1 n) Best data structure to date: Query time: O(k + log d 1 n) to report k points. Space: O(n (log n / log log n) d 1 ) Preprocessing time: O(n log d 1 n) ) 22

Windowing Input: A set S of n line segments in the plane Query: Report all segments in S that intersect a given query window Subproblem: Process a set of intervals on the line into a data structure which supports queries of the type: Report all intervals that contain a query point. 23

Interval trees Goal: To maintain a dynamic set of intervals, such as time intervals. i = [7, 10] low[i] = 7 10 = high[i] 5 11 17 19 4 8 15 18 22 23 Query: For a given query interval i, find an interval in the set that overlaps i. 24

Following the methodology 1. Choose an underlying data structure. Red-black tree keyed on low (left) endpoint. 2. Determine additional information to be stored in the data structure. Store in each node x the interval int[x] corresponding to the key, as well as the largest value m[x] of all right interval endpoints stored in the subtree rooted at x. int m 25

Example interval tree int m red 17,19 23 5,11 18 22,23 23 4,8 15,18 8 18 7,10 10 high[int[x]] m[x] [ ] = max m[left[x]] [ [ m[right[x]] 26

Modifying operations 3. Verify that this information can be maintained for modifying operations. INSERT: Fix m s on the way down. Rotations Fixup = O(1) () time per rotation: 11,15 30 6,20 30 19 30 6,20 30 11,15 19 30 14 14 19 Total INSERT time = O(log n); DELETE similar. 27

New operations 4. Develop new dynamic-set operations that use the information. INTERVAL-SEARCH(i) x root while x NIL and (low[i] > high[int[x]] or low[int[x]] > high[i]) do i and int[x] don t overlap if left[x] NIL and low[i] m[left[x]] then x left[x] else x right[x] return x 28

Example 1: INTERVAL-SEARCH([14,16]) x 17,19 23 14 16 5,11 18 22,23 23 4,8 15,18 8 18 7,10 10 x root [14,16] and [17,19] don t overlap 14 18 x lft[ left[x] ] 29

Example 1: INTERVAL-SEARCH([14,16]) 17,19 23 23 14 16 x 5,11 18 22,23 23 4,8 15,18 8 18 7,10 10 [14,16] and [5,11] don t overlap 14 > 8 x right[x] ] 30

Example 1: INTERVAL-SEARCH([14,16]) 17,19 23 23 14 16 5,11 18 22,23 23 4,8 15,18 x 8 18 7,10 10 [14,16] and [15,18] overlap return [15,18] 18] 31

Example 2: INTERVAL-SEARCH([12,14]) x 17,19 23 12 14 5,11 18 22,23 23 4,8 15,18 8 18 7,10 10 x root [12,14] and [17,19] don t overlap 12 18 x lft[ left[x] ] 32

Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 23 12 14 x 5,11 18 22,23 23 4,8 15,18 8 18 7,10 10 [12,14] and [5,11] don t overlap 12 > 8 x right[x] ] 33

Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 23 12 14 5,11 18 22,23 23 4,8 15,18 x 8 18 7,10 10 [12,14] and [15,18] don t overlap 12 > 10 x right[x] ] 34

Example 2: INTERVAL-SEARCH([12,14]) 17,19 23 23 12 14 5,11 18 22,23 23 4,8 15,18 8 18 7,10 x 10 x = NIL no interval that overlaps [12,14] 14] exists 35

Analysis Time = O(h) = O(log n), since INTERVAL- SEARCH does constant work at each level as it follows a simple path down the tree. List all overlapping intervals: Search, list, delete, repeat. Insert them all again at the end. Time = O(k log n), where k is the total number of overlapping intervals. This is an output-sensitive bound. Best algorithm to date: O(k + log n). 36

Correctness Theorem. Let L be the set of intervals in the left subtree of node x, and let R be the set of intervals in x s right subtree. If the search goes right, then { i L : i overlaps i } =. If the search goes left, then {i L : i overlaps i } = {i R : i overlaps i } =. In other words, it s always safe to take only 1 of the 2 children: we ll either find something, or nothing was to be found. 37

Correctness proof Proof. Suppose first that the search goes right. If lf[ left[x] ] = NIL, then we re done, since L =. Otherwise, the code dictates that we must have low[i] [i]> m[left[x]]. The value m[left[x]] corresponds to the right endpoint of some interval j L, and no other interval in L can have a larger right endpoint than high( j). i L high( j) =m[left[x]] low(i) Therefore, {i L : i overlaps i } =. 38

Proof (continued) Suppose that the search goes left, and assume that {i L : i overlaps i } =. Then, the code dictates that low[i] m[left[x]] = high[ j] for some j L. Since j L, it does not overlap i, and hence high[i] [ ] < low[ [ j]. But, the binary-search-tree property implies that for all i R, we have low[ j] low[i ]. But then {i R : i overlaps i } =. i j i L 39