Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b

Similar documents
Range Reporting. Range Reporting. Range Reporting Problem. Applications

Range Reporting. Range reporting problem 1D range reporting Range trees 2D range reporting Range trees Predecessor in nested sets kd trees

Range searching. with Range Trees

Computational Geometry [csci 3250]

Orthogonal range searching. Range Trees. Orthogonal range searching. 1D range searching. CS Spring 2009

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

Computational Geometry

Lecture 3 February 9, 2010

Module 8: Range-Searching in Dictionaries for Points

January 10-12, NIT Surathkal Introduction to Graph and Geometric Algorithms

Orthogonal range searching. Orthogonal range search

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

Computational Geometry

Orthogonal Range Search and its Relatives

Computational Geometry

COT 5407: Introduction. to Algorithms. Giri NARASIMHAN. 1/29/19 CAP 5510 / CGS 5166

Computational Geometry

Point Enclosure and the Interval Tree

Range Searching and Windowing

WINDOWING PETR FELKEL. FEL CTU PRAGUE Based on [Berg], [Mount]

1 The range query problem

1D Range Searching (cont) 5.1: 1D Range Searching. Database queries

CMSC 754 Computational Geometry 1

CSE 530A. B+ Trees. Washington University Fall 2013

Friday Four Square! 4:15PM, Outside Gates

Trees. CSE 373 Data Structures

Computational Geometry

Lower Bound on Comparison-based Sorting

would be included in is small: to be exact. Thus with probability1, the same partition n+1 n+1 would be produced regardless of whether p is in the inp

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

CS350: Data Structures B-Trees

Introduction to Algorithms

Orthogonal Range Queries

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Range Searching. Data structure for a set of objects (points, rectangles, polygons) for efficient range queries.

Recall: Properties of B-Trees

GEOMETRIC SEARCHING PART 2: RANGE SEARCH

(2,4) Trees. 2/22/2006 (2,4) Trees 1

8. Binary Search Tree

Augmenting Data Structures

Lecture: Analysis of Algorithms (CS )

Binary Trees, Binary Search Trees

CS 350 : Data Structures B-Trees

Lecture 6: Analysis of Algorithms (CS )

CMPS 2200 Fall 2017 B-trees Carola Wenk

CSC212 Data Structure - Section FG

CSE 554 Lecture 5: Contouring (faster)

Lecture 8: Orthogonal Range Searching

1. Meshes. D7013E Lecture 14

VQ Encoding is Nearest Neighbor Search

403: Algorithms and Data Structures. Heaps. Fall 2016 UAlbany Computer Science. Some slides borrowed by David Luebke

Range Searching II: Windowing Queries

Geometric Data Structures

Design and Analysis of Algorithms Lecture- 9: Binary Search Trees

Algorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs

CSC Design and Analysis of Algorithms. Lecture 5. Decrease and Conquer Algorithm Design Technique. Decrease-and-Conquer

Computational Geometry

Search Trees - 1 Venkatanatha Sarma Y

COMP Analysis of Algorithms & Data Structures

Geometric Data Structures

Binary Heaps. COL 106 Shweta Agrawal and Amit Kumar

Balanced Search Trees

Sorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

Trees. Eric McCreath

Binary Heaps. CSE 373 Data Structures Lecture 11

CS Fall 2010 B-trees Carola Wenk

COMP Analysis of Algorithms & Data Structures

TREES. Trees - Introduction

Warped parallel nearest neighbor searches using kd-trees

We can use a max-heap to sort data.

An improvement in the build algorithm for Kd-trees using mathematical mean

{ K 0 (P ); ; K k 1 (P )): k keys of P ( COORD(P) ) { DISC(P): discriminator of P { HISON(P), LOSON(P) Kd-Tree Multidimensional binary search tree Tak

Programming II (CS300)

GEOMETRIC SEARCHING PART 1: POINT LOCATION

Announcements. Written Assignment2 is out, due March 8 Graded Programming Assignment2 next Tuesday

Spatial Data Structures

Trapezoid and Chain Methods

Orthogonal line segment intersection. Computational Geometry [csci 3250] Laura Toma Bowdoin College

Design and Analysis of Algorithms - - Assessment

Binary Search Trees. Analysis of Algorithms

Computational Geometry. Lecture 17

Spatial Data Structures

Range Searching. 2010/11/22 Yang Lei

CSC148 Week 7. Larry Zhang

Exercise 1 : B-Trees [ =17pts]

Spatial Data Structures

9. Heap : Priority Queue

Computational Geometry

(2,4) Trees Goodrich, Tamassia (2,4) Trees 1

Balanced Trees Part Two

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Advanced Set Representation Methods

CS 361 Data Structures & Algs Lecture 9. Prof. Tom Hayes University of New Mexico

Line segment intersection (I): Orthogonal line segment intersection. Computational Geometry [csci 3250] Laura Toma Bowdoin College

I/O-Algorithms Lars Arge

External Memory Algorithms for Geometric Problems. Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)

CIS265/ Trees Red-Black Trees. Some of the following material is from:

Trees. Reading: Weiss, Chapter 4. Cpt S 223, Fall 2007 Copyright: Washington State University

Spatial Data Structures

Transcription:

Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b simas@cs.aau.dk

Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and the range trees; to see how data structures can be used to trade the space used for the running time of queries AALG, lecture 12, Simonas Šaltenis, 2005 2

Range queries How do you efficiently find points that are inside of a rectangle? Orthogonal range query ([x 1, x 2 ], [y 1,y 2 ]): find all points (x, y) such that x 1 <x<x 2 and y 1 <y<y 2 Useful also as a multi-attribute database query y y 2 y 1 x 1 x 2 x AALG, lecture 12, Simonas Šaltenis, 2005 3

Preprocessing How much time such a query would take? Rules of the game: We preprocess the data into a data structure Then, we perform queries and updates on the data structure Analysis: Preprocessing time Efficiency of queries (and updates) The size of the structure Assumption: no two points have the same x- coordinate (the same is true for y-coordinate). AALG, lecture 12, Simonas Šaltenis, 2005 4

1D range query How do we do a 1D range query [x 1, x 2 ]? Balanced BST where all data points are stored in the leaves The size of it? Where do we find the answer to a query? T Search path for x 1 Search path for x 2 b 1 q 1 a 1 b 2 q 2 a 2 Total order of data points AALG, lecture 12, Simonas Šaltenis, 2005 5

1D range query How do we find all these leaf nodes? A possibility: have a linked list of leaves and traverse from q 1 to q 2 but, will not work for more dimensions Sketch of the algorithm: Find the split node Continue searching for x1, report all right-subtrees Continue searching for x2, report all left-subtrees When leaves q 1 and q 2 are reached, check if they belong to the range AALG, lecture 12, Simonas Šaltenis, 2005 6

1DRangeSearch 1DRangeSearch(T, x 1, x 2 ) 01 v FindSplit(T, x 1, x 2 ) 02 return DoLeft(v, x 1, x 2 ) DoRight(v, x 1, x 2 ) doleft(v, x 1, x 2 ) 01 if v is leaf then 02 if x 1 v.key x 2 then return v 03 else 04 if x 1 v.key then return ReportSubtree(v.rightChild) DoLeft(v.leftChild, x 1, x 2 ) 05 else return DoLeft(v.rightChild, x 1, x 2 ) doright(v, x 1, x 2 ) // similar to doleft, but with modified lines 04-05 Why is this correct? AALG, lecture 12, Simonas Šaltenis, 2005 7

Analysis of 1D range query What is the worst-case running time of a query? It is output-sensitive: two traversals down the tree plus the O(k), where k is the number of reported data points: O(log n + k) What is the time of construction? Sort, construct by dividing into two, creating the root and conquering the two parts recursively O(n log n) Size: O(n) AALG, lecture 12, Simonas Šaltenis, 2005 8

2D range query How can we solve a 2D range query? Observation 2D range query is a conjunction of two 1D range queries: x 1 <x<x 2 and y 1 <y<y 2 Naïve idea: have two BSTs (on x-coordinate and on y-coordinate) Ask two 1D range queries Return the intersection of their results What is the worst-case running time (and when does it happen)? Is it output-sensitive? y y 2 y 1 x 1 x 2 AALG, lecture 12, Simonas Šaltenis, 2005 9 x

Range tree Idea: when performing search on x-coordinate, we need to start filtering points on y-coordinate earlier! Canonical subset P(v) of a node v in a BST is a set of points (leaves) stored in a subtree rooted at v Range tree is a multi-level data structure: T The main tree is a BST T on the a (v) x-coordinate of points T Any node v of T stores a pointer to a BST T a (v) (associated structure of v), which stores canonical subset P(v) organized on the y-coordinate 2D points are stored in all leaves! BST on y-coords AALG, lecture 12, Simonas Šaltenis, 2005 10 v P(v) BST on x-coords P(v)

Querying the range tree How do we query such a tree? Use the 1DRangeSearch on T, but replace ReportSubtree(w) with 1DRangeSearch(T a (w), y 1, y 2 ) What is the worst-case running time? Worst-case: We query the associated structures on all nodes on the path down the tree On level j, the depth of the associated structure is log n =log n j j 2 Total running time: O(log 2 n + k) AALG, lecture 12, Simonas Šaltenis, 2005 11

Size of the range tree What is the size of the range tree? At each level of the main tree associated structures store all the data points once (with constant overhead) (Why?) : O(n) There are O(log n) levels Thus, the total size is O(n log n) AALG, lecture 12, Simonas Šaltenis, 2005 12

Building the range tree How do we efficiently build the range tree? Sort the points on x and on y (two arrays: X,Y) Take the median v of X and create a root, build its associated structure using Y Split X into sorted X L and X R, split Y into sorted Y L and Y R (s.t. for any p X L or p Y L, p.x < v.x and for any p X R or p Y R, p.x v.x) Build recursively the left child from X L and Y L and the right child from X R and Y R What is the running time of this? O(n log n) AALG, lecture 12, Simonas Šaltenis, 2005 13

Range trees: summary Range trees Building (preprocessing time): O(n log n) Size: O(n log n) Range queries: O(log 2 n + k) Running time can be improved to O(log n + k) without sacrificing the preprocessing time or size Layered range trees (uses fractional cascading) Priority range trees (uses priority search trees as associated structures) AALG, lecture 12, Simonas Šaltenis, 2005 14

Kd-trees What if we want linear space? Idea: partition trees generalization of binary search trees Kd-tree: a binary tree Data points are at leaves For each internal node v: x-coords of left subtree v < x-coords of right subtree, if depth of v is even (split with vertical line) y-coords of left subtree v < y-coords of right subtree, if depth of v is odd (split with horizontal line) Space: O(n) points are stored once. AALG, lecture 12, Simonas Šaltenis, 2005 15

Example kd-tree 8 7 6 a g 5 x 5 b 4 3 y 4 3 2 d e c f 2 3 6 d e b a c g f x 1 1 2 3 4 5 6 7 8 AALG, lecture 12, Simonas Šaltenis, 2005 16

Draw a kd-tree Draw a kd-tree storing the following data points 8 7 6 5 4 3 2 1 b h g e d f a c 1 2 3 4 5 6 7 8 AALG, lecture 12, Simonas Šaltenis, 2005 17

Querying the kd-tree How do we answer a range query? Observation: Each internal node v corresponds to a region(v) (where all its children are included). We can maintain region(v) as we traverse down the tree 8 g 7 a 6 5 b 4 3 e d f 2 c 1 1 2 3 4 5 6 7 8 5 4 3 2 3 6 d e b a c g f AALG, lecture 12, Simonas Šaltenis, 2005 18

Querying the kd-tree The range query algorithm (range R): If region(v) does not intersect R, do not go deeper into the subtree rooted at v If region(v) is fully contained in R, report all points in the subtree rooted at v If region(v) only intersects with R, go recursively into v s children. AALG, lecture 12, Simonas Šaltenis, 2005 19

Analysis of the search alg. What is the worst-case running time of the search? Traversal of subtrees v, such that region(v) is fully contained in R adds up to O(k). We need to find the number of regions that intersect R the regions which are crossed by some border of R As an upper bound for that, let s find how many regions a crossed by a vertical (or horizontal) line What recurrence can we write for it? T(n) = 2 + 2T(n/4) Solution: O n Total time: O nk AALG, lecture 12, Simonas Šaltenis, 2005 20

Building the kd-tree How do we build the kd-tree? Sort the points on x and on y (two arrays: X,Y) Take the median v of X (if depth is even) or Y (if depth is odd) and create a root Split X into sorted X L and X R, split Y into sorted Y L and Y R, s.t. for any p X L or p Y L, p.x < v.x (if depth is even) or p.y < v.y (if depth is odd) for any p X R or p Y R, p.x v.x (if depth is even) or p.y v.y (if depth is odd) Build recursively the left child from X L and Y L and the right child from X R and Y R What is the running time of this? O(n log n) AALG, lecture 12, Simonas Šaltenis, 2005 21

Kd-trees: summary Kd-tree: Building (preprocessing time): O(n log n) Size: O(n) Range queries: Onk What about point queries? O(log n) AALG, lecture 12, Simonas Šaltenis, 2005 22

Quadtrees Quadtree a four-way partition tree region quadtrees vs. point quadtrees kd-trees can also be point or region Linear space Good average query performance a b 2 1 c 1 2 3 3 2 4 4 d e 3 4 1 3 c f 3 f 4 d e b a AALG, lecture 12, Simonas Šaltenis, 2005 23