Indexing of moving objects. Simonas Šaltenis Aalborg University
|
|
- Arthur Wood
- 5 years ago
- Views:
Transcription
1 Indexing of moving objects Simonas Šaltenis Aalborg University
2 Welcome! The audience of the course, what this course is about and what it is not about: research results not how to use commercial DBMSs, but we will talk about them too Important: ideas Agenda: Indexing of current time and future: TPR-trees, TPR*-trees, STRIPES Indexing of the past: B x -trees, BB x -trees, 3D R-trees, Partial persistence Acknowledgment Yufei Tao (TPR*-tree slides) Simonas Šaltenis, 006
3 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis, 006 3
4 Why Moving Objects? Position-aware, online, moving objects are enabled by the following trends. Miniaturization of electronics Advances in positioning systems (e.g., GPS, assisted GPS,...) Advances in wireless communications Examples of position-aware online moving objects GPS-enabled mobile-phones, as well as diverse types of personal digital assistants (online cameras, wrist watches, etc.). The coming years will witness very large quantities of these. Vehicles, including cars, public transportation, recreational vehicles, sea vessels, airplanes, etc. Simonas Šaltenis, 006 4
5 Sample Location-Based Services Context-aware Location-Based Services: Location-aware advertising Integrated tourist services Traffic coordination and management Identification of impending traffic jams and expected fastest routes between positions Safety-related services It is possible to monitor tourists traveling in dangerous environments, and then react to emergencies. Games! Sensor-networks also generates MO data! Monitoring of any kind-of continuous variables, e.g., temperature, pressure (1D, nd). Simonas Šaltenis, 006 5
6 Requirements There is a need to track the positions of large amounts of objects in a database main-memory solutions may not always suffice, especially if history needs to be recorded There is a need for efficient processing of queries: Find all mobile-phones that are currently no further than 500m from the supermarket Find all mobile-phones that will be no further than 500m from the supermarket in 10 minutes (according to the current knowledge of their movement) Find ten objects that are currently closest to the supermarket Find all mobile-phones that were no further than 500m from the supermarket last Saturday between 10 and 11 a.m. Simonas Šaltenis, 006 6
7 Rules of the game Measure of efficiency number of I/O operations One I/O operation reads or writes a page of data (~Kb 16Kb) from the disk to memory A page of data = a node of data structure (index) consisting of a number B of entries The size of the data set in disk blocks n = N/B CPU cost is mostly neglected Operations that have to be supported: Different kind of queries (that's why we want an index) Insertions Deletions Update = Deletion + Insertion Bulk-loading the index Simonas Šaltenis, 006 7
8 Multidimensional DS Multidimensional Data Structures are used to answer queries on the set of stationary multidimensional points Range query ([x 1, x ], [y 1,y ]): find all points (x, y) such that x 1 <x<x and y 1 <y<y, and do it faster than a linear scan (Θ(n)) preferably O(log n) in worst-case or at least on average Simonas Šaltenis, 006 8
9 1D Range-Searching: B-trees B-trees [Bayer & McCreight, 197] a balanced diskbased search tree B+-trees All data entries (keys) are in leaf-nodes Based on an ordering < Simonas Šaltenis, 006 9
10 B-trees (cont.) Internal (directory) nodes direct queries and point to lower-level nodes Leaf nodes include actual keys and pointers to data items Guaranteed 50% capacity by insert and delete algorithms split / merge routines Range query: find x, such that x 1 <x<x Can be done in O(log n + k) worst-case time, where k is the size of the answer in disk blocks x 1 x Simonas Šaltenis,
11 Multidimensional Range Searching Much harder than one-dimensional problem A lot of multidimensional index methods have been proposed The theoretically optimal ones [Arge et al. 1999] that achieve O(log n + k) worst-case query performance: are unpractically complex, require a little bit more than linear-space Easy to implement, linear-space methods: K-D-B-trees, Quadtrees, Grid Files (for points) R-trees and variations (work for non-point objects as well) Concurrency control, recovery protocols for R-trees are well researched Implemented in Oracle, IBM Informix, PostgreSQL, MySQL Not only range queries, but a wide variety of other types of queries are supported by R-trees (nearest-neighbor queries, joins) Simonas Šaltenis,
12 Spatial Indexing With the R-Tree Example R1 p6 Query R3 p1 p R4 p4 R5 p7 R1 R p3 p5 p8 R3 R4 R5 R6 R7 R R7 p1 R6 p9 p11 p10 p13 p1 p p3 p4 Pointers to data items p5 p6 p7 p8 p9 p10p11 p1p13 Simonas Šaltenis, 006 1
13 R-tree Properties Leaf entry = <n-dimensional point, rid> Internal entry = <n-dim MBR, ptr to a child node> MBR a Minimum Bounding Rectangle of all points in the sub-tree pointed to by ptr R-tree is a balanced tree all leaves are at same depth from the root Nodes are kept at least m% full (except root) through insertion and deletion algorithms m is usually chosen to be 40%. m is the minimum fill factor, depending on the workload the average fill factor is usually 70%. Simonas Šaltenis,
14 Range Query in R-trees Range query: RangeQuery(Q, root) RangeQuery(Q, node) If node is internal, for each entry <MBR, ptr> in node: if MBR overlaps Q, RangeQuery(Q, ReadNode (ptr)) If node is leaf, for each entry <E, rid> in node: if E overlaps Q, report rid Note: We may have to search several subtrees at each node! What about B-trees? Worst-case performance O(n)! But in practice, R-trees exhibit good query performance for various data sets How do we insert and delete? Simonas Šaltenis,
15 Grow-Post trees Grow-Post trees: generalized R-tree-type indexes Bounding predicate (BP) = something that describes entries in a subtree Building blocks of algorithms: Consistent(BP, Q) returns true if results of query Q can be under BP (in the R-tree, MBR intersects Q) PickSplit(node) splits a page of entries into two groups Penalty(BP, E) returns an estimate how worse BP becomes if E is inserted under it Union(node) computes a BP of a collection of entries (in the R-tree, computes an MBR minimum and maximum in all dimensions ) Internal Nodes Leaf Nodes BP 1 BP..... BP n Simonas Šaltenis,
16 Insertion Insert(E) leaf = ChoosePath(E, root) Insert E into leaf PropogateUp (leaf) ChoosePath(E, node) If node is leaf, return node. From all entries in node, choose entry <MBR, ptr> with the smallest Penalty (MBR, E). ChoosePath(E, ReadNode(ptr)). PropogateUp(node) If node is overfull, call PickSplit(node) to produce n1 and n, replace node s old entry in its parent by e1 = Union(n1), e = Union(n), call PropogateUp(node's parent) Else if e = Union(node) is different from node s old entry in its parent, replace the old entry with e, callpropogateup(node's parent). Create a new root with two entries whenever a root is split. Simonas Šaltenis,
17 Heuristics for Penalty Heuristics of least area enlargement and smallest area are used in the R-tree s Penalty. R1 p6 R3 p1 p R4 p4 R5 p7 R1 R p3 p5 p8 R3 R4 R5 R6 R7 R R7 p1 R6 p9 p11 p10 p13 p14 p1 p p3 p4 Pointers to data tuples p5 p6 p7 p8 p9 p10p11 p1p13 Simonas Šaltenis,
18 Heuristics for Penalty Heuristics of least area enlargement and smallest area are used in the R-tree s Penalty. R1 p6 R3 p1 p R4 p4 R5 p7 R1 R p3 p5 p8 R3 R4 R5 R6 R7 R R7 p1 R6 p9 p11 p10 p13 p14 p1 p p3 p4 Pointers to data tuples p5 p6 p7 p8 p9 p10p11 p1p13 p14 Simonas Šaltenis,
19 PickSplit The entries in a node node plus the newly inserted entry must be distributed between n1 and n The goal is to reduce a likelihood of both n1 and n being searched on subsequent queries Again we use heuristics of minimum area. Redistribute so as to minimize the area of n1 plus the area of n. Problem there is an exponential number of possible distributions of entries to consider. Solutions: Guttman s quadratic gready algorithm Good split R*-tree s algrithm Bad split Simonas Šaltenis,
20 Deletion Delete(E) Using the search procedure, find a leaf where entry E is located Remove E from leaf. Call PropogateUp(leaf). PropogateUp(leaf) 1. If leaf is underfull, deallocate the node leaf remove leaf s entry in its parent, call PropogateUp(leaf s parent), and reinsert all leaf s entries.. Else if e = Union(leaf) is different from leaf s old entry in its parent, replace the old entry with e, call PropogateUp (leaf s parent). No additional heuristics are involved in Delete, underfull nodes are handled using Insert as a subroutine. Simonas Šaltenis, 006 0
21 Comments on R-Trees Works well for - 4 D datasets. Several variants (notably, R + and R*-trees) have been proposed; widely used Supports a wide variety of queries Point / range queries Spatial join queries [Brinkhoff et al., 1993] Direction, topological, distance queries [Papadias et al., 1995] k- Nearest neighbor queries [Roussopoulos et al., 1995] Simonas Šaltenis, 006 1
22 Exercises R-trees Insert points p1, p, p1, p13, p5, p14, p8, p7, p11, p9 from slide 17 into an initially empty R-tree with the node capacity of 3 entries and minimum allowed fan-out of. Use Guttman s quadratic split algorithm. Would you build a better tree from scratch if all points were known in advance? What is the difference between the B-tree and the 1D R-tree? What kind of data the 1D R-tree can store that the B-tree can not? What is the bounding predicate in B-trees? Consider nearest neighbor queries ( find a data point that is nearest to the query point ) in R-trees: Can they be implemented redefining the Consistent(E, Q) method, do you need something more? Simonas Šaltenis, 006
23 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis, 006 3
24 Problem Statement We address the problem of indexing the ever-changing current and predicted future positions of point objects moving in one, two, and three-dimensional space. Indexing challenges specific to MOs continuous change of positions extrapolation between the last update and the current time must be supported hyper-dynamic workloads high-rates of updates Simonas Šaltenis, 006 4
25 Modeling Continuous Movement In conventional databases, data is assumed constant unless explicitly modified. With continuous movement, this is problematic. Too frequent updates Outdated, inacurate data Simonas Šaltenis, 006 5
26 Modeling Continuous Movement In conventional databases, data is assumed constant unless explicitly modified. With continuous movement, this is problematic. Too frequent updates Outdated, inacurate data Instead of storing position values, we store positions as functions of time, yielding time-parameterized positions. We use linear functions to capture the present and future positions. x( t) = x( t0) + v( t t0), where t now Updates are less frequent Tentative future queries are supported t 0 For example, given, the current and anticiapted, future position of a twodimensional point can be described by four parameters. x ( t0), y( t0), v x, v y Simonas Šaltenis, 006 6
27 Modeling Continuous Movement Three ways to think about continuously moving points in d-dimensional space: Lines in (d+1)-dimensional space d spatial dimensions and 1 time dimension Points in d-dimensional space d spatial and d velocity dimensions (function parameters: ) Time-parameterized points in d-dimensional space x t ), v ( 0 x x(t 0 ) o 3 o o 1 o o 1 o t v Simonas Šaltenis, 006 7
28 Queries Type 1: objects that intersect a given rectangle at t x Type : objects that 6 o 3 intersect a given 5 4 rectangle sometime 3 o from t1 to t o 1 o Type 3: objects that 1 1 o intersect a given 4 moving rectangle t sometime between t 1 andt We can expect, that most queries will be consentrated in the sliding window [CT, CT+W], i.e. CT <= t, t 1, t <= CT + W Simonas Šaltenis, 006 8
29 Time-Parameterized Rectangles The TPR-tree [Šaltenis et al. 000] is based on the R- tree. Moving points are bounded with time-parameterized rectangles. Are bounding from now on. The R-tree allows overlap. The tree employs conservative bounding rectangles. Union( node) : min i c i c max i c i c { } x ( t ) = min o. x ( t ) o node { } x ( t ) = max o. x ( t ) o node { } v = min o. v o node { } i Simonas Šaltenis, min i v = max o. v o node max i i
30 Entry Structure, Querying Do we need to store t c in each entry? No we use one common reference time t r (the same for data points): x = x ( t ) = x ( t ) + v ( t t ) Entry structure: <TPBR, ptr> min min min min i i r i c i r c x = x ( t ) = x ( t ) + v ( t t ) max max max max i i r i c i r c min max min max min max min max TPBR = MBR,VBR = ( x, x, x, x ),( v, v, v, v ) At any t > CT we can get a valid R-tree: TPR-tree(t) = R-tree x () t = x ( t ) + v ( t t ) min min min i i r i r x () t = x ( t ) + v ( t t ) max max max i i r i r Simonas Šaltenis,
31 Tightening Ideally, bounding rectangles should be always minimal. Excessive storage cost x t TPBRs are tightened (i.e., recomputed) whenever a bounded node is modified Simonas Šaltenis,
32 Tightening Ideally, bounding rectangles should be always minimal. Excessive storage cost x t TPBRs are tightened (i.e., recomputed) whenever a bounded node is modified Simonas Šaltenis, 006 3
33 Simonas Šaltenis, Insertion: Grouping Points How to group moving points (Penalty and PickSplit)? The R-tree s algorithms minimize characteristics of MBRs such as area, overlap, and margin. How does that work for moving points?
34 Insertion in the TPR-Tree The bounding rectangle characteristics (area, overlap, and margin) are functions of time. The goal is to minimize these for all time points from now to now+h. Minimizing the characteristics for time now + H/ does not work (e.g., the area of a conservative bounding rectangle is not linear). We use the regular R*-tree algorithms, but all bounding rectangle characteristics are replaced by their integrals. now+ H now A ( t) dt, where A(t) is, e.g., the area of an MBR What H to use? H depends on the update rate, and on how far queries may reach into the future (W). Simonas Šaltenis,
35 Example I We illustrate the working of the TPR-tree by means of an example. The subsequent figures are generated automatically, by the index code used for performance experiments. Data 0 one-dimensional points are used. Index Parameters Page size = 64 (5 entries in leaf nodes and 3 in non-leaf nodes). H = 8. Simonas Šaltenis,
36 Example II CT = 0 At CT=1, the point at x = 0, v = 0 is updated to have x = 18.5, v = Simonas Šaltenis,
37 Example III CT = 1 Inserting a moving point at position 14 with v = 0.5. Simonas Šaltenis,
38 Example IV After insertion Simonas Šaltenis,
39 What H to use? Intuitivly: we want H to be equal to the time during which queries will see the node that we consider modifying: H depends on the update rate, and on how far queries may reach into the future (W) Experiments show that H = UI + W consistently gives good query performance (UI average update interval) The system can track UI automatically (How?) W is usually smaller than UI can be tracked automatically too Simonas Šaltenis,
40 TPR-Tree Bulk-Loading I Regular R-tree bulk loading aims at dividing the space into roughly equal, square MBRs (e.g., STR). It is a straightforward approach to use STR, interpreting the d-dimensional moving points as d-dimensional points. The ratio between the spatial and velocity extents of BRs is important. α We introduce a parameter that expresses how many times the velocity extents of BRs are longer than their spatial extents. Given a fixed number of BRs to produce, we can make them large in the spatial dimensions, but small in the velocity dimensions (i.e., growing slowly). This corresponds to a small α. make them small in the spatial dimensions, but large in the velocity dimensions (i.e., fast growing). This corresponds to a large. α Simonas Šaltenis,
41 TPR-Tree Bulk Loading II Example: Using either initially large, but slow-growing, or initially small, but fast-growing, time-parameterized BRs. x(t 0 ) V s S v x t x(t 0 ) t α = 1 (good for a large H) 4 α = 1 Goal: find an optimal α as a function of H. v x (good for a small H) Simonas Šaltenis,
42 TPR-Tree Bulk Loading III One-dimensional uniform data Let k be the number of nodes and the space-time ratio S V SV k = s =. Area (length) A( t) = s + tsα. s sα kα H SV We integrate A( t) : I = s (1 + tα ) dt = ( H + 0 kα I To minimize I, we solve = 0 α = α H For D data,, and for 3D data, α α = H H gets smaller with increasing dimensionality Intuitive as hyper-volumes grow faster in more dimensions α H α). α H Simonas Šaltenis, 006 4
43 TPR-Tree Bulk Loading IV A modified STR is used that achieves the desired -ratio. Example: 000 points, 0 points per leaf-node. α H = 4, α = 1 H = 16, α = 1 8 Simonas Šaltenis,
44 What about Expanding BRs? Will the expanding TPBRs ruin the performance? That's what the experiments show: Settings: objects update only once per hour! D data, with W = 40. Due to the constant influx of updates, the performance of the TPR-tree does not degrade after reaching a certain level. Simonas Šaltenis,
45 Summary The TPR-tree indexes the current and predicted future positions of moving objects. The TPR-tree is based on the proven, widely used R-tree technology The tree extends the R*-tree by introducing conservative, timeparameterized bounding rectangles, which are tightened regularly. The tree s algorithms use integrals of area, overlap, etc. The tree can be tuned to take advantage of a specific update rate and querying window length. Other types of queries that are supported by the R-tree can be supported by the TPR-tree, e.g., nearest-neighbor queries. Simonas Šaltenis,
46 Exercises TPR-tree Write a Consistent (Q, E) function for a d-dimensional TPR-tree. E = [ x, x ],...,[ x, x ];[ v, v ],...,[ v, v ], 1 1 d d 1 1 d d Q= [ q, q ],...,[ q, q ];[ w, w ],...,[ w, w ];[ t, t ] 1 1 d d 1 1 d d where [ x, x ], and [ q, q ] are TPBRs' extents at time t. i i i i Consider SR-tree, where instead of MBR s minimum bounding circles (spheres) are used. Define the bounding predicate (and a Consistent (Q, E) function) of a time parameterized SR-tree. Simonas Šaltenis,
47 TPR*-tree Observation: TPR-tree integrated heuristics optimize the tree for time-slice queries (Why?) What if we want to target window and moving queries? A query is given as a TPBR plus a time window: Q = ( qx, qx, qx, qx ),( qv, qv, qv, qv ),( t, t ) min max min max min max min max min max TPR*-tree [Tao et al., 003]: Uses heuristics targeted for window and moving queries Uses other (structural) improvements of algorithms: Modified ChoosePath Modified PickSplit and PickWorst (for reinsertions) Active tightening during deletions Simonas Šaltenis,
48 Sweeping Region Heuristics Once again: What does it mean for a static rectangle to intersect with a static query? y axis y axis O Q O' Q center x axis i x axis Simonas Šaltenis,
49 Sweeping Region Heuristics What does it mean for a TPBR O to intersect with a moving query at some time point t? The same as for the TPBR O' to intersect the static query centroid: MBR of O' on i-th axis: VBR of O' on i-th axis: q q x x + min max max min ( v qv, v qv ) min max ( i, i i i ) i i i i y axis O -1 Q y axis O' 3 3 Q center x axis i x axis Simonas Šaltenis,
50 Sweeping Region Heuristics What does it mean for a TPBR O to intersect with a moving query at any point in time interval (t min,t max )? The same as for the sweeping region of TPBR O' to intersect the static query centroid y axis O' 3 Q center i x axis Simonas Šaltenis,
51 Sweeping Region Heuristics Instead of integrals of TPR-tree, areas of sweeping regions are used: E.g.: enlargment of the integral of area of O is replaced by the enlargment of the area of the sweeping region of O' Assumption about query: static point Compare these two heuristics: What happens if we have a static MBR? What happens if we have a moving MBR that does not change area? Does this confirm the intuition that these heuristics optimize for differnet queries? Simonas Šaltenis,
52 TPR Problem: ChoosePath To insert an entry, the TPR-tree picks the sub-tree incurring the minimum penalty y axis the (absolute) values of all velocities are 1 b a g y axis b g p d 4 i (static) e f c time 0 d h x axis What would a TPR-tree do? Simonas Šaltenis, i e f c inserting p at time May result in inserting an entry into a bad sub-tree a h x axis
53 TPR* solution: ChoosePath Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm) y axis i b e f c inserting p at time g p a h d x axis Maintain a heap: [(g),0], [(h),0], [(i),0] the path expanded so far the accumulated penalty so far Simonas Šaltenis,
54 TPR* solution: ChoosePath Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm). 10 y axis Visit node g: 8 6 b g p d [(h),0], [(a,g),3], [(i),0], [(b,g),3] 4 i e f c inserting p at time a h x axis complete paths already although nodes a and b are not visited Simonas Šaltenis,
55 TPR* solution: ChoosePath Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm) y axis i b e f c inserting p at time g p a h d x axis Visit node h: [(a,g),3], [(d,h),9], [(c,h),17], [(i),0], [(b,g),3] The algorithm stops now. Simonas Šaltenis,
56 TPR* solution: ChoosePath It is more expensive. Why does it make sense to do? Query performance is obviously improved Insertions are more expensive But overall update improves: Remember: update is deletion + insertion The main part of the cost of deletion (and all update) is search, which we improved with the new ChoosePath! Simonas Šaltenis,
57 TPR: Tightening MBRs in deletion Entry deletion requires first finding the entry, which accesses many nodes of the tree. The TPR-tree uses this fact to tighten the MBR of non-leaf entries. Assume nodes h and i are accessed before e is found; then the TPR-tree will tighten the MBR of i only (enclosing g and f) y axis the (absolute) values of all velocities are 1 f h g a b i e c time 0 d j x axis y axis h i g b f a e deleting e at time 1 d c j x axis Simonas Šaltenis,
58 TPR: Tightening MBRs in deletion Entry deletion requires first finding the entry, which accesses many nodes of the tree. The TPR-tree uses this fact to tighten the MBR of non-leaf entries. Assume nodes h and i are accessed before e is found; then the TPR-tree will tighten the MBR of i only (enclosing g and f) y axis the (absolute) values of all velocities are 1 f h g a b i e c time 0 d j x axis y axis h i g b f a d c j x axis after deleting e at time 1 Simonas Šaltenis,
59 TPR* solution: Active tightening Tightening more entries for free. Assume nodes h and i are accessed before e is found; then the TPR*-tree will tighten the MBR of both h and i y axis the (absolute) values of all velocities are 1 f h g a b i e c time 0 d j x axis y axis h i g b f a e deleting e at time 1 d c j x axis Simonas Šaltenis,
60 TPR* solution: Active tightening Tightening more entries for free. Assume nodes h and i are accessed before e is found; then the TPR*-tree will tighten the MBR of both h and i y axis the (absolute) values of all velocities are 1 f g i e d y axis i g f 4 h a b c j 4 b h a d c j time 0 x axis x axis after deleting e at time 1 Simonas Šaltenis,
61 TPR* solution: Active tightening Another example: Assume the shaded nodes are accessed to find e. The active tightening can tighten the MBR of n 5, n 6, n 3, and n 4. But not n 1 and n. Simonas Šaltenis,
62 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis, 006 6
63 Modeling Continuous Movement Three ways to think about continuously moving points in d-dimensional space: Lines in (d+1)-dimensional space d spatial dimensions and 1 time dimension Points in d-dimensional space d spatial and d velocity dimensions (function parameters: x ( t 0 ), v ) Time-parameterized points in d-dimensional space our approach x o 3 o o t o x(t 0 ) o 1 o v Simonas Šaltenis,
64 STRIPES Using Duality The main idea of STRIPES [Patel et al., 004] is to transform queries and points using the duality transformation: Lines become points, points become lines Trajectories become points Queries become region queries Use Quadtrees to index points That's it! Simonas Šaltenis,
65 Quadtrees Quadtree a four-way partition tree Linear space Good average query performance Orcale has them a b 1 c d e c f 3 f 4 d e b a Simonas Šaltenis,
66 Query Transformation Exercise Let's see how a window query Q = ([x l, x u ],[t l, t u ]) is transformed: x u x l 1 x x(t 0 ) o ? t v t i t u 1 o 1 o Simonas Šaltenis,
67 STRIPES: Summary Avoids the problem of expanding TPBRs Uses reference positions: No tightening is performed Has to assume L (the longest update interval) and to maintain two indixes: from t ref to t ref + L, and from t ref + L to t ref + L Index is not tuned to a specific horizon H (determined by updating and querying pattern) Remember TPR-trees bulk-loaded with different values of the α parameter (corresponding to the different values of H)! Simonas Šaltenis,
68 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis,
69 B x -tree [Jensen et al., 004] The goal: to index the current and future positions of moving objects using B-trees B-trees are at the core of every DBMS Very good concurrency-protocols are available for B-trees (and are implemented in all DBMS) improtant when we track a lot of objects B-tree updates are very efficient and we have a lot of them But how do we store and query these future linear trajectories of objects? Key ideas used: Space-filling curves Query expansion Simonas Šaltenis,
70 Space-filling curves How can we index multi-dimensional points with B-trees? Space-filling curve: An infinite curve that visits all points of a discretized space (usually one quadrant (if D) of space) In this way it uniquely maps multi-dimensional points to natural numbers Building Hilbert curve: (3,4) Simonas Šaltenis,
71 Transforming Queries Other space-filling curves can be constructed, e.g., Peano curve (Z-ordering) How do we perform range queries on the B + - tree storing the transformed points? Multi-dimensional range query is transformed into a set of intervals Simonas Šaltenis,
72 Enlarging Queries What about moving points? Just index their positions re-computed at some reference time point (t ref ) How do we ask range queries then? Let's assume we know the VBR of all points, i.e., maximum velocities of objects in all four directions: v l 1,v u 1,v l,v u Insight: If the query time is t q, we can expand the query to catch all necessary objects, knowing that objects traveling at maximum speed v, could have moved not further than v t ref t q. Simonas Šaltenis, 006 7
73 Multiple B-trees Problem: we may need to expand queries too much if the difference t ref t q is large. Idea: divide objects into n + 1 groups use a separate B- tree with a separate t ref for each group! Example: n + 1 = Only half of the points are queried with the maximum enlargment. What n to choose? t 1 t t q Large n: Smaller average query enlargment But, more queries are performed in more B-trees Authors propose to use n = Simonas Šaltenis,
74 Index Structure All three B + -trees are combined into one that has three partitions: 0, 1, and. Objects are assigned to partitions based on their last update time-stamps t u. Let's assume that all objects update more frequently than every Δt mu time units: t lab =t u t mu /n l partition= t lab / t mu /nmod n1 xrep= xvaluexv t lab t u The key then is (partition,xrep) Simonas Šaltenis,
75 Technicalities If an object does not update itself in Δt mu, it is migrated from partition 0 to partition n. Instead of using maximum VBR of all objects for expansion: The space is divided into a grid of cells, each sell maintains a VBR of objects in that cell (in main memory). The global VBR is used to expand the query, then local VBR is computed from the covered cells. The local VBR is used to recompute a possibly smaller expansion of query. The filtering of query results has to be done at the end. Simonas Šaltenis,
76 Exercise Which objects will be returned by the query q? Which are false drops? What is the precision of the query? Assume global VBR for query expansions Positions of objects are given at update times (given after comma) E, C,4 q B,4 A,1 D, Δt mu =6 t q =7 0 3 CT 6 9 time Simonas Šaltenis,
77 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis,
78 Trajectory data Positions between measurements are linearly interpolated. What we have are 3D polypines Two spatial dimensions + one time dimension The same types of queries as for the current-future data Simonas Šaltenis,
79 Recording of Movement How do we get the data? It is given as a static file of positions or new data is constantly appended to the end If new data is appended to the end: The last segment is probably an infinite trajectory based on the given velocity vector We have to assume that an object is somewhere between the last update and the current time Then, whenever an update happens, we have to update the last infinite peace of trajectory into a new peace of trajectory and insert a new infinite peace of trajectory Simonas Šaltenis,
80 Recording of Movement: Example pos Linear prediction of future movement Recorded polyline u1 u u3 u4 u5 CT time Simonas Šaltenis,
81 R-tree Based Indexing of Trajectories We can use a 3D R-tree! But Line segments (especially long ones) are difficult to index because they introduce large MBRs with a lot of overlap We can cut long line segments into smaller the index becomes larger STR-tree and TB-tree [Pfoser et al., 000]: Modifications of 3D R-tree, that strive for trajectory preservation (segments of the same trajectory should be clustered together): Heuristics of the tree changed In the TB-tree, a linked list conects line segments in the leaf level into a trajectory Trajectory queries are supported: Find all the taxis that on Monday at 10:00am were 1km around Fredrik Bajers Vej 7 and then drove to the city center Problem historical time-slice queries perform worse and worse, the more history you accumulate. Simonas Šaltenis,
82 BB x -tree It is easy to see how the B x -tree can be modified to record history: Have a separate B + -tree for each partition objects updated during [t ref Δt mu /n; t ref ] are stored in this B + -tree according to their positions at t ref Entry structure: (xrep, t start, t end, ptr), xrep is used as a key, for internal nodes t start and t end is the minimum and the maximum of timestamps of children. Simonas Šaltenis, 006 8
83 BB x -tree A query accesses those B + -trees whose lifespans overlap with the query BB x -tree can support both historical and future queries Technicalities: Together with each tree the histogram of VBRs is stored If an object does not update itself in Δt mu, it is migrated to the last partition: But it is not easy to do this, some sort of trigers are needed ( priority queue) Simonas Šaltenis,
84 Outline Motivation Background Multidimensional index structures R-tree Indexing the current and future positions: TPR-tree and TPR*-tree STRIPES B x -tree Trajectory indexing: Straight-forward R-tree-based approaches BB x -tree Partial persistence Conclusion of the course Simonas Šaltenis,
85 Partial Persistence Data structures that do not record history of changes are called ephemeral data structures (e.g., B-tree, R-tree) The technique of partial persistence transforms an ephemeral data structure into a partially persistent data structure which records the history and: Answers historical time-slice queries with the same asymptotic worst-case performance as an ephemeral sturcture Has the size proportional to the number of updates How is this achieved? Let's look at R-trees Entry structure: (point/mbr,t start, t end, ptr) For data (leaf) entries t start is the insertion time, t end is the deletion time. For non-leaf entries t start is the time when the ptr node was created, t end is the time when the ptr node was time split Simonas Šaltenis,
86 Time Split An entry is alive if t end =, else it is dead. The key operation is time spliting of a node A: A new node is created and all live entries from A are copied to the new node. No cahnges are done to A, and it will never change again, t end of the live parent entry of A is set to the current time, a new live entry is created pointing to the new node. A node is called dead if it was time split. When the root is time split, a pointer to it is put in a root array Queries consider only entries live at the time of query Simonas Šaltenis,
87 Time Split: Example Time spliting a node when it is overfull (B = 5), CT = 11: A(5,10) B(7, ) A(5,10) B(7, ) M(11, ) C(5,9) D(6, ) E(5, ) F(5,10) G(7, ) H(7,10) I(8,9) J(8, ) K(8,10) C(5,9) D(6, ) E(5, ) F(5,10) G(7, ) H(7,10) I(8,9) J(8, ) K(8,10) G(7, ) J(8, ) L(11, ) Insert L(11, ) What if an overfull node is full of live entries? Are there other situations when we do a time split? What does it give us? Simonas Šaltenis,
88 Invariants Weak version condition For each moment in time and for each non-root node, the number of alive entries is either zero or at least d (d = B/ k for some k). This guarantees that a time-slice for any time point sees an ephemeral structure that has O(B) fan-out! Queries have the same(good) worst-case performance The node is time-split when it is overfull or if weak version condition is violated. Strong version condition The number of alive entries in a node right after time-split is more than d+e and less than B-e (e = B/ j for some j) This ensures that the size of the index remains linear (why?) Simonas Šaltenis,
89 Algorithm Delete Insert Weak version underflow 1 Invariants OK AD Overflow Time split D Strong version underflow Time split sibling Merge two live nodes D Invariants OK A A a node of alive entries is bounded D a node of dead entries is bounded AD a mixed node is bounded Strong version overflow Key split live node A A Simonas Šaltenis,
90 Partially Persistent TPR-tree: TPBRs Straightforward TPBRs Double TPBRs with static "heads" x j Time of last update x j Double entry TPBR in TPR-tree x j x j x j x j Time of last update t CT t t CT t x j x j After insertion and time split x j x j Dead entry Time split Alive entry x j x j Time split Dead entry Double entry with an empty "head" t CT t t CT t Simonas Šaltenis,
91 Partially Persistent TPR-tree To make the TPR-tree partially persistent new double TPBRs are introduced for nodes with a mix of dead and alive entries Black box of ephemeral algorithms is opened Partial persistence algorithms have to be modified further to allow updating of the last-recorded trajectory segment (from infinite prediction to a finite line segment) Recall that time splits may have introduced a lot of copies of this live segment Simonas Šaltenis,
92 Exercise: Partially Persistent R-tree Do one operation per time unit: Insert p1, p, p1, p13, p5, p14; Delete p5, p; Insert p8; Delete p1; Insert p7, p11, p9. Assume: B = 6, m = 50% (3), d =, e = 1 p6 p1 p p4 p7 p3 p8 p5 p1 p9 p11 p10 p13 Simonas Šaltenis, 006 9
93 Not Covered Topics Uncertainty Either queries or objects are expanded Which of the covered indices support circle-objects? Update efficiency Using object-id-based look-ups in hash table to eliminate R-tree deletion multiple path searches [M. Lee et al., 003] Using buffer-tree [Arge, 1995, 1999] techniques Main-memory indexing Cache-conscious and cache-oblivious data structures Different types of queries (knn, closest pairs, etc.) Papadias et al. Infrastructure-constrained indexing Papadias et al., 003 Simonas Šaltenis,
94 Thank you How did you like the course? Acknowledgments Yufei Tao for his TPR*-tree slides Simonas Šaltenis,
Indexing the Positions of Continuously Moving Objects
Indexing the Positions of Continuously Moving Objects Simonas Šaltenis Christian S. Jensen Aalborg University, Denmark Scott T. Leutenegger Mario A. Lopez Denver University, USA SIGMOD 2000 presented by
More informationIntroduction to Indexing R-trees. Hong Kong University of Science and Technology
Introduction to Indexing R-trees Dimitris Papadias Hong Kong University of Science and Technology 1 Introduction to Indexing 1. Assume that you work in a government office, and you maintain the records
More informationSpatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018
Spatiotemporal Access to Moving Objects Hao LIU, Xu GENG 17/04/2018 Contents Overview & applications Spatiotemporal queries Movingobjects modeling Sampled locations Linear function of time Indexing structure
More informationMultidimensional Indexing The R Tree
Multidimensional Indexing The R Tree Module 7, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Single-Dimensional Indexes B+ trees are fundamentally single-dimensional indexes. When we create
More informationSpatial Data Management
Spatial Data Management [R&G] Chapter 28 CS432 1 Types of Spatial Data Point Data Points in a multidimensional space E.g., Raster data such as satellite imagery, where each pixel stores a measured value
More informationSpatial Data Management
Spatial Data Management Chapter 28 Database management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Types of Spatial Data Point Data Points in a multidimensional space E.g., Raster data such as satellite
More informationPrinciples of Data Management. Lecture #14 (Spatial Data Management)
Principles of Data Management Lecture #14 (Spatial Data Management) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Project
More informationChap4: Spatial Storage and Indexing. 4.1 Storage:Disk and Files 4.2 Spatial Indexing 4.3 Trends 4.4 Summary
Chap4: Spatial Storage and Indexing 4.1 Storage:Disk and Files 4.2 Spatial Indexing 4.3 Trends 4.4 Summary Learning Objectives Learning Objectives (LO) LO1: Understand concept of a physical data model
More informationAbstract. 1. Introduction
Predicted Range Aggregate Processing in Spatio-temporal Databases Wei Liao, Guifen Tang, Ning Jing, Zhinong Zhong School of Electronic Science and Engineering, National University of Defense Technology
More informationR-Trees. Accessing Spatial Data
R-Trees Accessing Spatial Data In the beginning The B-Tree provided a foundation for R- Trees. But what s a B-Tree? A data structure for storing sorted data with amortized run times for insertion and deletion
More informationAdvances in Data Management Principles of Database Systems - 2 A.Poulovassilis
1 Advances in Data Management Principles of Database Systems - 2 A.Poulovassilis 1 Storing data on disk The traditional storage hierarchy for DBMSs is: 1. main memory (primary storage) for data currently
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 6 - Storage and Indexing
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 6 - Storage and Indexing References Generalized Search Trees for Database Systems. J. M. Hellerstein, J. F. Naughton
More informationIndexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel
Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes
More informationR-Tree Based Indexing of Now-Relative Bitemporal Data
36 R-Tree Based Indexing of Now-Relative Bitemporal Data Rasa Bliujūtė, Christian S. Jensen, Simonas Šaltenis, and Giedrius Slivinskas The databases of a wide range of applications, e.g., in data warehousing,
More informationWhat we have covered?
What we have covered? Indexing and Hashing Data warehouse and OLAP Data Mining Information Retrieval and Web Mining XML and XQuery Spatial Databases Transaction Management 1 Lecture 6: Spatial Data Management
More informationData Organization and Processing
Data Organization and Processing Spatial Join (NDBI007) David Hoksza http://siret.ms.mff.cuni.cz/hoksza Outline Spatial join basics Relational join Spatial join Spatial join definition (1) Given two sets
More informationMultidimensional Indexes [14]
CMSC 661, Principles of Database Systems Multidimensional Indexes [14] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Motivation Examined indexes when search keys are in 1-D space Many interesting
More informationExternal-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair
External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair {machado,tron}@visgraf.impa.br Theoretical Models Random Access Machine Memory: Infinite Array. Access
More informationI/O-Algorithms Lars Arge
I/O-Algorithms Fall 203 September 9, 203 I/O-Model lock I/O D Parameters = # elements in problem instance = # elements that fits in disk block M = # elements that fits in main memory M T = # output size
More informationUpdate-efficient indexing of moving objects in road networks
DOI 1.17/s177-8-52-5 Update-efficient indexing of moving objects in road networks Jidong Chen Xiaofeng Meng Received: 22 December 26 / Revised: 1 April 28 / Accepted: 3 April 28 Springer Science + Business
More informationChapter 12: Indexing and Hashing. Basic Concepts
Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition
More informationDatenbanksysteme II: Multidimensional Index Structures 2. Ulf Leser
Datenbanksysteme II: Multidimensional Index Structures 2 Ulf Leser Content of this Lecture Introduction Partitioned Hashing Grid Files kdb Trees kd Tree kdb Tree R Trees Example: Nearest neighbor image
More informationChapter 12: Indexing and Hashing
Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More informationModule 4: Tree-Structured Indexing
Module 4: Tree-Structured Indexing Module Outline 4.1 B + trees 4.2 Structure of B + trees 4.3 Operations on B + trees 4.4 Extensions 4.5 Generalized Access Path 4.6 ORACLE Clusters Web Forms Transaction
More informationBackground: disk access vs. main memory access (1/2)
4.4 B-trees Disk access vs. main memory access: background B-tree concept Node structure Structural properties Insertion operation Deletion operation Running time 66 Background: disk access vs. main memory
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More informationProblem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees: Example. B-trees. primary key indexing
15-82 Advanced Topics in Database Systems Performance Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries 2 Indexing
More informationStorage hierarchy. Textbook: chapters 11, 12, and 13
Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow Very small Small Bigger Very big (KB) (MB) (GB) (TB) Built-in Expensive Cheap Dirt cheap Disks: data is stored on concentric circular
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More informationThe B-Tree. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland
Yufei Tao ITEE University of Queensland Before ascending into d-dimensional space R d with d > 1, this lecture will focus on one-dimensional space, i.e., d = 1. We will review the B-tree, which is a fundamental
More informationLecture Notes: External Interval Tree. 1 External Interval Tree The Static Version
Lecture Notes: External Interval Tree Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk This lecture discusses the stabbing problem. Let I be
More informationCMPUT 391 Database Management Systems. Spatial Data Management. University of Alberta 1. Dr. Jörg Sander, 2006 CMPUT 391 Database Management Systems
CMPUT 391 Database Management Systems Spatial Data Management University of Alberta 1 Spatial Data Management Shortcomings of Relational Databases and ORDBMS Modeling Spatial Data Spatial Queries Space-Filling
More informationAccess Methods. Basic Concepts. Index Evaluation Metrics. search key pointer. record. value. Value
Access Methods This is a modified version of Prof. Hector Garcia Molina s slides. All copy rights belong to the original author. Basic Concepts search key pointer Value record? value Search Key - set of
More informationUNIT III BALANCED SEARCH TREES AND INDEXING
UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant
More informationChap4: Spatial Storage and Indexing
Chap4: Spatial Storage and Indexing 4.1 Storage:Disk and Files 4.2 Spatial Indexing 4.3 Trends 4.4 Summary Learning Objectives Learning Objectives (LO) LO1: Understand concept of a physical data model
More informationFile Structures and Indexing
File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures
More informationInformation Retrieval. Wesley Mathew
Information Retrieval Wesley Mathew 30-11-2012 Introduction and motivation Indexing methods B-Tree and the B+ Tree R-Tree IR- Tree Location-aware top-k text query 2 An increasing amount of trajectory data
More informationDATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11
DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables
More informationExtended R-Tree Indexing Structure for Ensemble Stream Data Classification
Extended R-Tree Indexing Structure for Ensemble Stream Data Classification P. Sravanthi M.Tech Student, Department of CSE KMM Institute of Technology and Sciences Tirupati, India J. S. Ananda Kumar Assistant
More informationIntersection Acceleration
Advanced Computer Graphics Intersection Acceleration Matthias Teschner Computer Science Department University of Freiburg Outline introduction bounding volume hierarchies uniform grids kd-trees octrees
More informationIndexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25
Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small
More informationI/O-Algorithms Lars Arge Aarhus University
I/O-Algorithms Aarhus University April 10, 2008 I/O-Model Block I/O D Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory M T =
More informationDatabase index structures
Database index structures From: Database System Concepts, 6th edijon Avi Silberschatz, Henry Korth, S. Sudarshan McGraw- Hill Architectures for Massive DM D&K / UPSay 2015-2016 Ioana Manolescu 1 Chapter
More informationKathleen Durant PhD Northeastern University CS Indexes
Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical
More informationRelational Database Support for Spatio-Temporal Data
Relational Database Support for Spatio-Temporal Data by Daniel James Mallett Technical Report TR 04-21 September 2004 DEPARTMENT OF COMPUTING SCIENCE University of Alberta Edmonton, Alberta, Canada Relational
More informationChapter 17 Indexing Structures for Files and Physical Database Design
Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to
More informationCMSC724: Access Methods; Indexes 1 ; GiST
CMSC724: Access Methods; Indexes 1 ; GiST Amol Deshpande University of Maryland, College Park March 14, 2011 1 Partially based on notes from Joe Hellerstein Outline 1 Access Methods 2 B+-Tree 3 Beyond
More informationTree-Structured Indexes
Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationLecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure
Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology taoyf@cse.cuhk.edu.hk 3 Making the external interval tree
More informationkd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )
kd-trees Invented in 1970s by Jon Bentley Name originally meant 3d-trees, 4d-trees, etc where k was the # of dimensions Now, people say kd-tree of dimension d Idea: Each level of the tree compares against
More informationChapter 25: Spatial and Temporal Data and Mobility
Chapter 25: Spatial and Temporal Data and Mobility Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 25: Spatial and Temporal Data and Mobility Temporal Data Spatial
More informationAn AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time
B + -TREES MOTIVATION An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations finish within O(log N) time The theoretical conclusion
More informationPoint Cloud Filtering using Ray Casting by Eric Jensen 2012 The Basic Methodology
Point Cloud Filtering using Ray Casting by Eric Jensen 01 The Basic Methodology Ray tracing in standard graphics study is a method of following the path of a photon from the light source to the camera,
More informationCourse Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta
Database Management Systems Winter 2002 CMPUT 39: Spatial Data Management Dr. Jörg Sander & Dr. Osmar. Zaïane University of Alberta Chapter 26 of Textbook Course Content Introduction Database Design Theory
More informationClustering Billions of Images with Large Scale Nearest Neighbor Search
Clustering Billions of Images with Large Scale Nearest Neighbor Search Ting Liu, Charles Rosenberg, Henry A. Rowley IEEE Workshop on Applications of Computer Vision February 2007 Presented by Dafna Bitton
More informationTree-Structured Indexes. Chapter 10
Tree-Structured Indexes Chapter 10 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k 25, [n1,v1,k1,25] 25,
More informationPhysical Level of Databases: B+-Trees
Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,
More informationHigh Dimensional Indexing by Clustering
Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should
More informationMultidimensional Divide and Conquer 2 Spatial Joins
Multidimensional Divide and Conque Spatial Joins Yufei Tao ITEE University of Queensland Today we will continue our discussion of the divide and conquer method in computational geometry. This lecture will
More informationChapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,
Introduction Chapter 5 Hashing hashing performs basic operations, such as insertion, deletion, and finds in average time 2 Hashing a hash table is merely an of some fixed size hashing converts into locations
More informationSpatial Data Structures
15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) April 1, 2003 [Angel 9.10] Frank Pfenning Carnegie
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+
More informationSchool of Computing & Singapore-MIT Alliance, National University of Singapore {cuibin, lindan,
Bin Cui, Dan Lin and Kian-Lee Tan School of Computing & Singapore-MIT Alliance, National University of Singapore {cuibin, lindan, tankl}@comp.nus.edu.sg Presented by Donatas Saulys 2007-11-12 Motivation
More informationIntroduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far
Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing
More informationR-Tree Based Indexing of Now-Relative Bitemporal Data
R-Tree Based Indexing of Now-Relative Bitemporal Data Rasa Bliujūtė Christian S. Jensen Simonas Šaltenis Giedrius Slivinskas Department of Computer Science Aalborg University, Denmark rasa, csj, simas,
More informationChapter 12: Indexing and Hashing
Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationData Warehousing & Data Mining
Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Logical Model: Cubes,
More informationMassive Data Algorithmics
Database queries Range queries 1D range queries 2D range queries salary G. Ometer born: Aug 16, 1954 salary: $3,500 A database query may ask for all employees with age between a 1 and a 2, and salary between
More informationSpatial Data Structures
CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration
More informationChapter 12: Indexing and Hashing (Cnt(
Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition
More informationSpatial Data Structures
15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) March 28, 2002 [Angel 8.9] Frank Pfenning Carnegie
More informationSpatial Data Structures
CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/
More informationCMSC 754 Computational Geometry 1
CMSC 754 Computational Geometry 1 David M. Mount Department of Computer Science University of Maryland Fall 2005 1 Copyright, David M. Mount, 2005, Dept. of Computer Science, University of Maryland, College
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationRemember. 376a. Database Design. Also. B + tree reminders. Algorithms for B + trees. Remember
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 14 B + trees, multi-key indices, partitioned hashing and grid files B and B + -trees are used one implementation
More informationAdvanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b
Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b simas@cs.aau.dk Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees
More informationAdvanced Data Types and New Applications
Advanced Data Types and New Applications These slides are a modified version of the slides of the book Database System Concepts (Chapter 24), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.
More informationTopics to Learn. Important concepts. Tree-based index. Hash-based index
CS143: Index 1 Topics to Learn Important concepts Dense index vs. sparse index Primary index vs. secondary index (= clustering index vs. non-clustering index) Tree-based vs. hash-based index Tree-based
More informationDongmin Seo, Sungho Shin, Youngmin Kim, Hanmin Jung, Sa-kwang Song
Dynamic Hilbert Curve-based B + -Tree to Manage Frequently Updated Data in Big Data Applications Dongmin Seo, Sungho Shin, Youngmin Kim, Hanmin Jung, Sa-kwang Song Dept. of Computer Intelligence Research,
More informationBalanced Trees Part One
Balanced Trees Part One Balanced Trees Balanced search trees are among the most useful and versatile data structures. Many programming languages ship with a balanced tree library. C++: std::map / std::set
More informationTree-Structured Indexes
Tree-Structured Indexes Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke Access Methods v File of records: Abstraction of disk storage for query processing (1) Sequential scan;
More informationSpatial Queries. Nearest Neighbor Queries
Spatial Queries Nearest Neighbor Queries Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer efficiently point queries range queries k-nn
More informationDatabase System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static
More informationIntroduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana
Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too
More informationRay Tracing III. Wen-Chieh (Steve) Lin National Chiao-Tung University
Ray Tracing III Wen-Chieh (Steve) Lin National Chiao-Tung University Shirley, Fundamentals of Computer Graphics, Chap 10 Doug James CG slides, I-Chen Lin s CG slides Ray-tracing Review For each pixel,
More informationTPR tree Running in Main Memory
AALBORG UNIVERSITY Department of Computer Science TPR tree Running in Main Memory Beata Nitkiewicz Michał Pańczyk August 2006 1 2 Abstract As the branch of business, where locations based services matter,
More informationPhysical Database Design: Outline
Physical Database Design: Outline File Organization Fixed size records Variable size records Mapping Records to Files Heap Sequentially Hashing Clustered Buffer Management Indexes (Trees and Hashing) Single-level
More informationThe B dual -Tree: Indexing Moving Objects by Space Filling Curves in the Dual Space
The VLDB Journal manuscript No. (will be inserted by the editor) Man Lung Yiu Yufei Tao Nikos Mamoulis The B dual -Tree: Indeing Moving Objects by Space Filling Curves in the Dual Space Received: date
More informationSomething to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:
Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base
More informationPriority Queues and Binary Heaps
Yufei Tao ITEE University of Queensland In this lecture, we will learn our first tree data structure called the binary heap which serves as an implementation of the priority queue. Priority Queue A priority
More informationData Structures for Moving Objects
Data Structures for Moving Objects Pankaj K. Agarwal Department of Computer Science Duke University Geometric Data Structures S: Set of geometric objects Points, segments, polygons Ask several queries
More informationImplementation Techniques
V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 7 - Query execution References Generalized Search Trees for Database Systems. J. M. Hellerstein, J. F. Naughton
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part IV Lecture 14, March 10, 015 Mohammad Hammoud Today Last Two Sessions: DBMS Internals- Part III Tree-based indexes: ISAM and B+ trees Data Warehousing/
More informationInformation Systems (Informationssysteme)
Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IX B-Trees c Jens Teubner Information
More informationQuery Processing and Advanced Queries. Advanced Queries (2): R-TreeR
Query Processing and Advanced Queries Advanced Queries (2): R-TreeR Review: PAM Given a point set and a rectangular query, find the points enclosed in the query We allow insertions/deletions online Query
More information9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology
Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive
More informationSpatial Data Structures
Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) [Angel 9.10] Outline Ray tracing review what rays matter? Ray tracing speedup faster
More information