Overlap Interval Partition Join

Size: px

Start display at page:

Download "Overlap Interval Partition Join"

Horace Garrison
6 years ago
Views:

1 Overlap Interval Partition Join Anton Dignös Department of Computer Science University of Zürich, Switzerlan Michael H. Böhlen Department of Computer Science University of Zürich, Switzerlan Johann Gamper Faculty of Computer Science Free University of Bozen-Bolzano, Italy ABSTRACT Each tuple in a vali-time relation inclues an interval attribute T that represents the tuple s vali time. The overlap join between two vali-time relations etermines all pairs of tuples with overlapping intervals. Although overlap joins are common, existing partitioning an inexing schemes are inefficient if the ata inclues long-live tuples or if intervals intersect partition bounaries. We propose Overlap Interval Partitioning (OIP), a new partitioning approach for ata with an interval. OIP ivies the time range of a relation into base granules an efines overlapping partitions for sequences of contiguous granules. OIP is the first partitioning metho for interval ata that gives a constant clustering guarantee: the ifference in uration between the interval of a tuple an the interval of its partition is inepenent of the uration of the tuple s interval. We offer a etaile analysis of the average false hit ratio an the average number of partition accesses for queries with overlap preicates, an we prove that the average false hit ratio is inepenent of the number of short- an long-live tuples. To compute the overlap join, we propose the Overlap Interval Partition Join (OIPJOIN), which uses OIP to partition the input relations on-the-fly. Only the tuples from overlapping partitions have to be joine to compute the result. We analytically erive the optimal number of granules,, for partitioning the two input relations, from the size of the ata, the cost of CPU operations, an the cost of main memory or is IOs. Our experiments confirm the analytical results an show that the OIPJOIN outperforms state-ofthe-art techniques for the overlap join. Categories an Subject Descriptors H. [Database Management]: Systems Query processing Keywors Temporal atabases; Overlap join; Interval partitioning. INTRODUCTION A ey operation in vali-time atabases is the overlap join []: given two vali-time relations r an s, fin all pairs of tuples r r Permission to mae igital or har copies of all or part of this wor for personal or classroom use is grante without fee provie that copies are not mae or istribute for profit or commercial avantage an that copies bear this notice an the full citation on the first page. Copyrights for components of this wor owne by others than the author(s) must be honore. Abstracting with creit is permitte. To copy otherwise, or republish, to post on servers or to reistribute to lists, requires prior specific permission an/or a fee. Request permissions from permissions@acm.org. SIGMOD 4, June 7, 4, Snowbir, UT, USA. Copyright is hel by the owner/author(s). Publication rights license to ACM. ACM /4/6...$.. an s s with overlapping intervals, i.e., r.t s.t. The overlap join gives the query optimizer an efficient option if other preicates are absent, exhibit a poor selectivity, or must be evaluate after the overlapping interval has been compute. For instance, to fin employees who are employe uring at least months when a project is ongoing, we first must etermine the overlapping interval between an employee an a project, an then chec that the uration of the overlapping interval is at least months. Our goal is an efficient join for interval ata that offers the query optimizer a viable option when other joins o not perform well. Partitioning techniques for interval ata associate each partition with a partition interval. Each tuple is store in the best fitting partition, i.e., the partition interval must cover the interval of the tuple, an there may not exist a smaller partition interval that covers the interval of the tuple. As an example, consier a partition p with partition interval [-, -4] an a tuple s with interval [-, -]. Tuple s can be store in partition p since - - an - -4, an it is inee store in p if an only if there is no other partition with a smaller partition interval that covers the interval of s. Since a partition interval is usually larger than the intervals of the tuples in this partition, we get false hits when searching in a partition for tuples that overlap a query interval (a false hit is a tuple that is fetche with a partition but oes not contribute to the result). False hits increase the number of IOs, since more ata must be fetche, an the number of CPU operations, since false hits must be etecte an iscare. In orer to reuce the number of false hits, it is possible to create more partitions. Many partitions, however, increase the number of IOs since we get more partially fille blocs. This increases the number of CPU operations for searching an navigating in the access structure. This paper proposes the OIPJOIN, together with Overlap Interval Partitioning (OIP), to efficiently compute the overlap join. OIP partitions the time range of a relation uniformly at a granulay that is given by temporally isjoint granules of uration. We create partitions for all sequences of ajacent granules. This approach gives a constant clustering guarantee, i.e., the ifference in uration between a tuple an its partition is less than, inepenent of the uration of the tuple. The access structure of OIP, terme lazy partition list, omits empty partitions without sacrificing performance or functionality. The OIPJOIN is self-ajusting, i.e, it automatically etermines the optimal number of granules,, that minimizes the overhea costs of the OIPJOIN. Example. Figure illustrates OIP with = 4 granules for two relations r an s. The time range of r is [-, -], an the granules have a uration of - -+ = 7 = 4 4 months. This is the granulay at which relation r is partitione. The partitions span, 4, 6, or 8 months. Similarly, relation s is par-

2 titione over its time range [-, -]. The granules have a uration of months, an the partitions span, 6, 9, or months. For the OIPJOIN, r r.t s.t s, we process for each partition in r the overlapping (relevant) partitions in s. For instance, for the partition that contains r an r, three partitions in s are processe, yieling three false hits, namely {s 6} for r an {s, s } for r. For the partition that contains r, two partitions in s are processe, an there are no false hits. Overall, five partitions of s are accesse with three false hits an eight result tuples. r s s s s s r r r s 4 s 6 s Figure : Sample Relations an OIP. False hits an partition accesses incur overhea costs for the OIPJOIN. The number of false hits an the number of partition accesses are inversely relate. Increasing the number of granules (i.e., shorter granules an more partitions) increases the number of partition accesses, but ecreases the number of false hits, an vice versa. We analytically erive the that minimizes the overhea costs by aapting to the size of the two relations, the cost of CPU operations, an the cost of IOs. Instea of assuming a ominating cost factor, we propose a cost moel that accounts for CPU an IO costs. Note that IO costs can be memory IOs or is IOs. A main memory IO is faster than a is IO, but slower than a CPU operation []. Since ata are transferre in chuns from the memory to the processor, it is favorable to store tuples in contiguous main memory blocs. Summarizing, our technical contributions are as follows: We introuce OIP as partitioning strategy for the OIPJOIN. OIP offers a constant clustering guarantee, which ensures that the join oes not eteriorate. The ifference in uration between a tuple an its partition is less than two granules. We provie a etaile analysis of the average false hit ratio (AFR) an the average number of partition accesses (APA) for OIP. We prove that AFR for uniformly istribute query intervals is smaller than an inepenent of the number of short- an long-live tuples. The OIPJOIN is self-ajusting, i.e, it automatically etermines the optimal number of granules. We evelop a cost function for the OIPJOIN an minimize this cost function to get the optimal number of granules to partition the relations. minimizes the overhea costs ue to false hits an partition accesses for IO costs c_io an CPU costs c_cpu. We escribe an implementation of the OIPJOIN base on OIP an compare it with self-ajusting overlap joins base on quatree, loose quatree, segment tree, an relational interval tree. The experiments confirm that the OIPJOIN outperforms these approaches. The rest of the paper is organize as follows. Section iscusses relate wor. Section provies preliminaries. Section 4 escribes t overlap interval partitioning (OIP) an its implementation. Section analytically investigates the average false hit ratio (AFR) an the average number of partition accesses (APA) of OIP. Section 6 escribes the OIPJOIN an erives the optimal value for. Section 7 reports the results of our empirical evaluation.. RELATED WORK We escribe, in turn, relate wor on (a) self-ajusting approaches that, as the OIPJOIN, aapt to the ata an o not require user-specifie parameters; (b) parameter-guie approaches that can/must be tune with application-specific parameters; an (c) is-base approaches that introuce some of the ey concepts use in later wors. Self-Ajusting Approaches. The quatree [, 8] recursively subivies the space into cells an places objects in the smallest enclosing cell. Since split bounaries are propagate own the tree an objects are not allowe to overlap bounaries, small objects that overlap split bounaries en up high in the tree. Therefore the quatree oes not have a clustering guarantee. For instance, time range [, ] is recursively split into [, 6] an [7, ] an so on, an a tuple with interval [6, 7] is place in the root. This yiels many false hits for overlap preicates since all tree noes that overlap the query interval nee to be scanne. The quatree relies on a hierarchical tree structure, an in orer to navigate to noes at lower levels, all parent noes must be store, even if they are empty. To avoi many partially fille blocs, ensity base splitting is use, which propagates tuples own the tree only when blocs are full. This, however, increases the number of false hits. The loose quatree [8, 9] aresses the limitations of the quatree for small objects. It permits at each level partially overlapping cells. The amount of overlapping is etermine by a userspecifie cell expansion factor, p >, where p = is wiely accepte as the best value [, 8]. An expane cell has with ( + p) w, where w is the with of a quatree cell. For instance, time range [, ] is recursively split into [, 4] an [9, ] an so on. A tuple with interval [6, 7] is place in either [4, 7] or [6, 9], which are the expane quatree cells for [, 6] an [7, 8]. The join performance eteriorates for long-live tuples since the time ranges grow with a factor of two, i.e., the number of partitions for long-live tuples is much lower than for small tuples. The loose quatree provies a clustering guarantee that is not constant. The guarantee epens on the uration of a tuple an is weaer for longer tuples. For instance, for p = an a relation that spans ays, a tuple of uration 8 ays can be in a partition that spans ays, yieling a ifference of 7 ays between the tuple an the associate partition. A tuple that spans 8 ays can even be in a partition of ays, which is a ifference of 78 ays. The relational interval tree [4] implements Eelsbrunner s interval tree on top of a relational DBMS. The approach uses two B+-tree inices to inex intervals accoring to a ey an start point an en point. A query interval is first transforme into a ey point list an a ey range list, which in a secon step are joine with the B+-tree inices. For instance, given an inexe time range of [, 64] an a query interval [, 7], the ey point list is {, 6, 8} an the ey range list {[4, 4], [, 7]}. These lists are joine with the help of the two B+-tree inices to get the final result. Vari- In orer to manage intervals with D access structures we omit the secon imension, which reuces D points an D rectangles to, respectively, D points an intervals.

3 ous join techniques base on the relational interval tree have been propose [9], such as an Inex-Base Loop Join an several partition base joins (Up-Down, Down-Down, Up-Up epening on tree traversal). In all these join techniques, long-live tuples lea to many CPU operations since a large number of noes must be joine. To get a better IO performance for bloc storage, the ata can be clustere accoring to an inex on either the start or the en points. Nevertheless, long-live tuples eteriorate the performance since for long-live tuples the clustering of the two inices varies more than for short-live tuples. The segment tree [4, ] is an inexing technique for intervals. It buils isjoint segments (intervals) at the leaf level, using all start an en points in a relation. Internal noes merge all segments of their chilren noes. A tuple is associate with all sub-tree roots whose segment is completely covere by the tuple s interval. The segment tree efficiently retrieves all tuples that inclue a given time point. In orer to compute an overlap join, possibly empty parent noes must be scanne, an uplicate tuples that are assigne to multiple noes must be fetche (IO cost) an ientifie (CPU cost). This is particularly expensive for long-live tuples. For instance, for a relation with three tuples r, r, an r with intervals [, ], [, 9], an [8, 9], respectively, we have at the leaf level (level ) the four segments [, ], [, ], [6, 7], an [8, 9]. At level, we have the segments [, ] an [6, 9], an at level (root) the segment [, 9]. Tuple r is store twice, namely in [, ] an [6, 9], an it must be rea twice for the query interval [, 6]. Parameter-Guie Approaches. In [6], a spatially partitione temporal join is propose, where interval ata is mappe to points in a two imensional gri. Partitions are regions in the plane. Two relations are joine by etermining for each partition of the outer relation the relevant partitions of the inner relation. Two implementations are propose, namely to store partitions physically on is blocs or to use spatial inices to inex the regions of partitions. While existing spatial inices can be reuse, long-live tuples substantially increase the number of inex noes to scan. The number of partitions must be specifie by the application. The snapshot inex [] is an access metho for is resient transaction-time atabases. Intervals in transaction-time atabases are clustere since atabase moifications occur in increasing time orer. Blocs are istinguishe by usefulness accoring to an application parameter a that inicates the number of false hits a bloc is allowe to generate. Long-live tuples are artificially elete an re-inserte using controlle splits. Splitting is not a general solution for vali-time atabases since it changes the meaning of tuples [8, 7]. Parameter a must be chosen as a traeoff between artificial splits an false hits. In [7], an approach is propose for ata that is store in main memory. It is similar to the spatial hash join [] an uses an R-Tree to group tuples into minimum bouning rectangles (MBRs). The tuples of one relation are store in the leaf noes, an the tuples of the other relation in the lowest internal noe, where more than one chil s MBR overlaps the tuple. The join is performe by joining leaf with internal noes. The approach aims to reuce the number of CPU comparisons an requires three parameters: tree fanout, number of partitions, an cells per imension. Dis-Base Approaches. The size separation spatial join [] is a partitioning strategy for the overlap join of is resient spatial ata an is similar to the quatree. Instea of using a tree structure, the levels of the tree are mappe to sorte files. The join is then performe by a synchronize scan of two sorte files that represent the partitione relations. The approach reuces IO an provies a goo filling of blocs, but, ue to the recursive space ivision, small objects are not guarantee to be store at a low level. Thus, the size separation spatial join has no clustering guarantee an may prouce many false hits. The grace partition join [] is use for vali-time natural joins of is resient ata, i.e., overlap joins with aitional equality preicates. It samples the relations to etermine the partitions for the tuple intervals. A tuple is store in the last partition it overlaps. Partitions are joine from the last to the first partition. Long-live tuples that overlap several partitions are migrate to the next partition uring join processing. The approach is only efficient for few long-live tuples, where the overhea of migration is low. The R*-tree [, ] uses MBRs to group nearby objects an stores object IDs in the leaf noes. The internal noes buil an inex on the leaf noes using MBRs. MBRs of both leaf an internal noes might overlap. The tree is expensive to construct ue to the propagation of MBRs. Long-live tuples increase the MBRs an prouce more false hits (page faults). For overlap joins, it is necessary to follow multiple paths in the R*-tree.. PRELIMINARIES We assume a iscrete time omain, Ω T, consisting of a linearly orere set of time points. An interval T is a contiguous set of time points an is represente as a pair [T S, T E], where T S is the inclusive start point an T E the inclusive en point. We use the following operations on time points an intervals: x T if time point x is containe in interval T, i.e., T S x T E; Q T if Q an T intersect, i.e., there exists a time point x such that x Q x T ; T U if interval T is containe in interval U, i.e., x T x U; T E T S etermines the ifference in number of time points between T S an T E; T S + x shifts time point T S by x time points to the right, i.e., T E T S = x T S + x = T E; an T = (T E T S) + is the uration (length) of interval T. We use tuple timestamping an associate each tuple with a single interval that represents the tuple s vali time. A temporal relation schema is represente as R = (A,... A m, T ), where A... A m are attributes with omain Ω i an T is the interval attribute over Ω T Ω T. A tuple r over schema R is a finite set that contains for every A i a value v i Ω i an for T an interval [T S, T E] Ω T Ω T. A temporal relation r over schema R is a finite set of tuples over R. A vali-time relation r spans time range U = [U S, U E] if U S is the smallest start time point of any tuple in r an U E the largest en time point of any tuple in r. We we l for the uration of the longest tuple in a relation r, an λ for the uration of the longest tuple as a fraction of the time range, i.e., λ = l/ U. We use inices r an s to istinguish between the outer an inner relation in joins, e.g., n r an n s are, respectively, the carinality of the outer an inner relation in r s. 4. OVERLAP INTERVAL PARTITIONING In this section, we first efine Overlap Interval Partitioning (OIP) an show how the relevant partitions, i.e., partitions that overlap a query interval, are calculate. Secon, we establish the constant clustering guarantee. Thir, we show how to manage physical partitions of OIP with a lazy partition list that omits unuse partitions. 4. Definition OIP ivies a time range U into equally size granules of uration, which efine the base granulay of the partitioning (we iscuss in Section 6. how to erive ).

4 Definition. (OIP Configuration) Let r be a temporal relation with time range U = [U S, U E]. An OIP configuration for a given is a triple (,, o), where = U is the uration of each granule an o = U S is the start point of the partitione time range. A partition interval spans a sequence of one or more ajacent granules. A tuple is assigne to the smallest partition whose partition interval completely covers the tuple s interval. Definition. (OIP Partition) Let r be a temporal relation with OIP configuration (,, o). An OIP partition, p i,j, with i j <, spans all granules from i to j an has the partition interval p i,j.t = [o + i, o + (j + ) ]. A tuple r r is place in partition p i,j iff r.t S o = i an r.t E o = j. Example. Relation s in Figure inclues seven tuples an spans the time range U = [-, -]. The OIP configuration with = 4 granules is (4,, -) with granule uration = U = = months an start time point o = US = 4 -. The partitions that span one granule are p,, p,, p,, an p,, each ranging over three months. The partitions p,, p,, p,, p,, p,, an p, span more than one granule each, e.g., partition p, spans the range [-, -6]. Tuple s is place in partition p, since s.t S o = - - = = an s.t E o = - - = =. Tuple s6 is place in partition p, since s 6.T S o = -6 - = = an s 6.T E o = - - = 9 =. Five out of ten partitions are empty, namely p,, p,, p,, p,, an p,. p, p, p, p, s p, Q p, s 4 s 6 p, p, p, p, s s s granule granule granule granule Figure : OIP with Configuration (4,, -) for s. LEMMA (OIP OVERLAP QUERY) Let (,, o) be an OIP configuration an Q = [Q S, Q E] be a query interval. The caniate tuples that possibly overlap Q are in partitions p i,j for which i e = Q E o an j s = Q S o. We term these partitions the relevant partitions; s is the start an e the en inex of Q. PROOF. (By contraiction) Assume a partition p i,j with j < s = Q S o, which contains a tuple r that overlaps Q. Accoring to Definition, tuple r is place in a partition p i,j with i = r.t S o an j = r.t E o. Since j < s, we get r.t E < Q S, i.e., r.t an Q o not overlap, which contraicts our assumption. Similarly, a tuple in p i,j with i > e = Q E o cannot overlap Q since r.t S > Q E. Example. Consier Figure with the OIP configuration (4,, -) an the query interval Q = [-, -]. For the relevant partitions, p i,j, the following constraints hol: i e = Q E o = - - = 4 = an j s = Q S o = s 7 t = 4 =. This is satisfie for partitions p,, p,, p,, p,, p,, an p, (gray boxes), which contain all caniate tuples for the query interval Q. - - Next, we establish the constant clustering guarantee of OIP: the ifference in uration between a tuple an its partition is less than two granules, i.e., constant an inepenent of the uration of a tuple. Note that the number of time points per granule or the uration of a granule have no impact on our solution. The constant clustering guarantee ensures (a) an excellent partitioning since the ifference in uration between a tuple an its partitions is less than two granules, (b) an average false hit ratio that is inepenent of the intervals of tuples (cf. Section.), an (c) it allows to tae avantage of empty partitions by increasing (cf. Section 6.). LEMMA (CONSTANT CLUSTERING GUARANTEE) Let (,, o) be an OIP configuration for relation r. The ifference in uration between a tuple r r an its partition p is less than : r r(r p p.t r.t < ). PROOF. We show that the ifference in uration of the smallest tuple in a partition p i,j an p i,j is less than. A tuple r is place in partition p i,j iff i = r.t S o an j = r.t E o (cf. Definition ). Thus, we have i r.t S o (i + ) an j r.t E o (j + ). The smallest tuple in p i,j has uration [(i + ), j ] = j (i + ) +, an p i,j has uration [i, (j + ) ] = (j + ) i. Hence, the ifference in uration between the smallest tuple r in p i,j an p i,j is <. For instance, for a relation that spans ays an with =, we have = ays. A tuple that spans 8 ays can be in a partition that spans 9 ays, which is a ifference of ays. A tuple that spans 8 ays can be in a partition that spans ays, which is a ifference of 8 ays. Note that the ifference is always less than = ays. 4. Lazy Partitioning We represent the OIP access structure as a triangle in a istance regular square gri graph [6], as illustrate in Figure (a) for the OIP in Figure. We call this a triangular gri graph with gri points (i, j) for i j <. To fin all relevant partitions for a query interval with start inex s an en inex e, we etermine all partitions p i,j for which j s an i e (cf. Lemma ). We start at the top-left corner of the gri (i.e., i =, j = = ) an move along the path with ecreasing j as long as j s. At each noe p,j, we follow the path with increasing i as long as i min(j, e). In Figure (a), the relevant partitions (gray) for query interval Q are on the paths p, p,, p, p, an p, p,. j Q p, p, p, p, p, p, p, +i p, p, p, (a) Triangular Gri Graph. Q p, p, p, p, p, (b) Lazy Partition List. Figure : Management of OIP Partitions. The number of possible OIP partitions correspons to the number of noes in a triangular gri graph an grows quaratically with the number of granules.

5 PROPOSITION (NUMBER OF PARTITIONS) For granules, the number of possible partitions is p = i= ( i) = +. The lazy partition list is a compresse triangular gri graph that inclues only non-empty partitions. Figure (b) shows the lazy partition list of our example. It inclues only the non-empty partitions an the irecte eges that are neee for navigation. The main list starts at the upper-left corner an connects noes with ecreasing j from top to bottom. Each noe of the main list starts a branch list that connects (from left to right) noes with the same j-value an increasing i-value. The lazy partition list has the following avantages: (a) the number of CPU operations is reuce since empty partitions o not appear in the access structure; (b) can be increase if not all partitions are use (cf. Section. an Section 6.); an (c) the number of partitions is upper boune by the carinality of the partitione relation, inepenent of the value of. LEMMA (NUMBER OF PARTITIONS WITH LAZY PARTITION- ING) Assume an OIP configuration (,, o) for a relation r with n tuples whose vali time uration is at most λ. The number of partitions, p, of OIP for r is upper boune by min( λ + λ λ, n). PROOF. Tuples in r span at most λ granules. From Lemma we have that the ifference in uration of a partition an its tuples is less than two granules. Thus, the longest use partition spans at most λ + granules, an we have p λ + x= ( x) = λ + λ empty partitions are not create, p cannot excee n. λ. Since Example 4. Assume a relation with tuple urations up to % of the relation s time range, i.e., λ =.. With =, at most. +.. = 7, 8 partitions out of + =, possible partitions are use, i.e., 7%, while 6% are empty. 4. Implementation of OIP Our implementation of OIP uses a lazy partition list, L, to eep trac of inices an storage blocs of partitions. Figure 4 shows the lazy partition list for our running example. L p, p, p, p, p, RAM/DISK s, s s s s 7 s 4, s 6 Figure 4: OIP Lazy Partition List L with Bloc Pointers. Algohm creates the lazy partition list L for an input relation r with n tuples an an OIP configuration (,, o). After initializing an empty partition list, the relation is sorte accoring to the tuples partition p i,j with j in ascening an i in escening orer. The inices i an j of the partition to which a tuple r is assigne are compute accoring to Definition. The sorting ensures that tuples fall either into the first noe of the list (c = nil c.j < j) or into a new noe that is prepene to L.hea (c.i > i). The sorting reuces the complexity of insertions from O() to O() an gives a total runtime complexity for constructing L of O(n log n), which is inepenent of an ensures that storage blocs are allocate sequentially. Algohm : OIPCREATE (r, (,,o)) Input: Relation r an OIP configuration (,, o) Output: Partition list L L := empty partition list; Sort r by r.t E o foreach r r o i := r.t S o ; ASC an r.t S o DESC; j := r.t E o ; c := L.hea; if c = nil c.j < j then L.hea := new noe with partition p i,j ; L.hea.own := c; else if c.i > i then L.hea := new noe with partition p i,j ; L.hea.own := c.own; L.hea.right := c; A r to L.hea; return L; Example. Consier relation s in Figure. The call of OIPCREATE (s, (4,,-)) constructs L as follows:. r = sort(s) = s, s, s, s, s 7, s 4, s 6, L =. r = s, i = - - =, j = - - =, L = (,, {s }). r = s, i = - - =, j = - - =, L = (,, {s, s }) 4. r = s, i = - - =, j = - - =, L = (,, {s }), (,, {s, s }). r = s, i = - - =, j = - - =, L = (,, {s }), (,, {s }), (,, {s, s }) The algohm terminates an returns the lazy partition list L = (,, {s 4, s 6}), (,, {s 7}), (,, {s }), (,, {s }), (,, {s, s }), which is illustrate in Figure 4.. ANALYTICAL RESULTS OF OIP In this section, we analyze the quality of OIP using two ifferent measures. The average false hit ratio, AFR, measures the precision of a partitioning in terms of the average number of tuples that are retrieve for a query interval but o not contribute to the result. The average number of partition accesses, APA, quantifies the number of partitions that are fetche for a query.. Average False Hit Ratio Definition. (False Hits) Let P be a partitioning of a vali-time relation r an Q be a query interval. The false hits, F(P, Q), are the tuples that are retrieve when fetching the relevant partitions, but are not part of the query result, i.e., F(P, Q) = {r p P ( r p p.t Q (r.t Q) ) }. Consier Figure. For the query interval Q = [-, -], only the relevant partitions are fetche (i.e., p,, p,, p,). The false hits are F(OIP, Q) = {s 6} since partition p, is fetche, but s 6 oes not overlap Q. The result tuples are s, s 4, an s. We use neste lists as an abstract notation. For instance, L = a, b, c has noes a, b, c; L.hea = a; a.own = b; b.right = c; a.right, b.own, c.own, an c.right are nil.

6 We procee by efining the sum false hit ratio as the percentage of false hits over all possible point queries, i.e., the false hits prouce by all queries over query intervals of uration ivie by the total number of tuples. Definition 4. (Sum False Hit Ratio) Let P be a partitioning of a vali-time relation r with time range U. The sum false hit ratio, SFR(P ), for all query intervals [x, x] that overlap U is efine as SFR(P ) = x U F(P, [x, x]). r For the OIP shown in Figure, we have SFR(OIP) = F(OIP,[-,-]) + + F(OIP,[-,-]) = 4 =. Thus, 7 7 for all query intervals of uration, two times the number of tuples in s are retrieve as false hits. LEMMA 4. The sum false hit ratio of a partitioning P over a time range U is inepenent of the query interval uration q, i.e., it is the same for all query intervals [x, x + q ] that overlap U for any value of q: SFR(P ) = x U F(P, [x, x]) r = Q:Q U Q =q r F(P, Q). PROOF. Consier a time point x U an a partition p P. Query interval [x, x] of uration can prouce false hits in p if there is a non-overlapping part before x (i.e., p.t S < x) an/or a non-overlapping part after x (i.e., x < p.t E). All tuples that start an en in one of the two non-overlapping parts are false hits. We consier now query intervals of uration q >. The query interval [x, x+q ] prouces the same non-overlapping part before x, an the query interval [x q+, x] the same non-overlapping part after x, yieling together exactly the same false hits for partition p as the point query with interval [x, x]. Since for each x there exists exactly one query interval of uration q that starts at x an one that ens at x, it is straightforwar to generalize this result to the sum over all partitions an time points in U. This proves the lemma. Next, we efine the average false hit ratio for an arbitrary query interval of uration q. Definition. (Average False Hit Ratio) Let P be a partitioning for a relation r with time range U. The average false hit ratio, AFR(P ), for a query interval uration q is efine as AFR(P ) = SFR(P ) U + q. PROPOSITION. The AFR(P ) ecreases monotonically with increasing query interval uration q. Example 6. Consier Figure with the time range U = [-, -] an the sum false hit ratio SFR(OIP) =. The number of query intervals of uration q = is U + q =, yieling an average false hit ratio AFR(OIP) = (= 6.7%), i.e., on average 6.7% of the fetche tuples are false hits. The number of query intervals of uration q = is U +q = 6, yieling an average false hit ratio AFR(OIP) = (=.%). 6 For the analysis of the average false hit ratio of OIP in the following Theorem, we use uration complete relations. A uration complete relation, r l U, contains exactly one tuple for each interval up to a uration l in the time range U, i.e., T U( T l r r l U (r.t = T )), r r l U ( r.t l), r, r r l U (r r r.t r.t ). For instance, the uration complete relation r [,] contains a total of seven tuples with intervals [, ], [, ], [, ], [, ], [, ], [, ], [, ]. Duration complete relations ensure that the AFR is calculate over tuples of all possible positions an urations in U. THEOREM. Assume an OIP with configuration (,, o). The average false hit ratio for uration complete relations is inepenent of the uration of the tuples an upper boune by AFR(OIP) <. The proof of Theorem is provie in the Appenix.. Average Number of Partition Accesses The average number of partition accesses, APA, quantifies how many partitions are accesse on average to retrieve all tuples that overlap a query interval, i.e., how many relevant partitions exist. LEMMA (APA UPPER BOUND) Assume an OIP with configuration (,, o), where all partitions are use. The average number of partition accesses for query intervals with uniformly istribute start an en time points is: APA(OIP) + +. PROOF. For query intervals with uniformly istribute start an en time points, every query interval starting in granule s an ening in granule e has the same probability. Thus, we nee to compute the number of partitions that a query interval accesses when starting in s an ening in e, which is the total number of partitions minus all partitions ening before s an all partitions starting after e as follows: #acc(s, e) = + (s i) s i= = + e s + s e i= e + e. ( e i) We sum the number of partition accesses, #acc(s, e), for all s e < an ivie the sum by the carinality of s e < to get: APA(OIP) = e e= s= e= (#acc(s, e)) e s= () = + +. Since empty partitions are not present in the OIP access structure, APA is reuce if not all partitions are use. We use a tightening factor to quantify the reuction of partitions with lazy partitioning. The tightening factor, τ, with < τ, is calculate as the ratio between the number of use partitions with lazy partitioning (Lemma ) an the total number of partitions (Proposition ). For instance, the tightening factor in Example 4 is τ = 89/ =.7. THEOREM (APA) Assume an OIP configuration (,, o) for a relation with n tuples an a tightening factor τ with < τ. The average number of partition accesses is: APA(OIP) min(τ + +, n). PROOF. The proof follows from Lemma an Lemma. The tightening factor τ is the ratio of the number of use an the number of possible partitions. The inequality hols since the multiplication with τ conservatively assumes that the longest partitions, which prouce more partition accesses than shorter partitions, are omitte.

7 6. THE OVERLAP JOIN OIPJOIN This section presents the OIPJOIN algohm, erives the optimal number of granules, illustrates its calculation by an example, an analyzes the runtime complexity of the OIPJOIN. 6. The OIPJOIN Algohm Algohm computes the OIPJOIN for relations r an s. First, (cf. Section 6.) an the OIP configurations of the two relations are etermine. The partitions are create by calling OIPCREATE, an the result relation z is initialize. Then, the algohm iterates over each outer partition in L r an performs an overlap query (cf. Lemma ) with the query interval [Q S, Q E] of the outer partition (cf. Definition ). If [Q S, Q E] oes not overlap the time range of the inner relation s, the outer partition is sippe. Otherwise, the inices s an e of the query interval [Q S, Q E] are etermine. The relevant partitions of the inner relation that overlap with the query interval are fetche, an the tuples are joine with the tuples in the outer partition. The result tuples are collecte in z. Algohm : OIPJOIN (r, s) Input: Relation r an relation s Output: z = {r s r r s s r.t s.t } Determine for r an s for given IO an CPU costs; Determine configurations (, r, o r) for r an (, s, o s) for s; L r OIPCREATE(r, (, r, o r)); L s OIPCREATE(s, (, s, o s)); z := ; foreach noe c r in L r o Q S := o r + c r.i r; Q E := o r + (c r.j + ) r ; if Q E o s Q S < o s + s then s := Q S o s s ; e := Q E o s s ; c s := L s.hea; while c s nil c s.j s o x := c s; while x nil x.i e o z := z { joine tuples from c r an x}; x := x.right; return z; c s := c s.own; Example 7. Consier Figure with = 4. We get the OIP configurations (4,, -) for r an (4,, -) for s. OIPCREATE creates the lazy partition lists L r = (,, {r }), (,, {r, r }) an L s = (,, {s 4, s 6}), (,, {s 7}), (,, {s }), (,, {s }), (,, {s, s }). The first outer partition is processe as follows: c r = (,, {r }) Q S = - + = -7 Q E = - + ( + ) = - s = -7 - = 6 = e = - - = = c s = L s.hea = (,, {s 4, s 6}) x = c s = (,, {s 4, s 6}) z = {r s 4, r s 6} x = x.right = (,, {s 7}) z = {r s 4, r s 6, r s 7} c s = c s.own = (,, {s }) The secon (an last) outer partition, c r = (,, {r, r }), is processe in a similar way, yieling the final result z = {r s 4, r s 6, r s 7, r s, r s 4, r s, r s 4, r s 6}. 6. Number of Granules Choosing (i.e., the number of granules) is the most important ecision for the OIPJOIN. In orer to erive for the outer an the inner relation, we procee in two steps. First, we provie a cost function for the OIPJOIN, an secon, we minimize the cost function with respect to. Cost Function. The cost function consiers the CPU cost of a comparison operation (c_cpu) an the cost of a bloc IO (c_io). A bloc IO can refer to either main memory or is. The cost function moels the overhea ue to partition accesses an false hits. Recall that the cost for creating the partitioning is, ue to sorting, inepenent of an thus not inclue in the cost function. Let r an s be the number of granules for the outer an inner relation, respectively. For the join we fetch, for each of the O(r) outer partitions, O(s) inner partitions, i.e., O(r s). Furthermore, for each outer an inner tuple we have, respectively, O( ns s ) an O( nr r ) false hits, i.e., O(n s nr r + n r ns s ). Both, O(r s) an O(n s nr r + n r ns s ) reach their minimum when r = s, i.e., outer an inner relation are partitione using the same number of granules. Thus, we have a cost function with = r = s: cost() = p r APA (c_io + c_cpu) + p r n s AFR ( c_io b = x APA + y AFR + n r p r c_cpu) The first line is the cost for partition accesses. For each of the p r outer partitions, the algohm accesses APA inner partitions. Each partition access costs one extra c_io since an inner partition can have at most one partially fille bloc (remember we only measure the overhea) an two c_cpu (comparison i an j) for checing if this partition in the lazy partition list is relevant. The secon line is the cost for false hits. For each of the p r outer partitions, n s AFR false hits in the relevant inner partitions prouce extra bloc transfers, where b is the average number of tuples per bloc of the inner relation. The costs for ientifying false hits is two c_cpu (comparison T S an T E) for each false hit in the outer partition an each false hit in the relevant inner partitions. Determining. We erive by minimizing the cost function using the partial erivative of x APA+y AFR. The terms quantify, respectively, the increase of the costs ue to partition accesses an the ecrease of the costs ue to false hits. can be increase as long as the costs for AFR ecrease more than the costs for APA increase. The optimal is the point where the cost for accessing partitions starts growing faster than the cost for false hits ecreases, which is the minimum of the cost function. Since the complexity of the cost function prevents an analytical solution of the minimization problem, we procee in two steps to erive. First, we eep p r an τ constant an erive as follows:. Compute the partial erivative of x APA + y AFR. We use APA an AFR from Theorems an to get x τ ++ + y. The partial erivative with respect to is (x τ ++ + y ) = x τ ( + ) y.. Solve x τ ( + ) y = to get the that minimizes the cost function: ()

8 (6 y x τ + 8 y (8 y x τ)) (x τ) = 6 x τ + x τ + ( (6 y x τ + 8 y (8 y x τ)) (x τ) y 6 x τ. [# x ] steps/n (a) n r = M an n s = M [# x ] steps/n (b) n r = M an n s = G In the secon step, we use an iterative process that refines p r n an τ n in each step in orer to etermine. More specifically, we calculate the number p r n of outer partitions an the tightening factor τ n from the previously calculate n, starting with =. After substituting x an y (cf. Equation ()) in the above equation for, we obtain the recurrence: n ( n+ = s c_io + (c_io + c_cpu) τ n b 4 nr c_cpu ) p r n We start with = an calculate the number of outer partitions, p r, accoring to Lemma, i.e., p r n = min( n λ r n + n λr n () λr n, n r), an the tightening factor τ as the number of inner partitions (cf. Lemma ) ivie by the number of possible partitions (cf. Proposition ), i.e., λs n min(n λs n + n λs n, n s) τ n =. (n + n)/ We repeately calculate n+ from p r n an τ n until converges to the minimum cost. Example 8. Consier two relations r an s, each with a time range U = M. Relation r has n r = M tuples, an the maximum uration of tuples is l r =,, i.e., λ r =.. Relation s has n s = M tuples, an the maximum uration of tuples is l s =,, i.e., λ s =.. Both relations are store in main memory in blocs of bytes. With a tuple size of bytes, b = 4 tuples fit into a bloc. The time of a CPU operation is. nsec (GHz), an the time to fetch a bloc from main memory is nsec, i.e., c_cpu =. an c_io =. Starting with n = an =, we get the following values: n n p r n τ n 64, 6 7, 6. 7, 967, 9.6, 89 9, 7.9 4, 76 4, 8.6 7, 79, , 49, , 49, , 49, 6. Thus, converges to = 6,, which is the number of granules for the OIPJOIN. Figure illustrates the convergence of for relations of ifferent size. The iterative process to fin converges since at each step we reuce the power by. Note that ue to the ceiling function an integer calculus in p r n an τ n, may not converge to a single number, but oscillate between two. In this case the final is the average between these two numbers. Figure : Convergence of. 6. Complexity Analysis The complexity of the OIPJOIN is compose of three parts: O( p r APA) partition fetches; O(n s n r AFR) false hits; an O(n z) for retrieving n z result tuples. After substituting AFR an APA accoring to Theorem an, we get the sum O( p r τ ) + O(n s n r ) + O(nz). The asymptotic accoring to Equation () is = O(( ns nr p )/ r τ ). The upper boun complexity occurs with tightening factor τ = (no tightening). In this case we get a low to eep the cost for partition accesses low. From τ = we get p r = O( ) (cf. Section.), i.e., ns nr = O(( ) / ) / = O((n s n r) / ) = O((n s n r) / ) Inserting this into the above sum gives O(n 4/ r n 4/ s + n z). The lower boun complexity occurs with tightening factor τ = O( ) (maximal tightening). In this case we get a high, since the cost for partition accesses is low. From τ = O( ) we get p r = O(). Then is: = O(( ns nr ) / ) = O((n s n r) / ) Inserting this into the sum gives O(n / r n / s + n z). To illustrate the complexity, we performe an overlap join between two relations with M tuples each an between two relations with M tuples each. As a reference point, we also compare it with the lower an upper boun of a sort-merge join (SMJ) of the same relations. Table shows the runtimes in secons (as usual, the time to we result tuples is exclue). We can see that the runtime increase approximately by a factor of / / =. for the lower boun an by a factor of 4/ 4/ =. for the upper boun. The increase in runtime for the sort-merge join is.6 (linear) for the lower boun an 4. (quaratic) for the upper boun. M M increase OIPJOIN: LB (τ /) 46.6 UB (τ = ), 8 6, 6.8 SMJ: LB UB 8, 4 4, 7 4. Table : Runtime an Factor of Runtime Increase. 7. EMPIRICAL EVALUATION This section evaluates the performance of the OIPJOIN an compares it empirically with the other self-ajusting approaches. The first set of experiments evaluates how aapts to the cost of CPU

9 operations an the cost of bloc IOs. We also verify our cost function by relating it to the actual runtime. The secon set of experiments evaluates the performance of ifferent approaches for a varying percentage an istribution of long-live tuples. The ability to efficiently process ata with long-live tuples, i.e., tuples with a non-negligible temporal uration, is the most crucial aspect of algohms an access methos for temporal ata. The OIPJOIN outperforms relate approaches by a large margin. The thir set of experiments shows that the OIPJOIN scales better than the other approaches for real worl atasets, coming from animal fee inustry, personnel office, an open source software. Between.% % of the tuples are larger than 8% of the ata s time range, which alreay leas to significant ifferences. Finally, we show that the OIPJOIN scales better than the other approaches for is resient ata since it consiers both CPU an IO costs. Setup. For the experiments we use a x Intel(R) Xeon(R) CPU machine with 64GB main memory running CentOS 6.4. All algohms have been implemente in C. We use a tuple size of bytes. The bloc size is bytes for relations store in main memory (gives the best performance on our machine) an 4K bytes (physical is bloc size) when store on is. We implemente all algohms for both is an main memory storage. The cost to perform a CPU operation (. nsec) on our machine is about times faster than fetching a main memory bloc ( nsec). We use synthetic atasets with a time range of [, 4 ] as well as real worl atasets (escribe below). The OIPJOIN is compare against the following state-of-the-art approaches. Loose quatree (): We implemente a partitionbase algohm that joins every noe of the outer tree with all relevant noes in the inner tree. We use a cell expansion factor p =, which is wiely accepte as the best value [8, ] an gave the best results in our experiments. We use ensity base splitting, i.e., tuples are propagate own the tree only if a bloc is full. Together with bloc storage, this gave a runtime improvement up to 4% compare to ranom access to single tuples. Since in all experiments the loose quatree outperforme the quatree, the latter is not shown in the plots. Relational interval tree (): We implemente the RI-Tree Up-Down partition-base algohm [9]. When ata is store in main memory, we o not use blocs to store tuples contiguously. The reason is that even for a clustering inex, the time to fetch bytes that contain only a few matching tuples outweighs the avantages of contiguous memory access. Segment tree (): We implemente the segment tree, where the inex is buil on the inner relation an the overlap join is compute by joining each tuple of the outer relation with the segment tree. Duplicates are ientifie uring join processing by testing whether the intersecting interval starts before the currently joine segment; if so, it has alreay been joine in a previous segment. Sort-merge join (): We implemente a sort-merge join that sorts the tuples of the outer relation by the en point an the tuples of the inner relation by the start point. The sort orer of the inner relation is use to stop scanning when an inner tuple has a larger start point than the en point of the outer tuple. The sort orer of the outer relation allows to limit the bactracing to the maximum uration of tuples in the relations. We implemente the join using blocs. In spite of more false hits, this increases the performance ue to less bactracing. All runtime experiments inclue the time to create the inices. For all approaches, the inex creation time is % of the total runtime for ata ept in memory an % for is resient ata. Number of Granules. The first experiment shows how the OIPJOIN aapts to c_cpu an c_io. We use synthetic ata: an outer relation with M tuples an an inner relation with M tuples, both with tuple urations up to.% of the time range. Figure 6(a) shows when varying the ratio c_cpu from. to. When c_io c_cpu gets more expensive, increases so that more partitions are generate. Figure 6(b) an 6(c) show, respectively, the corresponing AFR (ecreasing) an the number of bloc IOs (increasing). Figure 6() shows the runtime for main memory resient ata. It illustrates that also for ata that is store in main memory the performance can be increase if the costs of memory IOs an the costs of CPU operations are consiere for etermining the optimal. [# x ] IO [M] CPU cost / IO cost (a) Derive.... CPU cost / IO cost (c) Bloc IOs. Runtime [sec x ] CPU cost / IO cost (b) AFR CPU cost / IO cost () Runtime (Main Memory). Figure 6: Derive with Varying c_cpu an c_io. The next experiment compares the cost function of the OIPJOIN to the actual runtime. We use the same relations as in the previous experiment an vary. Figure 7(a) shows the cost function for c_cpu =. nsec an c_io = nsec. Figure 7(b) shows the actual runtime for the same setting. It is easy to see that both functions have the same shape with the minimum at =,. Cost [sec x ]. [# x ] (a) Cost Function (Overhea). Runtime [sec x ] [# x ] (b) Runtime. Figure 7: Cost Function an Runtime. Long-Live Tuples. The next experiment compares the performance of the OIPJOIN () with the loose quatree (), the relational interval tree (), the segment tree (), an the sort-merge join (), by varying the number of long-live tuples an the maximum uration of tuples. The two input relations have M tuples each, with long-live tuples that have a uration up to 8% an an average uration of 4% of the relation s time range. Short-live tuples have a maximum uration of.% of the time range. Figure 8 shows the runtime an the AFR of the four algohms. The AFR of the relational interval tree an segment tree are omitte since they prouce no false hits. The OIPJOIN significantly outperforms the

10 other approaches since it oes not suffer from long-live tuples an has a very small AFR (the curve is close to the x-axis). In contrast, the loose quatree is very sensitive to long-live tuples, an the AFR increases rastically. This yiels much higher runtimes ue to excessive comparison operations an the filtering of false hits. Although the relational interval tree oes not prouce false hits, its performance ecreases with the increase of long-live tuples since a higher number of inex noes nee to be joine, which requires a high number of operations on the inices. The segment tree scales worse than the relational interval tree, since with longer tuple urations many uplicates nee to be fetche an teste. In our experiments, the segment tree outperforms the relational interval tree only for tuple urations smaller than.%. The performance of the sort-merge join is highly affecte by the longest tuple in the ataset, but it scales better than the loose quatree. Runtime [sec x ] Runtime [sec x ] 7 # of Long-Live Tuples [%] 7 # of Long-Live Tuples [%] (a) Varying Number of Long-Live Tuples Max. Tuple Duration [%] Max. Tuple Duration [%] (b) Varying Maximum Duration of Tuples. Figure 8: Long-Live Tuples. Real Worl Datasets. We use three real worl atasets that iffer in size an ata istribution. The main properties of these atasets are summarize in Table. The Incumbent ataset [] recors the history of employees assigne to projects over a 6 year perio at a granulay of ays. The Fee ataset recors the history of measure nutive values of fees over a 4 year perio at a granulay of ays; a measurement of a nutrient remains vali until a new measurement for the same nutive value an fee becomes available. The Webit ataset [] recors the history of files of the svn repository of the Webit project over a year perio at a granulay of millisecons. The vali times inicate the perios when a file i not change. Figure 9 shows the temporal istributions of the ata (i.e., the number of overlapping tuple intervals at each time point) an the histograms of tuple urations. Incumbent Fee Webit Carinality 8, 8, 697, 97,, 476 Time Range, 89 8, 6 9 Min. Duration Max. Duration 74 8, 89 9 Avg. Duration Distinct Points, 689, 84, 6 Table : Properties of Real Worl Datasets. # of tuples [%] # of tuples [%] # of tuples [%] Time # of tuples [%] (a) Incumbent Dataset Time # of tuples [%] (b) Fee Dataset Time # of tuples [%] (c) Webit Dataset Duration [%] Duration [%] Duration [%] Figure 9: Tuple Intervals per Time Point an Duration Histogram of Real Worl Datasets. For all three atasets we perform an overlap join, using a subset of the ataset as the outer relation an the entire ataset as the inner relation. We use the smaller as the outer relation, since it typically has fewer partitions, an thus some partitions of the larger relation are not accesse at all uring the join. Figure shows the runtime an the AFR for the three atasets epening on the size of the outer relation. The OIPJOIN has the best performance in all three settings. The other approaches suffer from long-live tuples, e.g., the AFR of the loose quatree is much larger than the one of the OIPJOIN, an it oes not aapt to the size of the ataset. The AFR of the sort-merge join is omitte since it reaches %. Scalability on Dis. The last experiment shows the scalability of the algohms for is resient ata. We vary the number of inner tuples from M to M. The number of outer tuples is % of the inner relation. Both relations have tuple urations up to.% of the time range. c_io is times higher than c_cpu. Figures (a) an (b) show the number of bloc IOs an the AFR. Figure (c) shows the runtime behavior on a server with 64GB of main memory, where a large number of is blocs is cache by the operating system. Although the loose quatree, ue to its ensitybase splitting strategy, is the best approach in terms of bloc IOs, it prouces a large number of false hits. The OIPJOIN aapts to both the cost of bloc IOs an the cost of false hits, an thus outperforms all other approaches in terms of runtime. The segment tree performs worst, in particular in terms of IO (close to the y- axis), since for each outer tuple, uplicate inner tuples an thus is blocs are fetche several times. We run the same experiment for the three best approaches on a ifferent machine with a similar CPU but only 4GB main memory, that is, fewer is blocs are cache by the operating system. The runtime behavior is shown in Figure () an is slower, as expecte. The loose quatree per-

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that