Efficient Data Structures for the Factor Periodicity Problem

Size: px
Start display at page:

Download "Efficient Data Structures for the Factor Periodicity Problem"

Transcription

1 Efficient Data Structures for the Factor Periodicity Problem Tomasz Kociumaka Wojciech Rytter Jakub Radoszewski Tomasz Waleń University of Warsaw, Poland SPIRE 2012 Cartagena, October 23, 2012 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 1/15

2 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

3 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

4 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

5 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Periods of w[11..22] are 5 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

6 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Periods of w[11..22] are 5, 10 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

7 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Periods of w[11..22] are 5, 10, 11 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

8 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Periods of w[11..22] are 5, 10, 11 and 12. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

9 Factor Periodicity Problem w: a b a a b a b a a b a a b a b a a b a b a a b a a b a b a a b a a b a a b a b a a b a b a a Periods of w[11..22] are 5, 10, 11 and 12. Notation P er(w[11..22]) ={5, 10, 11, 12}, per(w[11..22]) = 5. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 2/15

10 Arithmetic sets A word of length m might have Θ(m) periods, e.g. a m. Definition A set A = {a, a + d, a + 2d,..., a + kd} Z is called arithmetic. An integer d is called the difference of A. Observe that an arithmetic set can be represented by three integers: a, d and k. Fact Let v be a word of length m. Then P er(v) is a union of at most log m disjoint arithmetic sets. For example P er(w[11..22]) = {5} {10, 11, 12} = {5, 10} {11, 12}. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 3/15

11 Formal problem statement Problem (Periods Queries) Design a data structure that for a fixed word w of length n answers the following queries. Given integers i, j (1 i j n) compute P er(w[i..j]) respresented as a union of O(log w ) arithmetic sets. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 4/15

12 Formal problem statement Problem (Periods Queries) Design a data structure that for a fixed word w of length n answers the following queries. Given integers i, j (1 i j n) compute P er(w[i..j]) respresented as a union of O(log w ) arithmetic sets. Definition We say that p is an (1 + δ)-period of v if v (1 + δ)p. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 4/15

13 Formal problem statement Problem (Periods Queries) Design a data structure that for a fixed word w of length n answers the following queries. Given integers i, j (1 i j n) compute P er(w[i..j]) respresented as a union of O(log w ) arithmetic sets. Definition We say that p is an (1 + δ)-period of v if v (1 + δ)p. Problem ((1 + δ)-period Queries) Let us fix a real number δ > 0. Design a data structure that for a fixed word w of length n answers the following queries. Given integers i, j (1 i j n) compute all (1 + δ)-periods of w[i..j] respresented as a union of O(1) arithmetic sets. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 4/15

14 Related work To the best of our knowledge no previous research on the general case of Period Queries. Even for computing the maximal period, only straightforward solutions: memorize all answers O(n 2 ) space, O(1) query time compute the answer from scratch for each query no extra space, O(n) query time Efficient data structures for primitivity testing (generalized by (1 + δ)-period Queries with δ = 1) Karhumäki, Lifshits & Rytter; CPM 2007 O(n log n) space, O(1) query time, Crochemore et. al; SPIRE 2010 O(n log ε n) space, O(log n) query time. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 5/15

15 Our results Several results based on the common idea but different tools. Space All periods (1 + δ)-periods O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n(log log n) 2 ) O((log log n) 2 ) O(n log ε n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 6/15

16 Our results Several results based on the common idea but different tools. Space All periods (1 + δ)-periods O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n(log log n) 2 ) O((log log n) 2 ) O(n log ε n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) Standard assumption on the model of computation integer alphabet, i.e. Σ { 0, 1,..., n O(1)}, word RAM model with w = Ω(log n), randomization. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 6/15

17 Our approach Let Borders(v) = { u : u is a border of v}. Fact P er(v) = v Borders(v) = { v b : b Borders(v)}. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 7/15

18 Our approach Let Borders(v) = { u : u is a border of v}. Fact P er(v) = v Borders(v) = { v b : b Borders(v)}. We compute Borders(v) {2 k,..., 2 k+1 1} separately for each k {0,..., log v }. 2 k 2 k 1 2 k 1 2 k v Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 7/15

19 Our approach Let Borders(v) = { u : u is a border of v}. Fact P er(v) = v Borders(v) = { v b : b Borders(v)}. We compute Borders(v) {2 k,..., 2 k+1 1} separately for each k {0,..., log v }. 2 k 2 k 1 2 k 1 2 k v prefix suffix Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 7/15

20 Our approach Let Borders(v) = { u : u is a border of v}. Fact P er(v) = v Borders(v) = { v b : b Borders(v)}. We compute Borders(v) {2 k,..., 2 k+1 1} separately for each k {0,..., log v }. 2 k 2 k 1 2 k 1 2 k v prefix border suffix Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 7/15

21 Our approach Let Borders(v) = { u : u is a border of v}. Fact P er(v) = v Borders(v) = { v b : b Borders(v)}. We compute Borders(v) {2 k,..., 2 k+1 1} separately for each k {0,..., log v }. 2 k 2 k 1 2 k 1 2 k v prefix border suffix Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 7/15

22 Close occurrences Let Occ(u, v) be the set of positions of v where an occurrence of u starts. Arithmetic sets naturally appear as the Occ sets. Fact Let v 2 u. Then Occ(u, v) is arithmetic. Moreover, if Occ(u, v) 3 then its difference is equal to per(u). Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 8/15

23 Close occurrences Let Occ(u, v) be the set of positions of v where an occurrence of u starts. Arithmetic sets naturally appear as the Occ sets. Fact Let v 2 u. Then Occ(u, v) is arithmetic. Moreover, if Occ(u, v) 3 then its difference is equal to per(u). Case with Occ(u, v) 2 is trivial. Assume Occ(u, v) 2 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 8/15

24 Close occurrences Let Occ(u, v) be the set of positions of v where an occurrence of u starts. Arithmetic sets naturally appear as the Occ sets. Fact Let v 2 u. Then Occ(u, v) is arithmetic. Moreover, if Occ(u, v) 3 then its difference is equal to per(u). Case with Occ(u, v) 2 is trivial. Assume Occ(u, v) 2 v u periods of u Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 8/15

25 Close occurrences Let Occ(u, v) be the set of positions of v where an occurrence of u starts. Arithmetic sets naturally appear as the Occ sets. Fact Let v 2 u. Then Occ(u, v) is arithmetic. Moreover, if Occ(u, v) 3 then its difference is equal to per(u). Case with Occ(u, v) 2 is trivial. Assume Occ(u, v) 2 v u period of u Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 8/15

26 A formula for border lengths 2 k 2 k 1 2 k 1 2 k v p p s s Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 9/15

27 A formula for border lengths 2 k 2 k 1 2 k 1 2 k v p p s s P = Occ(p, s s) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 9/15

28 A formula for border lengths 2 k 2 k 1 2 k 1 2 k v p p s s S = Occ(s, pp ) P = Occ(p, s s) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 9/15

29 A formula for border lengths 2 k 2 k 1 2 k 1 2 k l l v p p s s l S = Occ(s, pp ) P = Occ(p, s s) l Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 9/15

30 A formula for border lengths 2 k 2 k 1 2 k 1 2 k l l v p p s s l S = Occ(s, pp ) P = Occ(p, s s) l Fact Let 0 l < 2 k. Then the word v has a border of length 2 k + l if and only if l + 1 S and s s l P. Consequently Borders(v) {2 k,..., 2 k+1 1} is arithmetic. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 9/15

31 Intersecting arithmetic sets Intersecting to arithmetics set in general is performed using the extended Euclid s algorithm, so takes O(log n) time. But if this sets share a common difference, constant time is sufficient. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 10/15

32 Intersecting arithmetic sets Intersecting to arithmetics set in general is performed using the extended Euclid s algorithm, so takes O(log n) time. But if this sets share a common difference, constant time is sufficient. Lemma If P 3 and S 3, then per(p) = per(s). Consequently P and S are arithmetic of common difference. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 10/15

33 Intersecting arithmetic sets Intersecting to arithmetics set in general is performed using the extended Euclid s algorithm, so takes O(log n) time. But if this sets share a common difference, constant time is sufficient. Lemma If P 3 and S 3, then per(p) = per(s). Consequently P and S are arithmetic of common difference. v p p s s S P Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 10/15

34 Intersecting arithmetic sets Intersecting to arithmetics set in general is performed using the extended Euclid s algorithm, so takes O(log n) time. But if this sets share a common difference, constant time is sufficient. Lemma If P 3 and S 3, then per(p) = per(s). Consequently P and S are arithmetic of common difference. v p p s s S P 2per(s) 2per(p) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 10/15

35 Intersecting arithmetic sets Intersecting to arithmetics set in general is performed using the extended Euclid s algorithm, so takes O(log n) time. But if this sets share a common difference, constant time is sufficient. Lemma If P 3 and S 3, then per(p) = per(s). Consequently P and S are arithmetic of common difference. v p p s s S P 2per(s) of period both per(p) and per(s) 2per(p) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 10/15

36 Summary of the combinatorial part Problem (Occurrence Queries) Design a data structure that for a word w can answer the following queries. Given a basic factor u and a factor v of w such that v 2 u (both represented by one of their occurrences) compute the arithmetic set Occ(u, v). Theorem Assume there is a data structure answering the Occurrence Queries in O(f(n)) time. Then this data structure can answer Period Queries in O(f(n) log n) time and (1 + δ)-period Queries is O(f(n)) time. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 11/15

37 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w i i v j Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

38 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w i i v j Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

39 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w SUCC(u, i) i i v j Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

40 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w SUCC(u, i) i i SUCC(u, i + 1) v j Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

41 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w SUCC(u, i) i i SUCC(u, i + 1) v j P RED(u, j) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

42 Range Predecessor/Successor Queries Problem (Range Predecessor/Successor Queries) Design a data structure that for a word w can answer the following queries. Given a factor u of w (represented by an occurrence in w) and i {1... n} find P RED(u, i) the last occurrence of u ending at a position i, SUCC(u, i) the first occurrence of u starting at a position i. The Occurrence Queries can be reduced to three Range Predecessor/Successor Queries, where u is a basic factor of w. w SUCC(u, i) i i SUCC(u, i + 1) v j P RED(u, j) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 12/15

43 Simple solution Recall that the Dictionary of Basic Factors is a data structure, which assigns integer identifiers to all basic factors of w. DBF k [i] = DBF k [j] w[i..i + 2 k 1] = w[j..j + 2 k 1]. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 13/15

44 Simple solution Recall that the Dictionary of Basic Factors is a data structure, which assigns integer identifiers to all basic factors of w. DBF k [i] = DBF k [j] w[i..i + 2 k 1] = w[j..j + 2 k 1]. For each k and identifier id we store a set A k,id = {i : DBF k [i] = id}. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 13/15

45 Simple solution Recall that the Dictionary of Basic Factors is a data structure, which assigns integer identifiers to all basic factors of w. DBF k [i] = DBF k [j] w[i..i + 2 k 1] = w[j..j + 2 k 1]. For each k and identifier id we store a set A k,id = {i : DBF k [i] = id}. If u = w[i..i + 2 k 1] and DBF k [i] = id, then A k,id = Occ(u, w). Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 13/15

46 Simple solution Recall that the Dictionary of Basic Factors is a data structure, which assigns integer identifiers to all basic factors of w. DBF k [i] = DBF k [j] w[i..i + 2 k 1] = w[j..j + 2 k 1]. For each k and identifier id we store a set A k,id = {i : DBF k [i] = id}. If u = w[i..i + 2 k 1] and DBF k [i] = id, then A k,id = Occ(u, w). A single binary search in A k,id allows to answer P RED(u, i) and SUCC(u, i). Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 13/15

47 Simple solution Recall that the Dictionary of Basic Factors is a data structure, which assigns integer identifiers to all basic factors of w. DBF k [i] = DBF k [j] w[i..i + 2 k 1] = w[j..j + 2 k 1]. For each k and identifier id we store a set A k,id = {i : DBF k [i] = id}. If u = w[i..i + 2 k 1] and DBF k [i] = id, then A k,id = Occ(u, w). A single binary search in A k,id allows to answer P RED(u, i) and SUCC(u, i). Corollary There exists a (simple) data structure of O(n log n) size that answer the Occurrence Queries in O(log n) time, and consequently the Period Queries in O(log 2 n) time. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 13/15

48 O(1) query time solution Fix 2 k n and consider a basic factor u, u = 2 k. Split w into fragments of length 2 k+1 with overlaps of size 2 k. 2 k+1 w 2 k+1 Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 14/15

49 O(1) query time solution Fix 2 k n and consider a basic factor u, u = 2 k. Split w into fragments of length 2 k+1 with overlaps of size 2 k. 2 k+1 w 2 k+1 Each occurrence of u occurs within a fragment. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 14/15

50 O(1) query time solution Fix 2 k n and consider a basic factor u, u = 2 k. Split w into fragments of length 2 k+1 with overlaps of size 2 k. 2 k+1 w 2 k+1 arithmetic sets Each occurrence of u occurs within a fragment. Occurrences in single fragments form arithmetic sets. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 14/15

51 O(1) query time solution Imagine a (large) array indexed by identifiers and fragments. u. u u [ ] [ 0, 2 k+1 2 k, 3 2 k] [ 2 k+1, 2 k+2]... [ m 2 k, n ]. This array has Θ.. ( n 2 2 k ) cells Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 15/15

52 O(1) query time solution Imagine a (large) array indexed by identifiers and fragments. u. u u [ ] [ 0, 2 k+1 2 k, 3 2 k] [ 2 k+1, 2 k+2]... [ m 2 k, n ]. This array has Θ. ( ) n 2 2 k cells.. All factors of a fixed length have n occurrences in total, so n non-empty fields perfect hashing can be used Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 15/15

53 O(1) query time solution Imagine a (large) array indexed by identifiers and fragments. u. u u [ ] [ 0, 2 k+1 2 k, 3 2 k] [ 2 k+1, 2 k+2]... [ m 2 k, n ]. This array has Θ. ( ) n 2 2 k cells.. All factors of a fixed length have n occurrences in total, so n non-empty fields perfect hashing can be used. This gives O(n log n) in total for all values of k Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 15/15

54 O(1) query time solution Answering queries: v w 2 k+1 2 k+1 v lies within at most two consecutive fragments, Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 16/15

55 O(1) query time solution Answering queries: v w 2 k+1 u 2 k+1 v lies within at most two consecutive fragments, get the occurrences of u from the hash table, Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 16/15

56 O(1) query time solution Answering queries: v w 2 k+1 u 2 k+1 v lies within at most two consecutive fragments, get the occurrences of u from the hash table, Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 16/15

57 O(1) query time solution Answering queries: v w 2 k+1 u 2 k+1 v lies within at most two consecutive fragments, get the occurrences of u from the hash table, crop and merge these arithmetic sets to obtain the result. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 16/15

58 O(1) query time solution Answering queries: v w 2 k+1 u 2 k+1 v lies within at most two consecutive fragments, get the occurrences of u from the hash table, crop and merge these arithmetic sets to obtain the result. Corollary There exists a data structure of O(n log n) size that answers the Occurrence Queries in O(1) time. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 16/15

59 Space-efficient solution Theorem (Nekrich, Navarro; 2012) There exists data structures that given the locus of u in the suffix tree of w answer the Range Predecessor/Successor queries in and satisfy the following space and time bounds: Space Query time O(n) O(log ε n) O(n log log n) O((log log n) 2 ) O(n log ε n) O(log log n) Theorem (Weighted LA Kopelovitz, Lewenstein; 2007) There exists a data structure of size O(n), which given an interval [i..j] finds the locus of w[i..j] in the suffix tree of w in O(log log n) time. Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 17/15

60 Summary Theorem (this paper) There exists data structures that satisfy following time and space bounds for size, Period Queries query time and (1 + δ)-period Queries query time: Space Period Queries (1 + δ)-period Q. O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n(log log n) 2 ) O((log log n) 2 ) O(n log ε n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 18/15

61 Further research Currently in progress: Space Period Queries (1 + δ)-period Q. O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n(log log n) 2 ) O((log log n) 2 ) O(n log ε n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 19/15

62 Further research Currently in progress: Space Period Queries 2-Period Queries O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n(log log n) 2 ) O((log log n) 2 ) O(n log ε n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) O(n) O(1) Open problems: Can the O(n log n) time preprocessing improved with o(n) queries? Can the maximum period be found faster than O(log n) with o(n 2 ) space? Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 19/15

63 Further research Currently in progress: Space Period Queries 2-Period Queries O(n) O(log 1+ε n) O(log ε n) O(n log log n) O(log n log log n) O(log log n) O(n log n) O(log n) O(1) O(n) O(1) Open problems: Can the O(n log n) time preprocessing improved with o(n) queries? Can the maximum period be found faster than O(log n) with o(n 2 ) space? Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 19/15

64 Thank you Thank you for your attention! Tomasz Kociumaka Efficient Data Structures for the Factor Periodicity Problem 20/15

Lecture 9 March 4, 2010

Lecture 9 March 4, 2010 6.851: Advanced Data Structures Spring 010 Dr. André Schulz Lecture 9 March 4, 010 1 Overview Last lecture we defined the Least Common Ancestor (LCA) and Range Min Query (RMQ) problems. Recall that an

More information

Fast Evaluation of Union-Intersection Expressions

Fast Evaluation of Union-Intersection Expressions Fast Evaluation of Union-Intersection Expressions Philip Bille Anna Pagh Rasmus Pagh IT University of Copenhagen Data Structures for Intersection Queries Preprocess a collection of sets S 1,..., S m independently

More information

Computational Geometry

Computational Geometry Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess

More information

Range Reporting. Range Reporting. Range Reporting Problem. Applications

Range Reporting. Range Reporting. Range Reporting Problem. Applications Philip Bille Problem problem. Preprocess at set of points P R 2 to support report(x1, y1, x2, y2): Return the set of points in R P, where R is rectangle given by (x1, y1) and (x2, y2). Applications Relational

More information

Optimal Parallel Randomized Renaming

Optimal Parallel Randomized Renaming Optimal Parallel Randomized Renaming Martin Farach S. Muthukrishnan September 11, 1995 Abstract We consider the Renaming Problem, a basic processing step in string algorithms, for which we give a simultaneously

More information

Lempel-Ziv compression: how and why?

Lempel-Ziv compression: how and why? Lempel-Ziv compression: how and why? Algorithms on Strings Paweł Gawrychowski July 9, 2013 s July 9, 2013 2/18 Outline Lempel-Ziv compression Computing the factorization Using the factorization July 9,

More information

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong Department of Computer Science and Engineering Chinese University of Hong Kong In this lecture, we will revisit the dictionary search problem, where we want to locate an integer v in a set of size n or

More information

Range Reporting. Range reporting problem 1D range reporting Range trees 2D range reporting Range trees Predecessor in nested sets kd trees

Range Reporting. Range reporting problem 1D range reporting Range trees 2D range reporting Range trees Predecessor in nested sets kd trees Range Reporting Range reporting problem 1D range reporting Range trees 2D range reporting Range trees Predecessor in nested sets kd trees Philip Bille Range Reporting Range reporting problem 1D range reporting

More information

Computational Geometry

Computational Geometry Motivation Motivation Polygons and visibility Visibility in polygons Triangulation Proof of the Art gallery theorem Two points in a simple polygon can see each other if their connecting line segment is

More information

Succinct dictionary matching with no slowdown

Succinct dictionary matching with no slowdown LIAFA, Univ. Paris Diderot - Paris 7 Dictionary matching problem Set of d patterns (strings): S = {s 1, s 2,..., s d }. d i=1 s i = n characters from an alphabet of size σ. Queries: text T occurrences

More information

Reporting Consecutive Substring Occurrences Under Bounded Gap Constraints

Reporting Consecutive Substring Occurrences Under Bounded Gap Constraints Reporting Consecutive Substring Occurrences Under Bounded Gap Constraints Gonzalo Navarro University of Chile, Chile gnavarro@dcc.uchile.cl Sharma V. Thankachan Georgia Institute of Technology, USA sthankachan@gatech.edu

More information

A Linear Size Index for Approximate String Matching

A Linear Size Index for Approximate String Matching A Linear Size Index for Approximate String Matching CPM 2006 Siu-Lung Tam Joint work with: Ho-Leung Chan Tak-Wah Lam Wing-Kin Sung Swee-Seong Wong 2006-07-05 11:10 +0200 Indexing for Approximate String

More information

An Improved Query Time for Succinct Dynamic Dictionary Matching

An Improved Query Time for Succinct Dynamic Dictionary Matching An Improved Query Time for Succinct Dynamic Dictionary Matching Guy Feigenblat 12 Ely Porat 1 Ariel Shiftan 1 1 Bar-Ilan University 2 IBM Research Haifa labs Agenda An Improved Query Time for Succinct

More information

Applications of Succinct Dynamic Compact Tries to Some String Problems

Applications of Succinct Dynamic Compact Tries to Some String Problems Applications of Succinct Dynamic Compact Tries to Some String Problems Takuya Takagi 1, Takashi Uemura 2, Shunsuke Inenaga 3, Kunihiko Sadakane 4, and Hiroki Arimura 1 1 IST & School of Engineering, Hokkaido

More information

Indexed Geometric Jumbled Pattern Matching

Indexed Geometric Jumbled Pattern Matching Indexed Geometric Jumbled Pattern Matching S. Durocher 1 R. Fraser 1,2 T. Gagie 3 D. Mondal 1 M. Skala 1 S. Thankachan 4 1 University of Manitoba 2 Google 3 University of Helsinki, Helsinki Institute for

More information

Computational Geometry

Computational Geometry Orthogonal Range Searching omputational Geometry hapter 5 Range Searching Problem: Given a set of n points in R d, preprocess them such that reporting or counting the k points inside a d-dimensional axis-parallel

More information

Improved Algorithms for the Range Next Value Problem and Applications

Improved Algorithms for the Range Next Value Problem and Applications Improved Algorithms for the Range Next Value Problem and Applications Maxime Crochemore, Costas Iliopoulos, Marcin Kubica, M. Sohel Rahman, Tomasz Walen To cite this version: Maxime Crochemore, Costas

More information

arxiv: v2 [cs.ds] 30 Sep 2016

arxiv: v2 [cs.ds] 30 Sep 2016 Synergistic Sorting, MultiSelection and Deferred Data Structures on MultiSets Jérémy Barbay 1, Carlos Ochoa 1, and Srinivasa Rao Satti 2 1 Departamento de Ciencias de la Computación, Universidad de Chile,

More information

1. Meshes. D7013E Lecture 14

1. Meshes. D7013E Lecture 14 D7013E Lecture 14 Quadtrees Mesh Generation 1. Meshes Input: Components in the form of disjoint polygonal objects Integer coordinates, 0, 45, 90, or 135 angles Output: A triangular mesh Conforming: A triangle

More information

CSCI 104 Log Structured Merge Trees. Mark Redekopp

CSCI 104 Log Structured Merge Trees. Mark Redekopp 1 CSCI 10 Log Structured Merge Trees Mark Redekopp Series Summation Review Let n = 1 + + + + k = σk i=0 n = k+1-1 i. What is n? What is log (1) + log () + log () + log (8)++ log ( k ) = 0 + 1 + + 3+ +

More information

Searching a Sorted Set of Strings

Searching a Sorted Set of Strings Department of Mathematics and Computer Science January 24, 2017 University of Southern Denmark RF Searching a Sorted Set of Strings Assume we have a set of n strings in RAM, and know their sorted order

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Classic Ideas, New Ideas Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura University of Toronto, July 2007 1 / 39 Outline

More information

Approximation Algorithms for Geometric Intersection Graphs

Approximation Algorithms for Geometric Intersection Graphs Approximation Algorithms for Geometric Intersection Graphs Subhas C. Nandy (nandysc@isical.ac.in) Advanced Computing and Microelectronics Unit Indian Statistical Institute Kolkata 700108, India. Outline

More information

Suffix Trees and Arrays

Suffix Trees and Arrays Suffix Trees and Arrays Yufei Tao KAIST May 1, 2013 We will discuss the following substring matching problem: Problem (Substring Matching) Let σ be a single string of n characters. Given a query string

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo

Module 5: Hashing. CS Data Structures and Data Management. Reza Dorrigiv, Daniel Roche. School of Computer Science, University of Waterloo Module 5: Hashing CS 240 - Data Structures and Data Management Reza Dorrigiv, Daniel Roche School of Computer Science, University of Waterloo Winter 2010 Reza Dorrigiv, Daniel Roche (CS, UW) CS240 - Module

More information

An introduction to suffix trees and indexing

An introduction to suffix trees and indexing An introduction to suffix trees and indexing Tomáš Flouri Solon P. Pissis Heidelberg Institute for Theoretical Studies December 3, 2012 1 Introduction Introduction 2 Basic Definitions Graph theory Alphabet

More information

3 Competitive Dynamic BSTs (January 31 and February 2)

3 Competitive Dynamic BSTs (January 31 and February 2) 3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within

More information

CPSC 536N: Randomized Algorithms Term 2. Lecture 10

CPSC 536N: Randomized Algorithms Term 2. Lecture 10 CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very

More information

4. Suffix Trees and Arrays

4. Suffix Trees and Arrays 4. Suffix Trees and Arrays Let T = T [0..n) be the text. For i [0..n], let T i denote the suffix T [i..n). Furthermore, for any subset C [0..n], we write T C = {T i i C}. In particular, T [0..n] is the

More information

58093 String Processing Algorithms. Lectures, Autumn 2013, period II

58093 String Processing Algorithms. Lectures, Autumn 2013, period II 58093 String Processing Algorithms Lectures, Autumn 2013, period II Juha Kärkkäinen 1 Contents 0. Introduction 1. Sets of strings Search trees, string sorting, binary search 2. Exact string matching Finding

More information

Predecessor. Predecessor Problem van Emde Boas Tries. Philip Bille

Predecessor. Predecessor Problem van Emde Boas Tries. Philip Bille Predecessor Predecessor Problem van Emde Boas Tries Philip Bille Predecessor Predecessor Problem van Emde Boas Tries Predecessors Predecessor problem. Maintain a set S U = {,..., u-} supporting predecessor(x):

More information

Stabbers of line segments in the plane

Stabbers of line segments in the plane Stabbers of line segments in the plane M. Claverol D. Garijo C. I. Grima A. Márquez C. Seara August 3, 2010 Abstract The problem of computing a representation of the stabbing lines of a set S of segments

More information

Computation with No Memory, and Rearrangeable Multicast Networks

Computation with No Memory, and Rearrangeable Multicast Networks Discrete Mathematics and Theoretical Computer Science DMTCS vol. 16:1, 2014, 121 142 Computation with No Memory, and Rearrangeable Multicast Networks Serge Burckel 1 Emeric Gioan 2 Emmanuel Thomé 3 1 ERMIT,

More information

Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data. Sorelle A. Friedler Google Joint work with David M. Mount

Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data. Sorelle A. Friedler Google Joint work with David M. Mount Spatio-temporal Range Searching Over Compressed Kinetic Sensor Data Sorelle A. Friedler Google Joint work with David M. Mount Motivation Kinetic data: data generated by moving objects Sensors collect data

More information

Lecture Notes: Range Searching with Linear Space

Lecture Notes: Range Searching with Linear Space Lecture Notes: Range Searching with Linear Space Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk In this lecture, we will continue our discussion

More information

Section 6.3: Further Rules for Counting Sets

Section 6.3: Further Rules for Counting Sets Section 6.3: Further Rules for Counting Sets Often when we are considering the probability of an event, that event is itself a union of other events. For example, suppose there is a horse race with three

More information

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1} CSE 5 Homework 2 Due: Monday October 6, 27 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in

More information

9 Bounds for the Knapsack Problem (March 6)

9 Bounds for the Knapsack Problem (March 6) 9 Bounds for the Knapsack Problem (March 6) In this lecture, I ll develop both upper and lower bounds in the linear decision tree model for the following version of the (NP-complete) Knapsack 1 problem:

More information

CPSC 311: Analysis of Algorithms (Honors) Exam 1 October 11, 2002

CPSC 311: Analysis of Algorithms (Honors) Exam 1 October 11, 2002 CPSC 311: Analysis of Algorithms (Honors) Exam 1 October 11, 2002 Name: Instructions: 1. This is a closed book exam. Do not use any notes or books, other than your 8.5-by-11 inch review sheet. Do not confer

More information

Simple Real-Time Constant-Space String Matching. Dany Breslauer, Roberto Grossi and Filippo Mignosi

Simple Real-Time Constant-Space String Matching. Dany Breslauer, Roberto Grossi and Filippo Mignosi Simple Real-Time Constant-Space String Matching Dany Breslauer, Roberto Grossi and Filippo Mignosi Real-time string matching Pattern X X[1..m] Text T T[1..n] Real-time string matching Pattern X X[1..m]

More information

S ELECTE Columbia University

S ELECTE Columbia University AD-A267 814 Testing String Superprimitivity in Parallel DTIC Dany Breslauer S ELECTE Columbia University AUG 111993 CUCS-053-92 A Abstract A string w covers another string z if every symbol of z is within

More information

CMSC 754 Computational Geometry 1

CMSC 754 Computational Geometry 1 CMSC 754 Computational Geometry 1 David M. Mount Department of Computer Science University of Maryland Fall 2005 1 Copyright, David M. Mount, 2005, Dept. of Computer Science, University of Maryland, College

More information

Reordering Buffer Management with Advice

Reordering Buffer Management with Advice Reordering Buffer Management with Advice Anna Adamaszek Marc P. Renault Adi Rosén Rob van Stee Abstract In the reordering buffer management problem, a sequence of coloured items arrives at a service station

More information

Predecessor. Predecessor. Predecessors. Predecessors. Predecessor Problem van Emde Boas Tries. Predecessor Problem van Emde Boas Tries.

Predecessor. Predecessor. Predecessors. Predecessors. Predecessor Problem van Emde Boas Tries. Predecessor Problem van Emde Boas Tries. Philip Bille s problem. Maintain a set S U = {,..., u-} supporting predecessor(x): return the largest element in S that is x. sucessor(x): return the smallest element in S that is x. insert(x): set S =

More information

FINAL EXAM SOLUTIONS

FINAL EXAM SOLUTIONS COMP/MATH 3804 Design and Analysis of Algorithms I Fall 2015 FINAL EXAM SOLUTIONS Question 1 (12%). Modify Euclid s algorithm as follows. function Newclid(a,b) if a

More information

Pebble Sets in Convex Polygons

Pebble Sets in Convex Polygons 2 1 Pebble Sets in Convex Polygons Kevin Iga, Randall Maddox June 15, 2005 Abstract Lukács and András posed the problem of showing the existence of a set of n 2 points in the interior of a convex n-gon

More information

Lecture 6: Faces, Facets

Lecture 6: Faces, Facets IE 511: Integer Programming, Spring 2019 31 Jan, 2019 Lecturer: Karthik Chandrasekaran Lecture 6: Faces, Facets Scribe: Setareh Taki Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Table 1 below illustrates the construction for the case of 11 integers selected from 20.

Table 1 below illustrates the construction for the case of 11 integers selected from 20. Q: a) From the first 200 natural numbers 101 of them are arbitrarily chosen. Prove that among the numbers chosen there exists a pair such that one divides the other. b) Prove that if 100 numbers are chosen

More information

9.5 Equivalence Relations

9.5 Equivalence Relations 9.5 Equivalence Relations You know from your early study of fractions that each fraction has many equivalent forms. For example, 2, 2 4, 3 6, 2, 3 6, 5 30,... are all different ways to represent the same

More information

Lecture 4: Primal Dual Matching Algorithm and Non-Bipartite Matching. 1 Primal/Dual Algorithm for weighted matchings in Bipartite Graphs

Lecture 4: Primal Dual Matching Algorithm and Non-Bipartite Matching. 1 Primal/Dual Algorithm for weighted matchings in Bipartite Graphs CMPUT 675: Topics in Algorithms and Combinatorial Optimization (Fall 009) Lecture 4: Primal Dual Matching Algorithm and Non-Bipartite Matching Lecturer: Mohammad R. Salavatipour Date: Sept 15 and 17, 009

More information

Computational Geometry

Computational Geometry Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess

More information

Lecture 7 February 26, 2010

Lecture 7 February 26, 2010 6.85: Advanced Data Structures Spring Prof. Andre Schulz Lecture 7 February 6, Scribe: Mark Chen Overview In this lecture, we consider the string matching problem - finding all places in a text where some

More information

Caging Polygons with Two and Three Fingers

Caging Polygons with Two and Three Fingers Caging Polygons with Two and Three Fingers Mostafa Vahedi and A. Frank van der Stappen Department of Information and Computing Sciences, Utrecht University, {vahedi, frankst}@cs.uu.nl Abstract: We present

More information

4.1.2 Merge Sort Sorting Lower Bound Counting Sort Sorting in Practice Solving Problems by Sorting...

4.1.2 Merge Sort Sorting Lower Bound Counting Sort Sorting in Practice Solving Problems by Sorting... Contents 1 Introduction... 1 1.1 What is Competitive Programming?... 1 1.1.1 Programming Contests.... 2 1.1.2 Tips for Practicing.... 3 1.2 About This Book... 3 1.3 CSES Problem Set... 5 1.4 Other Resources...

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

COUNTING AND PROBABILITY

COUNTING AND PROBABILITY CHAPTER 9 COUNTING AND PROBABILITY Copyright Cengage Learning. All rights reserved. SECTION 9.3 Counting Elements of Disjoint Sets: The Addition Rule Copyright Cengage Learning. All rights reserved. Counting

More information

Efficient algorithms for two extensions of LPF table: the power of suffix arrays

Efficient algorithms for two extensions of LPF table: the power of suffix arrays Efficient algorithms for two extensions of LPF table: the power of suffix arrays Maxime Crochemore 1,3, Costas S. Iliopoulos 1,4, Marcin Kubica 2, Wojciech Rytter 2,5, and Tomasz Waleń 2 1 Dept. of Computer

More information

1 The range query problem

1 The range query problem CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition

More information

Orthogonal Range Queries

Orthogonal Range Queries Orthogonal Range Piotr Indyk Range Searching in 2D Given a set of n points, build a data structure that for any query rectangle R, reports all points in R Kd-trees [Bentley] Not the most efficient solution

More information

Chordal Graphs: Theory and Algorithms

Chordal Graphs: Theory and Algorithms Chordal Graphs: Theory and Algorithms 1 Chordal graphs Chordal graph : Every cycle of four or more vertices has a chord in it, i.e. there is an edge between two non consecutive vertices of the cycle. Also

More information

Bijective counting of tree-rooted maps and shuffles of parenthesis systems

Bijective counting of tree-rooted maps and shuffles of parenthesis systems Bijective counting of tree-rooted maps and shuffles of parenthesis systems Olivier Bernardi Submitted: Jan 24, 2006; Accepted: Nov 8, 2006; Published: Jan 3, 2006 Mathematics Subject Classifications: 05A15,

More information

Frequency Covers for Strings

Frequency Covers for Strings Fundamenta Informaticae XXI (2018) 1 16 1 DOI 10.3233/FI-2016-0000 IOS Press Frequency Covers for Strings Neerja Mhaskar C Dept. of Computing and Software, McMaster University, Canada pophlin@mcmaster.ca

More information

Inverted Indexes. Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5

Inverted Indexes. Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5 Inverted Indexes Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 5 Basic Concepts Inverted index: a word-oriented mechanism for indexing a text collection to speed up the

More information

Linear-Space Data Structures for Range Minority Query in Arrays

Linear-Space Data Structures for Range Minority Query in Arrays Linear-Space Data Structures for Range Minority Query in Arrays Timothy M. Chan 1, Stephane Durocher 2, Matthew Skala 2, and Bryan T. Wilkinson 1 1 Cheriton School of Computer Science, University of Waterloo,

More information

The Visibility Problem and Binary Space Partition. (slides by Nati Srebro)

The Visibility Problem and Binary Space Partition. (slides by Nati Srebro) The Visibility Problem and Binary Space Partition (slides by Nati Srebro) The Visibility Problem b a c d e Algorithms Z-buffer: Draw objects in arbitrary order For each pixel, maintain distance to the

More information

Combinatorial Geometry & Approximation Algorithms

Combinatorial Geometry & Approximation Algorithms Combinatorial Geometry & Approximation Algorithms Timothy Chan U. of Waterloo PROLOGUE Analysis of Approx Factor in Analysis of Runtime in Computational Geometry Combinatorial Geometry Problem 1: Geometric

More information

Farthest-polygon Voronoi diagrams

Farthest-polygon Voronoi diagrams Farthest-polygon Voronoi diagrams Otfried Cheong, Hazel Everett, Marc Glisse, Joachim Gudmundsson, Samuel Hornus, Sylvain Lazard, Mira Lee and Hyeon-Suk Na ESA October 2007 KAIST, INRIA, NICTA, Soongsil

More information

Question Score Points Out Of 25

Question Score Points Out Of 25 University of Texas at Austin 6 May 2005 Department of Computer Science Theory in Programming Practice, Spring 2005 Test #3 Instructions. This is a 50-minute test. No electronic devices (including calculators)

More information

Union Find 11/2/2009. Union Find. Union Find via Linked Lists. Union Find via Linked Lists (cntd) Weighted-Union Heuristic. Weighted-Union Heuristic

Union Find 11/2/2009. Union Find. Union Find via Linked Lists. Union Find via Linked Lists (cntd) Weighted-Union Heuristic. Weighted-Union Heuristic Algorithms Union Find 16- Union Find Union Find Data Structures and Algorithms Andrei Bulatov In a nutshell, Kruskal s algorithm starts with a completely disjoint graph, and adds edges creating a graph

More information

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Fundamenta Informaticae 56 (2003) 105 120 105 IOS Press A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Jesper Jansson Department of Computer Science Lund University, Box 118 SE-221

More information

Space-efficient Algorithms for Document Retrieval

Space-efficient Algorithms for Document Retrieval Space-efficient Algorithms for Document Retrieval Niko Välimäki and Veli Mäkinen Department of Computer Science, University of Helsinki, Finland. {nvalimak,vmakinen}@cs.helsinki.fi Abstract. We study the

More information

Longest Common Extensions in Trees

Longest Common Extensions in Trees Longest Common Extensions in Trees Philip Bille 1, Pawe l Gawrychowski 2, Inge Li Gørtz 1, Gad M. Landau 3,4, and Oren Weimann 3 1 DTU Informatics, Denmark. phbi@dtu.dk, inge@dtu.dk 2 University of Warsaw,

More information

ACO Comprehensive Exam October 12 and 13, Computability, Complexity and Algorithms

ACO Comprehensive Exam October 12 and 13, Computability, Complexity and Algorithms 1. Computability, Complexity and Algorithms Given a simple directed graph G = (V, E), a cycle cover is a set of vertex-disjoint directed cycles that cover all vertices of the graph. 1. Show that there

More information

Chapter 8. Voronoi Diagrams. 8.1 Post Oce Problem

Chapter 8. Voronoi Diagrams. 8.1 Post Oce Problem Chapter 8 Voronoi Diagrams 8.1 Post Oce Problem Suppose there are n post oces p 1,... p n in a city. Someone who is located at a position q within the city would like to know which post oce is closest

More information

CSci 231 Final Review

CSci 231 Final Review CSci 231 Final Review Here is a list of topics for the final. Generally you are responsible for anything discussed in class (except topics that appear italicized), and anything appearing on the homeworks.

More information

Linear-Space Data Structures for Range Frequency Queries on Arrays and Trees

Linear-Space Data Structures for Range Frequency Queries on Arrays and Trees Noname manuscript No. (will be inserted by the editor) Linear-Space Data Structures for Range Frequency Queries on Arrays and Trees Stephane Durocher Rahul Shah Matthew Skala Sharma V. Thankachan the date

More information

Efficient validation and construction of border arrays

Efficient validation and construction of border arrays Efficient validation and construction of border arrays Jean-Pierre Duval Thierry Lecroq Arnaud Lefebvre LITIS, University of Rouen, France, {Jean-Pierre.Duval,Thierry.Lecroq,Arnaud.Lefebvre}@univ-rouen.fr

More information

CS:3330 (22c:31) Algorithms

CS:3330 (22c:31) Algorithms What s an Algorithm? CS:3330 (22c:31) Algorithms Introduction Computer Science is about problem solving using computers. Software is a solution to some problems. Algorithm is a design inside a software.

More information

Rectangle-Efficient Aggregation in Spatial Data Streams

Rectangle-Efficient Aggregation in Spatial Data Streams Rectangle-Efficient Aggregation in Spatial Data Streams Srikanta Tirthapura Iowa State David Woodruff IBM Almaden The Data Stream Model Stream S of additive updates (i, Δ) to an underlying vector v: v

More information

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm Computing intersections in a set of line segments: the Bentley-Ottmann algorithm Michiel Smid October 14, 2003 1 Introduction In these notes, we introduce a powerful technique for solving geometric problems.

More information

Union Find. Data Structures and Algorithms Andrei Bulatov

Union Find. Data Structures and Algorithms Andrei Bulatov Union Find Data Structures and Algorithms Andrei Bulatov Algorithms Union Find 6-2 Union Find In a nutshell, Kruskal s algorithm starts with a completely disjoint graph, and adds edges creating a graph

More information

Regular Languages and Regular Expressions

Regular Languages and Regular Expressions Regular Languages and Regular Expressions According to our definition, a language is regular if there exists a finite state automaton that accepts it. Therefore every regular language can be described

More information

by conservation of flow, hence the cancelation. Similarly, we have

by conservation of flow, hence the cancelation. Similarly, we have Chapter 13: Network Flows and Applications Network: directed graph with source S and target T. Non-negative edge weights represent capacities. Assume no edges into S or out of T. (If necessary, we can

More information

Lecture 6,

Lecture 6, Lecture 6, 4.16.2009 Today: Review: Basic Set Operation: Recall the basic set operator,!. From this operator come other set quantifiers and operations:!,!,!,! \ Set difference (sometimes denoted, a minus

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Dynamic Range Majority Data Structures

Dynamic Range Majority Data Structures Dynamic Range Majority Data Structures Amr Elmasry 1, Meng He 2, J. Ian Munro 3, and Patrick K. Nicholson 3 1 Department of Computer Science, University of Copenhagen, Denmark 2 Faculty of Computer Science,

More information

The Farthest Point Delaunay Triangulation Minimizes Angles

The Farthest Point Delaunay Triangulation Minimizes Angles The Farthest Point Delaunay Triangulation Minimizes Angles David Eppstein Department of Information and Computer Science UC Irvine, CA 92717 November 20, 1990 Abstract We show that the planar dual to the

More information

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure

Lecture 6: External Interval Tree (Part II) 3 Making the external interval tree dynamic. 3.1 Dynamizing an underflow structure Lecture 6: External Interval Tree (Part II) Yufei Tao Division of Web Science and Technology Korea Advanced Institute of Science and Technology taoyf@cse.cuhk.edu.hk 3 Making the external interval tree

More information

Data structures for totally monotone matrices

Data structures for totally monotone matrices Data structures for totally monotone matrices Submatrix maximum queries in (inverse) Monge matrices Input: n x n (inverse) Monge matrix M represented implicitly (constant time oracle access) Output: data

More information

University of Waterloo CS240R Fall 2017 Solutions to Review Problems

University of Waterloo CS240R Fall 2017 Solutions to Review Problems University of Waterloo CS240R Fall 2017 Solutions to Review Problems Reminder: Final on Tuesday, December 12 2017 Note: This is a sample of problems designed to help prepare for the final exam. These problems

More information

Lecture 1. 1 Notation

Lecture 1. 1 Notation Lecture 1 (The material on mathematical logic is covered in the textbook starting with Chapter 5; however, for the first few lectures, I will be providing some required background topics and will not be

More information

Lecture 15: The subspace topology, Closed sets

Lecture 15: The subspace topology, Closed sets Lecture 15: The subspace topology, Closed sets 1 The Subspace Topology Definition 1.1. Let (X, T) be a topological space with topology T. subset of X, the collection If Y is a T Y = {Y U U T} is a topology

More information

Lecture 12 March 4th

Lecture 12 March 4th Math 239: Discrete Mathematics for the Life Sciences Spring 2008 Lecture 12 March 4th Lecturer: Lior Pachter Scribe/ Editor: Wenjing Zheng/ Shaowei Lin 12.1 Alignment Polytopes Recall that the alignment

More information

Combinatorics I (Lecture 36)

Combinatorics I (Lecture 36) Combinatorics I (Lecture 36) February 18, 2015 Our goal in this lecture (and the next) is to carry out the combinatorial steps in the proof of the main theorem of Part III. We begin by recalling some notation

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

Introduction to Randomized Algorithms

Introduction to Randomized Algorithms Introduction to Randomized Algorithms Gopinath Mishra Advanced Computing and Microelectronics Unit Indian Statistical Institute Kolkata 700108, India. Organization 1 Introduction 2 Some basic ideas from

More information

4. Suffix Trees and Arrays

4. Suffix Trees and Arrays 4. Suffix Trees and Arrays Let T = T [0..n) be the text. For i [0..n], let T i denote the suffix T [i..n). Furthermore, for any subset C [0..n], we write T C = {T i i C}. In particular, T [0..n] is the

More information

Optimal detection of intersections between convex polyhedra

Optimal detection of intersections between convex polyhedra 1 2 Optimal detection of intersections between convex polyhedra Luis Barba Stefan Langerman 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Abstract For a polyhedron P in R d, denote by P its combinatorial complexity,

More information

String Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42

String Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42 String Matching Pedro Ribeiro DCC/FCUP 2016/2017 Pedro Ribeiro (DCC/FCUP) String Matching 2016/2017 1 / 42 On this lecture The String Matching Problem Naive Algorithm Deterministic Finite Automata Knuth-Morris-Pratt

More information