Automated Generation of Object Summaries from Relational Databases: A Novel Keyword Searching Paradigm GEORGIOS FAKAS

Size: px
Start display at page:

Download "Automated Generation of Object Summaries from Relational Databases: A Novel Keyword Searching Paradigm GEORGIOS FAKAS"

Transcription

1 Automated Generation of Object Summaries from Relational Databases: A Novel Keyword Searching Paradigm GEORGIOS FAKAS Department of Computing and Mathematics, Manchester Metropolitan University Manchester, UK. g.fakas@mmu.ac.uk

2 Related Work: Web Search Engines: Keyword Search Kw Search: Peacock Result: A ranked set of web pages

3 Related Work: Web Search Engines: Keyword Search Kw Search: Peacock Result: A ranked set of web pages

4 Related Work: Keyword Search in Relational DBs Full-text Search (e.g. Oracle 9i Text) Kw Searching in Relational DB (DISCOVER, BANKS) Kw Search: Leverling, Peacock Result: e3-o2-c2 e4-06-c2

5 A Novel Keyword Searching Paradigm: Object Summaries (OSs) Kw Search: Peacock Result: A Ranked set of OSs

6 A Novel Keyword Searching Paradigm: Object Summaries (OSs) Kw Search: Peacock Result: A Ranked set of OSs Problems-Challenges: How can we automatically (1) Generate, (2) size-l OSs and (3) Rank OSs liberating users from knowledge of: (1) Schema and (2) Query Language?

7 A Novel Keyword Searching Paradigm: Object Summaries (OSs) 1.Automated Generation of OSs Affinity 2.Generation of size-l OS Efficient greedy algorithms ValueRank, a PageRank inspired ranking system

8 OS Generation - Methodology t DS a central tuple containing the Kw; tuples around t DS contain additional information about the Data Subject. R DS the corresponding central Relation; similarly Relations around contain additional information.

9 OS Generation - Methodology KW-ID = Janet Leverling Territories t1 t2 t3 t4 Employees e1 e2 e3 e4 Orders o1 o2 Region r1 r2 Customers c1 c2 c3 EmployeeTerritories et1 et2 et3 et4 t DS a central tuple containing the Kw; tuples around t DS contain additional information about the Data Subject. o3 o4 Shippers o5 o6 s1 o7 Order Details Products s2 s3 od1 od2 od3 od4 od5 od6 p1 p2 Suppliers su1 Categories ca1 R DS the corresponding central Relation; similarly Relations around contain additional information.

10 OS Generation - Methodology KW-ID = Janet Leverling Territories t1 t2 t3 t4 Employees e1 e2 e3 e4 Orders o1 o2 Region r1 r2 Customers c1 c2 c3 EmployeeTerritories et1 et2 et3 et4 t DS a central tuple containing the Kw; tuples around t DS contain additional information about the Data Subject. o3 o4 Shippers o5 o6 s1 o7 Order Details Products s2 s3 od1 od2 od3 od4 od5 od6 p1 p2 Suppliers su1 Categories ca1 R DS the corresponding central Relation; similarly Relations around contain additional information.

11 OS Generation - Methodology KW-ID = Janet Leverling Territories t1 t2 t3 t4 Employees e1 e2 e3 e4 Orders o1 o2 Region r1 r2 Customers c1 c2 c3 EmployeeTerritories et1 et2 et3 et4 t DS a central tuple containing the Kw; tuples around t DS contain additional information about the Data Subject. o3 o4 Shippers o5 o6 s1 o7 Order Details Products s2 s3 od1 od2 od3 od4 od5 od6 p1 p2 Suppliers su1 Categories ca1 R DS the corresponding central Relation; similarly Relations around contain additional information.

12 OS Generation - Methodology KW-ID = Janet Leverling Territories Region t1 t2 r1 r2 t3 t4 Employees EmployeeTerritories e1 e2 e3 e4 Orders Customers et1 et2 et3 et4 c1 o1 o2 o3 o4 c2 c3 Shippers o5 o6 s1 o7 Order Details Products s2 s3 od1 od2 od3 od4 od5 od6 p1 p2 Suppliers su1 Categories ca1 G DS

13 OS Generation - Methodology G DS Problem: Not all Relations in G DS are relevant: How do I decide 1) What relations to select or not 2) When to Stop Traversing Solution: Investigate Relational Semantics: Schema Connectivity, Cardinality, Related Cardinality etc. Quantify Affinity of Relations

14 Af : Affinity of Relations to R DS in G DS DS R i R Distance Physical (fd), Logical (ld), ld=fd- M:N

15 Af : Affinity of Relations to R DS in G DS DS R i R Distance Physical (fd), Logical (ld), ld=fd- M:N E.g. Orders closer than Customer and CustomerDemo to Employees

16 Af : Affinity of Relations to R DS in G DS DS R i R Distance Physical (fd), Logical (ld), ld=fd- M:N E.g. Orders closer than Customer and CustomerDemo to Employees Hubs: spurious shortcuts Rather irrelevant or lateral information RC(R1, R2) R DS... N1: R hub 1:M R 2

17 Af : Affinity of Relations to R DS in G DS DS R i R Connectivity Schema Connectivity (Co i ) Data-graph Connectivity: Relative Cardinality (RC i j ), i.e. the average number of tuples of R i that are connected with each tuple from R j for 1:M RC i j = Ri / Rj for M:1 RC i j =1 Reverse Relative Cardinality (RRCi j) is the reverse of RC i j i.e. RRC i j =RC i j ).

18 Af DS R i R : Affinity of Relations to R DS in G DS DAf(Ri)={(m1, w1), (m2, w2),.. (mn, wn)} m1=f1(ldi), m2=f1(log(10*rci), m3=f1(log(10*rrci), m4=f1(log(10*coi) f1(α)=(11- α)/10 For a hub-child m1=f1(ldi *hi) and m2=f1(rci) Formula 1 (Semantic Affinity): The affinity of R i to R DS, denoted as Af DS, with respect to a schema R i R and a database conforming to the schema, can be calculated with the following formula: Af R R i DS = m j j w j Af R Parent R DS Where AfR Parent R DS is the affinity of the R i s Parent to R DS or is 1 if R Parent R DS.

19 Af DS R i R : Affinity of Relations to R DS in G DS G DS (θ)

20 Experimental Evaluation MS Northwind and TPC-H DBs Precision, Recall, F-Score Compare G DS s and OSs produced by G DS (θ) v G DS (h) G DS (h) was proposed by 10 participants G DS : average F-score 86.77, OS aver F-score 83 G DS Precision, Recall and F-score (Averages) <0.5, 0.4, 0.05, 0.05> OSs Precision, Recall and F-score (Averages) <0.5, 0.4, 0.05, 0.05> Precision Recall F-Score Precision Recall F-Score Customers Employees Suppliers Shippers Northwind Orders Products Customer Supplier Parts Orders TPC-H Nation Region 0 Customers Employees Suppliers Shippers Northwind Orders Products Customer Supplier Parts Orders TPC-H Nation Region

21 Affinity Ranking Correctness (Average) Affinity Ranking Correctness (Averages) Customers Employees Suppliers Shippers Orders Products Customer Supplier Parts Orders Nation Region Northwind TPC-H 100 * 100 d ( r i Af h Ri, r Ri )

22 A Novel Keyword Searching Paradigm: Object Summaries (OSs) Kw Search: Peacock Result: A Ranked set of OSs

23 Generation of Size-l Object Summaries Definition: A size-l OS Keyword Query is (1) a set of keywords and (2) a value for l; Example: Faloutsos with l=15. Query Result: a partial OS comprised only by l tuples and meet the following two criteria: (1) All l tuples are connected with the root of the OS tree and (2) The Importance of the size-l OS is the maximum. Importance of a Size-l OS Im(OS-size-l)=Σ(Im(OS, ti) Local Importance of a Tuple (Im(OS, ti)) Im(OS, ti)= Im(ti)*Af(ti)

24 Generation of Size-l Object Summaries 1. Brute-Force Algorithm generates firstly the complete OS (i.e. OS extractions of tuples I/O) then considers all candidate size-l OSs in order to find the optimal size-l OS (exponential in-memory operations). Very Expensive solution!!!

25 Generation of Size-l Object Summaries Greedy Size-l OS Generation Algorithms OS Property 1. Im(OS,ti) usually decreases with depth from tds. 2.1 Bottom-Up Pruning Size-l Algorithm Firstly generates the complete OS (similarly to the brute-force algorithm) And then prunes out from the bottom of the tree the k-l leaf nodes with the current smallest Im(OS, ti). Lemma 1: When the nodes of an OS have monotonically decreasing local Importance scores to their distance from the root (i.e. the score of an ancestor is always greater than its children s), then it returns the optimal size-l OS. Efficiency characteristics: OS I/O but only loglinear in memory very efficient when k is not significantly bigger than l, since fewer operations will be required (i.e. k-l is smaller). considerably cheaper than the brute force algorithm. Correctness: Very good approximations of the optimal size-l OS.

26 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

27 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

28 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

29 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

30 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

31 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

32 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

33 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

34 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

35 Size-l OS Generation Bottom-Up Pruning Size-l Algorithm

36 Generation of Size-l Object Summaries Greedy Size-l OS Generation Algorithms OS Property 1. Im(OS,ti) usually decreases with depth from tds. 2.1 Bottom-Up Pruning Size-l Algorithm Firstly generates the complete OS (similarly to the brute-force algorithm) And then prunes out from the bottom of the tree the k-l leaf nodes with the current smallest Im(OS, ti). Lemma 1: When the nodes of an OS have monotonically decreasing local Importance scores to their distance from the root (i.e. the score of an ancestor is always greater than its children s), then it returns the optimal size-l OS. Efficiency characteristics: OS I/O but only loglinear in memory very efficient when k is not significantly bigger than l, since fewer operations will be required (i.e. k-l is smaller). considerably cheaper than the brute force algorithm. Correctness: Very good approximations of the optimal size-l OS.

37 Generation of Size-l Object Summaries Greedy Size-l OS Generation Algorithms 2.1 Top-Down Size-l Algorithm Uses a Priority Queue to build the OS by expanding on the current tuple with the biggest local Importance score. Lemma 2: When the nodes of an OS have monotonically decreasing local Importance scores to their distance from the root, then the Top-Down Algorithm returns the optimal size-l OS. Efficiency characteristics: more efficient than both aforementioned algorithms when l is significantly smaller than k. less I/O operations (no need for the complete OS) and also less in memory operations. On the other hand, when k is not very big in comparison to l, this algorithm becomes more expensive than the Bottom-Up Pruning. Correctness: less effective because expanding on the best current local Importance value will not always lead us to good (near) optimal solution.

38 Size-l OS Generation Top-Down Size-l Algorithm

39 Size-l OS Generation Top-Down Size-l Algorithm PQ

40 Size-l OS Generation Top-Down Size-l Algorithm PQ

41 Size-l OS Generation Top-Down Size-l Algorithm PQ

42 Size-l OS Generation Top-Down Size-l Algorithm PQ

43 Size-l OS Generation Top-Down Size-l Algorithm PQ

44 Size-l OS Generation Top-Down Size-l Algorithm PQ

45 Size-l OS Generation Top-Down Size-l Algorithm PQ

46 Size-l OS Generation Top-Down Size-l Algorithm PQ

47 Size-l OS Generation Top-Down Size-l Algorithm PQ

48 Generation of Size-l Object Summaries Greedy Size-l OS Generation Algorithms 2.1 Top-Down Size-l Algorithm Uses a Priority Queue to build the OS by expanding on the current tuple with the biggest local Importance score. Lemma 2: When the nodes of an OS have monotonically decreasing local Importance scores to their distance from the root, then the Top-Down Algorithm returns the optimal size-l OS. Efficiency characteristics: more efficient than both aforementioned algorithms when l is significantly smaller than k. less I/O operations (no need for the complete OS) and also less in memory operations. On the other hand, when k is not very big in comparison to l, this algorithm becomes more expensive than the Bottom-Up Pruning. Correctness: less effective because expanding on the best current local Importance value will not always lead us to good (near) optimal solution.

49 Top-Down v Bottom Up Pruning Size-l Algorithm 1. Evidently for small OSs, the Bottom-Up Pruning is the best choice, since it always achieves better correctness and at the same time requires equal or even less time than the Top-Down Algorithm. 2. On the other hand for larger OSs (e.g. for OS >300), there are two alternatives: (1) speed (Top-Down is faster at least twice for any l<50) (2) or correctness (Bottom-Up achieves at least 10% better correctness).

50 Experimental Evaluation Database Cardinalities Size (MB) DBLP 2,959, TPC-H 8,661,245 1,100 Northwind 3, Parameter G A Range G A1, G A2, G A3 d (d 1, d 2, d 3 ) 0.85, 0.10, 0.99 All G DS (θ)s were generated with a common weight i.e. w i =0.25 and θ=0.70 and normalized Affinity.

51 Experimental Evaluation Efficiency of the two Size-l Algorithms DBLP Author (Aver OS =486) DBLP Paper (Aver OS =377) Time (s) 6 Time (s) Top-Dow n Bottom-Up Pruning l TPC-H Customer (Aver OS =179) (a) 2 Top-Dow n Bottom-Up Pruning TPC-H Supplier (Aver OS =1426) 24 l (b) Time (s) 2 Time (s) 8 1 Top-Dow n Bottom-Up Prunning l (c) 4 Top-Dow n Bottom-Up Prunning l (d)

52 Experimental Evaluation Correctness of the two greedy algorithms 100 DBLP Author (Aver( OS) =364) 100 DBLP Paper (Aver( OS) =279) 100 TPC-H Customer (Aver( OS) =179) Correctness 70 Correctness 70 Correctness Top-Dow n Bottom-Up Prunning 50 Top-Dow n Bottom-Up Prunning 50 Top-Dow n Bottom-Up Prunning l (a) l (b) l (c) 100 TPC-H Supplier (Aver( OS )=1425) 100 DBLP Author ( OS =67) 100 DBLP Author (Aver( OS) =364) Correctness Correctness Correctness Top-Dow n Bottom-Up Prunning l (d) 50 Top-Dow n Bottom-Up Prunning l (e) Top-Dow n Bottom-Up Prunning GA1-d1 GA2-d1 GA3-d1 GA1-d2 GA1-d3 (f) Settings that produced global Importance

53 Experimental Evaluation Effectiveness of Size-l OS for Northwind 100 DBLP Author DBLP Paper Northwind Employee Effectiveness GA1-d1 GA2-d1 GA3-d1 GA1-d2 GA1-d l (a) Effectiveness GA1-d1 GA2-d1 GA3-d1 30 GA1-d2 GA1-d l (b) Effectiveness GA1-d1 GA2-d1 GA3-d1 GA1-d2 GA1-d (c) l 100 Northwind Order 100 Size-15 OS 100 Size-30 OS Effectiveness GA1-d1 GA2-d1 GA3-d1 GA1-d2 GA1-d3 Effectiveness GA1-d2 30 GA1-d Author Paper Employee Order Author Paper Employee Order l (d) (e) (f) GA1-d1 GA2-d1 GA3-d1 Effectiveness GA1-d1 GA2-d1 GA3-d1 GA1-d2 GA1-d3

54 Conclusions -Novel Contributions The formal definition of the novel Search Paradigm which automatically produces OSs for a Data Subject. minimum contribution from the user (i.e. only a Kw) no prior knowledge of the DB schema or query language needed. Excellent Precision, Recall and F-score results The formal definition and quantification of Relation s Affinity in the context of G DS consider both Schema Design and Data distributions Generation of Size-l OS Efficient algorithms are proposed

55 Preliminaries: ObjectRank The ObjectRank of a node v i can be calculated: r = dar + (1 s d) S where A ij =α(e) if there is an edge e=(v i v j ) in E A D and 0 otherwise, d controls the Base Set importance and s=[s1,,sn] T is the Base Set vector for S. 0.7 cites Conference Year Paper Author 0 cited

56 Global Ranking of Tuples (Im(ti)): ValueRank The ValueRank of a node v i can be calculated using the same formula: r s = dar + (1 d) S The s i of a node v i in S can be calculated with the formula: s i =α+β f(v i ) The Authority Transfer Edges, either forward or backward denoted as a(e), can be calculated with the formula: Territories (3) Region (4) Employees (9) 0.2 Categories (8) Orders (830) OrderDetails (2155) s i = *f(UnitPrice*Quantity) Customers (91) *f(Price*Quantity) * *f(Price) f(price*quantity) Products (77) s i = *f(Price) α(e)=γ+δ f(v i v j ) *f(Freights) *f(Price) 0.3 Shippers (3) Suppliers (29) where α, β, γ and δ are tuning constants such that that α+β 1 and γ+δ 1 and f(.) is a normalisation function of the values of vi and vj (in the range [0, 1] rather than just 1 as in the case of ObjectRank).

57 Preliminary Evaluation: ValueRank v ObjecRank Tuple ID ObjectRank ValueRank Total Orders {UnitPrice*Quantity, Freight, Price } Employee Employee Shipper Product Product Customer SAVEA Customer QUICK Supplier Supplier ObjectRank connectivity ValueRanks values+connectivity Maximum values per relation are indicated in bold.

58 Local Ranking of Tuples (Im(OS, ti)) The local Importance of each tuple t i of an OS can be calculated with: Im(OS, t i )= Im(t i ) α *Af(t i ) β where Im(t i ) is the global Importance of t i (e.g. its ValueRank or ObjectRank), Af(t i ) is the Affinity of t i to the t DS, α and β are tuning constants. The product of Im(t i ) with AfR(t i ) actually reduces the Importance contribution of each tuple towards the overall Im(OS).

59 Inter-Relation Tuple Ranking Summary of ValueRank of Northwind Northwind R i Minimum Median Maximum Employees Territories Region Orders Customer Shipper OrderDetails Product Supplier Categories The results are based on GA_northwind and d=0.85 The earlier work ObjectRank did not investigate interrelation ranking of tuples in depth.

60 Inter-Relation Tuple Ranking ValueRank v ObjecRank Tuple ID ValueRank ObjectRank Total Orders {UnitPrice*Quantity, Freight, Price } Customer SAVEA ,673.4 Customer QUICK , Shipper ,185.3 Shipper Product ,984.2 Product ,296.0 ObjectRank connectivity ValueRanks values+connectivity Maximum values per relation are indicated in bold.. Employee ,187.4 Employee , Supplier Supplier

61 Af DS R i R : Affinity of Relations to R DS in G DS R DS ld i, RC i, Employees Customer Order Shipper m 1..m 4 Af Ri Af Ri (r Ri ) Af Ri (r Ri ) Af Ri (r Ri ) R i RC, Co i i Employees R DS R DS (3) 0.97 (4) 0.82 (4) Employees (ReportsTo) 1, 1, 0.9, 4 1, 1, 1, (5) 0.91 (5) 0.73 (5) Employees (ReportedBy) 1, 0.9, 1, 4 1, 1, 1, (7) 0.85 (7) 0.66 (7) Territories 1, 5.4, 1, 2 1, 0.9, 1, (10) 0.66 (10) 0.51 (10) Region 2, 1, 13.2, 1 0.9, 1, 0.88, (11) 0.59 (11) 0.43 (11) Order 1, 92.2, 1, 4 1, 0.8, 1, (1) 1 (R DS ) 0.89 (1) Customer 2, 1, 9.1, 2 0.9, 1, 0.9, (R DS ) 0.99 (1) 0.83 (2) Shipper 2, 1, 276.6, 1 0.9, 1, 0.75, (2) 0.98 (2) 1 (R DS ) OrderDetails 2, 2.5, 1, 2 0.9, 0.96, 1, (4) 0.97 (3) 0.82 (3) Product 3, 1, 43.9, 4 0.8, 1, 0.83, (6) 0.91 (6) 0.73 (6) Supplier 4, 1, 1.6, 1 0.7, 1, 0.9, (8) 0.82 (8) 0.62 (8) Categories 4, 1, 6.1, 1 0.7, 1, 0.92, (9) 0.81 (9) 0.61 (9) CustDemographics 3, null, null, 1 0.8, null, null, 1 Null Null Null Null

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Mining XML Functional Dependencies through Formal Concept Analysis

Mining XML Functional Dependencies through Formal Concept Analysis Mining XML Functional Dependencies through Formal Concept Analysis Viorica Varga May 6, 2010 Outline Definitions for XML Functional Dependencies Introduction to FCA FCA tool to detect XML FDs Finding XML

More information

An Appropriate Search Algorithm for Finding Grid Resources

An Appropriate Search Algorithm for Finding Grid Resources An Appropriate Search Algorithm for Finding Grid Resources Olusegun O. A. 1, Babatunde A. N. 2, Omotehinwa T. O. 3,Aremu D. R. 4, Balogun B. F. 5 1,4 Department of Computer Science University of Ilorin,

More information

Mining Frequently Changing Substructures from Historical Unordered XML Documents

Mining Frequently Changing Substructures from Historical Unordered XML Documents 1 Mining Frequently Changing Substructures from Historical Unordered XML Documents Q Zhao S S. Bhowmick M Mohania Y Kambayashi Abstract Recently, there is an increasing research efforts in XML data mining.

More information

Estimating the Quality of Databases

Estimating the Quality of Databases Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality

More information

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS

TRIE BASED METHODS FOR STRING SIMILARTIY JOINS TRIE BASED METHODS FOR STRING SIMILARTIY JOINS Venkat Charan Varma Buddharaju #10498995 Department of Computer and Information Science University of MIssissippi ENGR-654 INFORMATION SYSTEM PRINCIPLES RESEARCH

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Efficient pebbling for list traversal synopses

Efficient pebbling for list traversal synopses Efficient pebbling for list traversal synopses Yossi Matias Ely Porat Tel Aviv University Bar-Ilan University & Tel Aviv University Abstract 1 Introduction 1.1 Applications Consider a program P running

More information

Analysis of Algorithms

Analysis of Algorithms Algorithm An algorithm is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program can be viewed as an elaborate algorithm. In mathematics and

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Tree. A path is a connected sequence of edges. A tree topology is acyclic there is no loop.

Tree. A path is a connected sequence of edges. A tree topology is acyclic there is no loop. Tree A tree consists of a set of nodes and a set of edges connecting pairs of nodes. A tree has the property that there is exactly one path (no more, no less) between any pair of nodes. A path is a connected

More information

International Journal of Advance Engineering and Research Development. Performance Enhancement of Search System

International Journal of Advance Engineering and Research Development. Performance Enhancement of Search System Scientific Journal of Impact Factor(SJIF): 3.134 International Journal of Advance Engineering and Research Development Volume 2,Issue 7, July -2015 Performance Enhancement of Search System Ms. Uma P Nalawade

More information

A Graph Method for Keyword-based Selection of the top-k Databases

A Graph Method for Keyword-based Selection of the top-k Databases This is the Pre-Published Version A Graph Method for Keyword-based Selection of the top-k Databases Quang Hieu Vu 1, Beng Chin Ooi 1, Dimitris Papadias 2, Anthony K. H. Tung 1 hieuvq@nus.edu.sg, ooibc@comp.nus.edu.sg,

More information

Queries with Order-by Clauses and Aggregates on Factorised Relational Data

Queries with Order-by Clauses and Aggregates on Factorised Relational Data Queries with Order-by Clauses and Aggregates on Factorised Relational Data Tomáš Kočiský Magdalen College University of Oxford A fourth year project report submitted for the degree of Masters of Mathematics

More information

Efficient LCA based Keyword Search in XML Data

Efficient LCA based Keyword Search in XML Data Efficient LCA based Keyword Search in XML Data Yu Xu Teradata San Diego, CA yu.xu@teradata.com Yannis Papakonstantinou University of California, San Diego San Diego, CA yannis@cs.ucsd.edu ABSTRACT Keyword

More information

Binary Trees

Binary Trees Binary Trees 4-7-2005 Opening Discussion What did we talk about last class? Do you have any code to show? Do you have any questions about the assignment? What is a Tree? You are all familiar with what

More information

Lecture: Analysis of Algorithms (CS )

Lecture: Analysis of Algorithms (CS ) Lecture: Analysis of Algorithms (CS583-002) Amarda Shehu Fall 2017 1 Binary Search Trees Traversals, Querying, Insertion, and Deletion Sorting with BSTs 2 Example: Red-black Trees Height of a Red-black

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d ) kd-trees Invented in 1970s by Jon Bentley Name originally meant 3d-trees, 4d-trees, etc where k was the # of dimensions Now, people say kd-tree of dimension d Idea: Each level of the tree compares against

More information

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years. 3: Summary Statistics Notation Consider these 10 ages (in years): 1 4 5 11 30 50 8 7 4 5 The symbol n represents the sample size (n = 10). The capital letter X denotes the variable. x i represents the

More information

Effective Top-k Keyword Search in Relational Databases Considering Query Semantics

Effective Top-k Keyword Search in Relational Databases Considering Query Semantics Effective Top-k Keyword Search in Relational Databases Considering Query Semantics Yanwei Xu 1,2, Yoshiharu Ishikawa 1, and Jihong Guan 2 1 Graduate School of Information Science, Nagoya University, Japan

More information

Outline. Other Use of Triangle Inequality Algorithms for Nearest Neighbor Search: Lecture 2. Orchard s Algorithm. Chapter VI

Outline. Other Use of Triangle Inequality Algorithms for Nearest Neighbor Search: Lecture 2. Orchard s Algorithm. Chapter VI Other Use of Triangle Ineuality Algorithms for Nearest Neighbor Search: Lecture 2 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Outline

More information

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE COMP 62421 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Querying Data on the Web Date: Wednesday 24th January 2018 Time: 14:00-16:00 Please answer all FIVE Questions provided. They amount

More information

Greedy Approach: Intro

Greedy Approach: Intro Greedy Approach: Intro Applies to optimization problems only Problem solving consists of a series of actions/steps Each action must be 1. Feasible 2. Locally optimal 3. Irrevocable Motivation: If always

More information

Backtracking. Chapter 5

Backtracking. Chapter 5 1 Backtracking Chapter 5 2 Objectives Describe the backtrack programming technique Determine when the backtracking technique is an appropriate approach to solving a problem Define a state space tree for

More information

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs Computational Optimization ISE 407 Lecture 16 Dr. Ted Ralphs ISE 407 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms in

More information

Centrality Measures. Computing Closeness and Betweennes. Andrea Marino. Pisa, February PhD Course on Graph Mining Algorithms, Università di Pisa

Centrality Measures. Computing Closeness and Betweennes. Andrea Marino. Pisa, February PhD Course on Graph Mining Algorithms, Università di Pisa Computing Closeness and Betweennes PhD Course on Graph Mining Algorithms, Università di Pisa Pisa, February 2018 Centrality measures The problem of identifying the most central nodes in a network is a

More information

KEYWORD search is a well known method for extracting

KEYWORD search is a well known method for extracting IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 7, JULY 2014 1657 Efficient Duplication Free and Minimal Keyword Search in Graphs Mehdi Kargar, Student Member, IEEE, Aijun An, Member,

More information

Advanced Crawling Techniques. Outline. Web Crawler. Chapter 6. Selective Crawling Focused Crawling Distributed Crawling Web Dynamics

Advanced Crawling Techniques. Outline. Web Crawler. Chapter 6. Selective Crawling Focused Crawling Distributed Crawling Web Dynamics Chapter 6 Advanced Crawling Techniques Outline Selective Crawling Focused Crawling Distributed Crawling Web Dynamics Web Crawler Program that autonomously navigates the web and downloads documents For

More information

Discovering Frequently Changing Structures from Historical Structural Deltas of Unordered XML

Discovering Frequently Changing Structures from Historical Structural Deltas of Unordered XML Discovering Frequently Changing Structures from Historical Structural Deltas of Unordered XML Qiankun Zhao Sourav S Bhowmick Mukesh Mohania Yahiko Kambayashi CAIS, Nanyang Technological University, Singapore.

More information

Incremental Query Optimization

Incremental Query Optimization Incremental Query Optimization Vipul Venkataraman Dr. S. Sudarshan Computer Science and Engineering Indian Institute of Technology Bombay Outline Introduction Volcano Cascades Incremental Optimization

More information

Optimization I : Brute force and Greedy strategy

Optimization I : Brute force and Greedy strategy Chapter 3 Optimization I : Brute force and Greedy strategy A generic definition of an optimization problem involves a set of constraints that defines a subset in some underlying space (like the Euclidean

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

Lecture 6: Analysis of Algorithms (CS )

Lecture 6: Analysis of Algorithms (CS ) Lecture 6: Analysis of Algorithms (CS583-002) Amarda Shehu October 08, 2014 1 Outline of Today s Class 2 Traversals Querying Insertion and Deletion Sorting with BSTs 3 Red-black Trees Height of a Red-black

More information

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We

More information

Heaps Outline and Required Reading: Heaps ( 7.3) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic

Heaps Outline and Required Reading: Heaps ( 7.3) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic 1 Heaps Outline and Required Reading: Heaps (.3) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic Heap ADT 2 Heap binary tree (T) that stores a collection of keys at its internal nodes and satisfies

More information

Computing optimal total vertex covers for trees

Computing optimal total vertex covers for trees Computing optimal total vertex covers for trees Pak Ching Li Department of Computer Science University of Manitoba Winnipeg, Manitoba Canada R3T 2N2 Abstract. Let G = (V, E) be a simple, undirected, connected

More information

DATA STRUCTURE : A MCQ QUESTION SET Code : RBMCQ0305

DATA STRUCTURE : A MCQ QUESTION SET Code : RBMCQ0305 Q.1 If h is any hashing function and is used to hash n keys in to a table of size m, where n

More information

Automated Generation of Personal Data Reports from Relational Databases

Automated Generation of Personal Data Reports from Relational Databases Journal of Information & Knowledge Management, Vol. 10, No. 2 (2011 193 208 #.c World Scienti c Publishing Co. DOI: 10.1142/S0219649211002936 Automated Generation of Personal Data Reports from Relational

More information

Heaps Goodrich, Tamassia. Heaps 1

Heaps Goodrich, Tamassia. Heaps 1 Heaps Heaps 1 Recall Priority Queue ADT A priority queue stores a collection of entries Each entry is a pair (key, value) Main methods of the Priority Queue ADT insert(k, x) inserts an entry with key k

More information

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS Department of Computer Science University of Babylon LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS By Faculty of Science for Women( SCIW), University of Babylon, Iraq Samaher@uobabylon.edu.iq

More information

Keyword search in databases: the power of RDBMS

Keyword search in databases: the power of RDBMS Keyword search in databases: the power of RDBMS 1 Introduc

More information

Geometric data structures:

Geometric data structures: Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other

More information

st-orientations September 29, 2005

st-orientations September 29, 2005 st-orientations September 29, 2005 Introduction Let G = (V, E) be an undirected biconnected graph of n nodes and m edges. The main problem this chapter deals with is different algorithms for orienting

More information

FastQRE: Fast Query Reverse Engineering

FastQRE: Fast Query Reverse Engineering FastQRE: Fast Query Reverse Engineering Dmitri V. Kalashnikov AT&T Labs Research dvk@research.att.com Laks V.S. Lakshmanan University of British Columbia laks@cs.ubc.ca Divesh Srivastava AT&T Labs Research

More information

The Threshold Algorithm: from Middleware Systems to the Relational Engine

The Threshold Algorithm: from Middleware Systems to the Relational Engine IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,?? 1 The Threshold Algorithm: from Middleware Systems to the Relational Engine Nicolas Bruno Microsoft Research nicolasb@microsoft.com Hui(Wendy)

More information

Finding k-paths in Cycle Free Graphs

Finding k-paths in Cycle Free Graphs Finding k-paths in Cycle Free Graphs Aviv Reznik Under the Supervision of Professor Oded Goldreich Department of Computer Science and Applied Mathematics Weizmann Institute of Science Submitted for the

More information

Answering Aggregate Queries Over Large RDF Graphs

Answering Aggregate Queries Over Large RDF Graphs 1 Answering Aggregate Queries Over Large RDF Graphs Lei Zou, Peking University Ruizhe Huang, Peking University Lei Chen, Hong Kong University of Science and Technology M. Tamer Özsu, University of Waterloo

More information

FDB: A Query Engine for Factorised Relational Databases

FDB: A Query Engine for Factorised Relational Databases FDB: A Query Engine for Factorised Relational Databases Nurzhan Bakibayev, Dan Olteanu, and Jakub Zavodny Oxford CS Christan Grant cgrant@cise.ufl.edu University of Florida November 1, 2013 cgrant (UF)

More information

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery Ninh D. Pham, Quang Loc Le, Tran Khanh Dang Faculty of Computer Science and Engineering, HCM University of Technology,

More information

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya

More information

Information Retrieval. Wesley Mathew

Information Retrieval. Wesley Mathew Information Retrieval Wesley Mathew 30-11-2012 Introduction and motivation Indexing methods B-Tree and the B+ Tree R-Tree IR- Tree Location-aware top-k text query 2 An increasing amount of trajectory data

More information

Link Structure Analysis

Link Structure Analysis Link Structure Analysis Kira Radinsky All of the following slides are courtesy of Ronny Lempel (Yahoo!) Link Analysis In the Lecture HITS: topic-specific algorithm Assigns each page two scores a hub score

More information

Bidirectional Expansion For Keyword Search on Graph Databases

Bidirectional Expansion For Keyword Search on Graph Databases Bidirectional Expansion For Keyword Search on Graph Databases Varun Kacholia Shashank Pandit Soumen Chakrabarti S. Sudarshan Rushi Desai Hrishikesh Karambelkar Indian Institute of Technology, Bombay varunk@acm.org

More information

Reverse Engineering Aggregation Queries

Reverse Engineering Aggregation Queries Reverse Engineering Aggregation Queries Wei Chit Tan Meihui Zhang Hazem Elmeleegy 2 Divesh Srivastava 3 Singapore University of Technology and Design, 2 Turn Inc., 3 AT&T Labs-Research weichit tan@mymail.sutd.edu.sg,

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

Mining Rare Periodic-Frequent Patterns Using Multiple Minimum Supports

Mining Rare Periodic-Frequent Patterns Using Multiple Minimum Supports Mining Rare Periodic-Frequent Patterns Using Multiple Minimum Supports R. Uday Kiran P. Krishna Reddy Center for Data Engineering International Institute of Information Technology-Hyderabad Hyderabad,

More information

Basant Group of Institution

Basant Group of Institution Basant Group of Institution Visual Basic 6.0 Objective Question Q.1 In the relational modes, cardinality is termed as: (A) Number of tuples. (B) Number of attributes. (C) Number of tables. (D) Number of

More information

Heaps 2. Recall Priority Queue ADT. Heaps 3/19/14

Heaps 2. Recall Priority Queue ADT. Heaps 3/19/14 Heaps 3// Presentation for use with the textbook Data Structures and Algorithms in Java, th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 0 Heaps Heaps Recall Priority Queue ADT

More information

Exploring Econometric Model Selection Using Sensitivity Analysis

Exploring Econometric Model Selection Using Sensitivity Analysis Exploring Econometric Model Selection Using Sensitivity Analysis William Becker Paolo Paruolo Andrea Saltelli Nice, 2 nd July 2013 Outline What is the problem we are addressing? Past approaches Hoover

More information

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου Ανάκτηση µε το µοντέλο διανυσµατικού χώρου Σηµερινό ερώτηµα Typically we want to retrieve the top K docs (in the cosine ranking for the query) not totally order all docs in the corpus can we pick off docs

More information

LECTURE 11 TREE TRAVERSALS

LECTURE 11 TREE TRAVERSALS DATA STRUCTURES AND ALGORITHMS LECTURE 11 TREE TRAVERSALS IMRAN IHSAN ASSISTANT PROFESSOR AIR UNIVERSITY, ISLAMABAD BACKGROUND All the objects stored in an array or linked list can be accessed sequentially

More information

Mahathma Gandhi University

Mahathma Gandhi University Mahathma Gandhi University BSc Computer science III Semester BCS 303 OBJECTIVE TYPE QUESTIONS Choose the correct or best alternative in the following: Q.1 In the relational modes, cardinality is termed

More information

Locality- Sensitive Hashing Random Projections for NN Search

Locality- Sensitive Hashing Random Projections for NN Search Case Study 2: Document Retrieval Locality- Sensitive Hashing Random Projections for NN Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 18, 2017 Sham Kakade

More information

looking ahead to see the optimum

looking ahead to see the optimum ! Make choice based on immediate rewards rather than looking ahead to see the optimum! In many cases this is effective as the look ahead variation can require exponential time as the number of possible

More information

Lecture 8 13 March, 2012

Lecture 8 13 March, 2012 6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.

More information

Binary Trees, Binary Search Trees

Binary Trees, Binary Search Trees Binary Trees, Binary Search Trees Trees Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search, insert, delete)

More information

LOCAL STRUCTURE AND DETERMINISM IN PROBABILISTIC DATABASES. Theodoros Rekatsinas, Amol Deshpande, Lise Getoor

LOCAL STRUCTURE AND DETERMINISM IN PROBABILISTIC DATABASES. Theodoros Rekatsinas, Amol Deshpande, Lise Getoor LOCAL STRUCTURE AND DETERMINISM IN PROBABILISTIC DATABASES Theodoros Rekatsinas, Amol Deshpande, Lise Getoor Motivation Probabilistic databases store, manage and query uncertain data Numerous applications

More information

Keyword query interpretation over structured data

Keyword query interpretation over structured data Keyword query interpretation over structured data Advanced Methods of IR Elena Demidova Materials used in the slides: Jeffrey Xu Yu, Lu Qin, Lijun Chang. Keyword Search in Databases. Synthesis Lectures

More information

Lecture 6: Suffix Trees and Their Construction

Lecture 6: Suffix Trees and Their Construction Biosequence Algorithms, Spring 2007 Lecture 6: Suffix Trees and Their Construction Pekka Kilpeläinen University of Kuopio Department of Computer Science BSA Lecture 6: Intro to suffix trees p.1/46 II:

More information

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics 1 Sorting 1.1 Problem Statement You are given a sequence of n numbers < a 1, a 2,..., a n >. You need to

More information

NED: An Inter-Graph Node Metric Based On Edit Distance

NED: An Inter-Graph Node Metric Based On Edit Distance NED: An Inter-Graph Node Metric Based On Edit Distance Haohan Zhu Facebook Inc. zhuhaohan@fb.com Xianrui Meng Apple Inc. xmeng@apple.com George Kollios Boston University gkollios@cs.bu.edu ABSTRACT Node

More information

N N Sudoku Solver. Sequential and Parallel Computing

N N Sudoku Solver. Sequential and Parallel Computing N N Sudoku Solver Sequential and Parallel Computing Abdulaziz Aljohani Computer Science. Rochester Institute of Technology, RIT Rochester, United States aaa4020@rit.edu Abstract 'Sudoku' is a logic-based

More information

Priority Queues. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort.

Priority Queues. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort. Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building a Heap Heapsort Philip Bille Priority Queues Trees and Heaps Representations of Heaps Algorithms on Heaps Building

More information

01/01/2017. Chapter 5: The Relational Data Model and Relational Database Constraints: Outline. Chapter 5: Relational Database Constraints

01/01/2017. Chapter 5: The Relational Data Model and Relational Database Constraints: Outline. Chapter 5: Relational Database Constraints Chapter 5: The Relational Data Model and Relational Database Constraints: Outline Ramez Elmasri, Shamkant B. Navathe(2017) Fundamentals of Database Systems (7th Edition),pearson, isbn 10: 0-13-397077-9;isbn-13:978-0-13-397077-7.

More information

Search Techniques for Fourier-Based Learning

Search Techniques for Fourier-Based Learning Search Techniques for Fourier-Based Learning Adam Drake and Dan Ventura Computer Science Department Brigham Young University {acd2,ventura}@cs.byu.edu Abstract Fourier-based learning algorithms rely on

More information

8. Relational Calculus (Part II)

8. Relational Calculus (Part II) 8. Relational Calculus (Part II) Relational Calculus, as defined in the previous chapter, provides the theoretical foundations for the design of practical data sub-languages (DSL). In this chapter, we

More information

State Space Search. Many problems can be represented as a set of states and a set of rules of how one state is transformed to another.

State Space Search. Many problems can be represented as a set of states and a set of rules of how one state is transformed to another. State Space Search Many problems can be represented as a set of states and a set of rules of how one state is transformed to another. The problem is how to reach a particular goal state, starting from

More information

Constraint Satisfaction Problems

Constraint Satisfaction Problems Constraint Satisfaction Problems Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Nebel, Hué and Wölfl (Universität Freiburg) Constraint

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Balanced BST. Balanced BSTs guarantee O(logN) performance at all times

Balanced BST. Balanced BSTs guarantee O(logN) performance at all times Balanced BST Balanced BSTs guarantee O(logN) performance at all times the height or left and right sub-trees are about the same simple BST are O(N) in the worst case Categories of BSTs AVL, SPLAY trees:

More information

A model of navigation history

A model of navigation history A model of navigation history Connor G. Brewster Alan Jeffrey August, 6 arxiv:68.5v [cs.se] 8 Aug 6 Abstract: Navigation has been a core component of the web since its inception: users and scripts can

More information

Learning Goals. CS221: Algorithms and Data Structures Lecture #3 Mind Your Priority Queues. Today s Outline. Back to Queues. Priority Queue ADT

Learning Goals. CS221: Algorithms and Data Structures Lecture #3 Mind Your Priority Queues. Today s Outline. Back to Queues. Priority Queue ADT CS: Algorithms and Data Structures Lecture # Mind Your Priority Queues Steve Wolfman 0W Learning Goals Provide examples of appropriate applications for priority queues. Describe efficient implementations

More information

Chapter 6. Dynamic Programming

Chapter 6. Dynamic Programming Chapter 6 Dynamic Programming CS 573: Algorithms, Fall 203 September 2, 203 6. Maximum Weighted Independent Set in Trees 6..0. Maximum Weight Independent Set Problem Input Graph G = (V, E) and weights

More information

Efficient Non-Sequential Access and More Ordering Choices in a Search Tree

Efficient Non-Sequential Access and More Ordering Choices in a Search Tree Efficient Non-Sequential Access and More Ordering Choices in a Search Tree Lubomir Stanchev Computer Science Department Indiana University - Purdue University Fort Wayne Fort Wayne, IN, USA stanchel@ipfw.edu

More information

SPARK: Top-k Keyword Query in Relational Database

SPARK: Top-k Keyword Query in Relational Database SPARK: Top-k Keyword Query in Relational Database Wei Wang University of New South Wales Australia 20/03/2007 1 Outline Demo & Introduction Ranking Query Evaluation Conclusions 20/03/2007 2 Demo 20/03/2007

More information

Composition Systems. Composition Systems. Contents. Contents. What s s in this paper. Introduction. On-Line Character Recognition.

Composition Systems. Composition Systems. Contents. Contents. What s s in this paper. Introduction. On-Line Character Recognition. Contents S. Geman, D.F. Potter, Z. Chi Presented by Haibin Ling 12. 2. 2003 Definition: Compositionality refers to the evident ability of humans to represent entities as hierarchies of parts,, with these

More information

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees

Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees Trees 2: Linked Representation, Tree Traversal, and Binary Search Trees Linked representation of binary tree Again, as with linked list, entire tree can be represented with a single pointer -- in this

More information

Scalable Evaluation of k-nn Queries on Large Uncertain Graphs

Scalable Evaluation of k-nn Queries on Large Uncertain Graphs Scalable Evaluation of k-nn Queries on Large Uncertain Graphs Xiaodong Li 1, Reynold Cheng 1, Yixiang Fang 1, Jiafeng Hu 1, Silviu Maniu 2 1 The University of Hong Kong, China, 2 Université Paris-Sud,

More information

TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015

TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015 TREES cs2420 Introduction to Algorithms and Data Structures Spring 2015 1 administrivia 2 -assignment 7 due Thursday at midnight -asking for regrades through assignment 5 and midterm must be complete by

More information

Chapter 5. Binary Trees

Chapter 5. Binary Trees Chapter 5 Binary Trees Definitions and Properties A binary tree is made up of a finite set of elements called nodes It consists of a root and two subtrees There is an edge from the root to its children

More information

Implementation of Skyline Sweeping Algorithm

Implementation of Skyline Sweeping Algorithm Implementation of Skyline Sweeping Algorithm BETHINEEDI VEERENDRA M.TECH (CSE) K.I.T.S. DIVILI Mail id:veeru506@gmail.com B.VENKATESWARA REDDY Assistant Professor K.I.T.S. DIVILI Mail id: bvr001@gmail.com

More information

On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches

On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches On The Complexity of Virtual Topology Design for Multicasting in WDM Trees with Tap-and-Continue and Multicast-Capable Switches E. Miller R. Libeskind-Hadas D. Barnard W. Chang K. Dresner W. M. Turner

More information