SPARK: Top-k Keyword Query in Relational Database
|
|
- Sharon Lambert
- 5 years ago
- Views:
Transcription
1 SPARK: Top-k Keyword Query in Relational Database Wei Wang University of New South Wales Australia 20/03/2007 1
2 Outline Demo & Introduction Ranking Query Evaluation Conclusions 20/03/2007 2
3 Demo 20/03/2007 3
4 Demo 20/03/2007 4
5 SPARK I Searching, Probing & Ranking Top-k Results Thesis project ( ) Taste of Research Summary Scholarship (2005) Finally, CISRA prize winner ering.php 20/03/2007 5
6 SPARK II Continued as a research project with PhD student Yi Luo SIGMOD paper trying VLDB 2007 Demo now! 20/03/2007 6
7 A Motivating Example 20/03/2007 7
8 A Motivating Example Top-3 results in our system Movies: Primetime Glick (2001) Tom Hanks/Ben Stiller (#2.1) Movies: Primetime Glick (2001) Tom Hanks/Ben Stiller (#2.1) ActorPlay: Character = Himself Actors: Hanks, Tom Actors: John Hanks ActorPlay: Character = Alexander Kerst Movies: Rosamunde Pilcher - Winduber dem Fluss (2001) 20/03/2007 8
9 Improving the Effectiveness Three factors are considered to contribute to the final score of a search result (joined tuple tree) (modified) IR ranking score. the completeness factor. the size normalization factor. 20/03/2007 9
10 Preliminaries Data Model Relation-based Query Model Joined tuple trees (JTTs) Sophisticated ranking address one flaw in previous approaches unify AND and OR semantics alternative size normalization 20/03/
11 Problems with DISCOVER2 t Q D 1+ ln(1 + ln( tf )) ln (1 s) + s dl avdl qtf ln N + 1 df t Q D 1+ ln(1 + ln( tf )) ln N + 1 df score(c i ) score(p j ) score signature SPARK c1 p (1, 1) 0.98 c2 p (0, 2) /03/
12 Virtual Document Combine tf contributions before tf normalization / attenuation. t Q D 1+ ln(1 + ln( tf )) ln N + 1 df c i p j score(maxtor) score(netvista) score a * c1 p c2 p /03/
13 Virtual Document Collection Collection: 3 results idf netvista = ln(4/3) idf maxtor = ln(4/2) Estimate idf: idf netvista = ε idf maxtor = 1 t Q D ln = 1 (1 1 )(1 1 ) ln(1 + ln( tf ln (1 s) + s ln 9 5 dl avdl )) qtf ln N + 1 df Estimate avdl = avdl C + avdl P c1 p1 c2 p2 score a /03/
14 Completeness Factor For short queries User prefer results matching more keywords Derive completeness factor based on extended Boolean model Measure L p distance to the idea position netvista c1 p1 c2 p2 d = 1 (c2 p2) d = 1.41 L 2 distance Ideal Pos (1,1) d = 0.5 (c1 p1) maxtor score b ( )/1.41 = 0.65 (1.41-1)/1.41 = /03/
15 Size Normalization Results in large CNs tend to have more matches to the keywords Score c = (1+s 1 -s 1 * CN ) * (1+s 2 -s 2 * CN nf ) Empirically, s 1 = 0.15, s 2 = 1 / ( Q + 1) works well 20/03/
16 Putting em Together score(jtt) = score a * score b * score c a : IR-score of the virtual document b : completeness factor c : size normalization factor c1 p1 c2 p2 score a * score b 0.98 * 0.65 = * 0.29 = /03/
17 Comparing Top-1 Results DBLP; Query = nikosclique 20/03/
18 #Rel and R-Rank Results #Rel DBLP; 18 queries; Union of top-20 results R-Rank #Rel DISCOVER [Liu et al, SIGMOD06] p = Mondial; 35 queries; Union of top-20 results R-Rank DISCOVER [Liu et al, SIGMOD06] p = p = p = p = p = /03/
19 Query Processing 3 Steps Generate candidate tuples in every relation in the schema (using full-text indexes) 20/03/
20 Query Processing 3 Steps Generate candidate tuples in every relation in the schema (using full-text indexes) Enumerate all possible Candidate Networks (CN) 20/03/
21 Query Processing 3 Steps Generate candidate tuples in every relation in the schema (using full-text indexes) Enumerate all possible Candidate Networks (CN) Execute the CNs Most algorithms differ here. The key is how to optimize for top-k retrieval 20/03/
22 Monotonic Scoring Function Execute a CN Assume: idf netvista > idf maxtor and k = 1 P CN: P Q C Q c1 p1 score(c i ) 1.06 score(p j ) 0.97 score 2.03 c2 p P 1 P 2 C 2 C 1 DISCOVER2 C c1 p1 < c2 p2 c1 p1 < c2 p2 20/03/
23 Non-Monotonic Scoring Function Execute a CN Assume: idf netvista > idf maxtor and k = 1 P 2 P 1 P CN: P Q C Q c1 p1 score(c i ) 1.06 score(p j ) 0.97 score a 0.98 c2 p ?? SPARK C C 1 C 2 c1 p1 < c1 p1 c2 p2 c2 p2 1) Re-establish the early stopping criterion 2) Check candidates in an optimal order 20/03/ <
24 Upper Bounding Function Idea: use a monotonic & tight, upper bounding function to SPARK s non-monotonic scoring function Details sumidf = Σ w idf w watf(t) = (1/sumidf) * Σ w (tf w (t) * idf w ) A = sumidf * (1 + ln(1 + ln( Σ t watf(t) ))) B = sumidf * Σ t watf(t) then, score a uscore a = (1/(1-s)) * min(a, B) score b monotonic wrt. watf(t) score c are constants given the CN score uscore 20/03/
25 Early Stopping Criterion Execute a CN Assume: idf netvista > idf maxtor and k = 1 P CN: P Q C Q c1 p1 uscore 1.13 score a 0.98 c2 p P 1 P 2 score( ) uscore( ) score( ) uscore( ) stop! C 2 C 1 C SPARK 1) Re-establish the early stopping criterion 2) Check candidates in an optimal order 20/03/
26 Query Processing Execute the CNs {P 1, P 2, } and {C1, C2, } have been sorted based on their IR relevance scores. Score(Pi Cj) = Score(Pi) + Score(Cj) CN: P Q C Q Operations: P [P 1,P 1 ] [C 1,C 1 ] C.get_next() // a parametric SQL query is sent to the dbms P 3 P 2 P 1 C 1 C 2 C 3 [VLDB 03] C [P 1,P 1 ] C 2 P.get_next() P 2 [C 1,C 2 ] P.get_next() P 3 [C 1,C 2 ] 20/03/
27 Skyline Sweeping Algorithm Execute the CNs Dominance uscore(<p i, C j >) > uscore(<p i+1, C j >) and uscore(<p i, C j >) > uscore(<p i, C j+1 >) CN: P Q C Q Operations: Priority Queue: P P 3 P 2 P 1 C 1 C 2 C 3 C P 1 C 1 P 2 C 1 P 3 C 1 <P 1, C 1 > <P 2, C 1 >, <P 1, C 2 > <P 3, C 1 >, <P 1, C 2 >, <P 2, C 2 > <P 1, C 2 >, <P 2, C 2 >, <P 4, C 1 >, <P 3, C 2 > Skyline Sweep 1) Re-establish the early stopping criterion 2) Check candidates in an optimal order sort of 20/03/
28 Block Pipeline Algorithm Inherent deficiency to bound non-monotonic function with (a few) monotonic upper bounding functions draw an example Lots of candidates with high uscores return much lower (real) score unnecessary (expensive) checking cannot stop earlier Idea Partition the space (into blocks) and derive tighter upper bounds for each partitions unwilling to check a candidate until we are quite sure about its prospect (bscore) 20/03/
29 Block Pipeline Algorithm Execute a CN Assume: idf n > idf m and k = 1 P (n:0, m:1) CN: P Q C Q Block uscore bscore score a (n:1, m:0) Block Pipeline C (n:1, m:0) (n:0, m:1) ) Re-establish the early stopping criterion 2) Check candidates in an optimal order 20/03/ stop!
30 Efficiency DBLP ~ 0.9M tuples in total k = 10 PC 1.8G, 512M Sparse GP SS BP time(ms) /03/2007 DQ1 DQ2 DQ3 DQ4 DQ5 DQ6 DQ7 DQ8 DQ9 DQ10 DQ11 DQ12 DQ13 DQ14 DQ15 DQ16 DQ17 DQ18 30
31 Efficiency DBLP, DQ Sparse GP SS BP /03/
32 Conclusions A system that can perform effective & efficient keyword search on relational databases Meaningful query results with appropriate rankings second-level response time for ~10M tuple DB (imdb data) on a commodity PC 20/03/
33 Q&A Thank you. 20/03/
34 Backup Slides BANKS demo: -shashank//servlet/searchform 20/03/
SPARK2: Top-k Keyword Query in Relational Databases
TKDE SPECIAL IUE: KEYWORD SEARCH ON STRUCTURED DATA, 20 SPARK2: Top-k Keyword Query in Relational Databases Yi Luo, Wei Wang, Member, IEEE, Xuemin Lin, Xiaofang Zhou, Senior Member, IEEE Jianmin Wang,
More informationKeyword Search in Databases
Keyword Search in Databases Wei Wang University of New South Wales, Australia Outline Based on the tutorial given at APWeb 2006 Introduction IR Preliminaries Systems Open Issues Dr. Wei Wang @ CSE, UNSW
More informationImplementation of Skyline Sweeping Algorithm
Implementation of Skyline Sweeping Algorithm BETHINEEDI VEERENDRA M.TECH (CSE) K.I.T.S. DIVILI Mail id:veeru506@gmail.com B.VENKATESWARA REDDY Assistant Professor K.I.T.S. DIVILI Mail id: bvr001@gmail.com
More informationExtending Keyword Search to Metadata in Relational Database
DEWS2008 C6-1 Extending Keyword Search to Metadata in Relational Database Jiajun GU Hiroyuki KITAGAWA Graduate School of Systems and Information Engineering Center for Computational Sciences University
More informationEffective Top-k Keyword Search in Relational Databases Considering Query Semantics
Effective Top-k Keyword Search in Relational Databases Considering Query Semantics Yanwei Xu 1,2, Yoshiharu Ishikawa 1, and Jihong Guan 2 1 Graduate School of Information Science, Nagoya University, Japan
More informationKeyword search in relational databases. By SO Tsz Yan Amanda & HON Ka Lam Ethan
Keyword search in relational databases By SO Tsz Yan Amanda & HON Ka Lam Ethan 1 Introduction Ubiquitous relational databases Need to know SQL and database structure Hard to define an object 2 Query representation
More informationKeyword Search in Databases
+ Databases and Information Retrieval Integration TIETS42 Keyword Search in Databases Autumn 2016 Kostas Stefanidis kostas.stefanidis@uta.fi http://www.uta.fi/sis/tie/dbir/index.html http://people.uta.fi/~kostas.stefanidis/dbir16/dbir16-main.html
More informationInformation Retrieval Using Keyword Search Technique
Information Retrieval Using Keyword Search Technique Dhananjay A. Gholap, Dr.Gumaste S. V Department of Computer Engineering, Sharadchandra Pawar College of Engineering, Dumbarwadi, Otur, Pune, India ABSTRACT:
More informationPAPER SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree
2398 PAPER SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree In-Joong KIM, Student Member, Kyu-Young WHANG a), and Hyuk-Yoon KWON, Nonmembers SUMMARY A top-k
More informationEvaluation of Keyword Search System with Ranking
Evaluation of Keyword Search System with Ranking P.Saranya, Dr.S.Babu UG Scholar, Department of CSE, Final Year, IFET College of Engineering, Villupuram, Tamil nadu, India Associate Professor, Department
More informationMAINTAIN TOP-K RESULTS USING SIMILARITY CLUSTERING IN RELATIONAL DATABASE
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 MAINTAIN TOP-K RESULTS USING SIMILARITY CLUSTERING IN RELATIONAL DATABASE Syamily K.R 1, Belfin R.V 2 1 PG student,
More informationA System for Query-Specific Document Summarization
A System for Query-Specific Document Summarization Ramakrishna Varadarajan, Vagelis Hristidis. FLORIDA INTERNATIONAL UNIVERSITY, School of Computing and Information Sciences, Miami. Roadmap Need for query-specific
More informationKeyword query interpretation over structured data
Keyword query interpretation over structured data Advanced Methods of IR Elena Demidova Materials used in the slides: Jeffrey Xu Yu, Lu Qin, Lijun Chang. Keyword Search in Databases. Synthesis Lectures
More informationRelational Keyword Search System
Relational Keyword Search System Pradeep M. Ghige #1, Prof. Ruhi R. Kabra *2 # Student, Department Of Computer Engineering, University of Pune, GHRCOEM, Ahmednagar, Maharashtra, India. * Asst. Professor,
More informationLeveraging Set Relations in Exact Set Similarity Join
Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,
More informationInformation Retrieval
Information Retrieval WS 2016 / 2017 Lecture 2, Tuesday October 25 th, 2016 (Ranking, Evaluation) Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University
More informationRank-aware XML Data Model and Algebra: Towards Unifying Exact Match and Similar Match in XML
Proceedings of the 7th WSEAS International Conference on Multimedia, Internet & Video Technologies, Beijing, China, September 15-17, 2007 253 Rank-aware XML Data Model and Algebra: Towards Unifying Exact
More informationKeyword query interpretation over structured data
Keyword query interpretation over structured data Advanced Methods of Information Retrieval Elena Demidova SS 2018 Elena Demidova: Advanced Methods of Information Retrieval SS 2018 1 Recap Elena Demidova:
More informationBasic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval
Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert
More informationQuery Segmentation Using Conditional Random Fields
Query Segmentation Using Conditional Random Fields Xiaohui Yu and Huxia Shi York University Toronto, ON, Canada, M3J 1P3 xhyu@yorku.ca,huxiashi@cse.yorku.ca ABSTRACT A growing mount of available text data
More informationIJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 2.114
[Saranya, 4(3): March, 2015] ISSN: 2277-9655 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A SURVEY ON KEYWORD QUERY ROUTING IN DATABASES N.Saranya*, R.Rajeshkumar, S.Saranya
More informationKeyword Search over RDF Graphs. Elisa Menendez
Elisa Menendez emenendez@inf.puc-rio.br Summary Motivation Keyword Search over RDF Process Challenges Example QUIOW System Next Steps Motivation Motivation Keyword search is an easy way to retrieve information
More informationEFFICIENT AND EFFECTIVE AGGREGATE KEYWORD SEARCH ON RELATIONAL DATABASES
EFFICIENT AND EFFECTIVE AGGREGATE KEYWORD SEARCH ON RELATIONAL DATABASES by Luping Li B.Eng., Renmin University, 2009 a Thesis submitted in partial fulfillment of the requirements for the degree of MASTER
More informationSelecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach +
Selecting Topics for Web Resource Discovery: Efficiency Issues in a Database Approach + Abdullah Al-Hamdani, Gultekin Ozsoyoglu Electrical Engineering and Computer Science Dept, Case Western Reserve University,
More informationEffective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar
Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar School of Computer Science Faculty of Science University of Windsor How Big is this Big Data? 40 Billion Instagram Photos 300 Hours
More informationVolume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com
More informationInternational Journal of Advance Engineering and Research Development. Performance Enhancement of Search System
Scientific Journal of Impact Factor(SJIF): 3.134 International Journal of Advance Engineering and Research Development Volume 2,Issue 7, July -2015 Performance Enhancement of Search System Ms. Uma P Nalawade
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationKeyword search in databases: the power of RDBMS
Keyword search in databases: the power of RDBMS 1 Introduc
More informationExamples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15
Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating
More informationDatabase System Concepts
Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.
More informationA Keyword-based Structured Query Language
Expressive and Flexible Access to Web-Extracted Data : A Keyword-based Structured Query Language Department of Computer Science and Engineering Indian Institute of Technology Delhi 22th September 2011
More informationInteractive keyword-based access to large-scale structured datasets
Interactive keyword-based access to large-scale structured datasets 2 nd Keystone Summer School 20 July 2016 Dr. Elena Demidova University of Southampton 1 Overview Keyword-based access to structured data
More informationDepartment of Computer Engineering, Sharadchandra Pawar College of Engineering, Dumbarwadi, Otur, Pune, Maharashtra, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Information Retrieval
More informationKeywords Machine learning, Pattern matching, Query processing, NLP
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Ratatta: Chatbot
More informationRefinement of keyword queries over structured data with ontologies and users
Refinement of keyword queries over structured data with ontologies and users Advanced Methods of IR Elena Demidova SS 2014 Materials used in the slides: Sandeep Tata and Guy M. Lohman. SQAK: doing more
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1 Chapter 25 Distributed Databases and Client-Server Architectures Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 25 Outline
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationDatabase Systems CSE 414
Database Systems CSE 414 Lecture 15-16: Basics of Data Storage and Indexes (Ch. 8.3-4, 14.1-1.7, & skim 14.2-3) 1 Announcements Midterm on Monday, November 6th, in class Allow 1 page of notes (both sides,
More informationCOLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA
COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010
More informationChapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationOutline. Eg. 1: DBLP. Motivation. Eg. 2: ACM DL Portal. Eg. 2: DBLP. Digital Libraries (DL) often have many errors that negatively affect:
Outline Effective and Scalable Solutions for Mixed and Split Citation Problems in Digital Libraries Dongwon Lee, Byung-Won On Penn State University, USA Jaewoo Kang North Carolina State University, USA
More informationKikori-KS: An Effective and Efficient Keyword Search System for Digital Libraries in XML
Kikori-KS An Effective and Efficient Keyword Search System for Digital Libraries in XML Toshiyuki Shimizu 1, Norimasa Terada 2, and Masatoshi Yoshikawa 1 1 Graduate School of Informatics, Kyoto University
More informationCMSC 424 Database design Lecture 18 Query optimization. Mihai Pop
CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different
More informationOptimization of Queries in Distributed Database Management System
Optimization of Queries in Distributed Database Management System Bhagvant Institute of Technology, Muzaffarnagar Abstract The query optimizer is widely considered to be the most important component of
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationAn Empirical Performance Evaluation of Relational Keyword Search Systems
An Empirical Performance Evaluation of Relational Keyword Search Systems University of Virginia Department of Computer Science Technical Report CS-2011-07 Joel Coffman, Alfred C. Weaver Department of Computer
More informationProcessing Recommender Top-N Queries in Relational Databases
Processing Recommender Top-N Queries in Relational Databases Liang Zhu1*, Quanlong Lei1, Guang Liu2, Feifei Liu1 1 Key Lab of Machine Learning and Computational Intelligence, School of Mathematics and
More informationR & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:
Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More information4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)
Since in the result relation each group is represented by exactly one tuple, in the select clause only aggregate functions can appear, or attributes that are used for grouping, i.e., that are also used
More informationHolistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs
Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Authors: Andreas Wagner, Veli Bicer, Thanh Tran, and Rudi Studer Presenter: Freddy Lecue IBM Research Ireland 2014 International
More informationSemantic Optimization of Preference Queries
Semantic Optimization of Preference Queries Jan Chomicki University at Buffalo http://www.cse.buffalo.edu/ chomicki 1 Querying with Preferences Find the best answers to a query, instead of all the answers.
More informationInformation Retrieval Overview
Roadmap Information Retrieval Overview Vagelis Hristidis School of Computer Science Florida International University COP 6727 What is IR? Matching Models Evaluation of Results Digital Libraries vs. IR
More informationDatabases and Information Retrieval Integration TIETS42. Kostas Stefanidis Autumn 2016
+ Databases and Information Retrieval Integration TIETS42 Autumn 2016 Kostas Stefanidis kostas.stefanidis@uta.fi http://www.uta.fi/sis/tie/dbir/index.html http://people.uta.fi/~kostas.stefanidis/dbir16/dbir16-main.html
More informationAn Overview of various methodologies used in Data set Preparation for Data mining Analysis
An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of
More informationCGS 3066: Spring 2017 SQL Reference
CGS 3066: Spring 2017 SQL Reference Can also be used as a study guide. Only covers topics discussed in class. This is by no means a complete guide to SQL. Database accounts are being set up for all students
More informationEfficient Subgraph Matching by Postponing Cartesian Products
Efficient Subgraph Matching by Postponing Cartesian Products Computer Science and Engineering Lijun Chang Lijun.Chang@unsw.edu.au The University of New South Wales, Australia Joint work with Fei Bi, Xuemin
More informationChapter 12: Query Processing
Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation
More informationDatabase Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.
Database Systems ( 料 ) December 13/14, 2006 Lecture #10 1 Announcement Assignment #4 is due next week. 2 1 Overview of Query Evaluation Chapter 12 3 Outline Query evaluation (Overview) Relational Operator
More informationImproving Query Plans. CS157B Chris Pollett Mar. 21, 2005.
Improving Query Plans CS157B Chris Pollett Mar. 21, 2005. Outline Parse Trees and Grammars Algebraic Laws for Improving Query Plans From Parse Trees To Logical Query Plans Syntax Analysis and Parse Trees
More informationHash table example. B+ Tree Index by Example Recall binary trees from CSE 143! Clustered vs Unclustered. Example
Student Introduction to Database Systems CSE 414 Hash table example Index Student_ID on Student.ID Data File Student 10 Tom Hanks 10 20 20 Amy Hanks ID fname lname 10 Tom Hanks 20 Amy Hanks Lecture 26:
More informationPARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server
More informationOpen Data Integration. Renée J. Miller
Open Data Integration Renée J. Miller miller@northeastern.edu !2 Open Data Principles Timely & Comprehensive Accessible and Usable Complete - All public data is made available. Public data is data that
More informationMulti-dimensional Skyline to find shopping malls. Md Amir Amiruzzaman Suphanut Parn Jamonnak Zhengyong Ren
Multi-dimensional Skyline to find shopping malls Md Amir Amiruzzaman Suphanut Parn Jamonnak Zhengyong Ren Introduction In market research predicting customer movement is very important. While customers
More informationCS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Text data and information retrieval Li Xiong Department of Mathematics and Computer Science Emory University Outline Information Retrieval (IR) Concepts Text Preprocessing Inverted
More informationOverview of Implementing Relational Operators and Query Evaluation
Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders
More informationThis lecture: IIR Sections Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring
This lecture: IIR Sections 6.2 6.4.3 Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring 1 Ch. 6 Ranked retrieval Thus far, our queries have all
More informationOutline. Parallel Database Systems. Information explosion. Parallelism in DBMSs. Relational DBMS parallelism. Relational DBMSs.
Parallel Database Systems STAVROS HARIZOPOULOS stavros@cs.cmu.edu Outline Background Hardware architectures and performance metrics Parallel database techniques Gamma Bonus: NCR / Teradata Conclusions
More informationDiversification of Query Interpretations and Search Results
Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,
More informationIntroduction to Information Retrieval
Introduction Inverted index Processing Boolean queries Course overview Introduction to Information Retrieval http://informationretrieval.org IIR 1: Boolean Retrieval Hinrich Schütze Institute for Natural
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationVirtual views. Incremental View Maintenance. View maintenance. Materialized views. Review of bag algebra. Bag algebra operators (slide 1)
Virtual views Incremental View Maintenance CPS 296.1 Topics in Database Systems A view is defined by a query over base tables Example: CREATE VIEW V AS SELECT FROM R, S WHERE ; A view can be queried just
More informationIntroduction to Database Systems CSE 414. Lecture 26: More Indexes and Operator Costs
Introduction to Database Systems CSE 414 Lecture 26: More Indexes and Operator Costs CSE 414 - Spring 2018 1 Student ID fname lname Hash table example 10 Tom Hanks Index Student_ID on Student.ID Data File
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval (Supplementary Material) Zhou Shuigeng March 23, 2007 Advanced Distributed Computing 1 Text Databases and IR Text databases (document databases) Large collections
More informationMobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationOverview of Query Evaluation. Chapter 12
Overview of Query Evaluation Chapter 12 1 Outline Query Optimization Overview Algorithm for Relational Operations 2 Overview of Query Evaluation DBMS keeps descriptive data in system catalogs. SQL queries
More informationEffective Keyword Search in Relational Databases
Effective Keyword Search in Relational Databases Fang Liu, Clement Yu Computer Science Department University of Illinois at Chicago {fliu1,yu}@cs.uic.edu Weiyi Meng Computer Science Department Binghamton
More informationWelcome to the topic of SAP HANA modeling views.
Welcome to the topic of SAP HANA modeling views. 1 At the end of this topic, you will be able to describe the three types of SAP HANA modeling views and use the SAP HANA Studio to work with views in the
More informationTop-k Keyword Search Over Graphs Based On Backward Search
Top-k Keyword Search Over Graphs Based On Backward Search Jia-Hui Zeng, Jiu-Ming Huang, Shu-Qiang Yang 1College of Computer National University of Defense Technology, Changsha, China 2College of Computer
More informationClustering Analysis for Malicious Network Traffic
Clustering Analysis for Malicious Network Traffic Jie Wang, Lili Yang, Jie Wu and Jemal H. Abawajy School of Information Science and Engineering, Central South University, Changsha, China Email: jwang,liliyang@csu.edu.cn
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More informationRanking Objects by Exploiting Relationships: Computing Top-K over Aggregation
Ranking Objects by Exploiting Relationships: Computing Top-K over Aggregation Kaushik Chakrabarti Venkatesh Ganti Jiawei Han Dong Xin* Microsoft Research Microsoft Research University of Illinois University
More informationJames Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!
James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation
More informationIntroduction to Database Systems. Motivation. Werner Nutt
Introduction to Database Systems Motivation Werner Nutt 1 Databases Are Everywhere Database = a large (?) collection of related data Classically, a DB models a real-world organisation (e.g., enterprise,
More informationQuery Optimization in Distributed Databases. Dilşat ABDULLAH
Query Optimization in Distributed Databases Dilşat ABDULLAH 1302108 Department of Computer Engineering Middle East Technical University December 2003 ABSTRACT Query optimization refers to the process of
More informationWhat s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence
What s a database system? Review of Basic Database Concepts CPS 296.1 Topics in Database Systems According to Oxford Dictionary Database: an organized body of related information Database system, DataBase
More informationXML RETRIEVAL. Introduction to Information Retrieval CS 150 Donald J. Patterson
Introduction to Information Retrieval CS 150 Donald J. Patterson Content adapted from Manning, Raghavan, and Schütze http://www.informationretrieval.org OVERVIEW Introduction Basic XML Concepts Challenges
More informationIncremental Keyword Search in Relational Databases
Dipartimento di Informatica e Automazione Via della Vasca Navale, 79 00146 Roma, Italy Incremental Keyword Search in Relational Databases Roberto De Virgilio, Antonio Maccioni, Riccardo Torlone RT-DIA-204-2013
More informationOverview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 624 Spring 2011 Overview of DB & IR Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/12/2011 Lipyeow Lim -- University of Hawaii at Manoa 1 Example
More informationDatabases & Information Retrieval
Databases & Information Retrieval Maya Ramanath (Further Reading: Combining Database and Information-Retrieval Techniques for Knowledge Discovery. G. Weikum, G. Kasneci, M. Ramanath and F.M. Suchanek,
More informationTowards Efficient and Effective Semantic Table Interpretation Ziqi Zhang
Towards Efficient and Effective Semantic Table Interpretation Ziqi Zhang Department of Computer Science, University of Sheffield Outline Define semantic table interpretation State-of-the-art and motivation
More informationCMSC424: Database Design. Instructor: Amol Deshpande
CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons
More informationData about data is database Select correct option: True False Partially True None of the Above
Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another
More informationQuery Relaxation Using Malleable Schemas. Dipl.-Inf.(FH) Michael Knoppik
Query Relaxation Using Malleable Schemas Dipl.-Inf.(FH) Michael Knoppik Table Of Contents 1.Introduction 2.Limitations 3.Query Relaxation 4.Implementation Issues 5.Experiments 6.Conclusion Slide 2 1.Introduction
More informationEffective Semantic Search over Huge RDF Data
Effective Semantic Search over Huge RDF Data 1 Dinesh A. Zende, 2 Chavan Ganesh Baban 1 Assistant Professor, 2 Post Graduate Student Vidya Pratisthan s Kamanayan Bajaj Institute of Engineering & Technology,
More informationInformation Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes
CS630 Representing and Accessing Digital Information Information Retrieval: Retrieval Models Information Retrieval Basics Data Structures and Access Indexing and Preprocessing Retrieval Models Thorsten
More information1. Data Model, Categories, Schemas and Instances. Outline
Chapter 2: Database System Concepts and Architecture Outline Ramez Elmasri, Shamkant B. Navathe(2016) Fundamentals of Database Systems (7th Edition),pearson, isbn 10: 0-13-397077-9;isbn-13:978-0-13-397077-7.
More information