Generalizing the Optimality of Multi-Step k-nearest Neighbor Query Processing
|
|
- Alannah Preston
- 5 years ago
- Views:
Transcription
1 Generalizing the Optimality of Multi-Step k-nearest Neighbor Query Processing SSTD 2007 Boston, U.S.A. Hans-Peter Kriegel, Peer Kröger, Peter Kunath, Matthias Renz Institute for Computer Science University of Munich, Germany
2 Outline 1. Introduction 2. Multi-Step knn Query Processing R Ilu 3. -optimal Multi-Step knn Search 4. Experimental Evaluation 5. Conclusions and Future Work 2
3 Introduction CAD/GIS Graphs Audio/Video Time Series Similarity search on complex objects involves costly distance measures lower-bound (LB) and upper-bound (UB) filter distances 3
4 Lower-Bound & Upper-Bound Filter Distances Lower-Bound & Upper-Bound Filter Distances LB/UB based on reference objects (e.g., M-tree) d min (A,B) = d A -d B d min (A,B) R d max (A,B) d max (A,B) = d A +d B A B LB/UB based on regions (e.g., R-tree) B d min (A,B) d(a,b) d max (A,B) LB exact UB d min (A,B) A d max (A,B) true drops true hits LB/UB based on some application d min (A,B) time series A d max (A,B) time series B conservative approximation of the amplitude values 4
5 Outline 1. Introduction 2. Multi-Step knn Query Processing R Ilu 3. -optimal Multi-Step knn Search 4. Experimental Evaluation 5. Conclusions and Future Work 5
6 Problem Definition Problem Definition Definition: knn-query, knn-distance Let DB be a database, q be a query object and k IN a query parameter. A knn query in DB returns the smallest set NN DB (q, k) DB that contains at least k objects from DB, for which the following condition holds: o NN DB (q, k), o DB \ NN DB (q, k): dist(q, o) < dist(q, o ) q 6
7 Multi-Step Query Processing Multi-Step Query Processing Indexing methods (i.e., single-step solutions) suffer from two drawbacks: the distance function has to be a metric the query predicate is evaluated many times apply a multi-step approach to reduce the number of candidates in a filter step query database multi-step query processor drops candidates hits filter step refinement result 7
8 Multi-Step knn-algorithm based on LB Multi-Step knn-algorithm based on LB The multi-step knn algorithm proposed by [Seidl, Kriegel 98] uses a lower bounding (LB) distance estimation in the filter step: for any query object q and o DB: LB(q, o) dist(q, o) knn-optimal(db, q, k) Ranking = initialize ranking for q; // using the LB filter distance result = ; d max = + ; // stop distance WHILE c = Ranking.getNext() AND LB(q, c) d max DO IF dist(q, c) d max THEN result.insert(dist(q, c), c); IF result.length k THEN d max = result[k].key; remove all entries from result where key > d max ; // true drops END WHILE; RETURN result; The optimality w.r.t. the number of refined candidates is evaluated and formalized by the concept of R-optimality 8
9 Outline 1. Introduction 2. Multi-Step knn Query Processing R Ilu 3. -optimal Multi-Step knn Search 4. Experimental Evaluation 5. Conclusions and Future Work 9
10 Advantages of an Upper-Bound Filter Advantages of an Upper-Bound Filter LB-based knn-search vs. (LB+UB)-based knn-search o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 LB(q,o 1 ) dist(q,o 1 ) o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 LB(q,o 1 ) dist(q,o 1 ) UB(q,o 1 ) distance nn 8 -dist(q) distance nn 8 -dist(q) Using an additional upper-bound filter distance has several advantages: allows to identify true hits without refinement reduces the storage requirements of the priority list of the ranking true hits can be reported immediately to the user 10
11 Generalizing the Optimality Generalizing the Optimality Definition: Generalized Optimiality Given an information class I defining a set of distance approximations available in the filter step, a multi-step knn algorithm is called R I -optimal if it does not produce more candidates in the filter-step than necessary. Before Refinement After Refinement o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 dist(q,o 1 ) LB(q,o 1 ) UB(q,o 1 ) has to be refined o 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10 o 11 o 12 dist(q,o 1 ) LB(q,o 1 ) UB(q,o 1 ) refinement not necessary LB k UB k nn 8 -dist(q) distance LB k UB k nn 8 -dist(q) distance Idea: refine only those objects which cover the knn-distance nn k -dist(q) Theorem: there exists always at least one candidate c, where LB k LB(c) and UB k UB(c), i.e. which covers nn k -dist(q) 11
12 R I lu -Optimiality R I lu -Optimiality Theorem: R -Optimiality I lu A multi-step knn algorithm is R I -optimal, iff it refines the candidate set: lu (Case 1) {o DB LB(q, o) nn k -dist(q) UB(q, o)} if there are more than k candidates c DB with LB(q, c) nn k -dist(q), otherwise (Case 2) {o DB LB(q, o) nn k -dist(q) < UB(q, o)} case 1: case 2: case 3: case 4: case 5: LB UB correctness correctness R-optimality case 1: case 2: case 3: case 4: case 5: case 6: case 6: LB UB R-optimality correctness correctness ε < nn k -dist(q) nn k -dist(q) < ε if ε nn k -dist(q), we lose correctness and R I lu -optimiality 12
13 Multi-Step knn-algorithm based on LB & UB Multi-Step knn-algorithm based on LB & UB knn-optimal(db, q, k) Ranking = initialize ranking for q; [Hjaltason, Samet 95] // using the LB filter distance result = ; UB k = + ; LB k = 0; Fetch first k candidates from the ranking; REPEAT // Step 1: fetch next candidate if LB next LB k then c = Ranking.getNext(); insert c into candidates; // only if LB(q,c) UB k endif; update LB k, UB k, LB next ; // Step 2: identify true hits and true drops (Filter Step) for all c candidates do if UB(q,c) LB k then insert c into result; // true hit if LB(q,c) > UB k then remove c from candidates; // true drop end for; // Step 3: refine candidates (Refinement Step) if result + candidates k and LB next > UB k then insert all c candidates into result; // stop condition else refine all c candidates where LB(q,c) LB k UB k UB(q,c), i.e. compute d exakt (q,c) and update LB(q,c) = UB(q,c) = d exakt (q,c); end if; UNTIL( candidates =0 and LB next > UB k ); RETURN result; 13
14 Outline 1. Introduction 2. Multi-Step knn Query Processing R Ilu 3. -optimal Multi-Step knn Search 4. Experimental Evaluation 5. Conclusions and Future Work 14
15 Experimental Evaluation Experimental Evaluation Data Sets used in the Evaluation Road Network Protein Graph CAD data Audio Timeseries Data Set Size 18,263 nodes 1,128 proteins 35,950 voxels 2,400 clips Distance Dijkstra Graph Kernel Euclidean DTW Filter vs. Refinement 1/300 1/ /150 15
16 Relative Number of Unrefined Candidates Relative Number of Unrefined Candidates Road Network Protein Plane Timeseries a significant amount of hits does not need to be refined 16
17 Absolute Number of Refinements Absolute Number of Refinements Road Network Protein Plane Timeseries for high values of k, the number of refinements is reduced by 50% 17
18 Pruning of the Priority Queue of the Ranking Pruning of the Priority Queue of the Ranking Road Network Plane the space requirements of the priority queue can be reduced by more than 50% 18
19 Number of Reported Results vs. Query Iterations Number of Reported Results vs. Query Iterations Plane (k = 1000) Protein (k = 1000) Road Network (k = 250) Road Network (k = 1000) a significant portion of the result can be reported early 19
20 Outline 1. Introduction 2. Multi-Step knn Query Processing R Ilu 3. -optimal Multi-Step knn Search 4. Experimental Evaluation 5. Conclusions and Future Work 20
21 Conclusions and Future Work Conclusions and Future Work Our Approach: Generalization of the traditional notion of R-optimality Multi-step approach which uses both a lower-bound and an upperbound filter distance function Our new approach helps to drastically reduce the refinements, features considerably reduced storage requirements and allows to report first hits very early Future Work: Integration in Data Mining techniques 21
22 ????? 22
Reverse k-nearest Neighbor Search in Dynamic and General Metric Databases
12th Int. Conf. on Extending Database Technology (EDBT'9), Saint-Peterburg, Russia, 29. Reverse k-nearest Neighbor Search in Dynamic and General Metric Databases Elke Achtert Hans-Peter Kriegel Peer Kröger
More informationProUD: Probabilistic Ranking in Uncertain Databases
Proc. 20th Int. Conf. on Scientific and Statistical Database Management (SSDBM'08), Hong Kong, China, 2008. ProUD: Probabilistic Ranking in Uncertain Databases Thomas Bernecker, Hans-Peter Kriegel, Matthias
More informationInterval-focused Similarity Search in Time Series Databases
In Proc. 12th Int. Conf. on Database Systems for Advanced Applications (DASFAA '07), Bangkok, Thailand, 2007 Interval-focused Similarity Search in Time Series Databases Johannes Aßfalg, Hans-Peter Kriegel,
More informationTiP: Analyzing Periodic Time Series Patterns
ip: Analyzing Periodic ime eries Patterns homas Bernecker, Hans-Peter Kriegel, Peer Kröger, and Matthias Renz Institute for Informatics, Ludwig-Maximilians-Universität München Oettingenstr. 67, 80538 München,
More informationDeLiClu: Boosting Robustness, Completeness, Usability, and Efficiency of Hierarchical Clustering by a Closest Pair Ranking
In Proc. 10th Pacific-Asian Conf. on Advances in Knowledge Discovery and Data Mining (PAKDD'06), Singapore, 2006 DeLiClu: Boosting Robustness, Completeness, Usability, and Efficiency of Hierarchical Clustering
More informationEfficient Processing of Multiple DTW Queries in Time Series Databases
Efficient Processing of Multiple DTW Queries in Time Series Databases Hardy Kremer 1 Stephan Günnemann 1 Anca-Maria Ivanescu 1 Ira Assent 2 Thomas Seidl 1 1 RWTH Aachen University, Germany 2 Aarhus University,
More informationSimilarity Search on Time Series based on Threshold Queries
Similarity Search on Time Series based on Threshold Queries Johannes Aßfalg, Hans-Peter Kriegel, Peer Kröger, Peter Kunath, Alexey Pryakhin, Matthias Renz Institute for Computer Science, University of
More informationIn Proc. 10th Int. Conf. on Database Systems for Advanced Applications (DASFAA 2005), Beijing, China, 2005.
In Proc. 10th Int. Conf. on Database Systems for Advanced Applications (DASFAA 2005), Beijing, China, 2005. Distributed Intersection Join of Complex Interval Sequences Hans-Peter Kriegel, Peter Kunath,
More informationHierarchical Density-Based Clustering for Multi-Represented Objects
Hierarchical Density-Based Clustering for Multi-Represented Objects Elke Achtert, Hans-Peter Kriegel, Alexey Pryakhin, Matthias Schubert Institute for Computer Science, University of Munich {achtert,kriegel,pryakhin,schubert}@dbs.ifi.lmu.de
More informationDynamic Skyline Queries in Metric Spaces
Dynamic Skyline Queries in Metric Spaces Lei Chen and Xiang Lian Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Kowloon Hong Kong, China
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More informationSurface k-nn Query Processing
Surface k-nn Query Processing e Deng Xiaofang Zhou Heng Tao Shen University of Queensland {dengke, zxf, shenht}@itee.uq.edu.au ai Xu National ICT Australia kai.xu@nicta.com.au Xuemin Lin UNSW lxue@cse.unsw.edu.au
More informationA Bandwidth- Efficient Nearest Neighbor Search for Dynamic Time Warping Distance in Distributed Environment
A Bandwidth- Efficient Nearest Neighbor Search for Dynamic Time Warping Distance in Distributed Environment Intel Team Topic 3 Pernghwa, Chin- Chi, Yen- Hua, Kung- Ting Problem DescripIon Time series are
More informationMesh Decimation. Mark Pauly
Mesh Decimation Mark Pauly Applications Oversampled 3D scan data ~150k triangles ~80k triangles Mark Pauly - ETH Zurich 280 Applications Overtessellation: E.g. iso-surface extraction Mark Pauly - ETH Zurich
More informationVoronoi-Based K Nearest Neighbor Search for Spatial Network Databases
Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases Mohammad Kolahdouzan and Cyrus Shahabi Department of Computer Science University of Southern California Los Angeles, CA, 90089, USA
More informationADAPTABLE SIMILARITY SEARCH IN LARGE IMAGE DATABASES
Appears in: Veltkamp R., Burkhardt H., Kriegel H.-P. (eds.): State-of-the-Art in Content-Based Image and Video Retrieval. Kluwer publishers, 21. Chapter 14 ADAPTABLE SIMILARITY SEARCH IN LARGE IMAGE DATABASES
More informationSILC: Efficient Query Processing on Spatial Networks
Hanan Samet hjs@cs.umd.edu Department of Computer Science University of Maryland College Park, MD 20742, USA Joint work with Jagan Sankaranarayanan and Houman Alborzi Proceedings of the 13th ACM International
More informationEfficient Reverse k-nearest Neighbor Estimation
Efficient Reverse -Nearest Neighbor Estimation Ele Achtert, Christian Böhm, Peer Kröger, Peter Kunath, Alexey Pryahin, Matthias Renz Institute for Computer Science Ludwig-Maximilians Universität München
More informationOn Graph Query Optimization in Large Networks
On Graph Query Optimization in Large Networks Peixiang Zhao, Jiawei Han Department of omputer Science University of Illinois at Urbana-hampaign pzhao4@illinois.edu, hanj@cs.uiuc.edu September 14th, 2010
More informationSubspace Similarity Search: Efficient k-nn Queries in Arbitrary Subspaces
Subspace Similarity Search: Efficient k-nn Queries in Arbitrary Subspaces Thomas Bernecker, Tobias Emrich, Franz Graf, Hans-Peter Kriegel, Peer Kröger, Matthias Renz, Erich Schubert, Arthur Zimek Institut
More informationProbabilistic Spatial Queries on Existentially Uncertain Data
Probabilistic Spatial Queries on Existentially Uncertain Data Xiangyuan Dai 1, Man Lung Yiu 1, Nikos Mamoulis 1, Yufei Tao 2, and Michail Vaitis 3 1 Department of Computer Science, University of Hong Kong,
More informationShortest Paths in Intersection Graphs of Unit Disks
Shortest Paths in Intersection Graphs of Unit Disks sergio.cabello@fmf.uni-lj.si University of Ljubljana Slovenija Material based on joint work with Miha Jejčič and Panos Giannopoulos Setting P: n points
More informationSpatial Queries. Nearest Neighbor Queries
Spatial Queries Nearest Neighbor Queries Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer efficiently point queries range queries k-nn
More informationQuerying Shortest Distance on Large Graphs
.. Miao Qiao, Hong Cheng, Lijun Chang and Jeffrey Xu Yu Department of Systems Engineering & Engineering Management The Chinese University of Hong Kong October 19, 2011 Roadmap Preliminary Related Work
More informationRASIM: a rank-aware separate index method for answering top-k spatial keyword queries
World Wide Web (2013) 16:111 139 DOI 10.1007/s11280-012-0159-3 RASIM: a rank-aware separate index method for answering top-k spatial keyword queries Hyuk-Yoon Kwon Kyu-Young Whang Il-Yeol Song Haixun Wang
More informationIndexing Land Surface for Efficient knn Query
Indexing Land Surface for Efficient knn Query Cyrus Shahabi, Lu-An Tang and Songhua Xing InfoLab University of Southern California Los Angeles, CA 90089-0781 http://infolab.usc.edu Outline q Mo+va+on q
More informationSorting and Searching. Tim Purcell NVIDIA
Sorting and Searching Tim Purcell NVIDIA Topics Sorting Sorting networks Search Binary search Nearest neighbor search Assumptions Data organized into D arrays Rendering pass == screen aligned quad Not
More informationData Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition
Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 4 Instance-Based Learning Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Instance Based Classifiers
More informationOutline. Other Use of Triangle Inequality Algorithms for Nearest Neighbor Search: Lecture 2. Orchard s Algorithm. Chapter VI
Other Use of Triangle Ineuality Algorithms for Nearest Neighbor Search: Lecture 2 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Outline
More informationdoc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague
Praha & EU: Investujeme do vaší budoucnosti Evropský sociální fond course: Searching the Web and Multimedia Databases (BI-VWM) Tomáš Skopal, 2011 SS2010/11 doc. RNDr. Tomáš Skopal, Ph.D. Department of
More informationHeuristic Search and Advanced Methods
Heuristic Search and Advanced Methods Computer Science cpsc322, Lecture 3 (Textbook Chpt 3.6 3.7) May, 15, 2012 CPSC 322, Lecture 3 Slide 1 Course Announcements Posted on WebCT Assignment1 (due on Thurs!)
More informationData Structures for Moving Objects
Data Structures for Moving Objects Pankaj K. Agarwal Department of Computer Science Duke University Geometric Data Structures S: Set of geometric objects Points, segments, polygons Ask several queries
More informationAn Efficient Bayesian Nearest Neighbor Search Using Marginal Object Weight Ranking Scheme in Spatial Databases
Journal of Computer Science 8 (8): 1358-1363, 2012 ISSN 1549-3636 2012 Science Publications An Efficient Bayesian Nearest Neighbor Search Using Marginal Object Weight Ranking Scheme in Spatial Databases
More informationTwo Ellipse-based Pruning Methods for Group Nearest Neighbor Queries
Two Ellipse-based Pruning Methods for Group Nearest Neighbor Queries ABSTRACT Hongga Li Institute of Remote Sensing Applications Chinese Academy of Sciences, Beijing, China lihongga lhg@yahoo.com.cn Bo
More informationHike: A High Performance knn Query Processing System for Multimedia Data
Hike: A High Performance knn Query Processing System for Multimedia Data Hui Li College of Computer Science and Technology Guizhou University Guiyang, China cse.huili@gzu.edu.cn Ling Liu College of Computing
More informationSpatial Queries in Road Networks Based on PINE
Journal of Universal Computer Science, vol. 14, no. 4 (2008), 590-611 submitted: 16/10/06, accepted: 18/2/08, appeared: 28/2/08 J.UCS Spatial Queries in Road Networks Based on PINE Maytham Safar (Kuwait
More informationA Novel Method to Estimate the Route and Travel Time with the Help of Location Based Services
A Novel Method to Estimate the Route and Travel Time with the Help of Location Based Services M.Uday Kumar Associate Professor K.Pradeep Reddy Associate Professor S Navaneetha M.Tech Student Abstract Location-based
More informationOptimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink
Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Rajesh Bordawekar IBM T. J. Watson Research Center bordaw@us.ibm.com Pidad D Souza IBM Systems pidsouza@in.ibm.com 1 Outline
More informationGeometric Modeling. Mesh Decimation. Mesh Decimation. Applications. Copyright 2010 Gotsman, Pauly Page 1. Oversampled 3D scan data
Applications Oversampled 3D scan data ~150k triangles ~80k triangles 2 Copyright 2010 Gotsman, Pauly Page 1 Applications Overtessellation: E.g. iso-surface extraction 3 Applications Multi-resolution hierarchies
More informationInverse Queries For Multidimensional Spaces
Inverse Queries For Multidimensional Spaces Thomas Bernecker 1, Tobias Emrich 1, Hans-Peter Kriegel 1, Nikos Mamoulis 2, Matthias Renz 1, Shiming Zhang 2, and Andreas Züfle 1 1 Institute for Informatics,
More informationHighway Hierarchies Star
Delling/Sanders/Schultes/Wagner: Highway Hierarchies Star 1 Highway Hierarchies Star Daniel Delling Dominik Schultes Peter Sanders Dorothea Wagner Institut für Theoretische Informatik Algorithmik I/II
More informationGroup Nearest Neighbor Queries for Fuzzy Geo-Spatial Objects
Group Nearest Neighbor Queries for Fuzzy Geo-Spatial Objects Novia Nurain 1, Mohammed Eunus Ali 2, Tanzima Hashem 3, and Egemen Tanin 4 1,2,3 Dept. of CSE, Bangladesh University of Engineering Technology,
More informationKnowledge Discovery in Databases II Summer Term 2017
Ludwig Maximilians Universität München Institut für Informatik Lehr und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases II Summer Term 2017 Lecture 3: Sequence Data Lectures : Prof.
More informationAn encoding-based dual distance tree high-dimensional index
Science in China Series F: Information Sciences 2008 SCIENCE IN CHINA PRESS Springer www.scichina.com info.scichina.com www.springerlink.com An encoding-based dual distance tree high-dimensional index
More informationAdaptive k-nearest-neighbor Classification Using a Dynamic Number of Nearest Neighbors
Adaptive k-nearest-neighbor Classification Using a Dynamic Number of Nearest Neighbors Stefanos Ougiaroglou 1 Alexandros Nanopoulos 1 Apostolos N. Papadopoulos 1 Yannis Manolopoulos 1 Tatjana Welzer-Druzovec
More informationOn Generalizing Collective Spatial Keyword Queries
On Generalizing Collective Spatial Keyword Queries Chan, H. K-H., Long, C., & Wong, R. C-W. (08). On Generalizing Collective Spatial Keyword Queries. IEEE Transactions on Knowledge and Data Engineering.
More informationApproximate Shortest Distance Computing: A Query-Dependent Local Landmark Scheme
Approximate Shortest Distance Computing: A Query-Dependent Local Landmark Scheme Miao Qiao, Hong Cheng, Lijun Chang and Jeffrey Xu Yu The Chinese University of Hong Kong {mqiao, hcheng, ljchang, yu}@secuhkeduhk
More informationDatabase support for concurrent digital mock up
Proceedings of the Tenth International IFIP TC5 WG-5.2; WG-5.3 Conference PROLAMAT 1998 Database support for concurrent digital mock up S. Berchtold, H. P. Kriegel, M. Pötke Institute for Computer Science,
More informationNearest Neighbor Search by Branch and Bound
Nearest Neighbor Search by Branch and Bound Algorithmic Problems Around the Web #2 Yury Lifshits http://yury.name CalTech, Fall 07, CS101.2, http://yury.name/algoweb.html 1 / 30 Outline 1 Short Intro to
More informationDetection of Outliers
Detection of Outliers TNM033 - Data Mining by Anton Auoja, Albert Backenhof & Mikael Dalkvist Holy Outliers, Batman!! An outlying observation, or outlier, is one that appears to deviate markedly from other
More informationA Safe-Exit Approach for Efficient Network-Based Moving Range Queries
Data & Knowledge Engineering Data & Knowledge Engineering 00 (0) 5 A Safe-Exit Approach for Efficient Network-Based Moving Range Queries Duncan Yung, Man Lung Yiu, Eric Lo Department of Computing, Hong
More informationFinding Shortest Path on Land Surface
Finding Shortest Path on Land Surface Lian Liu, Raymond Chi-Wing Wong Hong Kong University of Science and Technology June 14th, 211 Introduction Land Surface Land surfaces are modeled as terrains A terrain
More informationSearch Space Reductions for Nearest-Neighbor Queries
Search Space Reductions for Nearest-Neighbor Queries Micah Adler 1 and Brent Heeringa 2 1 Department of Computer Science, University of Massachusetts, Amherst 140 Governors Drive Amherst, MA 01003 2 Department
More informationExercises C-Programming
Exercises C-Programming Claude Fuhrer (claude.fuhrer@bfh.ch) 0 November 016 Contents 1 Serie 1 1 Min function.................................. Triangle surface 1............................... 3 Triangle
More informationSurrounding Join Query Processing in Spatial Databases
Surrounding Join Query Processing in Spatial Databases Lingxiao Li (B), David Taniar, Maria Indrawan-Santiago, and Zhou Shao Monash University, Melbourne, Australia lli278@student.monash.edu, {david.taniar,maria.indrawan,joe.shao}@monash.edu
More informationThe Islands Approach to Nearest Neighbor Querying in Spatial Networks
The s Approach to Nearest Neighbor Querying in Spatial Networks Xuegang Huang, Christian S. Jensen, Simonas Šaltenis Department of Computer Science, Aalborg University Fredrik Bajers ej 7E, DK-922, Aalborg,
More informationInverse Queries For Multidimensional Spaces
Inverse Queries For Multidimensional Spaces Thomas Bernecker 1, Tobias Emrich 1, Hans-Peter Kriegel 1, Nikos Mamoulis 2, Matthias Renz 1, Shiming Zhang 2, and Andreas Züfle 1 1 Institute for Informatics,
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationThe Effects of Dimensionality Curse in High Dimensional knn Search
The Effects of Dimensionality Curse in High Dimensional knn Search Nikolaos Kouiroukidis, Georgios Evangelidis Department of Applied Informatics University of Macedonia Thessaloniki, Greece Email: {kouiruki,
More informationNear Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri
Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Scene Completion Problem The Bare Data Approach High Dimensional Data Many real-world problems Web Search and Text Mining Billions
More informationIndexing Land Surface for Efficient knn Query
Indexing Land Surface for Efficient knn Query Cyrus Shahabi Computer Science Department University of Southern California Los Angeles, CA 90089-0781 shahabi@usc.edu Lu-An Tang Computer Science Department
More informationgspan: Graph-Based Substructure Pattern Mining
University of Illinois at Urbana-Champaign February 3, 2017 Agenda What motivated the development of gspan? Technical Preliminaries Exploring the gspan algorithm Experimental Performance Evaluation Introduction
More informationCSE 421 Greedy Alg: Union Find/Dijkstra s Alg
CSE 1 Greedy Alg: Union Find/Dijkstra s Alg Shayan Oveis Gharan 1 Dijkstra s Algorithm Dijkstra(G, c, s) { d s 0 foreach (v V) d[v] //This is the key of node v foreach (v V) insert v onto a priority queue
More informationEfficient Similarity Search in Scientific Databases with Feature Signatures
DATA MANAGEMENT AND DATA EXPLORATION GROUP Prof. Dr. rer. nat. Thomas Seidl DATA MANAGEMENT AND DATA EXPLORATION GROUP Prof. Dr. rer. nat. Thomas Seidl Efficient Similarity Search in Scientific Databases
More informationBranch and Bound. Algorithms for Nearest Neighbor Search: Lecture 1. Yury Lifshits
Branch and Bound Algorithms for Nearest Neighbor Search: Lecture 1 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology 1 / 36 Outline 1 Welcome
More informationAccurate High-Performance Route Planning
Sanders/Schultes: Route Planning 1 Accurate High-Performance Route Planning Peter Sanders Dominik Schultes Institut für Theoretische Informatik Algorithmik II Universität Karlsruhe (TH) Eindhoven, March
More informationReverse Nearest Neighbors Search in Ad-hoc Subspaces
Reverse Nearest Neighbors Search in Ad-hoc Subspaces Man Lung Yiu Department of Computer Science Aalborg University DK-9220 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Department of Computer Science
More informationNearest Neighbor Queries
Nearest Neighbor Queries Nick Roussopoulos Stephen Kelley Frederic Vincent University of Maryland May 1995 Problem / Motivation Given a point in space, find the k NN classic NN queries (find the nearest
More informationApproximate Nearest Line Search in High Dimensions. Sepideh Mahabadi
Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi The NLS Problem Given: a set of N lines L in R d The NLS Problem Given: a set of N lines L in R d Goal: build a data structure s.t. given
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationNearest and reverse nearest neighbor queries for moving objects
VLDB Journal manuscript No. (will be inserted by the editor) Nearest and reverse nearest neighbor queries for moving objects Rimantas Benetis, Christian S. Jensen, Gytis Karčiauskas, Simonas Šaltenis Aalborg
More informationChapter 5: Outlier Detection
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 5: Outlier Detection Lecture: Prof. Dr.
More informationConstrained Shortest Path Computation
Constrained Shortest Path Computation Manolis Terrovitis, Spiridon Bakiras, Dimitris Papadias, and Kyriakos Mouratidis Electrical and Computer Engineering Department, National Technical University of Athens,
More informationCourse Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta
Database Management Systems Winter 2002 CMPUT 39: Spatial Data Management Dr. Jörg Sander & Dr. Osmar. Zaïane University of Alberta Chapter 26 of Textbook Course Content Introduction Database Design Theory
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #19: Machine Learning 1 Supervised Learning Would like to do predicbon: esbmate a func3on f(x) so that y = f(x) Where y can be: Real number:
More informationKnowledge Discovery in Spatial Databases
Invited Paper at 23rd German Conf. on Artificial Intelligence (KI 99), Bonn, Germany, 1999. Knowledge Discovery in Spatial Databases Martin Ester, Hans-Peter Kriegel, Jörg Sander Institute for Computer
More informationABSTRACT. Mining Massive-Scale Time Series Data using Hashing. Chen Luo
ABSTRACT Mining Massive-Scale Time Series Data using Hashing by Chen Luo Similarity search on time series is a frequent operation in large-scale data-driven applications. Sophisticated similarity measures
More informationHighway Hierarchies Hasten Exact Shortest Path Queries
Sanders/Schultes: Highway Hierarchies 1 Highway Hierarchies Hasten Exact Shortest Path Queries Peter Sanders and Dominik Schultes Universität Karlsruhe October 2005 Sanders/Schultes: Highway Hierarchies
More informationClustering Billions of Images with Large Scale Nearest Neighbor Search
Clustering Billions of Images with Large Scale Nearest Neighbor Search Ting Liu, Charles Rosenberg, Henry A. Rowley IEEE Workshop on Applications of Computer Vision February 2007 Presented by Dafna Bitton
More informationNOVEL CACHE SEARCH TO SEARCH THE KEYWORD COVERS FROM SPATIAL DATABASE
NOVEL CACHE SEARCH TO SEARCH THE KEYWORD COVERS FROM SPATIAL DATABASE 1 Asma Akbar, 2 Mohammed Naqueeb Ahmad 1 M.Tech Student, Department of CSE, Deccan College of Engineering and Technology, Darussalam
More informationClassifier Inspired Scaling for Training Set Selection
Classifier Inspired Scaling for Training Set Selection Walter Bennette DISTRIBUTION A: Approved for public release: distribution unlimited: 16 May 2016. Case #88ABW-2016-2511 Outline Instance-based classification
More informationNNH: Improving Performance of Nearest-Neighbor Searches Using Histograms (Full Version. UCI Technical Report, Dec. 2003)
NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms (Full Version. UCI Technical Report, Dec. 2003) Liang Jin 1, Nick Koudas 2,andChenLi 1 1 School of Information and Computer Science,
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationAlgorithms for Nearest Neighbors
Algorithms for Nearest Neighbors Classic Ideas, New Ideas Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura University of Toronto, July 2007 1 / 39 Outline
More information9/23/2009 CONFERENCES CONTINUOUS NEAREST NEIGHBOR SEARCH INTRODUCTION OVERVIEW PRELIMINARY -- POINT NN QUERIES
CONFERENCES Short Name SIGMOD Full Name Special Interest Group on Management Of Data CONTINUOUS NEAREST NEIGHBOR SEARCH Yufei Tao, Dimitris Papadias, Qiongmao Shen Hong Kong University of Science and Technology
More informationK-Nearest Neighbour (Continued) Dr. Xiaowei Huang
K-Nearest Neighbour (Continued) Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ A few things: No lectures on Week 7 (i.e., the week starting from Monday 5 th November), and Week 11 (i.e., the week
More informationFast Inbound Top- K Query for Random Walk with Restart
Fast Inbound Top- K Query for Random Walk with Restart Chao Zhang, Shan Jiang, Yucheng Chen, Yidan Sun, Jiawei Han University of Illinois at Urbana Champaign czhang82@illinois.edu 1 Outline Background
More informationDynamic Skyline Queries in Large Graphs
Dynamic Skyline Queries in Large Graphs Lei Zou, Lei Chen 2, M. Tamer Özsu 3, and Dongyan Zhao,4 Institute of Computer Science and Technology, Peking University, Beijing, China, {zoulei,zdy}@icst.pku.edu.cn
More informationDatenbanksysteme II: Multidimensional Index Structures 2. Ulf Leser
Datenbanksysteme II: Multidimensional Index Structures 2 Ulf Leser Content of this Lecture Introduction Partitioned Hashing Grid Files kdb Trees kd Tree kdb Tree R Trees Example: Nearest neighbor image
More informationV Simpósio Brasileiro de Geoinformática (GEOINFO 2003), Campos do Jordão (SP), Efficient Query Processing on the Relational Quadtree
V Simpósio Brasileiro de Geoinformática (GEOINFO 23), Campos do Jordão (SP), 23. Efficient Query Processing on the Relational Quadtree HANS-PETER KRIEGEL, PETER KUNATH, MARTIN PFEIFLE, MATTHIAS RENZ University
More informationLearning Models of Similarity: Metric and Kernel Learning. Eric Heim, University of Pittsburgh
Learning Models of Similarity: Metric and Kernel Learning Eric Heim, University of Pittsburgh Standard Machine Learning Pipeline Manually-Tuned Features Machine Learning Model Desired Output for Task Features
More informationBatch Nearest Neighbor Search for Video Retrieval
1 Batch Nearest Neighbor Search for Video Retrieval Jie Shao, Zi Huang, Heng Tao Shen, Xiaofang Zhou, Ee-Peng Lim, and Yijun Li EDICS: -KEEP Abstract To retrieve similar videos to a query clip from a large
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationBeyond Sliding Windows: Object Localization by Efficient Subwindow Search
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, & Thomas Hofmann Max Planck Institute for Biological Cybernetics Tübingen, Germany Google,
More informationSPATIAL RANGE QUERY. Rooma Rathore Graduate Student University of Minnesota
SPATIAL RANGE QUERY Rooma Rathore Graduate Student University of Minnesota SYNONYMS Range Query, Window Query DEFINITION Spatial range queries are queries that inquire about certain spatial objects related
More informationNearest and reverse nearest neighbor queries for moving objects
The VLDB Journal (2006) 15(3): 229 250 DOI 10.1007/s00778-005-0166-4 REGULAR PAPER Rimantas Benetis Christian S. Jensen Gytis Karčiauskas Simonas Šaltenis Nearest and reverse nearest neighbor queries for
More informationEffective and Efficient Indexing for Large Video Databases
Proc. 12. GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web (BTW'07), Aachen, Germany, 2007 Effective and Efficient Indexing for Large Video Databases Christian Böhm Peter Kunath Alexey
More informationTree-Weighted Neighbors and Geometric k Smallest Spanning Trees
Tree-Weighted Neighbors and Geometric k Smallest Spanning Trees David Eppstein Department of Information and Computer Science University of California, Irvine, CA 92717 Tech. Report 92-77 July 7, 1992
More informationUsing Natural Clusters Information to Build Fuzzy Indexing Structure
Using Natural Clusters Information to Build Fuzzy Indexing Structure H.Y. Yue, I. King and K.S. Leung Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, New Territories,
More information