Outline. Branch and Bound Algorithms for Nearest Neighbor Search: Lecture 1. Informal Statement. Chapter I

Size: px
Start display at page:

Download "Outline. Branch and Bound Algorithms for Nearest Neighbor Search: Lecture 1. Informal Statement. Chapter I"

Transcription

1 Branch and Bound Algorithms for Nearest Neighbor Search: Lecture Yury Lifshits htt://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Outline Welcome to Nearest Neighbors! Branch and Bound Methodology Around Vantage-Point Trees 4 Generalized Hyerlane Trees and Relatives 5 M-Trees / 6 / 6 Informal Statement Chater I Welcome to Nearest Neighbors! To rerocess a database of n objects so that given a uery object, one can effectively determine its nearest neighbors in database / 6 4 / 6

2 More Formally Search sace: object domain U, similarity function σ Inut: database S = {,..., n } U Query: U Task: find argmax i σ( i, ) Alications (/5) Information Retrieval Content-based retrieval (magnetic resonance images, tomograhy, CAD shaes, time series, texts) Selling correction Geograhic databases (ost-office roblem) Searching for similar DNA seuences Related ages web search Semantic search, concet matching 5 / 6 6 / 6 Alications (/5) Machine Learning Alications (/5) Data Mining knn classification rule: classify by majority of k nearest training examles. E.g. recognition of faces, fingerrints, seaker identity, otical characters Nearest-neighbor interolation Near-dulicate detection Plagiarism detection Comuting co-occurrence similarity (for detecting synonyms, uery extension, machine translation...) Key difference: Mostly, off-line roblems 7 / 6 8 / 6

3 Alications (4/5) Biartite Problems Alications (5/5) As a Subroutine Recommendation systems (most relevant movie to a set of already watched ones) Personalized news aggregation (most relevant news articles to a given user s rofile of interests) Behavioral targeting (most relevant ad for dislaying to a given user) Coding theory (maximum likelihood decoding) MPEG comression (searching for similar fragments in already comressed art) Clustering Key differences: Query and database objects have different nature Objects are described by features and connections 9 / 6 0 / 6 Variations of the Comutation Task Solution asects: Aroximate nearest neighbors Dynamic nearest neighbors: moving objects, deletes/inserts, changing similarity function Related roblems: Nearest neighbor: nearest museum to my hotel Reverse nearest neighbor: all museums for which my hotel is the nearest one Range ueries: all museums u to km from my hotel Closest air: closest air of museum and hotel Satial join: airs of hotels and museums which are at most km aart Multile nearest neighbors: nearest museums for each of these hotels Metric facility location: how to build hotels to minimize the sum of museum nearest hotel distances / 6 Brief History 908 Voronoi diagram 967 knn classification rule by Cover and Hart 97 Post-office roblem osed by Knuth 997 The aer by Kleinberg, beginning of rovable uer/lower bounds 006 Similarity Search book by Zezula, Amato, Dohnal and Batko 008 First International Worksho on Similarity Search. Consider submitting! / 6

4 Tutorial Outline Four lectures: Branch-and-bound: various tree-based data structures for general metric sace Other use of triangle ineuality: Walks, matrix methods, secific tricks for Euclidean sace Maing-based techniues: Locality-sensitive hashing, random rojections 4 Restrictions on inut: Intrinsic dimension, robabilistic analysis and oen roblems Chater II Branch and Bound Methodology Not covered: low-dimensional solutions, exerimental results, arallelization, I/O comlexity, lower bounds, alications / 6 4 / 6 General Metric Sace Tell me definition of metric sace M = (U, d), distance function d satisfies: Non negativity: s, t U : d(s, t) 0 Symmetry: s, t U : d(s, t) = d(t, s) Identity: d(s, t) = 0 s = t Triangle ineuality: r, s, t U : d(r, t) d(r, s) + d(s, t) Basic Examles: Arbitrary metric sace, oracle access to distance function k-dimensional Euclidean sace with Euclidean, weighted Euclidean, Manhattan or L metric Strings with Hamming or Levenshtein distance Metric Saces: More Examles Finite sets with Jaccard metric d(a, B) = A B A B Correlated dimensions: x M ȳ distance Hausdorff distance for sets Similarity saces (no triangle ineuality): Multidimensional vectors with scalar roduct similarity Biartite grah, co-citations similarity for vertices in one art Social networks with number of joint friends similarity 5 / 6 6 / 6

5 Branch and Bound: Search Hierarchy Branch and Bound: Range Search Database S = {,..., n } is reresented by a tree: (,,, 4, 5 ) Task: find all i d( i, ) r: (,,, 4, 5 ) Every node corresonds to a subset of S Root corresonds to S itself Children s sets cover arent s set Every node contains a descrition of its subtree roviding easy-comutable lower bound for d(, ) in the corresonding subset (,, ) ( 4, 5 ) (, ) 4 5 Make a deth-first traversal of search hierarchy At every node comute the lower bound for its subtree Prune branches with lower bounds above r (,, ) ( 4, 5 ) (, ) / 6 8 / 6 B&B: Nearest Neighbor Search Task: find argmin i d( i, ): Pick a random i, set NN := i, r NN := d( i, ) Start range search with r NN range Whenever meet such that d(, ) < r NN, udate NN :=, r NN := d(, ) B&B: Best Bin First Task: find argmin i d( i, ): Pick a random i, set NN := i, r NN := d( i, ) Put the root node into insection ueue Every time: take the node with a smallest lower bound from insection ueue, comute lower bounds for children subtrees 4 Insert children with lower bound below r NN into insection ueue; rune other children branches 5 Whenever meet such that d(, ) < r NN, udate NN :=, r NN := d(, ) 9 / 6 0 / 6

6 Some Tree-Based Data Structures Shere Rectangle Tree k-d-b tree Geometric near-neighbor access tree Excluded middle vantage oint forest mv-tree Fixed-height fixed-ueries tree Vantage-oint tree R -tree Burkhard-Keller tree BBD tree Voronoi tree Balanced asect ratio tree Metric tree v s -tree M-tree SS-tree R-tree Satial aroximation tree Multi-vantage oint tree Bisector tree mb-tree Generalized hyerlane tree Hybrid tree Slim tree Sill Tree Fixed ueries tree X-tree k-d tree Balltree Quadtree Octree SR-tree Post-office tree / 6 Chater III Vantage-Point Trees and Relatives / 6 Vantage-Point Partitioning Uhlmann 9, Yianilos 9: Choose some object in database (called ivot) Choose artitioning radius r Put all i such that d( i, ) r into inner art, others to the outer art 4 Recursively reeat Pruning Conditions For r-range search: If d(, ) > r + r rune the inner branch If d(, ) < r r rune the outer branch For r r d(, ) r + r we have to insect both branches r r r r r r r / 6 4 / 6

7 Variations of Vantage-Point Trees Burkhard-Keller tree: ivot used to divide the sace into m rings Burkhard&Keller 7 MVP-tree: use the same ivot for different nodes in one level Bozkaya&Ozsoyoglu 97 Post-office tree: use r + δ for inner branch, r δ for outer branch McNutt 7 Chater IV Generalized Hyerlane Trees and Relatives 5 / 6 6 / 6 Generalized Hyerlane Tree Partitioning techniue (Uhlmann 9): Pick two objects (called ivots) and Put all objects that are closer to than to to the left branch, others to the right branch Recursively reeat GH-Tree: Pruning Conditions For r-range search: If d(, ) > d(, ) + r rune the left branch If d(, ) < d(, ) r rune the right branch For d(, ) d(, ) r we have to insect both branches d(,) d(,) 7 / 6 8 / 6

8 Bisector trees Let s kee the covering radius for and left branch, for and right branch: useful information for stronger runing conditions Geometric Near-Neighbor Access Tree Brin 95: Variation: monotonous bisector tree (Noltemeier, Verbarg, Zirkelbach 9) always uses arent ivot as one of two children ivots Exercise: rove that covering radii are monotonically decrease in mb-trees 9 / 6 Use m ivots Branch i consists of objects for which i is the closest ivot Stores minimal and maximal distances from ivots to all brother -branches 0 / 6 M-tree: Data structure Chater V Ciaccia, Patella, Zezula 97: All database objects are stored in leaf nodes (buckets of fixed size) M-trees Every internal nodes has associated ivot, covering radius and legal range for number of children (e.g. -) Usual deth-first or best-first search Secial algorithms for insertions and deletions a-la B-tree / 6 / 6

9 M-tree: Insertions All insertions haen at the leaf nodes: Choose the leaf node using minimal exansion of covering radius rincile If the leaf node contains fewer than the maximum legal number of elements, there is room for one more. Insert; udate all covering radii Otherwise the leaf node is slit into two nodes Use two ivots generalized hyerlane artitioning Both ivots are added to the node s arent, which may cause it to be slit, and so on Exercises Prove that Jaccard distance d(a, B) = A B A B satisfies triangle ineuality Prove that covering radii are monotonically decrease in mb-trees Construct a database and a set of otential ueries in some multidimensional Euclidean sace for which all described data structures reuire Ω(n) nearest neighbor search time / 6 4 / 6 Highlights References Nearest neighbor search is fundamental for information retrieval, data mining, machine learning and recommendation systems Balls, generalized hyerlanes and Voronoi cells are used for sace artitioning Deth-first and Best-first strategies are used for search Thanks for your attention! Questions? Course homeage htt://simsearch.yury.name/tutorial.html Y. Lifshits The Homeage of Nearest Neighbors and Similarity Search htt://simsearch.yury.name P. Zezula, G. Amato, V. Dohnal, M. Batko Similarity Search: The Metric Sace Aroach. Sringer, 006. htt:// E. Chávez, G. Navarro, R. Baeza-Yates, J. L. Marrouín Searching in Metric Saces. ACM Comuting Surveys, 00. htt:// G.R. Hjaltason, H. Samet Index-driven similarity search in metric saces. ACM Transactions on Database Systems, 00 htt:// gateway.cfm.df 5 / 6 6 / 6

Branch and Bound. Algorithms for Nearest Neighbor Search: Lecture 1. Yury Lifshits

Branch and Bound. Algorithms for Nearest Neighbor Search: Lecture 1. Yury Lifshits Branch and Bound Algorithms for Nearest Neighbor Search: Lecture 1 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology 1 / 36 Outline 1 Welcome

More information

Nearest Neighbor Search by Branch and Bound

Nearest Neighbor Search by Branch and Bound Nearest Neighbor Search by Branch and Bound Algorithmic Problems Around the Web #2 Yury Lifshits http://yury.name CalTech, Fall 07, CS101.2, http://yury.name/algoweb.html 1 / 30 Outline 1 Short Intro to

More information

Outline. Other Use of Triangle Inequality Algorithms for Nearest Neighbor Search: Lecture 2. Orchard s Algorithm. Chapter VI

Outline. Other Use of Triangle Inequality Algorithms for Nearest Neighbor Search: Lecture 2. Orchard s Algorithm. Chapter VI Other Use of Triangle Ineuality Algorithms for Nearest Neighbor Search: Lecture 2 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Outline

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Classic Ideas, New Ideas Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura University of Toronto, July 2007 1 / 39 Outline

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors State-of-the-Art Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Yandex Tech Seminar, April 2007 1 / 28 Outline 1 Problem Statement Applications Data Models

More information

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions

More information

Lecture 8: Orthogonal Range Searching

Lecture 8: Orthogonal Range Searching CPS234 Comutational Geometry Setember 22nd, 2005 Lecture 8: Orthogonal Range Searching Lecturer: Pankaj K. Agarwal Scribe: Mason F. Matthews 8.1 Range Searching The general roblem of range searching is

More information

Geometric data structures:

Geometric data structures: Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other

More information

Range Searching. Data structure for a set of objects (points, rectangles, polygons) for efficient range queries.

Range Searching. Data structure for a set of objects (points, rectangles, polygons) for efficient range queries. Range Searching Data structure for a set of objects (oints, rectangles, olygons) for efficient range queries. Y Q Deends on tye of objects and queries. Consider basic data structures with broad alicability.

More information

We will then introduce the DT, discuss some of its fundamental properties and show how to compute a DT directly from a given set of points.

We will then introduce the DT, discuss some of its fundamental properties and show how to compute a DT directly from a given set of points. Voronoi Diagram and Delaunay Triangulation 1 Introduction The Voronoi Diagram (VD, for short) is a ubiquitious structure that aears in a variety of discilines - biology, geograhy, ecology, crystallograhy,

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers

Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers Clustering-Based Similarity Search in Metric Spaces with Sparse Spatial Centers Nieves Brisaboa 1, Oscar Pedreira 1, Diego Seco 1, Roberto Solar 2,3, Roberto Uribe 2,3 1 Database Laboratory, University

More information

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.)

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.) Advanced Algorithms Fall 2015 Lecture 3: Geometric Algorithms(Convex sets, Divide & Conuer Algo.) Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual

More information

Foundations of Multidimensional and Metric Data Structures

Foundations of Multidimensional and Metric Data Structures Foundations of Multidimensional and Metric Data Structures Hanan Samet University of Maryland, College Park ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE

More information

The Spatial Skyline Queries

The Spatial Skyline Queries The Satial Skyline Queries Mehdi Sharifzadeh Comuter Science Deartment University of Southern California Los Angeles, CA 90089-078 sharifza@usc.edu Cyrus Shahabi Comuter Science Deartment University of

More information

The Spatial Skyline Queries

The Spatial Skyline Queries Coffee sho The Satial Skyline Queries Mehdi Sharifzadeh and Cyrus Shahabi VLDB 006 Presented by Ali Khodaei Coffee sho Three friends Coffee sho Three friends Don t choose this lace is closer to each three

More information

COT5405: GEOMETRIC ALGORITHMS

COT5405: GEOMETRIC ALGORITHMS COT5405: GEOMETRIC ALGORITHMS Objects: Points in, Segments, Lines, Circles, Triangles Polygons, Polyhedra R n Alications Vision, Grahics, Visualizations, Databases, Data mining, Networks, GIS Scientific

More information

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling *

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling * ivot Selection for Dimension Reduction Using Annealing by Increasing Resamling * Yasunobu Imamura 1, Naoya Higuchi 1, Tetsuji Kuboyama 2, Kouichi Hirata 1 and Takeshi Shinohara 1 1 Kyushu Institute of

More information

CMSC 425: Lecture 16 Motion Planning: Basic Concepts

CMSC 425: Lecture 16 Motion Planning: Basic Concepts : Lecture 16 Motion lanning: Basic Concets eading: Today s material comes from various sources, including AI Game rogramming Wisdom 2 by S. abin and lanning Algorithms by S. M. LaValle (Chats. 4 and 5).

More information

Continuous Visible k Nearest Neighbor Query on Moving Objects

Continuous Visible k Nearest Neighbor Query on Moving Objects Continuous Visible k Nearest Neighbor Query on Moving Objects Yaniu Wang a, Rui Zhang b, Chuanfei Xu a, Jianzhong Qi b, Yu Gu a, Ge Yu a, a Deartment of Comuter Software and Theory, Northeastern University,

More information

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague

doc. RNDr. Tomáš Skopal, Ph.D. Department of Software Engineering, Faculty of Information Technology, Czech Technical University in Prague Praha & EU: Investujeme do vaší budoucnosti Evropský sociální fond course: Searching the Web and Multimedia Databases (BI-VWM) Tomáš Skopal, 2011 SS2010/11 doc. RNDr. Tomáš Skopal, Ph.D. Department of

More information

Computational Geometry: Proximity and Location

Computational Geometry: Proximity and Location 63 Comutational Geometry: Proximity and Location 63.1 Introduction Proximity and location are fundamental concets in geometric comutation. The term roximity refers informally to the uality of being close

More information

Efficient Parallel Hierarchical Clustering

Efficient Parallel Hierarchical Clustering Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore

More information

Introduction to Visualization and Computer Graphics

Introduction to Visualization and Computer Graphics Introduction to Visualization and Comuter Grahics DH2320, Fall 2015 Prof. Dr. Tino Weinkauf Introduction to Visualization and Comuter Grahics Grids and Interolation Next Tuesday No lecture next Tuesday!

More information

CS2 Algorithms and Data Structures Note 8

CS2 Algorithms and Data Structures Note 8 CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in

More information

Cross products. p 2 p. p p1 p2. p 1. Line segments The convex combination of two distinct points p1 ( x1, such that for some real number with 0 1,

Cross products. p 2 p. p p1 p2. p 1. Line segments The convex combination of two distinct points p1 ( x1, such that for some real number with 0 1, CHAPTER 33 Comutational Geometry Is the branch of comuter science that studies algorithms for solving geometric roblems. Has alications in many fields, including comuter grahics robotics, VLSI design comuter

More information

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric

More information

Nearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017

Nearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017 Nearest neighbors Focus on tree-based methods Clément Jamin, GUDHI project, Inria March 2017 Introduction Exact and approximate nearest neighbor search Essential tool for many applications Huge bibliography

More information

Randomized algorithms: Two examples and Yao s Minimax Principle

Randomized algorithms: Two examples and Yao s Minimax Principle Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum

More information

Experiments on Patent Retrieval at NTCIR-4 Workshop

Experiments on Patent Retrieval at NTCIR-4 Workshop Working Notes of NTCIR-4, Tokyo, 2-4 June 2004 Exeriments on Patent Retrieval at NTCIR-4 Worksho Hironori Takeuchi Λ Naohiko Uramoto Λy Koichi Takeda Λ Λ Tokyo Research Laboratory, IBM Research y National

More information

Directed File Transfer Scheduling

Directed File Transfer Scheduling Directed File Transfer Scheduling Weizhen Mao Deartment of Comuter Science The College of William and Mary Williamsburg, Virginia 387-8795 wm@cs.wm.edu Abstract The file transfer scheduling roblem was

More information

split split (a) (b) split split (c) (d)

split split (a) (b) split split (c) (d) International Journal of Foundations of Comuter Science c World Scientic Publishing Comany ON COST-OPTIMAL MERGE OF TWO INTRANSITIVE SORTED SEQUENCES JIE WU Deartment of Comuter Science and Engineering

More information

Index-Driven Similarity Search in Metric Spaces

Index-Driven Similarity Search in Metric Spaces Index-Driven Similarity Search in Metric Spaces GISLI R. HJALTASON and HANAN SAMET University of Maryland, College Park, Maryland Similarity search is a very important operation in multimedia databases

More information

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013 Lecture 24: Image Retrieval: Part II Visual Computing Systems Review: K-D tree Spatial partitioning hierarchy K = dimensionality of space (below: K = 2) 3 2 1 3 3 4 2 Counts of points in leaf nodes Nearest

More information

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model. U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture

More information

CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE

CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE CENTRAL AND PARALLEL PROJECTIONS OF REGULAR SURFACES: GEOMETRIC CONSTRUCTIONS USING 3D MODELING SOFTWARE Petra Surynková Charles University in Prague, Faculty of Mathematics and Physics, Sokolovská 83,

More information

CS2 Algorithms and Data Structures Note 8

CS2 Algorithms and Data Structures Note 8 CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in

More information

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University.

Relations with Relation Names as Arguments: Algebra and Calculus. Kenneth A. Ross. Columbia University. Relations with Relation Names as Arguments: Algebra and Calculus Kenneth A. Ross Columbia University kar@cs.columbia.edu Abstract We consider a version of the relational model in which relation names may

More information

DB-GNG: A constructive Self-Organizing Map based on density

DB-GNG: A constructive Self-Organizing Map based on density DB-GNG: A constructive Self-Organizing Map based on density Alexander Ocsa, Carlos Bedregal, Ernesto Cuadros-Vargas Abstract Nowadays applications require efficient and fast techniques due to the growing

More information

INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE

INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE The International Archives of the Photogrammetry, Remote Sensing and Satial Information Sciences, Vol. 38, Part II INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE Min

More information

Algorithms for Euclidean TSP

Algorithms for Euclidean TSP This week, paper [2] by Arora. See the slides for figures. See also http://www.cs.princeton.edu/~arora/pubs/arorageo.ps Algorithms for Introduction This lecture is about the polynomial time approximation

More information

Analyzing Borders Between Partially Contradicting Fuzzy Classification Rules

Analyzing Borders Between Partially Contradicting Fuzzy Classification Rules Analyzing Borders Between Partially Contradicting Fuzzy Classification Rules Andreas Nürnberger, Aljoscha Klose, Rudolf Kruse University of Magdeburg, Faculty of Comuter Science 39106 Magdeburg, Germany

More information

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Clustering Billions of Images with Large Scale Nearest Neighbor Search Clustering Billions of Images with Large Scale Nearest Neighbor Search Ting Liu, Charles Rosenberg, Henry A. Rowley IEEE Workshop on Applications of Computer Vision February 2007 Presented by Dafna Bitton

More information

Locality- Sensitive Hashing Random Projections for NN Search

Locality- Sensitive Hashing Random Projections for NN Search Case Study 2: Document Retrieval Locality- Sensitive Hashing Random Projections for NN Search Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 18, 2017 Sham Kakade

More information

Image Segmentation Using Topological Persistence

Image Segmentation Using Topological Persistence Image Segmentation Using Toological Persistence David Letscher and Jason Fritts Saint Louis University Deartment of Mathematics and Comuter Science {letscher, jfritts}@slu.edu Abstract. This aer resents

More information

Privacy Preserving Moving KNN Queries

Privacy Preserving Moving KNN Queries Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,

More information

Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order

Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order E. Chávez 1, K. Figueroa 1,2, and G. Navarro 2 1 Facultad de Ciencias Físico-Matemáticas, Universidad Michoacana, México.

More information

Chapter 1 - Basic Equations

Chapter 1 - Basic Equations 2.20 - Marine Hydrodynamics, Sring 2005 Lecture 2 2.20 Marine Hydrodynamics Lecture 2 Chater 1 - Basic Equations 1.1 Descrition of a Flow To define a flow we use either the Lagrangian descrition or the

More information

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Scene Completion Problem The Bare Data Approach High Dimensional Data Many real-world problems Web Search and Text Mining Billions

More information

Problem 1: Complexity of Update Rules for Logistic Regression

Problem 1: Complexity of Update Rules for Logistic Regression Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1

More information

K-Nearest Neighbor Finding Using the MaxNearestDist Estimator

K-Nearest Neighbor Finding Using the MaxNearestDist Estimator K-Nearest Neighbor Finding Using the MaxNearestDist Estimator Hanan Samet hjs@cs.umd.edu www.cs.umd.edu/ hjs Department of Computer Science Center for Automation Research Institute for Advanced Computer

More information

Fully Dynamic Metric Access Methods based on Hyperplane Partitioning

Fully Dynamic Metric Access Methods based on Hyperplane Partitioning Fully Dynamic Metric Access Methods based on Hyperplane Partitioning Gonzalo Navarro a, Roberto Uribe-Paredes b,c a Department of Computer Science, University of Chile, Santiago, Chile. gnavarro@dcc.uchile.cl.

More information

Constrained Empty-Rectangle Delaunay Graphs

Constrained Empty-Rectangle Delaunay Graphs CCCG 2015, Kingston, Ontario, August 10 12, 2015 Constrained Emty-Rectangle Delaunay Grahs Prosenjit Bose Jean-Lou De Carufel André van Renssen Abstract Given an arbitrary convex shae C, a set P of oints

More information

521493S Computer Graphics Exercise 3 (Chapters 6-8)

521493S Computer Graphics Exercise 3 (Chapters 6-8) 521493S Comuter Grahics Exercise 3 (Chaters 6-8) 1 Most grahics systems and APIs use the simle lighting and reflection models that we introduced for olygon rendering Describe the ways in which each of

More information

Scalability Comparison of Peer-to-Peer Similarity-Search Structures

Scalability Comparison of Peer-to-Peer Similarity-Search Structures Scalability Comparison of Peer-to-Peer Similarity-Search Structures Michal Batko a David Novak a Fabrizio Falchi b Pavel Zezula a a Masaryk University, Brno, Czech Republic b ISTI-CNR, Pisa, Italy Abstract

More information

Physically Based Rendering ( ) Intersection Acceleration

Physically Based Rendering ( ) Intersection Acceleration Physically ased Rendering (600.657) Intersection cceleration Intersection Testing ccelerated techniques try to leverage: Grouing: To efficiently discard grous of rimitives that are guaranteed to be missed

More information

Cross products Line segments The convex combination of two distinct points p

Cross products Line segments The convex combination of two distinct points p CHAPTER Comutational Geometry Is the branch of comuter science that studies algorithms for solving geometric roblems. Has alications in many fields, including comuter grahics robotics, VLSI design comuter

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

Stereo Disparity Estimation in Moment Space

Stereo Disparity Estimation in Moment Space Stereo Disarity Estimation in oment Sace Angeline Pang Faculty of Information Technology, ultimedia University, 63 Cyberjaya, alaysia. angeline.ang@mmu.edu.my R. ukundan Deartment of Comuter Science, University

More information

Figure 8.1: Home age taken from the examle health education site (htt:// Setember 14, 2001). 201

Figure 8.1: Home age taken from the examle health education site (htt://  Setember 14, 2001). 201 200 Chater 8 Alying the Web Interface Profiles: Examle Web Site Assessment 8.1 Introduction This chater describes the use of the rofiles develoed in Chater 6 to assess and imrove the quality of an examle

More information

Bayesian Oil Spill Segmentation of SAR Images via Graph Cuts 1

Bayesian Oil Spill Segmentation of SAR Images via Graph Cuts 1 Bayesian Oil Sill Segmentation of SAR Images via Grah Cuts 1 Sónia Pelizzari and José M. Bioucas-Dias Instituto de Telecomunicações, I.S.T., TULisbon,Lisboa, Portugal sonia@lx.it.t, bioucas@lx.it.t Abstract.

More information

Multi-robot SLAM with Unknown Initial Correspondence: The Robot Rendezvous Case

Multi-robot SLAM with Unknown Initial Correspondence: The Robot Rendezvous Case Multi-robot SLAM with Unknown Initial Corresondence: The Robot Rendezvous Case Xun S. Zhou and Stergios I. Roumeliotis Deartment of Comuter Science & Engineering, University of Minnesota, Minneaolis, MN

More information

An accurate and fast point-to-plane registration technique

An accurate and fast point-to-plane registration technique Pattern Recognition Letters 24 (23) 2967 2976 www.elsevier.com/locate/atrec An accurate and fast oint-to-lane registration technique Soon-Yong Park *, Murali Subbarao Deartment of Electrical and Comuter

More information

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER

GEOMETRIC CONSTRAINT SOLVING IN < 2 AND < 3. Department of Computer Sciences, Purdue University. and PAMELA J. VERMEER GEOMETRIC CONSTRAINT SOLVING IN < AND < 3 CHRISTOPH M. HOFFMANN Deartment of Comuter Sciences, Purdue University West Lafayette, Indiana 47907-1398, USA and PAMELA J. VERMEER Deartment of Comuter Sciences,

More information

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH Jin Lu, José M. F. Moura, and Urs Niesen Deartment of Electrical and Comuter Engineering Carnegie Mellon University, Pittsburgh, PA 15213 jinlu, moura@ece.cmu.edu

More information

Pivoting M-tree: A Metric Access Method for Efficient Similarity Search

Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Tomáš Skopal Department of Computer Science, VŠB Technical University of Ostrava, tř. 17. listopadu 15, Ostrava, Czech Republic tomas.skopal@vsb.cz

More information

Multidimensional Indexing The R Tree

Multidimensional Indexing The R Tree Multidimensional Indexing The R Tree Module 7, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Single-Dimensional Indexes B+ trees are fundamentally single-dimensional indexes. When we create

More information

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551,

More information

Improving the Performance of M-tree Family by Nearest-Neighbor Graphs

Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Tomáš Skopal, David Hoksza Charles University in Prague, FMP, Department of Software Engineering Malostranské nám. 25, 118 00 Prague,

More information

Nearest Neighbor Methods

Nearest Neighbor Methods Nearest Neighbor Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Nearest Neighbor Methods Learning Store all training examples Classifying a

More information

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval Case Study 2: Document Retrieval Task Description: Finding Similar Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 11, 2017 Sham Kakade 2017 1 Document

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

Data Mining and Machine Learning: Techniques and Algorithms

Data Mining and Machine Learning: Techniques and Algorithms Instance based classification Data Mining and Machine Learning: Techniques and Algorithms Eneldo Loza Mencía eneldo@ke.tu-darmstadt.de Knowledge Engineering Group, TU Darmstadt International Week 2019,

More information

Spatial Data Structures

Spatial Data Structures CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration

More information

Efficient Sequence Generator Mining and its Application in Classification

Efficient Sequence Generator Mining and its Application in Classification Efficient Sequence Generator Mining and its Alication in Classification Chuancong Gao, Jianyong Wang 2, Yukai He 3 and Lizhu Zhou 4 Tsinghua University, Beijing 0084, China {gaocc07, heyk05 3 }@mails.tsinghua.edu.cn,

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

Introduction to Image Compresion

Introduction to Image Compresion ENEE63 Fall 2 Lecture-7 Introduction to Image Comresion htt://www.ece.umd.edu/class/enee63/ minwu@eng.umd.edu M. Wu: ENEE63 Digital Image Processing (Fall') Min Wu Electrical & Comuter Engineering Univ.

More information

CMSC 754 Computational Geometry 1

CMSC 754 Computational Geometry 1 CMSC 754 Comutational Geometry 1 David M. Mount Deartment of Comuter Science University of Maryland Fall 2002 1 Coyright, David M. Mount, 2002, Det. of Comuter Science, University of Maryland, College

More information

Finding the k-closest pairs in metric spaces

Finding the k-closest pairs in metric spaces Finding the k-closest pairs in metric spaces Hisashi Kurasawa The University of Tokyo 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, JAPAN kurasawa@nii.ac.jp Atsuhiro Takasu National Institute of Informatics 2-1-2

More information

Handling Multiple K-Nearest Neighbor Query Verifications on Road Networks under Multiple Data Owners

Handling Multiple K-Nearest Neighbor Query Verifications on Road Networks under Multiple Data Owners Handling Multiple K-Nearest Neighbor Query Verifications on Road Networks under Multiple Data Owners S.Susanna 1, Dr.S.Vasundra 2 1 Student, Department of CSE, JNTU Anantapur, Andhra Pradesh, India 2 Professor,

More information

DBM-Tree: A Dynamic Metric Access Method Sensitive to Local Density Data

DBM-Tree: A Dynamic Metric Access Method Sensitive to Local Density Data DB: A Dynamic Metric Access Method Sensitive to Local Density Data Marcos R. Vieira, Caetano Traina Jr., Fabio J. T. Chino, Agma J. M. Traina Department of Computer Science, University of São Paulo at

More information

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 4 Instance-Based Learning Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Instance Based Classifiers

More information

Similarity Searching:

Similarity Searching: Similarity Searching: Indexing, Nearest Neighbor Finding, Dimensionality Reduction, and Embedding Methods, for Applications in Multimedia Databases Hanan Samet hjs@cs.umd.edu Department of Computer Science

More information

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces Wavelet Based Statistical Adated Local Binary atterns for Recognizing Avatar Faces Abdallah A. Mohamed 1, 2 and Roman V. Yamolskiy 1 1 Comuter Engineering and Comuter Science, University of Louisville,

More information

Machine Learning: Symbol-based

Machine Learning: Symbol-based 10c Machine Learning: Symbol-based 10.0 Introduction 10.1 A Framework for Symbol-based Learning 10.2 Version Space Search 10.3 The ID3 Decision Tree Induction Algorithm 10.4 Inductive Bias and Learnability

More information

Spatial Data Structures

Spatial Data Structures CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/

More information

Face Recognition Using Legendre Moments

Face Recognition Using Legendre Moments Face Recognition Using Legendre Moments Dr.S.Annadurai 1 A.Saradha Professor & Head of CSE & IT Research scholar in CSE Government College of Technology, Government College of Technology, Coimbatore, Tamilnadu,

More information

Chapter 4: Non-Parametric Techniques

Chapter 4: Non-Parametric Techniques Chapter 4: Non-Parametric Techniques Introduction Density Estimation Parzen Windows Kn-Nearest Neighbor Density Estimation K-Nearest Neighbor (KNN) Decision Rule Supervised Learning How to fit a density

More information

Data Structures for Approximate Proximity and Range Searching

Data Structures for Approximate Proximity and Range Searching Data Structures for Approximate Proximity and Range Searching David M. Mount University of Maryland Joint work with: Sunil Arya (Hong Kong U. of Sci. and Tech) Charis Malamatos (Max Plank Inst.) 1 Introduction

More information

Balanced Box-Decomposition trees for Approximate nearest-neighbor. Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry

Balanced Box-Decomposition trees for Approximate nearest-neighbor. Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry Balanced Box-Decomposition trees for Approximate nearest-neighbor 11 Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry Nearest Neighbor A set S of n points is given in some metric

More information

A Morphological LiDAR Points Cloud Filtering Method based on GPGPU

A Morphological LiDAR Points Cloud Filtering Method based on GPGPU A Morhological LiDAR Points Cloud Filtering Method based on GPGPU Shuo Li 1, Hui Wang 1, Qiuhe Ma 1 and Xuan Zha 2 1 Zhengzhou Institute of Surveying & Maing, No.66, Longhai Middle Road, Zhengzhou, China

More information

SIMILARITY SEARCH The Metric Space Approach. Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko

SIMILARITY SEARCH The Metric Space Approach. Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko SIMILARITY SEARCH The Metric Space Approach Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko Table of Contents Part I: Metric searching in a nutshell Foundations of metric space searching

More information

Large Scale Graph Algorithms

Large Scale Graph Algorithms Large Scale Graph Algorithms A Guide to Web Research: Lecture 2 Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Stuttgart, Spring 2007 1 / 34 Talk Objective To pose an abstract computational

More information

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree The Priority R-Tree: A Practically Efficient and Worst-Case Otimal R-Tree Lars Arge Deartment of Comuter Science Duke University, ox 90129 Durham, NC 27708-0129 USA large@cs.duke.edu Mark de erg Deartment

More information

Signature File Hierarchies and Signature Graphs: a New Index Method for Object-Oriented Databases

Signature File Hierarchies and Signature Graphs: a New Index Method for Object-Oriented Databases Signature File Hierarchies and Signature Grahs: a New Index Method for Object-Oriented Databases Yangjun Chen* and Yibin Chen Det. of Business Comuting University of Winnieg, Manitoba, Canada R3B 2E9 ABSTRACT

More information

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM. Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)

More information

Nearest Neighbor Classification

Nearest Neighbor Classification Nearest Neighbor Classification Charles Elkan elkan@cs.ucsd.edu October 9, 2007 The nearest-neighbor method is perhaps the simplest of all algorithms for predicting the class of a test example. The training

More information

A Novel Iris Segmentation Method for Hand-Held Capture Device

A Novel Iris Segmentation Method for Hand-Held Capture Device A Novel Iris Segmentation Method for Hand-Held Cature Device XiaoFu He and PengFei Shi Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200030, China {xfhe,

More information

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets An imroved algorithm for Hausdorff Voronoi diagram for non-crossing sets Frank Dehne, Anil Maheshwari and Ryan Taylor May 26, 2006 Abstract We resent an imroved algorithm for building a Hausdorff Voronoi

More information