A comparative study of k-nearest neighbour techniques in crowd simulation

Similar documents
Crowd simulation. Summerschool Utrecht: Multidisciplinary Game Research. Dr. Roland Geraerts 23 August 2017

Balanced Box-Decomposition trees for Approximate nearest-neighbor. Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry

Scalable Nearest Neighbor Algorithms for High Dimensional Data Marius Muja (UBC), David G. Lowe (Google) IEEE 2014

PARALLEL AND DISTRIBUTED PLATFORM FOR PLUG-AND-PLAY AGENT-BASED SIMULATIONS. Wentong CAI

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray

CROWD SIMULATION AS A SERVICE

University of Florida CISE department Gator Engineering. Clustering Part 2

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions

Clustering in Data Mining

Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search

A Mixed Hierarchical Algorithm for Nearest Neighbor Search

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Strategies for simulating pedestrian navigation with multiple reinforcement learning agents

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

APPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM

Behavioral Animation in Crowd Simulation. Ying Wei

Collision Detection. These slides are mainly from Ming Lin s course notes at UNC Chapel Hill

Large-scale visual recognition Efficient matching

1 Proximity via Graph Spanners

Exploiting Parallelism to Support Scalable Hierarchical Clustering

Introduction to Artificial Intelligence

GRID BASED CLUSTERING

Motion Planning. Howie CHoset

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Real-time Path Planning and Navigation for Multi-Agent and Heterogeneous Crowd Simulation

Cluster Analysis: Agglomerate Hierarchical Clustering

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Planning Techniques for Robotics Planning Representations: Skeletonization- and Grid-based Graphs

Principles of Computer Game Design and Implementation. Lecture 13

Introduction to MAPPER

Two-Phase flows on massively parallel multi-gpu clusters

Geometric Computation: Introduction. Piotr Indyk

Computational Geometry

KD-Tree Algorithm for Propensity Score Matching PhD Qualifying Exam Defense

Architectural Support for Large-Scale Visual Search. Carlo C. del Mundo Vincent Lee Armin Alaghi Luis Ceze Mark Oskin

Geometric Computation: Introduction

ECE276B: Planning & Learning in Robotics Lecture 5: Configuration Space

CS 4649/7649 Robot Intelligence: Planning

PASCAL. A Parallel Algorithmic SCALable Framework for N-body Problems. Laleh Aghababaie Beni, Aparna Chandramowlishwaran. Euro-Par 2017.

Data Structures for Approximate Proximity and Range Searching

Large Scale 3D Reconstruction by Structure from Motion

URBAN SCALE CROWD DATA ANALYSIS, SIMULATION, AND VISUALIZATION

Intro to Artificial Intelligence

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

NEarest neighbor search plays an important role in

Nearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017

Parallel Physically Based Path-tracing and Shading Part 3 of 2. CIS565 Fall 2012 University of Pennsylvania by Yining Karl Li

Clustering & Classification (chapter 15)

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Approximate Nearest Neighbor Problem: Improving Query Time CS468, 10/9/2006

Power to the points: Local certificates for clustering

Foundations of Machine Learning CentraleSupélec Fall Clustering Chloé-Agathe Azencot

Large-Scale Face Manifold Learning

DiFi: Distance Fields - Fast Computation Using Graphics Hardware

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Spatial Queries in Road Networks Based on PINE

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

Geometric Modeling. Mesh Decimation. Mesh Decimation. Applications. Copyright 2010 Gotsman, Pauly Page 1. Oversampled 3D scan data

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Probabilistic Methods for Kinodynamic Path Planning

Supervised vs.unsupervised Learning

NEarest neighbor search plays an important role in

Roadmap Methods vs. Cell Decomposition in Robot Motion Planning

CMU-Q Lecture 4: Path Planning. Teacher: Gianni A. Di Caro

University of Florida CISE department Gator Engineering. Clustering Part 4

Di Zhao Ohio State University MVAPICH User Group (MUG) Meeting, August , Columbus Ohio

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Types of image feature and segmentation

Data Mining and Machine Learning: Techniques and Algorithms

Clustering Part 4 DBSCAN

How do microarrays work

Dynamic Collision Detection

Fast k Nearest Neighbor Search using GPU

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

Cluster-based 3D Reconstruction of Aerial Video

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Many rendering scenarios, such as battle scenes or urban environments, require rendering of large numbers of autonomous characters.

Chapter 6: Cluster Analysis

A Systems View of Large- Scale 3D Reconstruction

Dimension reduction : PCA and Clustering

BlueDBM: An Appliance for Big Data Analytics*

COMP 465: Data Mining Still More on Clustering

Unsupervised Learning and Clustering

Idea. Found boundaries between regions (edges) Didn t return the actual region

Mesh Decimation. Mark Pauly

Unsupervised Learning

Robot Motion Planning Using Generalised Voronoi Diagrams

Applications. Oversampled 3D scan data. ~150k triangles ~80k triangles

Growing Neural Gas A Parallel Approach

Associative Cellular Learning Automata and its Applications

Simulation in Computer Graphics Space Subdivision. Matthias Teschner

Processing (Object Detection)

Bumptrees for Efficient Function, Constraint, and Classification Learning

CS 532: 3D Computer Vision 14 th Set of Notes

MEV 442: Introduction to Robotics - Module 3 INTRODUCTION TO ROBOT PATH PLANNING

9/17/2009. Wenyan Li (Emily Li) Sep. 15, Introduction to Clustering Analysis

OpenMPSuperscalar: Task-Parallel Simulation and Visualization of Crowds with Several CPUs and GPUs

Transcription:

A comparative study of k-nearest neighbour techniques in crowd simulation Jordi Vermeulen Arne Hillebrand Roland Geraerts Department of Information and Computing Sciences Utrecht University, The Netherlands 30th Conference on Computer Animation and Social Agents, May 23, 2017 1 / 25

Context We want efficient crowd simulations. 2 / 25

Context Large amount of computation spent on collision avoidance. Needs several nearest neighbours. Which method for finding nearest neighbours is most efficient? Efficient: Construction Querying Variance 3 / 25

Context The k-nearest neighbour (knn) problem is well-known. Robotics Machine learning Databases Computer vision... Usually: high dimensionality, separation between offline construction and online querying, disk storage. Our case: two or three dimensions, changing data, main memory. 4 / 25

Data structures Data structures selected on prevalence and availability of good implementations. We tested: Data structure Construction time knn query time k-d tree O(n log n) O(k log n) BD-tree O(n log n) O(k log n) R-tree O(n log n) O(k log n) Voronoi diagram O(n log n) O(k log n) k-means O(n 2 ) O(n) Linear search O(1) O(n) Grid O(n) O(n) 5 / 25

k-d tree Split alternatingly along axes. Try to split remaining data in half. https://www.cs.umd.edu/ mount/ann/files/1.1.2/ ANNmanual 1.1.pdf 6 / 25

Box-decomposition tree k-d tree with extra split rule. Split into inner and outer box. https://www.cs.umd.edu/ mount/ann/files/1.1.2/ ANNmanual 1.1.pdf 7 / 25

R-tree Point or volumetric data. Partitions may overlap. Insertion and deletion of data possible. https://en.wikipedia.org/wiki/r-tree 8 / 25

Hierarchical k-means clustering Assign points to centroid. Calculate new centroid and iterate. Apply hierarchically. http://rossfarrelly.blogspot.com/2012/12/ k-means-clustering.html 9 / 25

Voronoi diagrams Cells of points closest to site. Find nearest neighbours by examining neighbouring cells. http://merganser.math.gvsu.edu/david/voronoi.08.06/ 10 / 25

Implementations k-d tree implementations provided by FLANN [1] and nanoflann [2]. FLANN: general-purpose implementation nanoflann: highly optimised for 2D and 3D data FLANN also provides k-means implementation. BD-tree is provided by ANN [3]. [1] Muja and Lowe, FLANN - Fast Library for Approximate Nearest Neighbors (http://www.cs.ubc.ca/research/ flann/) [2] Blanco-Claraco, nanoflann (https://github.com/jlblancoc/nanoflann) [3] Mount and Arya, ANN: A Library for Approximate Nearest Neighbor Searching (http://www.cs.umd.edu/ ~mount/ann/) 11 / 25

Implementations R-tree and Voronoi diagrams are provided by Boost [1]. R-tree has good update performance, test two versions: 1 Rebuild entire tree each time step 2 Update tree incrementally Linear search and grid are own implementations. [1] Gehrels et al., Boost Geometry Library (http://www.boost.org/libs/geometry) 12 / 25

Scenarios Test on artificial and real-world scenarios. Artificial: test specific properties. Density: uniform vs clustered Stationary agents: test with 25, 50 or 75% of agents not moving Scaling: add more agents each time step Real-world: Simulations of evacuation of building Simulations for Tour de France [1] Jülich trajectory data of real crowds [2] [1] van der Zwan, The Impact of Density Measurement on the Fundamental Diagram [2] Keip and Ries, Dokumentation von Versuchen zur Personenstromdynamik 13 / 25

Scenarios - density 14 / 25

Scenarios - evacuation 15 / 25

Scenarios - Tour de France 16 / 25

Scenarios - Jülich bottleneck 17 / 25

Experimental setup Jülich data only available as trajectories (tuples of id, time, x- and y-coordinate). For fair comparison, converted all data to trajectories. C++ testing program reads data per time step, and: 1 Builds the structure for agent positions at current time step 2 Performs knn query for each agent For realism, queries are performed in parallel. We fix k at 10; collision avoidance does not need more. 18 / 25

Results Total of 62 different scenarios: multiple instances of similar settings. Tested on machine running Ubuntu 15.10, with two Xeon 12-core processors and 32 GB of DDR4 RAM. 19 / 25

Results Overall results per agent per time step: 6 Update time (μs) 5 4 3 2 1 0 4 Query time (μs) 3 2 1 0 Grid BD-tree Voronoi R-tree (update) R-tree (rebuild) Linear search k-means k-d tree (nanoflann) k-d tree (FLANN) 20 / 25

Results - scaling Query time (ms) Update time (ms) 300 250 200 150 100 50 0 300 250 200 150 100 50 0 0 200 400 600 800 1000 Time step BD-tree Grid k-d tree (FLANN) k-d tree (nanoflann) k-means Linear search R-tree (rebuild) R-tree (update) Voronoi Linear search quickly infeasible: 16 seconds per time step for 100,000 agents 21 / 25

Results - scaling Query time (ms) Update time (ms) 300 250 200 150 100 50 0 300 250 200 150 100 50 0 0 200 400 600 800 1000 Time step BD-tree Grid k-d tree (FLANN) k-d tree (nanoflann) k-means Linear search R-tree (rebuild) R-tree (update) Voronoi R-tree and FLANN k-d tree have similar query performance, but R-tree over 3x more expensive to update 21 / 25

Results - scaling Query time (ms) Update time (ms) 300 250 200 150 100 50 0 300 250 200 150 100 50 0 0 200 400 600 800 1000 Time step BD-tree Grid k-d tree (FLANN) k-d tree (nanoflann) k-means Linear search R-tree (rebuild) R-tree (update) Voronoi R-tree update 20% faster than rebuild 21 / 25

Results - scaling Query time (ms) Update time (ms) 300 250 200 150 100 50 0 300 250 200 150 100 50 0 0 200 400 600 800 1000 Time step BD-tree Grid k-d tree (FLANN) k-d tree (nanoflann) k-means Linear search R-tree (rebuild) R-tree (update) Voronoi nanoflann 2x faster than FLANN: 100,000 agents in 35 ms 21 / 25

Conclusion nanoflann implementation of k-d tree clearly best option. Fastest except when number of agents very small Lowest variance 100,000 agents in 35 ms per time step Grid competitive for small number of agents (< 1000) due to low update cost. Linear search efficient up to a few hundred agents. Updating R-tree more efficient than rebuilding. 22 / 25

Future work Currently working on extending knn algorithm to multi-layered environments, e.g. buildings with multiple floors. Euclidean nearest neighbours not enough: close x- and y-coordinates may be on different floor Need to consider visibility 23 / 25

Future work Local neighbourhood does not change much between time steps: could update only once every few steps. How often should we update? Compare performance of GPU methods, looking for people with expertise. 24 / 25

Introduction Data structures Experiments Results Conclusion Thanks! J.L.Vermeulen@uu.nl 25 / 25