Mobility Data Management & Exploration

Similar documents
Mobility Data Mining. Mobility data Analysis Foundations

Fosca Giannotti et al,.

Clustering Part 4 DBSCAN

University of Florida CISE department Gator Engineering. Clustering Part 4

Mobility Data Management and Exploration: Theory and Practice

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Clustering CS 550: Machine Learning

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Clustering Lecture 4: Density-based Methods

Data Mining 4. Cluster Analysis

Publishing CitiSense Data: Privacy Concerns and Remedies

CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH

TRAJECTORY PATTERN MINING

Density Based Clustering using Modified PSO based Neighbor Selection

Unsupervised Learning

Constructing Popular Routes from Uncertain Trajectories

Network Traffic Measurements and Analysis

Clustering Algorithm (DBSCAN) VISHAL BHARTI Computer Science Dept. GC, CUNY

Distance-based Methods: Drawbacks

DBSCAN. Presented by: Garrett Poppe

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets

Who Cares about Others Privacy: Personalized Anonymization of Moving Object Trajectories

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE

TrajStore: an Adaptive Storage System for Very Large Trajectory Data Sets

数据挖掘 Introduction to Data Mining

Spatial Outlier Detection

A Review on Cluster Based Approach in Data Mining

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Introduction to Trajectory Clustering. By YONGLI ZHANG

Clustering in Data Mining

CHAPTER 4: CLUSTER ANALYSIS

Density-Based Clustering. Izabela Moise, Evangelos Pournaras

Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf

Efficient Mining of Platoon Patterns in Trajectory Databases I

COMP 465: Data Mining Still More on Clustering

2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.

Clustering. Supervised vs. Unsupervised Learning

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters

Cluster Analysis. Ying Shen, SSE, Tongji University

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

CSE 5243 INTRO. TO DATA MINING

On Discovering Moving Clusters in Spatio-temporal Data

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

DS504/CS586: Big Data Analytics Big Data Clustering II

Detect tracking behavior among trajectory data

DS504/CS586: Big Data Analytics Big Data Clustering II

A FRAMEWORK FOR MINING UNIFYING TRAJECTORY PATTERNS USING SPATIO- TEMPORAL DATASETS BASED ON VARYING TEMPORAL TIGHTNESS Rahila R.

Analysis and Extensions of Popular Clustering Algorithms

An algorithm for Trajectories Classification

Unsupervised Learning : Clustering

Notes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/10/2017)

Supervised vs. Unsupervised Learning

CS Introduction to Data Mining Instructor: Abdullah Mueen

Notes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/09/2018)

Clustering in Ratemaking: Applications in Territories Clustering

Chapter 4: Text Clustering

Foundations of Machine Learning CentraleSupélec Fall Clustering Chloé-Agathe Azencot

ECG782: Multidimensional Digital Signal Processing

ECLT 5810 Clustering

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

[7.3, EA], [9.1, CMB]

Incremental Sub-Trajectory Clustering of Large Moving Object Databases

A System for Discovering Regions of Interest from Trajectory Data

Trajectory Compression under Network Constraints

Contents. Part I Setting the Scene

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

R07. FirstRanker. 7. a) What is text mining? Describe about basic measures for text retrieval. b) Briefly describe document cluster analysis.

Trip Reconstruction and Transportation Mode Extraction on Low Data Rate GPS Data from Mobile Phone

Data Mining Cluster Analysis: Advanced Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining, 2 nd Edition

Motion Tracking and Event Understanding in Video Sequences

Clustering Algorithms In Data Mining

Available online Journal of Scientific and Engineering Research, 2019, 6(1): Research Article

ECLT 5810 Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

A Framework for Mobility Pattern Mining and Privacy- Aware Querying of Trajectory Data

Data Mining Algorithms

Pattern Recognition Lecture Sequential Clustering

Knowledge Discovery in Databases

CSE 347/447: DATA MINING

An Efficient Clustering Algorithm for Moving Object Trajectories

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

CS570: Introduction to Data Mining

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Semantic-Based Surveillance Video Retrieval

Kapitel 4: Clustering

University of Florida CISE department Gator Engineering. Clustering Part 2

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Segmentation and sampling of moving object trajectories based on representativeness.

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

Chapter 5: Outlier Detection

Unsupervised learning on Color Images

Clustering part II 1

INCREASING CLASSIFICATION QUALITY BY USING FUZZY LOGIC

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Transcription:

Mobility Data Management & Exploration Ch. 07. Mobility Data Mining and Knowledge Discovery Nikos Pelekis & Yannis Theodoridis InfoLab University of Piraeus Greece infolab.cs.unipi.gr v.2014.05

Chapter outline 7.1. Clustering in Mobility Data 7.2. Moving Clusters for Capturing Collective Mobility Behavior 7.3. Sequence Pattern Mining in Mobility Data 7.4. Prediction and Classification in Mobility Data 7.5. Summary 2

3 7.1. Clustering in mobility data

Trajectory clustering Technical objectives: Cluster trajectories w.r.t. similarity For each cluster, find its centroid or representative Discover moving clusters (flocks), outliers, etc. Issues: Which distance between trajectories? recall Chap. 6 Which kind of clustering? Partitioning (K-means-like) vs. Densitybased (DBSCAN- or OPTICS- like) solutions How does a cluster centroid or representative look like? 4

Trajectory clustering (cont.) General requirements: Tolerance to noise; Low computational cost; Applicability to complex, possibly non-vectorial data; Non-spherical clusters; etc. E.g.: A traffic jam along a road = snake-shaped cluster State-of-the-art Clustering on entire trajectories: T-OPTICS (2006), CenTR-I-FCM (2009) Clustering on sub-trajectories: TRACLUS (2007), NEAT (2012) 5

T-OPTICS (Trajectory OPTICS) Builds upon OPTICS and the DISSIM function between trajectories Reachability plot (valleys and hills) Valleys à clusters Hills à noise Reachability plot ε threshold 6

CenTR-I-FCM (Clustering under uncertainty) Builds upon Fuzzy-C-Means (a variation of K-means for uncertain data) Motivation: uncertainty of trajectory data should be taken into account Three phases: Step 1: mapping of trajectories in an intuitionistic fuzzy vector space Step 2: discovering the centroid of a bundle of trajectories (algorithm CenTra) Step 3: clustering trajectories under uncertainty (algorithm CenTR-I-FCM) 7

TRACLUS (Trajectory Clustering) Discovers similar portions of trajectories (sub-trajectories) Works in two phases: partitioning grouping 8

TRACLUS (cont.) Algorithm TRACLUS Input: A dataset of trajectories D={tr 1,, tr N } Output: (1) A set of clusters C={C 1,, C O }, (2) A set of representative trajectories /* Partitioning phase */ 1. for each tr i in D do 2. Execute trajectory partitioning; Get a set S of line segments as the result; 3. Accumulate S into a set D ; /* Grouping phase */ 4. Execute Line Segment clustering for D ; Get a set C of clusters as the result; 5. for each C j in C do 6. Execute RTG; Get a representative trajectory as the result; 9

NEAT Clusters trajectories moving on road network. Works in three phases: base cluster formation: partitions trajectories into t-fragments, where each one lies on a single road segment, and forms base clusters, each containing the t-fragments that lie on the same road segment flow cluster formation: combines base clusters w.r.t. merging selectivity function that takes into account flow, density and road speed factors flow cluster refinement: compresses flow clusters (DBSCAN variant) 10

NEAT example 11

Finding representatives in trajectory datasets Recall one of the trajectory clustering issues: How does a cluster centroid or representative look like? Some of the trajectory clustering algorithms inherently provide such centroids or representatives (actually, artificial trajectory-like shapes) TRACLUS representatives CenTR-I-FCM centroids 12

Finding representatives (cont.) TRACLUS representatives three thick lines (black, red, blue) compositions of segments (subtrajectories) a trajectory may be split into different clusters CenTR-I-FCM centroids two sets of cells (green, red) catch the overall complex mobility patterns 13

Finding representatives (cont.) Another approach: T-sampling (2010, 2012) Samples the top-k most representative trajectories following a deterministic voting methodology Trajectory segmentation is neighborhood- rather than geometry-aware! Example: 14

7.2. Moving clusters for capturing collective mobility behavior 15

Flocks and variants Flock: a large enough subset of objects moving along paths close to each other for a certain time Side-effect: the lossy-flock problem 1 2 3 2 1 3 <e 1 3 2 t1 t2 t3 16

Flocks and variants (cont.) Interesting problems arise over the flock concept: Identify long flock patterns (top-k longest flock pattern discovery) Discover meetings (fixed- vs. varying- versions) Discover convergences Discover leaders and followers p1 t2 t7 t3 t7 t3 t2 p3 convergence p2 t2 t3 meeting t7 17

Moving clusters Moving cluster: a sequence of spatial clusters in consecutive timepoints that keep a large percentage of common objects a kind of varying flock Example: <e 4 3 5 6 C1 2 1 3 5 4 6 C2 2 1 3 5 C3 6 2 4 1 t1 t2 t3 t Issue: The consecutive time constraint may result in the loss of interesting patterns 18

Convoys Convoy: a group of objects that has at least m objects, which are density-connected with respect to a distance threshold e, during k consecutive timepoints 19

Group patterns and Swarms Group pattern: a set of moving objects that travel within a radius for certain timestamps that maybe non-consecutive actually, a time-relaxed flock pattern Swarm: a collection of moving objects with cardinality at least m, that are part of the same cluster for at least k timepoints timestamps are not required to be consecutive the geometric trajectory of each object in-between timepoints is not imposed under any constraint 20

21 7.3. Sequence pattern mining in mobility data

Frequent pattern mining Technical objectives: Identify frequent or popular patterns Discover hot spots, hot paths, etc. Two groups of approaches: techniques that identify regularities in the behavior of a single user: Periodic patterns (2007) techniques that reveal collective sequential behavior of a set of users: T-Patterns (2007) 22

T-patterns T-pattern is a sequence of visited regions, frequently visited in a specific order with similar transition times Example: TP1: <(), A> <(9,15), B> (supp:31) TP2: <(), A> <(4,20), C> (supp:26) TP3: <(), A> <(9,12), B> <(10,56), D> (supp:21) Algorithm T-Pattern (with static regions of interest - RoI) Input: (1) a set of input trajectories T, (2) a grid G 0, (3) a minimum support/ density threshold δ, (4) a radius of spatial neighborhoods ε, (5) a temporal threshold τ. Output: A set of pairs (S, A) of sequences of regions with temporal annotations. 1. G=ComputeDensity(T, G 0, ε); 2. RoI=PopularRegions(G, δ); 3. D=Translate(T, RoI); 4. TAS_mining(D, δ, τ); 23

7.4. Prediction and classification in mobility data 24

Prediction and Classification Goal: to predict the future location of a moving object Possible solutions: Naïve : extrapolate w.r.t. current location and velocity vector Alternative: build a prediction model State-of-the-art technique: WhereNext (2009) 25

WhereNext Builds upon the T-pattern concept: extracts a set of T- patterns and builds a T-pattern tree the best path is found for a given trajectory the predicted future location of the trajectory is the region that corresponds to the final node of the best path Example: 26

Classification and Outlier detection Classification aims to predict the class label of a moving object based on its trajectories (and eventually other features) State-of-the-art: TRACLASS (2008) Outlier detection aims to detect among a set of trajectories, those that behave differently from their neighbors State-of-the-art: TRAOD (2008) 27

TRACLASS TRACLASS (Trajectory Classification) works in three phases: 1. Partitions trajectories based on their shapes (using a TRACLUS variant) 2. Discovers regions that contain sub-trajectories mostly from one class (hierarchical region-based clustering) Tradeoff between homogeneity and conciseness 3. Discovers common movement patterns for each class of subtrajectories (trajectory-based clustering) Example: 28

TRAOD TRAOD (Trajectory Outlier Detection) works in two phases: Partitioning phase: trajectories are segmented into t-partitions (sub-trajectories); recall TRACLUS Detection phase: a trajectory is considered outlier if it contains a sufficient number of outlying t-partitions Example: 29

30 7.5. Summary

Summarizing Knowledge discovery in trajectory databases discovers behavioral patterns of moving objects In this chapter, we presented state-of-the-art techniques for: Clustering a set of trajectories Discovering collective behaviors (flocks, moving clusters, etc.) Predicting the future location of moving objects Classification and outlier detection 31

End of chapter