xiii Preface INTRODUCTION

Similar documents
Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf

Mobility Data Mining. Mobility data Analysis Foundations

TRAJECTORY PATTERN MINING

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

Trajectory analysis. Ivan Kukanov

OLAP for Trajectories

A Joint approach of Mining Trajectory Patterns according to Various Chronological Firmness

Mobility Data Management & Exploration

Mining Frequent Trajectory Using FP-tree in GPS Data

Detect tracking behavior among trajectory data

Spatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018

Route Pattern Mining From Personal Trajectory Data *

Voronoi-based Trajectory Search Algorithm for Multi-locations in Road Networks

Hotspot District Trajectory Prediction *

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets

COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Fosca Giannotti et al,.

A Framework for Trajectory Data Preprocessing for Data Mining

Predicting the Next Location Change and Time of Change for Mobile Phone Users

Mining Quantitative Association Rules on Overlapped Intervals

A System for Discovering Regions of Interest from Trajectory Data

Route pattern mining from personal trajectory data *

An Improved Apriori Algorithm for Association Rules

An Efficient Clustering Algorithm for Moving Object Trajectories

On-Line Discovery of Flock Patterns in Spatio-Temporal Data

Mining Dense Trajectory Pattern Regions of Various Temporal Tightness Ms. Sumaiya I. Shaikh 1, Prof. K. N. Shedge 2

Mining Trajectory Patterns Using Hidden Markov Models

Clustering Spatio-Temporal Patterns using Levelwise Search

A Novel Algorithm for Associative Classification

Discovery of Association Rules in Temporal Databases 1

A NEW METHOD FOR FINDING SIMILAR PATTERNS IN MOVING BODIES

Introduction to Trajectory Clustering. By YONGLI ZHANG

Discovering Frequent Mobility Patterns on Moving Object Data

Efficient distributed computation of human mobility aggregates through User Mobility Profiles

Similarity-based Analysis for Trajectory Data

OSM-SVG Converting for Open Road Simulator

Mobility Data Management and Exploration: Theory and Practice

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases

Mining Representative Movement Patterns through Compression

Improved Frequent Pattern Mining Algorithm with Indexing

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm

Mining Temporal Association Rules in Network Traffic Data

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach

A Novel Method for Activity Place Sensing Based on Behavior Pattern Mining Using Crowdsourcing Trajectory Data

Clustering Algorithm for Network Constraint Trajectories

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

A mining method for tracking changes in temporal association rules from an encoded database

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data

Implementation and Experiments of Frequent GPS Trajectory Pattern Mining Algorithms

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm

PartSpan: Parallel Sequence Mining of Trajectory Patterns

MINING CO-LOCATION PATTERNS FROM SPATIAL DATA

arxiv: v1 [cs.db] 9 Mar 2018

Periodic Pattern Mining Based on GPS Trajectories

Chapter 1, Introduction

Feature Extraction in Time Series

Comparative Study of Subspace Clustering Algorithms

Publishing CitiSense Data: Privacy Concerns and Remedies

Data Stream Clustering Using Micro Clusters

Salah Alghyaline, Jun-Wei Hsieh, and Jim Z. C. Lai

Faster Clustering with DBSCAN

Product presentations can be more intelligently planned

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules

CS570 Introduction to Data Mining

Efficient ERP Distance Measure for Searching of Similar Time Series Trajectories

Clustering Algorithms for Data Stream

Mining Recent Frequent Itemsets in Data Streams with Optimistic Pruning

City, University of London Institutional Repository

Finding Local and Periodic Association Rules from Fuzzy Temporal Data

C-NBC: Neighborhood-Based Clustering with Constraints

GeT_Move: An Efficient and Unifying Spatio-Temporal Pattern Mining Algorithm for Moving Objects

A Spatio-temporal Access Method based on Snapshots and Events

Web page recommendation using a stochastic process model

To Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set

Upper bound tighter Item caps for fast frequent itemsets mining for uncertain data Implemented using splay trees. Shashikiran V 1, Murali S 2

Generating Cross level Rules: An automated approach

Using Association Rules for Better Treatment of Missing Values

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Life and motion configuration

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views

AN IMPROVED DENSITY BASED k-means ALGORITHM

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India

Materialized Data Mining Views *

DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li

Graph-based Analysis of City-wide Traffic Dynamics using Time-evolving Graphs of Trajectory Data

Balanced COD-CLARANS: A Constrained Clustering Algorithm to Optimize Logistics Distribution Network

Best Keyword Cover Search

An Algorithm for Mining Frequent Itemsets from Library Big Data

Analysis and Extensions of Popular Clustering Algorithms

Framework of Frequently Trajectory Extraction from AIS Data

MINING CTMSPS IN LBS

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No.

Optimization using Ant Colony Algorithm

Development of Efficient & Optimized Algorithm for Knowledge Discovery in Spatial Database Systems

Transcription:

xiii Preface INTRODUCTION With rapid progress of mobile device technology, a huge amount of moving objects data can be geathed easily. This data can be collected from cell phones, GPS embedded in cars or telemetry attached on animals. Location of mobile phone user can be collected by locating the cell connected with their mobile phone. In addition, The GPS-equiped vehicle can be tracked. Useful knowledge can be discovered automatically from moving objects data. For example mobile phone logs can be extracted to form a pattern of mobile user movement. This pattern can be used to improve services of network providers. In addition, GPS-equipped vehicle position data can also be extracted to discover frequent routes of vehicle movements. This pattern can be used for traffic management. Moving object data containd spatial and temporal elements. Spatial data is related with geographic aspects, whereas temporal data related with the time properties. Unlike the traditional relational databases, relation among spatial objects is more complex. Ester, Kriegel and Sander (2001) state three basic types of spatial relations, namely: topological, distance, and direction relations (as illustrated in Figure 1). The first type is based on the positions of two objects. The position of these objects could be classified as inside, disjoint or overlaps. For example: x disjoints y, x overlaps y. Distance relation type is based on the distance between two objects. The distance between two objects can be compared with a defined number. As a result the distance between two objects can be classified as equal, lower than or greater than the specified number. Finally, Direction relation of spatial data is based on the directions of the two objects. In order to do that, one of the two objects is defined as the source object. The direction between two objects depends on the source object. For example: z north x means that object z, as a source object, is located north of object x. Figure 1. Spatial relations

xiv The characteristics of spatial data is unique compared to the other databases. In relation to data mining field, Buttenfiled et al. (2000) argued three characteristic of spatial data, namely: huge data, collected cyclically & data foundation. Most of spatial data is obtained automatically by the system rather than input manually, which is retrieved cyclically. The problem of cycled data is that the time information may not be mentioned. Thus, the first challenge before the mining process begins is to mine the cycle pattern of the data. The last characteristic is that spatial data mostly contains raw data. Mining pattern from spatio-temporal databases is more challenging than on numerical data. Because of the spatio-temporal data complexity, primitive data mining techniques cannot be applied straightforward as those for numerical data. In this section, we highlight data mining techniques for discovering usefull knowledge from spatio-temporal data including: movement pattern, trajectory pattern, relative motion pattern, location prediction and trajectory clustering. MOVEMENT PATTERN MINING Mobile user movement data is the data, which can be obtained from users or cars, which carry or embed geographic track devices, such as, mobile phone and GPS. Based on this data, patterns can be discovered, including most people are likely to move from A to B, or the group of people who have same trajectory route on daily movement. Basically, there are two types of pattern, namely: location-based and userbased movement pattern. Location-Based Movement Pattern In location-based pattern mining, we are interested on the object or location, rather than the mobile user who visited the location. An example of pattern that we can discover is: Location A will be visited after Location B. It means that most mobile users visit location B and then Location A. Three related works on location-based pattern mining are: 2-step walking pattern, Location Link Pattern and Periodic Pattern. First, Goh & Taniar (2006) proposed a 2-step walking pattern mining method. This pattern is taken from a mobile user database, which contains the x- and y-coordinates indicating the location of the user at a particular time. The knowledge derived from this pattern is that with the 2-given steps, the mobile user walks from one point to another point. The points represent the location of interest. The process of 2-step walking pattern mining works as follows: Initially, the user movement data set and the location of interest become a data source for this process. Next, the coordinate location in the user movement data set is converted to a relevant location of interest, with regard to the user-defined minimum duration value. The result of this conversion is shown in Table 1. Then, using the user defined minimum weight value, the user location database is analyzed to generate user location summary database. Finally, the user location summary database is posted to the Walking-Matrix or Walking-Graph algorithm to generate the pattern. The following walk-through illustrates the working of the 2-step walking pattern mining using the Walking-Matrix algorithm. This example is taken from Goh and Taniar (2006) experiments. Firstly, the user location database in Table 1 is generated from a user movement database. For example, user movement database of u 1 is converted to l 1 and at time 1. It implies that user u 1 stayed in both location l 1 and at time 1. Let the min_dur = 3 and min_weight = 0.5. Next, using the min_weight value, the user location database is analyzed to form a user location summary database (see Table 2). The user move-

xv Table 1. User location database T u 1 u 2 u 3 u 4 u 5 u 6 1 2 3 4,, 5,, 6,, 7,, 8,, 9,, 10 11 12 13 14 15 16,, 17,, 18,, 19,, 20,, 21,, ment l x l y implies that user moves from l x to l y. However, if x=y the user movement is removed because there is no significant movement of that user. Lastly, the Walking-Matrix algorithm is used for mining 2-step walking pattern from the user location summary database. Walking-Matrix is shown in Table 3. The weight of each 2-step walking pattern is counted. If the weight is greater than min_weight, these patterns are valid, otherwise pruned. Next, location-based pattern mining is Location Link Pattern (Iwan & Safar, 2009). The pattern consists of location sequences, which have a link. For example, as we can see from Figure 2 that all mobile users (u 1, u 2, u 3, u 4, u 5 ) visit childcare before go to school. From that fact, it is possible to summarize that childcare and school have a strong relationship. In other word, there is a link between school and childcare. Iwan & Safar (2009) proposed an algorithm for mining location link pattern from mobile user movement data. This algorithm discovers location link pattern from daily user movement pattern. The well-known Apriori algorithm (Agrawal et al., 1993) is used to generate the patterns. The algorithm for mining location link patterns consists of 2 steps: generating 1-location links and mining location link patterns. The 1-location link pattern mining algorithm uses the daily mobile user movement pattern from all users (instead of a single user) in the mobile user movement database. The location link pattern algorithm is the main process for mining location link patterns. This algorithm requires the location set 1-LLP, the set of daily mobile movement pattern DMP, and the min_user threshold as inputs. Initially, 1-LLP is stored to a valid user movement 1 variable, VUM 1. After that, k-

xvi Table 2. The result from analysis of a User location summary database u 1 u 2 u 3 u 4 u 5 u 6 l 1 l3 l3 l 6 l3 l3 l 1 l3 l3 l3 l3 l 6 l 1 l3 l3 l 6 l 1 l 1 l 1 l 1 l3 l3 l3 l3 l 6 l 1 l 6 l 1 l 1 Table 3. A walking-matrix l 1 l 6 l 1 * 5 1 * 6 7 1 1 1 4 * 1 6 1 2 3 * 1 1 4 2 * l 6 1 4 * Figure 2. An example of a mobile user movement

xvii valid user movement candidate VUC k is generated by self joining each member of VUM k-1. Because daily mobile user movement patterns are classified as sequential data, the joining step is similar to that of the sequential-pattern mining algorithm (Srikant and Agrawal, 1996). Next, if the member of VUC k is one of the DMP subsets, then the number of users is counted. After that, if the number of users is greater than or equal to the min_user threshold, then this candidate is considered to be a valid user movement. This process runs recursively until no more valid user movements can be generated. The output of this algorithm is a set of location link patterns. Another movement pattern based on location is the Periodic Pattern. Mamoulis et al. (2004) proposed a periodic pattern mining method from spatio-temporal data. Spatio-temporal data in this pattern is daily routes over a regular time interval. For example, people are using similar routes from home to office everyday. The routes may not be exactly the same. The route consists of a set of location points. The location points are close to each other if they have the same cluster. The periodic mining {AB*C} means that people move to B from A. After that, they do not move to any other location point. Finally, they move to C. The process of mining the periodic patterns consists of two stages: discovering frequent 1-patterns and discovering a longer pattern. In the first stage, the user movement data is clustered for each timestamp. Traditional clustering technique such as DBSCAN (Ester et al., 1996) can be used. The frequent 1-pattern is generated by checking the number of points in the cluster. The frequent 1-pattern is valid if the number of points of that cluster is greater than the minimum support threshold, min_sup. In the second stage of mining the periodic pattern, the frequent longer patterns are discovered. Mamoulis et al. (2004) proposed two methods for generating frequent longer pattern: bottom-up and top-down approaches. While the bottom-up approach called STPMine1, which is based on the Apriori- TID algorithm (Agrawal and Srikant, 1994), the top-down approach is called SPTMine2 Algorithm. The frequent 1-pattern is the input for both algorithms. In the bottom-up approach, candidate k pattern is generated by joining segment ID of pairs (k 1) pattern. After that, each candidate pattern is checked to determine whether or not its number of segments is greater than min_sup. Finally, in order to make sure that the region of that candidate is still clustered, the pattern is validated. User-Based Movement Pattern Unlike location-based pattern mining, user-based movement patterns are interested in the movement itself. One existing pattern mining algorithm based on user movement is group pattern mining. Group pattern is a relatively new in the data mining research area. Initially, Wang et al. (2003) and Wang et al. (2006) proposed two group pattern mining algorithms: AGP and VG-Growth algorithms. Subsequently, using a spherical location summarization method, Wang et al. (2004) improved the efficiency of the group pattern mining algorithm on a large number of users and lengthy logging duration in order to reduce the processing time required to mine 2-valid groups. Subsequently, Wang et al. (2008) revised the AGP and VG-Growth algorithms in order to mine max-valid groups. The revised algorithms are called: AMG and VGMax algorithms. A group pattern can be extracted from mobile user movement dataset. A user movement dataset is a set of time series of locations for each user. The locations are defined as a set of x- and y-coordinates, while a time series is defined as the time when the user moved to that location. Each record of this dataset contains a triple set of fields: t, x, and y denoting x- and y- coordinates is the location of the user at time t. An example of a user movement data set is shown in Figure 3.

xviii Figure 3. User movement dataset A group pattern is defined as a group of users whose distance is less than the maximum distance between them and the duration of time for which they remain together is greater than, or equal to, the minimum duration. Both maximum distance and minimum duration are the user-defined thresholds. The distance between users is calculated using Euclidean distance whose formulation is shown: ( ) = 2 d ( x, y x y x x y y 1 1),(, 2 2 ) ( 1 2 ) + ( ) 1 2 2 For example, the distance between u 1 and u 2 at time 1 denoted by d(u 1,u 2 ) = 52. In addition, the validity of the group pattern is determined by its weight value. The Weight of group pattern P is defined as: weight P where: n i ( ) = = 1 N S i P = group pattern S i = Valid segment of i n = number of valid segment N = number of time points in database A group pattern is called a valid group if the weight of the group pattern is greater than the min_weight user-defined threshold. Consider the following user movement dataset in Table 4. The problem to be solved in this example is to find a group of users which remained together for a period of time. Let the maximum distance, max_dis = 3, the minimum duration, min_dur = 3 and the minimum weight, min_wei = 6. As we can see from Table 4, the distance between users u 1 and u 2 from time 1 to time 3 and from time 7 to time 10 is less then max_dis. Therefore, u 1 and u 2 fulfill the distance requirements of group pattern. The other requirement is the duration. The durations of u 1 and u 2 from time 1 to time 3 and time 7 to time 10 are 3 and 4 respectively. Therefore, u 1 and u 2 have a duration that is greater than or equal to min_dur. The last

xix requirement is weight. The total number of weights of u 1 and u 2 being closed to each other is 7 which is greater than min_wei. Therefore, u 1 and u 2 are classified as one group because both users satisfied all of the requirements of group pattern. The generation of 2-valid group patterns using the AGP algorithm above is inefficient. The number of 2-valid groups is huge, especially when there are a large number of users. In order to solve this problem, Wang et al. (2004) proposed an efficient method for generating 2-valid group patterns which uses a spherical location summarization approach. In Wang et al. (2004) experiments, the execution time of the AGP algorithm using the SLS method is less than that using the AGP algorithm without the SLS method. A maximal valid group is a set of valid groups without redundancy. Although the AGP algorithm above mines a complete a set of valid groups from user movement dataset, the AGP algorithm produces some redundant valid groups. In order to resolve the redundancy valid group problem, Wang et al. (2008) proposed a maximal valid group mining algorithm for discovering group patterns without redundant valid groups. Another work of user-based pattern mining is User Link Pattern (Iwan & Safar, 2009). As we can see from Table 4 that mobile user u 3, u 4 and u 5 have the same daily movement. They go to school before going to campus and go to childcare before going to school. Based on this fact, it is possible to conclude that u 3, u 4 and u 5 may have a strong relationship. The user link pattern mining algorithm consists of two procedures. The first procedure generates user link candidates. With this procedure, a (n 1)-valid user link is joined to obtain an n-user link candidate. The join process is based on the prefix join method. Two or more users are joinable if they have the same prefix user link set. Otherwise, these users are not joinable. The second procedure in the mining step is the user link candidate validation algorithm. This procedure validates whether or not the size of location movement intersection between them is greater than or equal to min_loc. This procedure consists of two steps: location intersection base generation and intersection step. In the first step, the first user in user link candidate set becomes the location base intersection. Specifically, each subset of the first user location segment where has size greater than equal to min_loc, is used as a location intersection process. In the second step, each member of location intersection base set intersects with the other user location segments in user link candidate. If the size of location movement intersection result is greater than or equal to min_loc, than this user link candidate is valid. Table 4. User movement dataset example Time u 1 u 2 d(u 1,u 2 ) 1 (4, 7) (4, 6) 1 2 (4, 8) (4, 8) 0 3 (4, 7) (4, 6) 1 4 (70, 50) (90,50) 20 5 (70, 51) (90, 53) 20 6 (70,52) (90,52) 20 7 (74, 57) (74, 56) 1 8 (74, 58) (74, 58) 0 9 (74, 57) (74, 56) 1 10 (74, 58) (74, 58) 0

xx TRAJECTORY PATTERN MINING Trajectories are sequences that consist of spatial and temporal data about movements. Giannotti et al. (2007) defined trajectory as spatio-temporal sequences (ST-sequence): <(x 0,y 0,t 0 ),...,(x n,y n,t n )>. x i and y i are the position coordinate relative to the orgin whereas t i is the time stamp for the position information. The example of trajectory data <(x 0,y 0,t 0 ), (x 1,y 1,t 1 ), (x 2,y 2,t 2 ), (x 3,y 3,t 3 ), (x 4,y 4,t 4 ), (x 5,y 5,t 5 )> can be illustrated in Figure 4. In order to obtain a usefull knowledge from movement data above, Giannotti et al. (2007) proposed an algorithm for mining spatio-temporal pattern called a Trajectory Pattern (T-Pattern). T-pattern is formed by examining a set of each trajectories data which has visited almost the same point of interest on similiar times. T-Pattern is defined as a couple (s,a), where: s x, y,..., x, y = ( ) ( ) 0 0 n n is a sequence of n+1 locations a = α,..., α 1 n are the transision times such that α 1 = t = t t i i i 1. α α α 1 2 n T = S S... S 0 1 n A T-pattern T will be discovered in a subseqeunce S if it matches the following two conditions: Each (x i, y i ) in T, matches a point (x i, y i ) in S The transition times in T are similar to those in S. Figure 4. Trajectory data illustration

xxi In fact, it is hard to find exact the same spatial location (x, y) and transition times occurs often on trajectory data. However, the same area is often representated by close location coordinate and the same behavior is often indicated by the close times of trajectory data. This problem can be tackled by the notion of spatial neigborhood and temporal tolerance (Giannotti et al., 2007). Spatial neigborhood is used for defining the location area whereas temporal tolerance is used for defining transisiton times. The two pints will be defined as one area if one and another point falls within a spatial neigborhood and. The two transition times will be defined as one time if the diferences between the two is less than or equal to T. Furthermore, as generating all T-patterns is too computational intensive, the concept of regions of interest is used (Giannotti et al., 2007). Region of interest is used for defining similar point of interest. The region is built from set of point which have same neighbors. Otherwise, the point is not region if they do not have a neighbors. Furthermore, Giannotti et al. (2007) used density-based algorithm for finding regions of interest. With region of interest approach, Giannotti et al. (2007) proposed algorithm for mining T-pattern with pre-defined region interest and dynamic region of interest (RoI). Pre-defined RoI technique is used if the region interest is given whereas dynamic RoI technique is used if the region of interest is unknown. RELATIVE MOTION PATTERN Motion pattern can be formed based on the relative motion (REMO) concept proposed by Laube et al. (2004). There are three basic REMO concept, namely: constance, concurrence, and trend-setter. First, pattern forms constance when sequence moving point object (MPOs) moves on the same motion attribute for given period of time. Next, concurrence can be formed when the number of MPOs have the same motion attributes on the given period of time. The last concept is trend-setter. This pattern is formed when one MPO encourages of other MPOs moving to the same direction. The basic REMO concept disregards the information space. In fact, information space is important to detect patterns. For example, trend-setter of one MPO in location A influences other MPOs in location B. However, if location A has diferent information space with location B, we cannot say that MPO in location A is trend-setter the MPOs in location B. Therefore, information space is essensial for detecting a pattern. In order to solve the above problem, Laube et al. (2004) extended the basic REMO concept for mining spatially constrained REMO patterns, namely: track, flock, and leadership. First, constance pattern with spatial constraint is defined as track. Next, concurence pattern with spatial pattern forms a flock. Lastly, leadership can be build by combining trend-setter and spatial constraints. The extended of basic REMO concept is illustrated in Figure 5. Furthermore, Laube et al. (2004) proposed spatial REMO pattern convergence. This concept is to detect groups of MPOs aggregating in space and time. Convergence is built when groups of MPOs from any location move to the same circular region of the given radius. The convergence pattern is illustrated in Figure 6. A convergence pattern is discovered from 4 MPOs for p 2, p 3, p 5, and p 6. In order to detect aggreration pattern, Laube proposed a spatial data mining approach. Aggreration pattern including flocking behaviour and convergence in spatial-temporal data. The apporach is based on object motion properties and spatial constraint.

xxii Figure 5. An extended basic REMO concept: track, flock, leadership Figure 6. Convergence pattern LOCATION PREDICTION Location prediction for moving object can be classified as based on one moving object and based on all moving objects trajectories. Based on one moving object, location prediction is defined as a prediction of the next location on a defined timestamp. This location prediction is based on the past movement history. Consider Figure 7 taken from Tao et al. (2004) as an example. Assume we want to predict the location at the next time 4 timestamp based on 2 past movements history at time 1 and time 2. The white dots in Figure 7 illustrate the location prediction at the next 4 timestamp at time 1 and time 2 respectively. However in reality most of movements are not linear. As illustrated in Figure 7, object o is moving not linear showed in the black dots. In order to predict location from non-linear movement, Tao et al. (2004) proposed algorithm for predicting location from non-linear movement patterns. Specifically, Tao et al. proposed three contributions to solve the problems on location prediction, namely: system architecture, the recursive motion function and the STP Tree (spatio-temporal prediction tree). The first, Tao et al. (2004) assumed the system architecture is client-server (refer to Figure 8). The client maintains the location path history and calculates its single motion function continuously whereas the server receives information from all clients, stores the path history and delivers location prediction based on user query.

xxiii Figure 7. Location prediction Location prediction is based on both pattern and path history. Therefore it can predict location in the future, even the current time is longer than the requested time. It is focused on time when the prediction requested. Similar with Tao et al. (2004), Jeung et al. (2008) concerned on non-linear moving objects which can predict the longest time. One of techniques for predicting location based on all moving object s trajectories is WhereNext (Monreale et al., 2009). WhereNext predicts locations using frequent trajectory pattern of other moving objects. The moving object is likely to follow a pattern where most moving objects formed a pattern. In order to predict locations, WhereNext has four main steps, namely: data selection, trajectory pattern (T-pattern) mining, T-pattern Tree building, and prediction. In the data selection step, a location area and a time period are selected. The purpose of this step is for taking only the moving object that passing the location area in some period of time. After that the selected data is extracted for mining the frequent movement pattern. The process of extracting data is using trajectory patterns algorithm called T-patterns. In the third step, T-pattern Tree is constructed by combining T-patterns in a prefix tree. Finally, the future location of moving object can be predicted by using T-pattern Tree. An example of T-pattern Tree of Table 4 s T-Pattern is shown in Figure 9 which is taken from Moonreale et al. (2009). Figure 8. System architecture on location prediction

xxiv Table 4. T-pattern <(), C> <(15,20), B)> supp:20 <(), C> <(10,12), D> supp:35 <(), A> <(4,20), A)> supp: 26 <(), C> <(70,90), C> supp:21 <(), A> <(9,12), C)> <(10,12), D> supp:21 <(), F> <(2,51), D> supp:37 <(), A> <(9,12), B)> <(10,56), E> supp:21 <(), A> <(9,15), B> supp:31 <(), A> <(9,12), C)> <(15,20), B> supp:10 <(), B> <(8,70), E> supp:28 Figure 9. T-pattern tree construction The T-Pattern: <(), loc 1 > <(t min, t max ), loc 2 > sup:n Where: loc 1 = the source location loc 2 = the destination location t min = the minimum time interval from source to destination location t max = the maximum time interval from source to destination location n = the support value of this pattern TRAJECTORY CLUSTERING There are two types of clustering spatio-temporal data, namely: distance-based and shape-based clustering. Distance-based clustering aims to discover a group of objects moving together whereas shape-based clustering aims to discover similar shape trajectories. Unlike existing algorithms where cluster trajectories as a whole, Lee et al. (2007) proposed a partition and group framework for clustering trajectories. Clustering trajectories as a whole may lead to miss essential information. In particular, common behavior may not be discovered because all trajectories can move in different directions. An example of this problem can be seen in Figure 10. As a whole

xxv Figure 10. An example of a common trajectory trajectory, all trajectories (TR 1, TR 2, TR 3, and TR 4 ) move to totally different directions. As a result, the cluster is not discovered. However, the cluster will be discovered when we partition the trajectories into sub-trajectories. As can be seen, the dotted rectangle is a common sub-trajectories which form a cluster. In order to tackle the problem above, Lee et al. (2007) proposed a trajectory clustering algorithms, TRACLUS, based on partition-and-group framework. According to this framework, TRACLUS discovers the common sub-trajectories in two steps, namely: partitioning and grouping. In the partitioning step, the characteristic points are identified by using the minimum description length (MDL) technique. The characteristic point classified a point where the behavior of trajectory changes dramatically. After that a trajectory partition is formed at every characteristic point. In the grouping step, the clusters are discovered by applying density-based clustering, DBSCAN. After that, overall movement of the trajectory partitions is described. This common sub-trajectory forms a cluster. CONCLUSION The purpose of this preface was to highlight data mining techniques for spatio-temporal data. We show that the usefull knowledge can be discovered by extracting spatio-temporal data. This knowlege including: movement pattern, trajectory pattern, relative motion pattern, location prediction and trajectory clustering. David Taniar Monash University, Australia Lukman Hakim Iwan RMIT University, Australia REFERENCES Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In the Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. pp. 207-216. Washington, D.C., United States. ACM. Agrawal, R., & Srikant, R. (1994). Fast Algorithms for Mining Association Rules in Large Databases. Proceedings of the 20th International Conference on Very Large Data Bases. pp 487-499. Santiago, Chile. Morgan Kaufmann Publishers Inc.

xxvi Buttenfield, B., Gahegan, M., Miller, H., & Yuan, M. (2000). Geospatial Data Mining and Knowledge Discovery. Technical report, University Consorsium for Geographic Information Science Research White Paper. Washington, D.C. Available online at http://www.ucgis.org/priorities/research/research_ white/2000%20papers/emerging/gkd.pdf. Ester, M., Kriegel, H., & Sander, J. (2001). Algorithms and Applications for Spatial Data Mining. Geographic Data Mining and Knowledge Discovery, Research Monograph in GIS. Taylor and Francis. Ester, M., Kriegel, H.-P., Sander, J. & XU, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, Second International Conference on Knowledge and Data Mining. pp 226-231. Portland, Oregon. AAAI Press. Giannotti, F., Nanni, M., Pinelli, F., & Pedreschi, D. (2007). Trajectory pattern mining. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. pp 330-339. San Jose, California, USA. ACM. Goh, J., & Taniar, D. (2006). On Mining 2 Step Walking Pattern from Mobile Users. Computational Science and Its Applications - ICCSA 2006 (pp. 1090 1099). Glasgow, UK: Springer. Iwan, L.H. and Safar, M. (2009). Pattern mining from movement of mobile users, Journal Ambient Intelligence and Humanized Computing, 1-4(2010), pp. 295-308. Jeung, H., Liu, Q., Shen, H. T., & Zhou, X. (2008). A Hybrid Prediction Model for Moving Objects. In Proceedings of the 24th International Conference on Data Engineering (ICDE 08). pp 70-79. Cancun, Mexico. IEEE. Laube, P., Kreveld, M., & Imfeld, S. (2004). Finding REMO Detecting Relative Motion Patterns in Geospatial Lifelines. Developments in Spatial Data Handling: Proceddings of the 11th International Symposium on Spatial Data Handling. pp. 201-214. Lee, J., Han, J., & Whang, J. (2007). Trajectory clustering: a partition-and-group framework. In Proceedings of the ACM SIGMOD International Conference on Management of data (SIGMOD 07). pp 593-604. Beijing, China, ACM. Mamoulis, N., Cao, H., Kollios, G., Hadjieleftheriou, M., Tao, Y., & Cheung, D. W. (2004). Mining, indexing, and querying historical spatiotemporal data. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 236-245. Seattle, WA, USA. ACM. Monreale, A., Pinelli, F., Trasarti, R., & Giannotti, F. (2009). WhereNext: a location predictor on trajectory pattern mining. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 09). pp 637-646. Paris, France, ACM. Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology. pp. 3-17. Avignon, France. Springer. Tao, Y., Faloutsos, C., Papadias, D., & Liu, B. (2004). Prediction and indexing of moving objects with unknown motion patterns. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD 04). pp. 611-622. Paris, France. ACM.

xxvii Wang, Y., Lim, E.-P., & Hwang, S.-Y. (2003). On mining group patterns of mobile users. The 14th International Conference on Database and Expert Systems Applications - DEXA 2003. pp. 287-296. Prague, Czech Republic. Springer. Wang, Y., Lim, E.-P., & Hwang, S.-Y. (2004). Efficient group pattern mining using data summarization. The 9th International Conference on Database Systems for Advanced Applications - DASFAA 2004. pp. 895-907. Jeju Island, Korea. Springer. Wang, Y., Lim, E.-P., & Hwang, S.-Y. (2006). Efficient mining of group patterns from user movement data. Data & Knowledge Engineering, 57, 240 282. doi:10.1016/j.datak.2005.04.006 Wang, Y., Lim, E.-P., & Hwang, S.-Y. (2008). Efficient algorithms for mining maximal valid groups. The International Journal on Very Large Data Bases -. The VLDB Journal, 17, 515 535. doi:10.1007/ s00778-006-0019-9