Automatic Learning of Predictive CEP Rules Bridging the Gap between Data Mining and Complex Event Processing

Similar documents
Multivariate Time Series Classification Using Inter-leaved Shapelets

Extracting Discriminative Shapelets from Heterogeneous Sensor Data

Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes

Analysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data

Multiresolution Motif Discovery in Time Series

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

Jarek Szlichta

Searching and mining sequential data

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process

Deliverable First Version of Analytics Benchmark

EAST Representation: Fast Discriminant Temporal Patterns Discovery From Time Series

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

Cost Sensitive Time-series Classification Shoumik Roychoudhury, Mohamed Ghalwash, Zoran Obradovic

Categorization of Sequential Data using Associative Classifiers

CSE 573: Artificial Intelligence Autumn 2010

Spatial Outlier Detection

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Semi-Supervised Clustering with Partial Background Information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Introduction to Trajectory Clustering. By YONGLI ZHANG

Feature Selection in Learning Using Privileged Information

A Supervised Time Series Feature Extraction Technique using DCT and DWT

Computer-based Tracking Protocols: Improving Communication between Databases

CS 584 Data Mining. Classification 1

Classification of Log Files with Limited Labeled Data

Learning DTW-Shapelets for Time-Series Classification

Hierarchical and Ensemble Clustering

CSEP 573: Artificial Intelligence

Metric Learning for Large Scale Image Classification:

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

Annotation of Human Motion Capture Data using Conditional Random Fields

Week 1 Unit 1: Introduction to Data Science

Time Series Classification in Dissimilarity Spaces

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Correlative Analytic Methods in Large Scale Network Infrastructure Hariharan Krishnaswamy Senior Principal Engineer Dell EMC

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron

Anomaly Detection on Data Streams with High Dimensional Data Environment

Online Pattern Recognition in Multivariate Data Streams using Unsupervised Learning

Data Preprocessing. Supervised Learning

Event Detection using Archived Smart House Sensor Data obtained using Symbolic Aggregate Approximation

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Statistical Pattern Recognition

arxiv: v4 [cs.lg] 14 Aug 2018

Classification: Feature Vectors

Building a Scalable Recommender System with Apache Spark, Apache Kafka and Elasticsearch

Statistical Pattern Recognition

Mining Web Data. Lijun Zhang

Chapter 1, Introduction

Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell

STREAMOPS : OPEN SOURCE PLATFORM FOR RESEARCH AND INTEGRATION OF ALGORITHMS FOR MASSIVE TIME SERIES FLOW ANALYSIS

Clustering Analysis based on Data Mining Applications Xuedong Fan

SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR

Artificial Intelligence. Programming Styles

SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER

New ensemble methods for evolving data streams

LiSEP: a Lightweight and Extensible tool for Complex Event Processing

Structure of Association Rule Classifiers: a Review

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza

Statistical Pattern Recognition

Detection of Anomalies using Online Oversampling PCA

Data Mining and Data Warehousing Classification-Lazy Learners

Mobility Data Management & Exploration

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22

Web Usage Mining. Overview Session 1. This material is inspired from the WWW 16 tutorial entitled Analyzing Sequential User Behavior on the Web

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE

To Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

ECG782: Multidimensional Digital Signal Processing

Chapter 9: Outlier Analysis

A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis

Package rucrdtw. October 13, 2017

Domain Independent Prediction with Evolutionary Nearest Neighbors.

GraphCEP Real-Time Data Analytics Using Parallel Complex Event and Graph Processing

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Bag-of-Temporal-SIFT-Words for Time Series Classification

DIONYSUS: Towards Query-aware Distributed Processing of RDF Graph Streams

CS5670: Computer Vision

NDoT: Nearest Neighbor Distance Based Outlier Detection Technique

Keywords Data alignment, Data annotation, Web database, Search Result Record

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Deep Model Adaptation using Domain Adversarial Training

A Conflict-Based Confidence Measure for Associative Classification

Computationally Efficient M-Estimation of Log-Linear Structure Models

1 Case study of SVM (Rob)

COMP 465: Data Mining Classification Basics

Distance-based Outlier Detection: Consolidation and Renewed Bearing

Machine Learning based session drop prediction in LTE networks and its SON aspects

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters

Network Traffic Measurements and Analysis

OUTSMART ADVANCED CYBER ATTACKS WITH AN INTELLIGENCE-DRIVEN SECURITY OPERATIONS CENTER

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

CSC411/2515 Tutorial: K-NN and Decision Tree

Transcription:

Automatic Learning of Predictive CEP Rules Bridging the Gap between Data Mining and Complex Event Processing Raef Mousheimish, Yehia Taher and Karine Zeitouni DAIVD Laboratory, University of Versailles, France The 11 th ACM International Conference on Distributed and Event-Based Systems 33ème conférence sur la Gestion de Données Principes, Technologies et Applications, Nancy 14-17 novembre 2017

Context CEP helps to instantaneously react against occurring situations Employed in different domains Environmental monitoring Fraud detection Financial applications Anomaly detection CEP engines are totally guided by CEP rules The only inference mechanism in the CEP world CEP Rules Events CEP Engine Composite events, results, alerts, 2

Examples of Rules Simple CEP Rule: SELECT * FROM WE.win:time(2 minutes) HAVING avg(temperature)>10 Complex CEP Rule: CREATE WINDOW tempwin.win:keepall() (place String, avg int) INSERT INTO tempwin SELECT * FROM WE.win:time(2 minutes) HAVING avg(temperature)>10 SELECT * FROM PATTERN [EVERY a = tempwin(avg>15) b = tempwin(avg > a.avg) 3

Motivation Event-based systems is an active area of research Scalability Latency Distribution Almost all scenarios and problems tackled by researchers have reactive traits Anomaly, traffic jam, fraud detection What if anomaly needs to be predicted? E.g., in a manufacturing process The usage of CEP in such examples is not at all evident 4

Predictive CEP It has been mentioned several times as a future direction and as proposals on the conceptual level [Fulop, 2010] [Engel, 2012] However it remains a vision No real attempt to produce an easy-to-use predictive CEP system The main cause: CEP rules need to be specified manually No support is provided for users to define these rules It is hard for experts to manually write rules that predict situations This is why domains such as data mining and predictive analytics exist in the first place There exists a gap between predictive analytics and CEP that needs to be bridged 5

Our Objectives Overtake the de facto approach regarding the definition of CEP rules Manual to automatic Go from merely reactive rules into predictive rules Allow for the CEP technology to be easily employed in predictive applications Create a generic solution that could be used in different domains Bridge a gap between data mining and complex event processing 6

Outline Context & Motivation High Level Goals Objectives Contributions Univariate Shapelet Extraction Sequence Extraction Automatic Learning of Predictive CEP Rules Evaluations Conclusion

Early Classification on Time Series A predictive analytics field that fits exactly our goals of creating a generic and predictive approach Time series means timestamped events Early classification means predictive rules Shapelet-based classification style: Devised in 2009 [Ye, SIGKDD 2009] Temporal patterns that are associated with classes By definition it works on univariate time series Suitable for anomaly prediction 8

Definitions Real artwork Transport Data Shapelets A shapelet sh = (s, δ, c s, score) s is the subsequence of a time series δ is the distance threshold c s is the class of the shapelet score is a utility score The best shapelets: 1. Have the smallest length 2. Frequent 3. Discriminative 9

Definitions Multivariate Setting A multi-dimensional time series MTS dt = {T 1, T 2,, T d } Data set of d-dimensional time series: D d = {(dt, c)} TAS (Time-Annotated Sequence) is a sequence of shapelets with time annotations between them: sh 2 3 1 sh 2 sh 3 2 3 TAS are associated with a class: sh 1 sh 2 sh 3 c If the TAS appeared then c will probably occur 10

Problem Statement In general: How to go from classified multivariate time series into ready-to-deploy predictive CEP rules? 1. How to go from classified multivariate time series into time-annotated sequences? Input: Classified multivariate time series Contribution 1 Output: Timeannotated sequences 2. How to transform time-annotated sequences into ready predictive CEP rules? Input: Time-annotated sequences Contribution 2 Output: Predictive CEP rules 11

Our Proposal Two Contributions: 1. USE & SEE algorithms to extract predictive temporal patterns with time and sequence constraints (TAS) 2. A compiler (autocep) that transforms these patterns on-the-fly into predictive CEP rules Multivariate time series D d Classified MTS Univariate Shapelet Extractor USE Shapelets SEquence Extractor SEE TAS TAS records Learning time CEP Engine Predictive CEP Rules autocep Transform TAS into CEP rules Real time 12

Step 1: Univariate D d Classified MTS USE Shapelets SEE TAS Shapelet Extractor (USE) CEP Engine Predictive CEP Rules autocep Shapelet Learning Shapelet Learning Parameter-Free Shapelet Learning Shapelet Learning MTS = Multivariate Time Series Shapelet Learning Shapelet Learning Univariate Time Series Shapelets 13

Shapelet Learning For each subsequence End Loop Get all subsequences between the lengths Calculate Similarities Calculate Distance Threshold Calculate Utility Score Three types of distance measures: Euclidean Distance (default) Dynamic time warping Mass (frequency domain) [Yeh, ICDM 2016] 14

Pruning Top K pruning Normal top-k pruning Depending on the utility score Cover pruning While not all data set is covered and there is still shapelets to test Marked > 0 Sort Mark the instances that it covers no yes accept 15

Step 2: SEquence Extractor (SEE) D d Classified MTS USE CEP Engine Shapelets SEE Predictive CEP Rules TAS autocep D d encoded: Sequence Learning Sequences shapelets that constitute the sequences are associated with the same class Parameter-Free Time Constraint Learning Time-annotated Sequences Δt 1 16

Step 3: Automatic Generation of CEP Rules (AutoCEP) D d Classified MTS USE CEP Engine Shapelets SEE Predictive CEP Rules TAS autocep Each TAS is transformed into a set of simple rules and one complex rule A simple rule is a transformation of each individual shapelet into the CEP language A complex rule is a transformation of the whole pattern into the CEP language It captures all constraints: sequence and time windows between shapelets (i.e., between simple rules) Δt 2 Transformed into simple rules Transformed into a complex rule 17

Simple CEP Rule Generation Each shapelet in a TAS is transformed into a CEP rule A simple rule matches whenever received events are similar to the shapelet that it represents To convey this in CEP jargons: for sh = (s, δ, c s, score) Δt 2 Transformed into simple rules Transformed into a complex rule 18

Complex CEP Rules Generation The chaining of simple rules and complex rules is done through CEP Named Windows Input Stream: Sensors listens CEP Engine CEP Rule 1 outputs Named Window listens CEP Rule 2 outputs Results Named window NW created 3 simple rules created They emit their matches to NW Complex Rule: 2 3 For every a=nw(dim=3) b=nw(dim=1, start a.start 2) c=nw(dim=2, start b.start 3) Dim 1 Dim 2 Dim 3 19

Complete Picture of CEP Rules Generation Named Windows: Complex CEP Rules: Simple CEP Rules: 20

Experiments: Multivariate Time Series Objectives Quality Performance Interpretability Different variants of classification: 1. Closest classification: Classify according to closest pattern so far 2. First Classification: Classify according to the first matched pattern 3. Abnormality Detection: Ignore normal instances 4. Majority voting: Check every instance with every rule Metrics Average f-score (the higher the better) Earliness: (the lower the better) Accuracy (the higher the better) Applicability (the higher the better) Learning time (the lower the better) https://archive.ics.uci.edu/ 21

Closest Classification: Comparison Approach Avg. f- score [REACT, 2015] [MSD, 2013] Earliness App. Acc. autocep 90.5% 28.7% 100% 92.6% REACT 91.9% 32.8% 100% - Full 1NN 87.2% 100% 100% 89.9% Wafer Approach Avg. f- score Approach Avg. f-score Earliness App. Acc. Earliness App. Acc. autocep 81% 21.2% 100% 82.7% REACT 76.7% 10.5% 100% - Full 1NN 87.7% 100% 100% 88.7% MSD 58.8% 12.8% 100% - ECG autocep 76.6% 50% 100% 80.8% REACT 72.7% 40.7% 94.7% - Full 1NN 71.9% 100% 100% 79.3% MSD 39.6% 27.4% 96.3% - Robots 22

All Classifications Classification Methods 85.8 91.3 86.3 85.7 82.2 82.4 94.9 100 95.1 100 33 35 27 28.6 30 37 FIRST CLASSIFICATION CLOSEST CLASSIFICATION ABNORMALITY DETECTION MAJORITY VOTING Wafer:Acc Wafer:Earliness ECG:Acc ECG:Earliness 23

Sensitivity of Parameters Lengths (min, max) Distance Measure Pruning minacc (SEE) The ECG dataset 24

Learning Time Brute and mass to compute distances O means an optimized version with multithreading Time Complexity USE(brute) = O(d.n.(m 2.log(n))) USE(mass) = O(d.n.log(n)) SEE = O(n.m.d!) autocep has no time complexity Empirical experiments with synthetic data 25

Interpretability of Rules ECG Data Example Wafer Data Example 26

Conclusion Learning of advanced temporal patterns from multivariate time series with USE & SEE Adopt shapelets in the multivariate settings Step further from current state-of-the-art approaches Including sequencing and time constraints Automatic learning of predictive CEP rules with autocep Learn data-driven rules First approach to learn predictive rules Employ the CEP technology in predictive contexts without complexity More Optimization techniques will be integrated 27

References [Margara, 2014] et al. Learning from the past: automated rule generation for complex event processing. DEBS, 2014. [Margara, 2014] et al. Towards automated rule learning for complex event processing. Tech Report, 2013. [Mutschler, 2012] et al. Learning event detection rules with noise hidden Markov models. AHS, 2012. [Sen, 2010] et al. An approach for iterative event pattern recommendation. DEBS, 2010. [Turchin, 2009] et al. Tuning complex event processing rules using the predictioncorrection paradigm. DEBS, 2009. [Ye, 2009], and Eamonn Keogh. "Time series shapelets: a new primitive for data mining." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009. 28

References [Fulop, 2010] Fülöp, Lajos Jenő, et al. "Survey on complex event processing and predictive analytics." Proceedings of the Fifth Balkan Conference in Informatics. 2010. [Engel, 2012] Engel, Yagil, Opher Etzion, and Zohar Feldman. "A basic model for proactive event-driven computing." Proceedings of the 6th ACM international conference on distributed event-based systems. ACM, 2012. [REACT, 2015], Lin, H.-H. Chen, V. S. Tseng, and J. Pei. Reliable early classication on multivariate time series with numerical and categorical attributes. In Advances in Knowledge Discovery and Data Mining, pages 199-211. Springer, 2015. [MSD, 2013] Ghalwash, V. Radosavljevic, and Z. Obradovic. Extraction of interpretable multivariate patterns for early diagnostics. In Data Mining (ICDM), 2013 IEEE 13th International Conference on, pages 201-210. IEEE, 2013. 29

30