Automatic Learning of Predictive CEP Rules Bridging the Gap between Data Mining and Complex Event Processing Raef Mousheimish, Yehia Taher and Karine Zeitouni DAIVD Laboratory, University of Versailles, France The 11 th ACM International Conference on Distributed and Event-Based Systems 33ème conférence sur la Gestion de Données Principes, Technologies et Applications, Nancy 14-17 novembre 2017
Context CEP helps to instantaneously react against occurring situations Employed in different domains Environmental monitoring Fraud detection Financial applications Anomaly detection CEP engines are totally guided by CEP rules The only inference mechanism in the CEP world CEP Rules Events CEP Engine Composite events, results, alerts, 2
Examples of Rules Simple CEP Rule: SELECT * FROM WE.win:time(2 minutes) HAVING avg(temperature)>10 Complex CEP Rule: CREATE WINDOW tempwin.win:keepall() (place String, avg int) INSERT INTO tempwin SELECT * FROM WE.win:time(2 minutes) HAVING avg(temperature)>10 SELECT * FROM PATTERN [EVERY a = tempwin(avg>15) b = tempwin(avg > a.avg) 3
Motivation Event-based systems is an active area of research Scalability Latency Distribution Almost all scenarios and problems tackled by researchers have reactive traits Anomaly, traffic jam, fraud detection What if anomaly needs to be predicted? E.g., in a manufacturing process The usage of CEP in such examples is not at all evident 4
Predictive CEP It has been mentioned several times as a future direction and as proposals on the conceptual level [Fulop, 2010] [Engel, 2012] However it remains a vision No real attempt to produce an easy-to-use predictive CEP system The main cause: CEP rules need to be specified manually No support is provided for users to define these rules It is hard for experts to manually write rules that predict situations This is why domains such as data mining and predictive analytics exist in the first place There exists a gap between predictive analytics and CEP that needs to be bridged 5
Our Objectives Overtake the de facto approach regarding the definition of CEP rules Manual to automatic Go from merely reactive rules into predictive rules Allow for the CEP technology to be easily employed in predictive applications Create a generic solution that could be used in different domains Bridge a gap between data mining and complex event processing 6
Outline Context & Motivation High Level Goals Objectives Contributions Univariate Shapelet Extraction Sequence Extraction Automatic Learning of Predictive CEP Rules Evaluations Conclusion
Early Classification on Time Series A predictive analytics field that fits exactly our goals of creating a generic and predictive approach Time series means timestamped events Early classification means predictive rules Shapelet-based classification style: Devised in 2009 [Ye, SIGKDD 2009] Temporal patterns that are associated with classes By definition it works on univariate time series Suitable for anomaly prediction 8
Definitions Real artwork Transport Data Shapelets A shapelet sh = (s, δ, c s, score) s is the subsequence of a time series δ is the distance threshold c s is the class of the shapelet score is a utility score The best shapelets: 1. Have the smallest length 2. Frequent 3. Discriminative 9
Definitions Multivariate Setting A multi-dimensional time series MTS dt = {T 1, T 2,, T d } Data set of d-dimensional time series: D d = {(dt, c)} TAS (Time-Annotated Sequence) is a sequence of shapelets with time annotations between them: sh 2 3 1 sh 2 sh 3 2 3 TAS are associated with a class: sh 1 sh 2 sh 3 c If the TAS appeared then c will probably occur 10
Problem Statement In general: How to go from classified multivariate time series into ready-to-deploy predictive CEP rules? 1. How to go from classified multivariate time series into time-annotated sequences? Input: Classified multivariate time series Contribution 1 Output: Timeannotated sequences 2. How to transform time-annotated sequences into ready predictive CEP rules? Input: Time-annotated sequences Contribution 2 Output: Predictive CEP rules 11
Our Proposal Two Contributions: 1. USE & SEE algorithms to extract predictive temporal patterns with time and sequence constraints (TAS) 2. A compiler (autocep) that transforms these patterns on-the-fly into predictive CEP rules Multivariate time series D d Classified MTS Univariate Shapelet Extractor USE Shapelets SEquence Extractor SEE TAS TAS records Learning time CEP Engine Predictive CEP Rules autocep Transform TAS into CEP rules Real time 12
Step 1: Univariate D d Classified MTS USE Shapelets SEE TAS Shapelet Extractor (USE) CEP Engine Predictive CEP Rules autocep Shapelet Learning Shapelet Learning Parameter-Free Shapelet Learning Shapelet Learning MTS = Multivariate Time Series Shapelet Learning Shapelet Learning Univariate Time Series Shapelets 13
Shapelet Learning For each subsequence End Loop Get all subsequences between the lengths Calculate Similarities Calculate Distance Threshold Calculate Utility Score Three types of distance measures: Euclidean Distance (default) Dynamic time warping Mass (frequency domain) [Yeh, ICDM 2016] 14
Pruning Top K pruning Normal top-k pruning Depending on the utility score Cover pruning While not all data set is covered and there is still shapelets to test Marked > 0 Sort Mark the instances that it covers no yes accept 15
Step 2: SEquence Extractor (SEE) D d Classified MTS USE CEP Engine Shapelets SEE Predictive CEP Rules TAS autocep D d encoded: Sequence Learning Sequences shapelets that constitute the sequences are associated with the same class Parameter-Free Time Constraint Learning Time-annotated Sequences Δt 1 16
Step 3: Automatic Generation of CEP Rules (AutoCEP) D d Classified MTS USE CEP Engine Shapelets SEE Predictive CEP Rules TAS autocep Each TAS is transformed into a set of simple rules and one complex rule A simple rule is a transformation of each individual shapelet into the CEP language A complex rule is a transformation of the whole pattern into the CEP language It captures all constraints: sequence and time windows between shapelets (i.e., between simple rules) Δt 2 Transformed into simple rules Transformed into a complex rule 17
Simple CEP Rule Generation Each shapelet in a TAS is transformed into a CEP rule A simple rule matches whenever received events are similar to the shapelet that it represents To convey this in CEP jargons: for sh = (s, δ, c s, score) Δt 2 Transformed into simple rules Transformed into a complex rule 18
Complex CEP Rules Generation The chaining of simple rules and complex rules is done through CEP Named Windows Input Stream: Sensors listens CEP Engine CEP Rule 1 outputs Named Window listens CEP Rule 2 outputs Results Named window NW created 3 simple rules created They emit their matches to NW Complex Rule: 2 3 For every a=nw(dim=3) b=nw(dim=1, start a.start 2) c=nw(dim=2, start b.start 3) Dim 1 Dim 2 Dim 3 19
Complete Picture of CEP Rules Generation Named Windows: Complex CEP Rules: Simple CEP Rules: 20
Experiments: Multivariate Time Series Objectives Quality Performance Interpretability Different variants of classification: 1. Closest classification: Classify according to closest pattern so far 2. First Classification: Classify according to the first matched pattern 3. Abnormality Detection: Ignore normal instances 4. Majority voting: Check every instance with every rule Metrics Average f-score (the higher the better) Earliness: (the lower the better) Accuracy (the higher the better) Applicability (the higher the better) Learning time (the lower the better) https://archive.ics.uci.edu/ 21
Closest Classification: Comparison Approach Avg. f- score [REACT, 2015] [MSD, 2013] Earliness App. Acc. autocep 90.5% 28.7% 100% 92.6% REACT 91.9% 32.8% 100% - Full 1NN 87.2% 100% 100% 89.9% Wafer Approach Avg. f- score Approach Avg. f-score Earliness App. Acc. Earliness App. Acc. autocep 81% 21.2% 100% 82.7% REACT 76.7% 10.5% 100% - Full 1NN 87.7% 100% 100% 88.7% MSD 58.8% 12.8% 100% - ECG autocep 76.6% 50% 100% 80.8% REACT 72.7% 40.7% 94.7% - Full 1NN 71.9% 100% 100% 79.3% MSD 39.6% 27.4% 96.3% - Robots 22
All Classifications Classification Methods 85.8 91.3 86.3 85.7 82.2 82.4 94.9 100 95.1 100 33 35 27 28.6 30 37 FIRST CLASSIFICATION CLOSEST CLASSIFICATION ABNORMALITY DETECTION MAJORITY VOTING Wafer:Acc Wafer:Earliness ECG:Acc ECG:Earliness 23
Sensitivity of Parameters Lengths (min, max) Distance Measure Pruning minacc (SEE) The ECG dataset 24
Learning Time Brute and mass to compute distances O means an optimized version with multithreading Time Complexity USE(brute) = O(d.n.(m 2.log(n))) USE(mass) = O(d.n.log(n)) SEE = O(n.m.d!) autocep has no time complexity Empirical experiments with synthetic data 25
Interpretability of Rules ECG Data Example Wafer Data Example 26
Conclusion Learning of advanced temporal patterns from multivariate time series with USE & SEE Adopt shapelets in the multivariate settings Step further from current state-of-the-art approaches Including sequencing and time constraints Automatic learning of predictive CEP rules with autocep Learn data-driven rules First approach to learn predictive rules Employ the CEP technology in predictive contexts without complexity More Optimization techniques will be integrated 27
References [Margara, 2014] et al. Learning from the past: automated rule generation for complex event processing. DEBS, 2014. [Margara, 2014] et al. Towards automated rule learning for complex event processing. Tech Report, 2013. [Mutschler, 2012] et al. Learning event detection rules with noise hidden Markov models. AHS, 2012. [Sen, 2010] et al. An approach for iterative event pattern recommendation. DEBS, 2010. [Turchin, 2009] et al. Tuning complex event processing rules using the predictioncorrection paradigm. DEBS, 2009. [Ye, 2009], and Eamonn Keogh. "Time series shapelets: a new primitive for data mining." Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009. 28
References [Fulop, 2010] Fülöp, Lajos Jenő, et al. "Survey on complex event processing and predictive analytics." Proceedings of the Fifth Balkan Conference in Informatics. 2010. [Engel, 2012] Engel, Yagil, Opher Etzion, and Zohar Feldman. "A basic model for proactive event-driven computing." Proceedings of the 6th ACM international conference on distributed event-based systems. ACM, 2012. [REACT, 2015], Lin, H.-H. Chen, V. S. Tseng, and J. Pei. Reliable early classication on multivariate time series with numerical and categorical attributes. In Advances in Knowledge Discovery and Data Mining, pages 199-211. Springer, 2015. [MSD, 2013] Ghalwash, V. Radosavljevic, and Z. Obradovic. Extraction of interpretable multivariate patterns for early diagnostics. In Data Mining (ICDM), 2013 IEEE 13th International Conference on, pages 201-210. IEEE, 2013. 29
30