ediating Between odeled and Observed Behavior: The Quest for the "Right" Process prof.dr.ir. Wil van der Aalst
Outline Introduction to Process ining (short) ediating between a reference model and observed behavior (model repair) The 4+ dimensions of conformance Importance of alignments to relate observed and modeled behavior Representational bias Discovering configurable process models Decomposing process mining problems to deal with Big Data PAGE 1
Positioning Process ining performance-oriented questions, problems and solutions process model analysis (simulation, verification, etc.) process mining data-oriented analysis (data mining, machine learning, business intelligence) compliance-oriented questions, problems and solutions 2
oore's Law 2013 2060 3
Three Types of Process ining world business processes people machines components organizations models analyzes supports/ controls specifies configures implements analyzes software system records events, e.g., messages, transactions, etc. (process) model discovery conformance enhancement event logs
Play-Out process model event log PAGE 5
Play-Out (Classical use of models) B A p1 E p3 D start end p2 C p4 A B C D A E D A C B D A B C D A E D A C B D A E D A C B D PAGE 6
Play-In event log process model PAGE 7
Play-In A B C D A C B D A E D A E D A B C D A C B D A E D A C B D B A p1 E p3 D start end p2 C p4 PAGE 8
Example Process Discovery (Vestia, Dutch housing agency, 208 cases, 5987 events) PAGE 9
Example Process Discovery (ASL, test process lithography systems, 154966 events) PAGE 10
Example Process Discovery (AC, 627 gynecological oncology patients, 24331 events) PAGE 11
Replay event log process model extended model showing times, frequencies, etc. diagnostics predictions recommendations PAGE 12
Replay A B C D B A p1 E p3 D start end p2 C p4 PAGE 13
Replay A E D B A p1 E p3 D start end p2 C p4 PAGE 14
Replay can detect problems A C D Problem! token left behind B Problem! missing token A p1 E p3 D start end p2 C p4 PAGE 15
Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988) PAGE 16
Replay can extract timing information A 5 B 8 C 9 D 13 4 5 3 6 8 B 2 5 7 8 A p1 E p3 D start 5 4 p2 3 3 7 C 9 p4 4 6 4 7 13 end PAGE 17
Performance Analysis Using Replay (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988) PAGE 18
Hundreds of plug-ins available covering the whole process mining spectrum open-source (L-GPL) Download from: www.processmining.org PAGE 19
Commercial Alternatives Disco (Fluxicon) Perceptive Process ining (before Futura Reflect and BP one) ARIS Process Performance anager QPR ProcessAnalyzer Interstage Process Discovery (Fujitsu) Discovery Analyst (StereoLOGIC) XAnalyzer (XPro) 20
Three Key Observations PAGE 21
#1 Alignments are essential! conformance checking to diagnose deviations squeezing reality into the model to do model-based analysis move on model move on log move on model (harmless) move on model PAGE 22
#2 odels are like the glasses required to see and understand event data! PAGE 23
#3 Process models as maps: Breathing life into models PAGE 24
Alignments Joint work with Arya Adriansyah, Boudewijn van Dongen, Elham Ramezani, Dirk Fahland, assimiliano de Leoni, et al. PAGE 25
Replaying trace abeg a b e g b examine thoroughly g start a register request r=1 c examine casually d check ticket e decide m=1 f reinitiate request pay compensation h reject request end 1 6 1 6 = 0.83333 26
From playing the token game to optimal alignments observed trace: abeg a b» e g a b d e g b move in model only start a register request examine thoroughly c examine casually d check ticket decide f e reinitiate request g pay compensation h reject request end 27
Another alignment observed trace: abcdeg a b c d e g a b» d e g b examine thoroughly g move in log only start a register request c examine casually d check ticket decide f e reinitiate request pay compensation h reject request end 28
oves have costs a» a a» a a b Standard cost function: c(x,») = 1 c(»,y) = 1 c(x,y) = 0, if x=y c(x,y) =, if x y 29
Any cost structure is possible send-letter(john,2 weeks, $400) send-email(sue,3 weeks,$500) Similar activities (more similarity implies lower costs). Resource conformance (done by someone that does not have the specified role). Data conformance (path is not possible for this customer). Time conformance (missed the legal deadline) 30
Using Alignments An optimal alignment has the lowest possible costs. If multiple alignments are optimal, pick one or use all. Like an "oracle" revealing paths in the model. These paths can be used for further analysis! Can be used for quantifying various types conformance (possibly using a different cost function). PAGE 31
Some pointers Wil. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisc. Rew.: Data ining and Knowledge Discovery 2(2): 182-192 (2012) Arya Adriansyah, Jorge unoz-gama, Josep Carmona, Boudewijn F. van Dongen, Wil. P. van der Aalst: Alignment Based Precision Checking. Business Process anagement Workshops 2012: 137-149 Arya Adriansyah, Boudewijn F. van Dongen, Wil. P. van der Aalst: Conformance Checking Using Cost-Based Fitness Analysis. EDOC 2011: 55-64 assimiliano de Leoni, Wil. P. van der Aalst, Boudewijn F. van Dongen: Dataand Resource-Aware Conformance Checking of Business Processes. BIS 2012: 48-59 Joos C. A.. Buijs, Boudewijn F. van Dongen, Wil. P. van der Aalst: On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. OT Conferences (1) 2012: 305-322 Elham Ramezani, Dirk Fahland, Wil. P. van der Aalst: Where Did I isbehave? Diagnostic Information in Compliance Checking. BP 2012: 262-278 A. Adriansyah, B.F. van Dongen, W..P. van der Aalst. emory-efficient Alignment of Observed and odeled Behavior. BP Center Report BP-13-03, BPcenter.org, 2013 PAGE 32
Conformance: Taking a Step Back PAGE 33
Conventional Conformance Notions fitness lift ability to explain observed behavior avoiding overfitting precision thrust generalization Process ining Occam s Razor simplicity avoiding underfitting drag gravity Leaving out one of these dimensions during discovery will lead to degenerate cases! PAGE 34
Problem real process is unknown process discovery event logs covers only a fraction of all possible behavior process discovery real process unknown record conformance checking event data only examples conformance checking process model model needs to provide an abstraction: urphy's Law of Process ining PAGE 35
Traditional Notions such as Precision and Recall do NOT Apply ideal or desired model based on perfect knowledge of real process event log descriptive or normative model (man-made or discovered) 0 regular behavior in log not covered by model FN 1 TP L regular behavior in log covered by model Problem I: event log does not provide information about the whole universe of traces only a selected part 0 TN FP exceptional behavior in log covered by model Problem II: in practice it is unclear where this line is exceptional behavior in log not covered by the model PAGE 36
Operationalizing the Four Conformance dimensions avoiding overfitting thrust generalization fitness Process ining Occam s Razor gravity lift ability to explain observed behavior simplicity precision drag avoiding underfitting Fitness (fraction of observed behavior possible according to the model). easure at the case or event level. How to continue after a deviation (where to put the "blame"), cf. duplicate and silent activities. Precision (avoiding underfitting; fraction of allowed behavior never observed). Log only contains examples. etrics are e.g. based on escaping edges. Generalization (avoiding overfitting; probability that the next unseen case will not fit). Reasoning about unseen behavior, strongly related to log completeness. Simplicity (Occam's Razor: the simplest of two or more competing theories is preferable) Easy to operationalize (e.g., number of nodes or arcs). Often subjective (valuation of AND/XOR/OR-split/joins). PAGE 37
Some pointers Wil. P. van der Aalst: ediating Between odeled and Observed Behavior: The Quest for the Right Process. Seventh IEEE International Conference on Research Challenges in Information Science (RCIS 2013), (2013) Wil. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Replaying history on process models for conformance checking and performance analysis. Wiley Interdisc. Rew.: Data ining and Knowledge Discovery 2(2): 182-192 (2012) Joos C. A.. Buijs, Boudewijn F. van Dongen, Wil. P. van der Aalst: On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. OT Conferences (1) 2012: 305-322 Wil. P. van der Aalst: Process ining - Discovery, Conformance and Enhancement of Business Processes. Springer 2011, isbn 978-3-642-19344-6, pp. I-XVI, 1-352 PAGE 38
Representational Bias in Process ining (not about visualization) Joint work with Joos Buijs, Sander Leemans, and Boudewijn van Dongen. PAGE 39
Typical Representational Bias (Labeled) Petri Nets, WFnets, etc. Subsets of BPN diagrams, UL Activity Diagrams, Event-Driven Process Chains (EPCs), YAWL, etc. Transition Systems (Hidden) arkov odels [start] register request start start start register request [c1,c2] register request examine thoroughly examine casually check ticket a register request XOR [c2,c3] [c1,c4] OR AND check ticket c1 reinitiate request examine casually examine thoroughly [c5] c2 e1 e2 e3 decide [c3,c4] b examine thoroughly c examine casually d check ticket pay compensation examine thoroughly reject request examine casually check ticket examine thoroughly examine casually check ticket [end] c3 c4 reinitiate request OR AND start register request e decide f decide reinitiate request e4 decide XOR c1 c2 c5 OR-split e5 e6 examine thoroughly examine casually pay compensation pay compensation reject request pay compensation check ticket OR-join reject request decide new information c3 g h reject request end pay compensation reject request end end end end PAGE 40
Huge Search Space When Discovering a Petri Net, BPN model, and the like PAGE 41
with just a few interesting candidates PAGE 42
Alternative Representational Bias 1. C-nets (XOR/AND/ORsplit/join graphs; more likely to be sound due to declarative semantics). 2. Declare models (constraint based, grounded in LTL; anything is possible unless forbidden) 3. Process Trees (similar to subsets of various process algebras; sound by structure) a register request today's focus b examine thoroughly c examine casually d check ticket f reinitiate request a e decide register request g pay compensation h reject request z end pay compensation g h reject request e decide PAGE 43
Another Representational Bias: Process Trees A Always sound because of the block structure Also Loop and OR operator A X D E B C D B C A B A (BA)* E PAGE 58
Petri Net Semantics (used for comparison and conformance checking only) A B A Sequence A Parallellism B Exclusive Choice B A A C Loop B Or Choice B PAGE 59
and BPN. Sequence Or Choice Exclusive Choice Loop Parallellism PAGE 60
A Discovery Algorithm Using Process Trees: Evolutionary Tree iner (ET) Process trees as representation (= limit search space to "good" models). Genetic approach (= very flexible) Fitness function uses all four criteria (= seamlessly balance the different "forces") Change Population No Create Initial Population easure Quality Stop? Yes Return Best Individual PAGE 61
Population Change Elite Population i Population i+1 Selection Replace Crossover utation PAGE 62
Example B E A C G D F A = send e-mail, B = check credit, C = calculate capacity, D = check system, E = accept, F = reject, G = send e-mail PAGE 64
Conventional Algorithms (1/3) ("best effort" mapping to process trees to allow for comparison) alpha miner ILP miner low fitness language-based region miner low precision low fitness PAGE 65
Conventional Algorithms (2/3) heuristic miner multi-phase miner PAGE 66
Conventional Algorithms (3/3) genetic miner state-based region miner PAGE 67
Often unsound result and no mechanism to seamlessly balance the four criteria PAGE 68
Genetic ining (ET) While Considering Only One Criterion best value possible for this log ET with weight zero to three out of four perspectives. PAGE 69
Considering Replay Fitness and One Other Criterion ET with weight zero to two out of four perspectives. PAGE 70
Considering 3 of 4 Criteria replay fitness needs to have a larger weight PAGE 71
Considering All Four Criteria with Emphasis on Fitness fitness has weight 10 PAGE 72
Initial odel Versus Discovered odel simulated discovered by ET B E A C G D F PAGE 73
1) Carefully choose your representational bias during discovery: Unrelated to presentation/visualization! 2) Consider all conformance dimensions (replay fitness, precision, generalization, and simplicity)! PAGE 74
Some pointers Wil. P. van der Aalst, Arya Adriansyah, Boudewijn F. van Dongen: Causal Nets: A odeling Language Tailored towards Process Discovery. CONCUR 2011: 28-42 Joos C. A.. Buijs, Boudewijn F. van Dongen, Wil. P. van der Aalst: On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. OT Conferences (1) 2012: 305-322 Joos C. A.. Buijs, Boudewijn F. van Dongen, Wil. P. van der Aalst: A genetic algorithm for discovering process trees. IEEE Congress on Evolutionary Computation 2012: 1-8 S.J.J. Leemans, D. Fahland, W..P. van der Aalst. Discovering Block-Structured Process odels From Event Logs A Constructive Approach. BP Center Report BP-13-06, BPcenter.org, 2013 PAGE 75
ediating Between a Reference odel and Real Observed Behavior Joint work with Joos Buijs, Boudewijn van Dongen, and Dirk Fahland. PAGE 76
Compromise Based on Two ain Forces b examine thoroughly g start a register request c examine casually d check ticket decide f e reinitiate request pay compensation h reject request end fitness lift ability to explain observed behavior Various techniques to compare graphs, e.g., edit distance notions (add, remove, replace). thrust avoiding overfitting generalization Process ining Occam s Razor simplicity avoiding underfitting precision drag gravity PAGE 77
Force Between Reference odel and Candidate odel Can be Viewed as a 5 th Conformance Notion Replay Fitness Simplicity Optimal Process odel Similarity Boundary Candidate Process odel Reference Process odel Precision Generalization PAGE 78
Example From CoSeLoG Project PAGE 79
Some pointers J.C.A.. Buijs,. La Rosa, H.A. Reijers, B.F. van Dongen, and W..P. van der Aalst: Improving Business Process odels using Observed Behavior, SIPDA 2012 post-proceedings, Lecture Notes in Business Information Processing, 2013. Joos C. A.. Buijs, Boudewijn F. van Dongen, Wil. P. van der Aalst: On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery. OT Conferences (1) 2012: 305-322 Dirk Fahland, Wil. P. van der Aalst: Repairing Process odels to Reflect Reality. BP 2012: 229-245. Dirk Fahland, Wil. P. van der Aalst: Simplifying discovered process models in a controlled manner. Inf. Syst. 38(4): 585-605 (2013). PAGE 80
ining Configurable Process odels Joint work with Joos Buijs, Boudewijn van Dongen, and Florian Gottschalk. PAGE 81
Two variants of the same process PAGE 82
Configurable Process odel PAGE 83
Variants of the same process d a b g h f c d a b g h e c f d a g h e f PAGE 84
Approach 1: erge Separately ined odels event log 1 Process model 1 pure modelbased merging C1 event log 2 Process model 2 C2... pure modelbased configuration event log n Process model n Cn "normal" discovery Step 1: Process ining Step 2a: Process odel erging Configurable Process model Step 2b: Process Configuration PAGE 85
Results Approach 1 for running example PAGE 86
Results Approach 1 for running example poor generalization poor similarity PAGE 87
Approach 2 pure modelbased merging event log 1 Process model 1 C1 event log 2 Process model 2 C2... event log n Process model n Cn Step 1a: erge Event Logs erged event log Step 1b: Process ining Common Process model Step 1c: Process Individualization Step 1d: Process odel erging Configurable Process model Step 2: Process Configuration discovery based on whole event log discovery based on local event log and common model (edit distance as 5 th dimension) pure modelbased configuration PAGE 88
Results Approach 2 for running example PAGE 89
Results Approach 2 for running example poor generalization poor similarity PAGE 90
Approach 3 event log 1 discovery based on whole event log configuration based on common model and local event log (always limiting behavior, not extending it) C1 event log 2 C2... event log n Cn Step 1a: erge Event Logs erged event log Step 1b: Process ining Configurable Process model Step 2: Process Configuration PAGE 91
Results Approach 3 for running example better generalization better similarity PAGE 92
Approach 4 discovery of the process model and the configuration is combined into one algorithm (configuration costs are added to overall fitness) event log 1 C1 event log 2 C2... event log n Cn Step 1&2: Process ining & Process Configuration Configurable Process model PAGE 93
Results Approach 4 for running example small drop in replay fitness better simplicity even better generalization even better similarity PAGE 94
Genetic algorithms are versatile and can consider different forces at the same time PAGE 95
but often not fast enough PAGE 96
Some pointers J.C.A.. Buijs, B.F. van Dongen, and W..P. van der Aalst: ining Configurable Process odels from Collections of Event Logs, under review, 2013. Florian Gottschalk, Teun A. C. Wagemakers, onique H. Jansen-Vullers, Wil. P. van der Aalst, arcello La Rosa: Configurable Process odels: Experiences from a unicipality Case Study. CAiSE 2009: 486-500 Florian Gottschalk, Wil. P. van der Aalst, onique H. Jansen- Vullers: erging Event-Driven Process Chains. OT Conferences (1) 2008: 418-426 Wil. P. van der Aalst: Business Process Configuration in the Cloud: How to Support and Analyze ulti-tenant Processes? ECOWS 2011: 3-10 PAGE 97
Decomposing Process ining Problems Joint work with Eric Verbeek, Jorge unoz, and Josep Carmona. PAGE 98
Big Data: Opportunities and Challenges PAGE 99
System Net f t7 c7 t8 c8 t11 start a t1 Labeled Petri net (P,T,F,l) c1 c2 t2 b t3 c t4 a = register request b = examine file c = check ticket d = decide e = reinitiate request f = send acceptance letter g = pay compensation h = send rejection letter Silent transitions and visible transitions (unique or not), One initial marking init, one final marking final c3 c4 e t6 d t5 c5 c6 g t9 h t10 c9 end PAGE 100
Traces and Logs A system net SN has a set of possible visible traces ϕ(sn) starting in init and ending final only showing the visible steps. An event log L is a multiset of traces. Two main process mining problems: 1. Conformance checking: Given L and SN, evaluate the "conformance" (e.g., fitness, precision, generalization, etc.) of L and ϕ(sn) 2. Process discovery: Given L, create SN such that the conformance of L and ϕ(sn) is "as good as possible" PAGE 101
Valid Decomposition f t2 start a t1 c1 c2 t2 b t3 c t4 c3 c4 t7 d t5 e t6 c5 c7 c6 t8 g t9 h t10 c8 c9 t11 end start a t1 t7 c1 c2 c7 b t3 c t4 f t8 c3 c4 c8 d t5 e t6 t11 g c6 t9 c9 Union of subnets is original net No shared places d t5 c5 h t10 e t6 Shared transitions are visible and have unique label end PAGE 102
Another Decomposition f t7 c7 t8 c8 t11 start a t1 c1 c2 t2 b t3 c t4 c3 c4 d t5 e t6 c5 c6 g t9 h t10 c9 end start a t1 t2 a t1 c2 c t4 c4 e d t5 a t1 c1 b t3 c3 d t5 t6 f e t7 c7 t8 c8 t11 t6 g c6 t9 c9 d Requirement t5 c5 h end Union of subnets is original net e t10 No shared places t6 Shared transitions are visible and unique PAGE 103
aximal Valid Decomposition f t7 c7 t8 c8 t11 start a t1 c1 c2 t2 b t3 c t4 c3 c4 d t5 e c5 c6 g t9 h t10 c9 end t6 SN 1 start a t1 SN 2 a t1 c1 t2 b t3 c3 d t5 e t6 SN 5 t7 d t5 e c5 c7 c6 f t8 g t9 h t10 f t8 g t9 h t10 c8 c9 t11 end SN 6 a t6 t1 c2 c SN 4 d t5 SN 3 t4 e t6 c t4 c4 PAGE 104
aximal Decomposition start a t1 c1 c2 t2 b t3 c t4 c3 c4 t7 d t5 e t6 c5 c7 c6 f t8 g t9 h t10 c8 c9 t11 end SN 1 start a t1 SN 3 a t1 c2 SN 2 a t1 c t4 c1 e t6 t2 b t3 c3 SN 4 c t4 d t5 e t6 c4 SN 5 t7 d t5 e t6 d t5 c5 c7 c6 f t8 g t9 h t10 f t8 g t9 h t10 c8 c9 t11 end SN 6 x Construction: group arcs iteratively aximal decomposition is unique There is always a valid decomposition y y PAGE 105
Non-unique visible labels f t7 c7 t8 c8 t11 start a t1 c1 c2 t2 b t3 b t4 c3 c4 d t5 e c5 c6 g t9 h t10 c9 end t2 t6 a t1 c1 b t3 c3 d t5 SN 7 t2 a e t6 a t1 c1 c2 b t3 b t4 c3 c4 d t5 e t1 c2 b d t5 t6 t4 e t6 b t4 c4 Union of subnets is original net No shared places Shared transitions are visible and unique PAGE 106
Conformance checking can be decomposed!!! start Let L be an event log, SN a system net, and D={SN 1,SN 2, SN n } a valid decomposition L i is the sublog of SN i (L projected onto visible transitions of SN i ) a t1 c1 c2 t2 b t3 c t4 c3 c4 e t6 t7 d t5 c5 c7 c6 f t8 g t9 h t10 c8 c9 t11 end L is perfectly fitting SN if and only if each projected log is L i is perfectly fitting SN i SN 1 start a t1 SN 3 a t1 c2 SN 2 c t4 a t1 c1 e t6 t2 b t3 SN 4 c t4 c3 e t6 c4 d t5 d t5 SN 5 e t6 t7 d t5 c5 c7 c6 f t8 g t9 h t10 f t8 g t9 h t10 c8 c9 t11 end SN 6 PAGE 107
Example of alignment for observed trace a,b,c,d,e,c,d,g,f a,b,c,d,e,c,d,g,f t2 c1 b a t3 start t1 c2 c t4 c3 c4 t7 d t5 e t6 c5 c7 c6 f t8 g t9 h t10 c8 c9 t11 end SN 1 start a t1 SN 3 a t1 c2 SN 2 a t1 c t4 c1 e t6 t2 b t3 c3 SN 4 c t4 d t5 e t6 c4 SN 5 t7 d t5 e t6 d t5 c5 c7 c6 f t8 g t9 h t10 f t8 g t9 h t10 c8 c9 t11 end SN 6 Etc. PAGE 108
So What? Quantifying Conformance Exact result: Fraction of cases perfectly fitting SN equals the fraction of cases for which each projected log is L i is perfectly fitting SN i. Bound: For fitness at the event level (costs based alignment) it is possible to compute an optimistic value. PAGE 109
Discovery can also be distributed! 1. Assume the set of activities is split in overlapping sets. 2. Split log L in sublogs L i based on these sets. 3. Discover a model SN i per sublog. 4. erge the models SN i into SN and return result. All the earlier guarantees hold, e.g., L is perfectly fitting SN if and only if each projected log is L i is perfectly fitting SN i PAGE 111
Example Let's use the Alpha algorithm Alpha algorithm is not very suitable for discovering transition bordered subnets. By adding start and end activities we get (in this case) perfectly fitting subnets. PAGE 112
Discover model per sublog (Alpha algorithm) SN 1 i a d start b end e SN 2 a c d SN 3 d j start e end start e h end SN 4 f j k start g end h PAGE 113
erge models SN 1 i a d start b end SN 2 start a c e e d end L is perfectly fitting SN because each projected log is L i is perfectly fitting SN i SN 3 j d start e h end SN 4 f j k start g end p1 p2 p3 {1,2,3,4} t1 h p5 p6 {1,2} a t2 p7 p8 p9 t4 i t3 b c t5 {1} {1} {2} p10 p11 j t6 {3,4} {1,2,3} d t7 p13 p14 p15 f t9 {4} g k t10 p19 t12 {3,4} p18 {4} {4} h t11 p20 p21 {1,2,3,4} t13 p22 p23 p24 p4 p12 {1,2,3} e t8 p16 p17 p25 PAGE 114
Simplify by removing redundant places and initialization p1 p2 p3 {1,2,3,4} t1 p5 p6 {1,2} a t2 p7 p8 p9 t4 i t3 b c t5 {1} {1} {2} p10 p11 j t6 {3,4} {1,2,3} d t7 p13 p14 p15 {4} f t9 p18 {4} {4} g t10 p19 k t12 {3,4} h t11 p20 p21 {1,2,3,4} t13 p22 p23 p24 p4 p12 {1,2,3} e t8 p16 p17 p25 f i j g k a b d start end c h e Log is perfectly fitting this model! PAGE 115
PAGE 116
SESEs, Passages, and general result all define trees on process graph/event log Few large process mining tasks or many smaller process mining tasks. Choice of level is driven by computation time and desired diagnostics! All implemented in Pro. PAGE 117
Diagnosis PAGE 118
Example: Conformance Checking based on SESEs k=inf (no decompostion) some overhead for easy problems intractable problems become tractable stopped after 10 hours problems are often local suitable k-value depends on model/log k = maximal number of arcs in one SESE, P = # places, T = # transitions, f = fitness, t = time in seconds, S = # transition bounded SESEs, B = # bridges, >5 = # more than 5 arcs, nf = # non-fitting parts. PAGE 119
Some pointers Wil.P. van der Aalst. Decomposing Petri Nets for Process ining: A Generic Approach. BP Center Report BP-12-20, BPcenter.org, 2012 Wil. P. van der Aalst: Distributed Process Discovery and Conformance Checking. FASE 2012: 1-25 Wil. P. van der Aalst: Decomposing Process ining Problems Using Passages. Petri Nets 2012: 72-91 H.. W. (Eric) Verbeek, Wil. P. van der Aalst: An Experimental Evaluation of Passage-Based Process Discovery. Business Process anagement Workshops 2012: 205-210 Wil.P. van der Aalst and Eric Verbeek. Process Discovery and Conformance Checking Using Passages. BP Center Report BP-12-21, BPcenter.org, 2012 J. unoz-gama, J. Carmona, W. van der Aalst: Hierarchical Conformance Checking of Process odels Based on Event Logs. In: Applications and Theory of Petri Nets, 2013 PAGE 120
Conclusion PAGE 121
Conclusion (1/2) Alignments are essential for relating observed and modeled behavior! Conformance has (at least) four dimensions! Representational bias is important (and should not be confused with visualization)! New questions are emerging: mediating between a reference model and observed behavior discovering configurable process models Decomposing process mining problems to deal with Big Data. PAGE 122
Conclusion (2/2) Still many challenging and highly relevant open problems in process mining! performance-oriented questions, problems and solutions process model analysis (simulation, verification, etc.) process mining data-oriented analysis (data mining, machine learning, business intelligence) compliance-oriented questions, problems and solutions PAGE 123