Interactive PMCube Explorer

Similar documents
Multidimensional Process Mining with PMCube Explorer

Table of Contents. Table of Contents

Enterprise Data Catalog for Microsoft Azure Tutorial

Teiid Designer User Guide 7.5.0

BW C SILWOOD TECHNOLOGY LTD. Safyr Metadata Discovery Software. Safyr User Guide

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata

Jet Data Manager 2014 SR2 Product Enhancements

MSBI Online Training (SSIS & SSRS & SSAS)

Data Warehouse and Data Mining

DATA MINING AND WAREHOUSING

SAS Data Integration Studio 3.3. User s Guide

MICROSOFT BUSINESS INTELLIGENCE

QDA Miner. Addendum v2.0

Doc. Version 1.0 Updated:

SQL Server Analysis Services

ForeScout Open Integration Module: Data Exchange Plugin

Pentaho Aggregation Designer User Guide

OLAP Introduction and Overview

Analytics: Server Architect (Siebel 7.7)

Getting Started Guide. ProClarity Analytics Platform 6. ProClarity Professional

Tutorial 1. Creating a Database

Using SAP NetWeaver Business Intelligence in the universe design tool SAP BusinessObjects Business Intelligence platform 4.1

Hyperion Interactive Reporting Reports & Dashboards Essentials

OneStop Reporting 4.5 OSR Administration User Guide

Information Design Tool User Guide SAP BusinessObjects Business Intelligence platform 4.0 Support Package 4

This download file shows detailed view for all updates from BW 7.5 SP00 to SP05 released from SAP help portal.

PowerPlanner manual. Contents. Powe r Planner All rights reserved

Creating a target user and module

Sample Data. Sample Data APPENDIX A. Downloading the Sample Data. Images. Sample Databases

DATA WAREHOUING UNIT I

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Copyright 2010, Oracle. All rights reserved.

Mastering phpmyadmiri 3.4 for

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

REPORTING AND QUERY TOOLS AND APPLICATIONS

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

SAS 9.4 Management Console: Guide to Users and Permissions

ForeScout CounterACT. Configuration Guide. Version 3.4

OSR Composer 3.7 User Guide. Updated:

BPMN Miner 2.0: Discovering Hierarchical and Block-Structured BPMN Process Models

API Gateway Version September Key Property Store User Guide

What s New In Sawmill 8 Why Should I Upgrade To Sawmill 8?

Advanced Data Management Technologies Written Exam

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC

Cube Designer User Guide SAP BusinessObjects Financial Consolidation, Cube Designer 10.0

Imagine. Create. Discover. User Manual. TopLine Results Corporation

Business Insight Authoring

Server Edition USER MANUAL. For Microsoft Windows

SQL Server 2005 Analysis Services

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure

SAS. Information Map Studio 3.1: Creating Your First Information Map

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives

Data Mining Concepts & Techniques

Integration Services. Creating an ETL Solution with SSIS. Module Overview. Introduction to ETL with SSIS Implementing Data Flow

Data Warehouse and Data Mining

Cisco StadiumVision Director Obtaining Proof of Play. Release 2.3. February 2011

SAS BI Dashboard 3.1. User s Guide Second Edition

Discovery Hub User Guide Version Find the latest version at support.timextender.com. Copyright 2018 TimeXtender A/S. All Rights Reserved.

Logi Ad Hoc Reporting Management Console Usage Guide

Enterprise Architect. User Guide Series. Database Models. Author: Sparx Systems. Date: 19/03/2018. Version: 1.0 CREATED WITH

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service

Using the List management features can be a useful way of collecting a set of results ready for export.

Oracle Warehouse Builder 10g Runtime Environment, an Update. An Oracle White Paper February 2004

Ocean Wizards and Developers Tools in Visual Studio

SQL Server. Management Studio. Chapter 3. In This Chapter. Management Studio. c Introduction to SQL Server

IBM. Database Database overview. IBM i 7.1

SAS 9.4 Management Console: Guide to Users and Permissions

Interstage Business Process Manager Analytics V12.1 Studio Guide

Seamless Dynamic Web (and Smart Device!) Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN

Index A, B, C. Rank() function, steps, 199 Cloud services, 2 Comma-separated value (CSV), 27

Using SQL Developer. Oracle University and Egabi Solutions use only

DEC Computer Technology LESSON 6: DATABASES AND WEB SEARCH ENGINES

Decision Support Systems aka Analytical Systems

Viewer. Quick Reference Guide

ER/Studio Enterprise Portal User Guide

Improving the Performance of OLAP Queries Using Families of Statistics Trees

How to Mail Merge PDF Documents

ARIS Admintool Commands

Portfolios Creating and Editing Portfolios... 38

6 SSIS Expressions SSIS Parameters Usage Control Flow Breakpoints Data Flow Data Viewers

Release notes for version 3.0

Acronis Backup plugin for WHM and cpanel 1.0

Server Edition USER MANUAL. For Mac OS X

Budget Process Tools: Smart View Ad Hoc Basics 1

Using AWS Data Migration Service with RDS

The Alarms Professional plug-in PRINTED MANUAL

Server Edition. V8 Peregrine User Manual. for Microsoft Windows

normalization are being violated o Apply the rule of Third Normal Form to resolve a violation in the model

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

Chapter 18: Data Analysis and Mining

SAS 9.2 OLAP Server. User s Guide

Using SQL with SQL Developer 18.2

SAS Web Report Studio 3.1

OFSAA Extension Guidelines Model. January 2018

Budget Process Tools: Smart View Ad Hoc Basics 1

A Case Study Building Financial Report and Dashboard Using OBIEE Part I

TX DWA Contents RELEASE DOCUMENTATION

Reading Sample. Creating New Documents and Queries Creating a Report in Web Intelligence Contents. Index. The Authors

EMC Documentum Composer

Doc. Version 1.0 Updated:

Transcription:

Interactive PMCube Explorer Documentation and User Manual Thomas Vogelgesang Carl von Ossietzky Universität Oldenburg June 15, 2017

Contents 1. Introduction 4 2. Application Overview 5 3. Data Preparation 7 3.1. Data Warehouse Schema (PMCube)..................... 7 3.2. Creating a Meta-data File.......................... 10 3.2.1. Database Schema Tab........................ 11 3.2.2. Event Tables Tab........................... 11 3.2.3. Dimensions Tab............................ 12 3.2.4. Simple Attributes Tab........................ 13 3.2.5. Classifiers Tab............................. 14 3.2.6. Model-Specific Tabs......................... 15 3.2.7. Semantics............................... 16 3.2.8. Scopes and Alternative Scopes.................... 18 3.2.9. Value Providers............................ 18 3.2.10. Data Sources............................. 19 3.3. Data Integration Wizard........................... 20 3.3.1. Data Source.............................. 20 3.3.2. Database............................... 20 3.3.3. Meta-data............................... 20 3.3.4. Mapping................................ 22 4. Application settings 24 4.1. General Settings................................ 24 4.2. Paths...................................... 24 4.3. Visualization................................. 24 4.4. Process Model Diff.............................. 25 4.5. Process Discovery............................... 25 4.6. Plugins..................................... 25 5. Connect to Database 26 A. Plug-ins 28 A.1. Event Log Exporter Plug-ins......................... 28 A.2. Conformance Checking Plug-ins....................... 28 A.3. Consolidation Plug-ins............................ 28 2

Contents A.4. Causal Behavioural Profile Plug-ins..................... 29 A.5. Database Plug-ins............................... 29 A.6. Event Log Statistics Plug-ins......................... 30 A.7. Diff Plug-ins.................................. 30 A.8. Adapter Plug-ins............................... 31 A.9. Adapter UI Plug-ins............................. 31 A.10.Integration Plug-ins.............................. 31 A.11.Event Log Action Plug-ins.......................... 31 A.12.Meta-data Plug-ins.............................. 32 A.13.Semantics Plug-ins.............................. 32 A.14.Meta-data UI Plug-ins............................ 33 A.15.OLAP Server Plug-ins............................ 33 A.16.OLAP Server UI Plug-ins.......................... 33 A.17.Process Discovery Plug-ins.......................... 34 A.18.Binary Significance Metric Plug-Ins..................... 34 A.19.Binary Correlation Metric Plug-ins..................... 34 A.20.Unary Significance Plug-ins......................... 35 A.21.Process Model Plug-ins............................ 35 A.22.Process Model Converter Plug-ins...................... 36 A.23.Time Enhancement Plug-ins......................... 36 A.24.View Model Converter Plug-ins....................... 36 A.25.Styling Operation Plug-ins.......................... 37 A.26.View Model Plug-ins............................. 38 3

1. Introduction The Interactive PMCube Explorer is a tool for multidimensional process mining (MPM). This manual gives a brief introduction of its usage and the necessary preparation of event data. For a general introduction of MPM and the concept of the PMCube approach, we refer to [VA16]. For this document, we assume that the reader is familiar with process mining. 4

2. Application Overview Figure 2.1 shows the application s main window after start-up. In general it consists of four parts: the ribbon menu at the top, the content area at the center, and two sidebars left-hand and right-hand of the content area. Figure 2.1.: Main window after start-up The ribbon menu provides access to most of the features of the application which are grouped in different tabs by their semantic. The six main tabs are: Start This tab offers general features like connecting the application to the database (see Chapter 5) and customizing the application settings (see Chapter 4). Tools This tab provides tools that support the integration of data into the data cube like the editor for meta-data files (see Section 3.2) and the Integration Wizard (see Section??). View On this tab, you can change the view of the application, e.g., turn on/off the process model matrix view or hide/show particular tabs. OLAP This tab provides all OLAP-related features like loading the cube and its data from the data warehouse. 5

2. Application Overview Event Log On this tab, you find all features to manipulate (e.g. filter) the event logs that were loaded from the data warehouse. Process Mining This tab offers advanced process mining techniques like process enhancement and conformance checking. Consolidation This tab provides access to the consolidation of process models. Tip: You can hide/collapse the ribbon bar by clicking the icon on the right side (below the window close button) or double-clicking one of the tabs. The sidebar on the left side of the main window comprises all configuration options for the OLAP query. Grouped by different tabs, you can roll-up/drill-down or slice the data cube. Furthermore, you can also omit unnecessary attributes from the query results (Projection). In the sidebar on the right side of the main window, you find the selection and configuration of the process discovery algorithm, the selection of the classifier, and the operation history where you can undo/redo particular analysis steps. The content area in the center of the window uses a tab system to present the analysis results. The Olap Cell Preview tab shows a preview of the currently set OLAP query while the Olap Results Tab presents an overview of the results of the last executed OLAP query. Tip: The main window is based on a docking framework. Therefore, you can customize it according to your needs, e.g., by changing the size of or rearranging the different parts of the window. 6

3. Data Preparation The Interactive PMCube Explorer manages the event data required for process mining in a multidimensional data cube that is stored in a relational data warehouse. Currently, the following database management systems (DBMS) are supported: Microsoft SQL Server (recommended) Oracle DB PostgreSQL MySQL However, it is recommended to use Microsoft SQL Server which was primarily used for testing and application so far. Before analyzing processes with Interactive PMCube Explorer, the event data has to be integrated into the data warehouse. Section 3.1 explains its expected database schema. 3.1. Data Warehouse Schema (PMCube) In Interactive PMCube Explorer, there are two different ways to model the case and event attributes. Dimension A dimension represents an axis of the data cube that can be used to partition the event data into sublogs. Each dimension consist of a set of pair-wise distinct values. Furthermore, Interactive PMCube Explorer allows the user to define dimension hierarchies in a tree-like structure. Dimensions should be the preferred way of modeling the attributes. However, they may not be appropriate in case of many rare attribute values as this can result in sparse data cubes and missing representativeness of of the process models. To avoid such problems, sets of different attribute values may be grouped into artificial categories, e.g. representing intervals of values. If this is not meaningful, the attributes can also be stored as simple attributes. Simple attribute This way of modeling directly attaches an attribute to its case or event. In contrast to dimensions, simple attributes are very limited in their usage (e.g., they cannot be used for roll-up and drill-down). Therefore, it is only recommended to use simple attributes for avoiding the loss of data if the attribute cannot be meaningfully mapped to a dimension. 7

3. Data Preparation Column Name(s) Type Cardinality Description id arbitrary 1 id of cell, must be unique dim 1... dim s arbitrary 1... foreign key to table of most finegrained dimension level Table 3.1.: Columns of Fact table The database schema manages case and event attributes on different levels. This is not only reflected in the definition of OLAP operations, but also by the database schema that stores cases and events in different tables. Figure 3.1 shows the generic database schema of the data warehouse which is an extension of the traditional snowflake schema commonly used for relational data warehouses. D 1.K m 1... D s.k n 1 D s+1.k x 1... D w.k y 1 n... 1... n... 1 n... 1... n... 1 n D 1.K 0... n D s.k 0 n D s+1.k 0... n D w.k 0 1 1 id 1 1 n Fact n 1 n Case n 1 n Event n id id A 1... A p A 1... A r Figure 3.1.: Entity-relationship model of generic database schema Similar to the snowflake schema, there is one table for each level of a dimension hierarchy and the tables of a dimension are connected by a 1:n foreign key relationship according to the hierarchy (e.g., tables D 1.K 0,..., D 1.K m ). The Fact table references case dimension (i.e., each dimension representing a case attribute) by a foreign key reference to the table of its most fine-grained dimension level. Additionally, it also stores a unique id for the easy identification of cell of the data cube. Table 3.1 gives an overview of the columns of the Fact table. Using a foreign key, the cell id of the Fact table is referenced by the Case table which stores each single case as a row. Each tuple requires a mandatory unique id to identify 8

3. Data Preparation Column Name(s) Type Cardinality Description id arbitrary 1 id of case, must be unique cell id arbitrary 1 foreign key reference to id column of Fact table A 1... A p arbitrary 0... columns for storing simple case attributes Table 3.2.: Columns of Case table Column Name(s) Type Cardinality Description id arbitrary 1 id of event, must be unique case id arbitrary 1 foreign key reference to id column of Case table dim 1... dim w arbitrary 1... foreign key to table of most finegrained dimension level A 1... A p arbitrary 0... columns for storing simple event attributes Table 3.3.: Columns of Event table the cases. Additionally, the Case table can also have an arbitrary number of additional attributes (A 1,..., A p ) to store the simple case attributes (i.e., all case attributes that are modeled as simple attributes). The simple attribute columns of the Case table may have arbitrary types. Table 3.2 gives an overview of the columns of the Case table. The Event table stores one event per row. Similar to the Case table, it references the tables of the most fine-grained dimension levels of all event dimensions (i.e., event attributes that are modeled as dimensions). The simple event attributes (i.e., event attributes modeled as simple attributes) are represented by an additional table column for each simple attribute (similar to Case table). The Event table also has a unique id and a foreign key reference to the id of the case that own the event. Table 3.3 summarizes the columns of the Event table. The structure of a dimension level table depends on whether the dimension level is the highest (i.e., most coarse-grained) level of the dimension. If it is not the highest dimension level, the table has a unique id column (used for referencing), a value column (holding a representation of the attribute s value), a description column (providing a more verbose textual description of the value), and a parent id column holding a foreign key reference to the id of the parent dimension level value. The latter can be omitted for the highest level of a dimension, because this does not have a parent. Table 3.4 gives an overview of the columns of a dimension level table. Interactive PMCube Explorer provides a set of tools that support the creation of the data cube and the integration of data. With the Meta-Data Editor (Section 3.2), you can create a meta-data file from scratch. The Data Integration Wizard (Section 3.3) helps you to create a data cube from an event log file and integrate its data into the 9

3. Data Preparation Column Name(s) Type Cardinality Description id arbitrary 1 id of dimension value, must be unique value arbitrary 1 representation of dimension level value description varchar/varchar2 1 textual description of the dimension level vale parent id arbitrary 0... 1 foreign key reference to the table of parent dimension level, can be omitted if no parent exists Table 3.4.: Columns of Event table cube. You can find the data integration tools on the Tools tab of the application s ribbon menu (see Figure 3.2). Figure 3.2.: Tools tab Tip: The Interactive PMCube Explorer assumes that there are at least two dimensions in the data cube. If your data does not provide enough dimensions, you can create a dummy dimension which only consists of one dimension level with a single value that is the same for all cases/events. 3.2. Creating a Meta-data File To be able to load the event data from the database, Interactive PMCube Explorer requires some meta-data on the database schema which is provided by an XML-based meta-data file. To create this file, Interactive PMCube Explorer provides a meta-data editor that can be opened from the Tools tab (see Figure 3.2). To create a new meta-data file, click on. In the pop-up window, select the type of your data cube. You can choose one of the following data cube models: Default The default PMCube data cube model Direct Access Data cube model for directly accessing an operational database using OLAP operations PMC The Process Cubes / PMC data cube model according to van der Aalst and Bolt [vda13, BvdA15]. 10

3. Data Preparation You can click on to save the file to disk or on to load an existing file from disk. Furthermore, you can click on to check if your meta-data is valid. The editor window provides a number of different tabs to edit the meta-data file which are explained in the following subsections in more detail. 3.2.1. Database Schema Tab On the Database Schema Tab (see Figure 3.3), you can describe the schema of the underlying database with its tables, columns, foreign key references etc. Tip: You can import an existing schema from the database by clicking. Figure 3.3.: Database Schema tab of the meta-data editor 3.2.2. Event Tables Tab On the Event Tables Tab (see Figure 3.4) you must select the database tables which represent the event tables of the data cube. To select an event table, you need to click on and select the database table from the dialog. Note: The default PMCube data cube model requires exactly one event table. However, other data cube models may allow for multiple event tables. 11

3. Data Preparation 3.2.3. Dimensions Tab Figure 3.4.: Event Tables tab of the meta-data editor On the Dimensions Tab (see Figure 3.5), you must define all dimensions of the data cube. On the left-side of the tab, all dimensions and their dimension levels are presented in a tree structure. You can add a dimension by clicking in the toolbar above the tree view. The most coarse-grained dimension level (ALL) which comprises all values of the dimension is automatically added to each dimension and cannot be edited. To add a dimension level, you must select its designated parent level and right-click on it. Then select Add Dimension Level from the context menu. The properties of the selected dimension or dimension level are displayed on the right-side of the tab where you can edit them. For a dimension, you can edit the following properties: Name The name of the dimension which is displayed in the user interface. To avoid confusion, it should be unique. Type The type of the dimension. It should be equivalent to the type of the database columns which will store the values for the dimension. Semantics Optional annotation to define some semantic for the dimension. Read Section 3.2.7 for more details. Default Scope The default scope for the dimension. Read Section 3.2.8 for more details. Alternative Scopes The alternative scopes defined for the dimension. 3.2.8 for more details. Read Section For a dimension level, you can edit the following properties: 12

3. Data Preparation Figure 3.5.: Dimensions tab of the meta-data editor Name The name of the dimension level which is displayed in the user interface. To avoid confusion, it should be unique and should describe the level of granularity the dimension level it represents. Value Provider The way how the dimension level values are defined. You can select one of the following options: Database The values of the dimension level are loaded from the database. User Defined The values are explicitly defined by the user. For more information about this, read Section 3.2.9. Data Sources The source of the dimension level s data for each event table. Section 3.2.10 for more information Read 3.2.4. Simple Attributes Tab On the Simple Attributes tab (see Figure 3.6), you must define all simple attributes of the data cube. On the left side of the tab, all simple attributes are presented in a list. You can add a simple attribute by clicking on in the toolbar above. For a simple attribute, you can edit the following properties: Name The name of the attribute which is displayed in the user interface. confusion, it should be unique and descriptive. To avoid Type The data type of the simple attribute. It should be equivalent to the type of the database columns which will store the values for the simple attribute. 13

3. Data Preparation Figure 3.6.: Simple Attributes tab of the meta-data editor Data Sources The source of the simple attribute s data for each event table. Section 3.2.10 for more information. Read Semantics Optional annotation to define some semantic for the simple attribute. Read Section 3.2.7 for more details. Default Scope The default scope for the simple attribute. Read Section 3.2.8 for more details. Alternative Scopes The alternative scopes defined for the simple attribute. Read Section 3.2.8 for more details. 3.2.5. Classifiers Tab On the Classifiers Tab (see Figure 3.7), you can define the classifiers, i.e., which attributes are mapped to the node labels of the resulting process models during process discovery. You can add a new classifier by clicking in the tool bar. After selecting a classifier, you can click on to edit or on to remove the selected classifier. Figure 3.8 shows the dialog for creating or editing a classifier. On top of the window, you can define the name of the classifier which will be shown in the user interface. Below the name field, there are two lists. The upper list contains a collection of all attributes (dimensions and simple attributes) of the data cube, while the lower list shows the collection of attributes defining the classifier. To add an attribute to the classifier, you need to select it in the upper list and click on. To remove an attribute from the classifier, you must select it in the lower list and click on. As the order of classifier s 14

3. Data Preparation Figure 3.7.: Classifiers tab of the meta-data editor attributes defines the label of the node (the values of the attributes are concatenated in the order of the attributes in the lower list), you can move the selected attribute in the lower list one position up or down by clicking or, respectively. Figure 3.8.: Dialog for editing a classifier After you created one or more classifiers, you can select the default classifier from the drop down menu on the bottom of the tab. The default classifier will be automatically selected, when you start a new analysis. 3.2.6. Model-Specific Tabs The default PMCube data model also requires the definition of aggregation functions, which are applied to the simple attributes during the aggregation of events. These 15

3. Data Preparation aggregation functions can be defined on the Aggregation Functions tab (see Figure 3.9). Figure 3.9.: Aggreation Functions tab of the meta-data editor From the drop-down menu on top of the tab, you can select the default aggregation function, that will be applied to all simple attributes in the case of event aggregation. If you want to choose another aggregation function for a particular simple attribute, you can click on. A pop-up window will show up, where you can map the simple attribute to an aggregation function. The mapping between the simple attribute and the aggregation function will be added to the list. You can click or to remove or edit the selected mapping. 3.2.7. Semantics Because attributes of an event log may have a specific meaning, Interactive PMCube Explorer provides a framework to annotate the attributes of the data cube (dimensions as well as simple attributes) with a semantic. For example, it is possible to mark an attribute as the start time-stamp of an event. These semantic annotations are used by Interactive PMCube Explorer to process the data, e.g. the event start time-stamp can be used to order the events in a trace. Table 3.5 gives on overview of all semantic annotations that are defined by default. However, it is possible to extend the semantics framework by providing further annotations integrated as plug-ins (see Section A.13 for a list of provided plug-ins). 16

3. Data Preparation Semantic Name Identifier Semantics Case ID Event ID Event Log ID Time Semantics Event start timestamp Event end timestamp Event duration Case start timestamp Case end timestamp Case duration Description Marks an attribute as the (unique) identifier for cases, or a candidate for case identifiers if the data cube model allows for more than one case attribute. Marks an attribute as the (unique) identifier for events Marks an attribute as the (unique) identifier for the event log Represents the point in time, when the event starts Represents the point in time, when the event ends Represents the time-span from the start to the end of the event (i.e., the difference between start and end timestamp of the event) Represents the point in time, when the case starts (i.e., its first activity starts) Represents the point in time, when the case ends (i.e., its last activity ends) Represents the time-span from the start to the end of the case (i.e., the difference between start and end timestamp of the case) Table 3.5.: Default Semantics 17

3. Data Preparation Note: In some data cube models (e.g. Direct Access and PMC), it is possible to use different attributes as case id. In a hospital log for example, the stay of a patient in the hospital as well as the patient himself (having multiple stays in hospital) may be used as case id. In this case, both candidate attributes need to be annotated as case id in the meta-data. The case id which should be used is then selected in the OLAP query. However, some data cube models like the default PMCube model are restricted to a single case id due to the database structure. In that case, exactly one attribute must be annotated as case id which is automatically selected in the OLAP query. 3.2.8. Scopes and Alternative Scopes Internally, Interactive PMCube Explorer stores the loaded event logs in the usual hierarchical structure: An event log consists of a set of cases while each case owns an ordered set of events (the trace). Both, cases and events may have arbitrary attributes. When reading the data from the database, the scope tells the parser whether to attach an attribute value to an event or a case. Therefore, for each dimension and simple attribute defines default scope in the meta-data. However, if there are more than one case id candidates, the scope may vary depending on the selected case id. If, for example, each stay is selected as the case id of a hospital event log, the admittance date needs to be mapped to the case because it is the same for all events of the same case. If the patient himself is selected as the case id, there may be multiple admittance dates which needs to be mapped to the events. Consequently, for each dimension and simple attribute in the meta-data, you can define alternative scopes for particular case ids. You can add an alternative scope by clicking on in the tool bar of the Alternative Scopes section. In the pop-up window, you can select the case id (only attributes annotated as case id are shown) and the respective scope. The new mapping will then be displayed in the list. You can edit / remove the selected mapping by clicking / in the tool bar. 3.2.9. Value Providers After the connection to the database has been established, Interactive PMCube Explorer initializes the values of all dimension levels which are required to define the OLAP queries. In general, there are two different ways to retrieve the values of the dimension levels: Database The values are loaded from the database. This assumes, that each possible value for a dimension level is implicitly defined inside the database, e.g. as a tuple in a specific dimension table (see 3.1). User defined The values are explicitly defined by the user in the meta-data editor and saved inside the meta-data file. 18

3. Data Preparation Interactive PMCube Explorer supports three different kinds of dimension level values: Element Value This kind of dimension level value represents an elementary value, e.g. a single number or string like 1.4 or Place order. Collection Value This kind of value comprises a non-empty collection of elementary values, e.g. { Monday, Tuesday, Wednesday }. All values of the collection are expected to be of the same type. Different collection values may be overlapping, i.e. an elementary value can be part of multiple collection values (e.g. { Monday, Tuesday, Wednesday } and { Wednesday, Thursday, Friday }). Range Value This kind of value defines a range of (typically numerical) values, e.g. 10 to 20. The range must be defined as a SQL code condition restricting the relevant database column to the desired range. Each dimension level value may have a given name which must be unique for its dimension level. Especially for collection values, this name should be descriptive because it is displayed in the user interface instead of the real value. 3.2.10. Data Sources The data source property defines how to retrieve the data for the dimension level or simple attribute in the OLAP query. In Interactive PMCube Explorer, there are three different kinds of data sources: Database Column This kind of data source defines the database column that stores the value of the attribute (dimension level or simple attribute). If the database column belongs to a different table than the event table, this table must (at least transitively) referenced by the event table. Implicit Value If the data cube contains multiple event tables, it is possible that some event tables only contain events representing the same activity. For example, there may be a dedicated event table storing order activities. These tables may not have a column holding the activity name because this is implicitly defined by the fact that an event is stored in that table. However, the events of such an event table require an explicit activity for process discovery. Therefore, Interactive PMCube Explorer provides the implicit value data source that allows the user to make these implicitly defined values explicit. Derived This kind of data source allows for the definition of virtual attributes. This are attributes that are not stored in the database but that can be derived from one or more other attributes.for example, if the database contains a time-stamp for the beginning and the end of the events, it is possible to derive the event duration by calculating the difference of both time-stamps. Derived data sources are defined by a SQL code fragment which derives the attribute value for an event. 19

3. Data Preparation As some data cube models allow for more than one event table, the data sources are separately defined for each event table. Every simple attribute and dimension level must define a data source for each event table defined in the meta-data (see 3.2.2). 3.3. Data Integration Wizard To minimize the effort for building up a suitable data cube, Interactive PMCube Explorer provides a wizard for data integration. Despite some configuration, it automatically performs the following integration steps: Generate the meta-data file from the selected source file Generate a database schema for the data cube from the selected meta-data file Import the data from data source into a temporary staging area table of the database Transform the data of the staging area table into the structure of the data cube The following sections give a brief overview of the configuration of the different steps. Note: The data integration wizard assumes that there is an empty database instance that can be used for data integration. 3.3.1. Data Source On the Data Source page (see Figure 3.10), you first need to select select an appropriate adapter for your data source (e.g. XES file) before you can select the data source file itself. Below the file selection field, the adapter may offer individual configuration options for reading the data source file and its structure. The adapters for accessing different data source are provided as plug-ins. For a list of provided adapter plug-ins, see Section A.8. 3.3.2. Database On the Database page (see Figure 3.11), you can configure the connection to the target database. More details on defining a database connection are given in Chapter 5. 3.3.3. Meta-data On the Meta Data page (see Figure 3.12), you first need to select the target data cube model. Then, you can select the file path for saving the meta-data file. Depending on the selected data cube model, you may find some additional configuration options below 20

3. Data Preparation Figure 3.10.: Data Integration Wizard: Configuration page for the data source Figure 3.11.: Data Integration Wizard: Configuration page for the data base connection 21

3. Data Preparation Figure 3.12.: Data Integration Wizard: Configuration page for the meta-data the file path field. A list of available data cube models for the data integration can be found in the appendix (Section A.10). Note: After you have clicked the Next button on the Meta Data page, the wizard extracts some information on the structure of the data source. Depending on the type of data source, this may take some time. 3.3.4. Mapping After analyzing the data source structure, the Mapping page (see Figure 3.13) will show up where you must define how the attributes of the data source are mapped to the attributes (i.e. dimensions and simple attributes) of the data cube. On the left side of the page, all attributes of the data source are presented in a list. On the right side of this list, the mapping of the selected attribute to the target data cube attribute is shown. According to the configuration in the previous steps, the wizard automatically creates an initial mapping which should be fine for most of the attributes. However, you may need to correct the mapping or adjust it to your needs. For each attribute of the data source, you will find the following configuration options for the target attribute of the data cube: Name The descriptive name of the attribute that will be displayed in the user interface. To avoid name clashes during the data integration, the target attribute names must be unique. Type The type of the attribute. It must be compatible with the type of the data source attribute. 22

3. Data Preparation Default Scope The default scope of the target attribute (read more on scopes in Section 3.2.8). Map To Dimension Flag that indicates whether the attribute should be added to the database as a dimension or a simple attribute. For more information about dimensions and simple attributes, see Section 3.1. Semantics Optionally attaches some semantics to the target attribute in the data cube. Figure 3.13.: Data Integration Wizard: Configuration page for the mapping between source and target data elements Note: After the wizard has created the meta-data file, you can still edit it with the meta-data editor (e.g., to rename attributes). However, you have to make sure that the meta-data file matches the structure of the data cube. After clicking the Next button, the wizard starts to integrate the data into the data cube. 23

4. Application settings Before starting the first analysis, it is recommended to configure the tool. To customize the application settings, click on the Application settings button in the start tab. The application settings dialog showing up provides multiple settings on different pages. The following sections will give a brief overview of the available configuration options. Note: Changes of some particular settings will only take effect after a restart of the application. 4.1. General Settings On this page, you can select the configuration file for the logger. Interactive PMCube Exploreruses the Apache log4net library 1 (a.net port of the log4j library). To enable the logging, create a configuration file (see log4net documentation for details) and select it in the application settings. Additionally, you can also set maximum tree depth of XML documents handled by the serializer. However, the default value should be fine in most cases. 4.2. Paths On the Paths page, you can configure the default paths for the import and export files (e.g., process models, images), a default folder for storing meta-data files, and the default folder for the setting files of process mining algorithms. Furthermore, you have to specify the file that will store your database connection information. 4.3. Visualization On this page, you can customize the appearance of the process model visualization (e.g., node size, distances between nodes etc.). Most of the settings are specific for a particular process model type. To edit the appearance of a specific view model, you can select the corresponding process model notation from the drop-down list. 1 https://logging.apache.org/log4net/ 24

4.4. Process Model Diff 4. Application settings On this page, you can select the algorithm that should be used to create difference models. 4.5. Process Discovery Here, you can select the default process discovery algorithm. This algorithm will be automatically selected on application start-up. Optionally, you can also select a configuration file to provide customized default settings to the selected default algorithm. 4.6. Plugins The Plugins page gives you an overview of all plug-ins that have been loaded at the start-up of the application. To search for a particular plug-in, the text field on the top of the page allows you to filter the plug-in list. Tip: By default, the tool provides multiple plug-ins. Therefore, an empty plugin list indicates that the path to the plug-in folder is not correctly set. To fix this issue, you can select the plug-in directory and confirm it by clicking the Okay button. Now, the plug-ins should be loaded when restarting the application. 25

5. Connect to Database At the beginning of your analysis, you need to select the connection to your database. To open the connection dialog (Figure 5.1), click on the Connect button of the Start tab. Figure 5.1.: Connection Dialog To create a new database connection, select the used database management system from the drop-down list and click on the plus icon. After that, enter the required parameters in the dialog. Table?? explains the parameters in more detail. For connections to Microsoft SQL Server databases, there is also the option to use the automatic authentication of the Windows operating system. You can check the checkbox to enable this option. User and password field can be left empty when Windows authentication is used. You can save the settings using the Save icon in the toolbar on top of the dialog. 26

5. Connect to Database Property Connection name Database Server Port User Password Description The user-defined name of the connection which is used in the connection list on the left. Should be unique. The name of the target database The URL or IP address of the target database server The port of the target database service The name of the database user The password of the database user Table 5.1.: Parameters of the connection dialog Note: The database configurations are saved in a file (for selecting this file, see Section 4.2) which is encrypted using the Windows user account. Consequently, the connection file can only be read from the user account you created it. Sharing connection information between different user accounts or machines is not possible. Finally, you need to select the meta-data file that describes the structure of the selected database. To connect to the database, select the desired connection from the list and click on OK. Now the application initializes the dimension values from the database. Depending on the size of the data and your connection speed, this may take some time. Tip: The application automatically saves the last used database connection and meta-data file. To reconnect quickly, you can click on the Quick connect button in the Start tab. 27

A. Plug-ins A.1. Event Log Exporter Plug-ins These plug-ins provide export functionalities to save an event log to disk. CSV Event Log Exporter PMCubeExlorer.EventLogExporter.Csv Provides an export of event logs as CSV file. A.2. Conformance Checking Plug-ins These Plugins provide conformance checking functionalities, e.g., they compare a process model to an event log in order to calculate the fitness as a measure of process model quality. Token replay PMCubeExplorer.ConformanceChecking.PetriNet Implements Token Replay Conformance Checking on Petri Net process models. Process map conformance checking PMCubeExplorer.ConformanceChecking.ProcessMap Implements Token Replay Conformance Checking on Process Map process models. Process tree conformance checking PMCubeExplorer.ConformanceChecking.ProcessTree Implements Token Replay Conformance Checking on Process Tree process models. A.3. Consolidation Plug-ins These plug-ins provide approaches for the consolidation of process models. Consolidation means that a subset of the possibly most interesting process models is selecting from the set of all resulting process models of an multidimensional process mining analysis. Consolidation covers a wide range of approaches from simply filtering the process models 28

A. Plug-ins by their properties to a clustering of process models. Clustering Consolidation PMCubeExplorer.Consolidation.ClusteringConsolidation Provides a clustering-based consolidation of process models using the DB-Scan algorithm. Requires further plug-ins to calculate the Causal Behavioural Profile Metric and defines the respective interface (A.4). Filter consolidation PMCubeExplorer.Consolidation.FilterConsolidation Provides a process model consolidation based on simple filtering. A.4. Causal Behavioural Profile Plug-ins These plug-ins provide the algorithms to calculate the Causal Behavioural Profile metric which specifies the semantic behaviour of a process model. Process Map Causal Behavioural Profile Metric Plugin PMCubeExplorer.Consolidation.ClusteringConsolidation.Metric.ProcessMap Implementation of the Causal Behavioural Profile Metric for Process Map process models. Process Tree Causal Behavioural Profile Metric Plugin PMCubeExplorer.Consolidation.ClusteringConsolidation.Metric.ProcessTree Implementation of the Causal Behavioural Profile Metric for Process Tree process models. A.5. Database Plug-ins These plug-ins provide access to database management systems (DBMS). They encapsulate the DBMS-specific aspects of the communication with the database and integrate the relevant driver libraries for the DBMS. Microsoft SQL PMCubeExplorer.DatabaseNew.MicrosoftSql Provides access to Microsoft SQL Server databases. MySQL PMCubeExplorer.DatabaseNew.MySQL Provides access to MySQL databases. Oracle PMCubeExplorer.DatabaseNew.Oracle Provides access to Oracle databases. 29

A. Plug-ins PostgreSQL PMCubeExplorer.DatabaseNew.PostgreSql Provides access to PostgreSQL databases. A.6. Event Log Statistics Plug-ins These plug-ins provide the calculation of statistics on the event logs. Trace duration PMCubeExplorer.DataSelection.LogStatistics.Duration Provides statistics on the duration of traces in an event log. Event class distribution PMCubeExplorer.DataSelection.LogStatistics.EventClassDistribution Provides statistics on the the distribution of event classes in an event log. Trace length by frequency PMCubeExplorer.DataSelection.LogStatistics.TraceLengthByFrequency Provides statistics on the length of traces in an event log by its frequency. Traces by frequency PMCubeExplorer.DataSelection.LogStatistics.TracesByFrequency Provides statistics on the frequency of traces in an event log. A.7. Diff Plug-ins These plug-ins provide algorithms for the calculation of difference models. Difference models can be either created by comparing two process models or two event logs to each other. Better Snapshot PMCubeExplorer.Diff.BetterSnapshot Provides an improved version of the snapshot diff algorithm. Log diff comparison PMCubeExplorer.Diff.LogDiff Plug-in provides the calculation of a difference model based on event log comparison. Snapshot PMCubeExplorer.Diff.Snapshot Provides the implementation of the snapshot algorithm to calculate the difference of two process models. 30

A. Plug-ins A.8. Adapter Plug-ins These plug-ins provide adapters for the integration of event logs. Each adapter can read the data and its structure from file, create an initial mapping of the data source structure to the meta-data, and load the data into the staging area of the target database. XES ETL Adpater Plugin PMCubeExplorer.ETL.Adapter.XES Adapter plug-in providing an import for event logs in XES file format. A.9. Adapter UI Plug-ins These plug-ins provide optional user interface components for the configuration of a particular adapter plug-in. XES ETL Adapter UI Plugin PMCubeExplorer.ETL.Adapter.XES.UI Provides user interface components for the XES adapter plug-in A.10. Integration Plug-ins These plug-ins provide the functionality to integrate data into a specific data cube model. Each plug-in can create the database tables for the data cube, and transform the data from the staging area to integrate it into the data cube. Default Integration Plugin PMCubeExplorer.ETL.Integration.Default Provides the integration functionalities for the default cube model of PMCube. A.11. Event Log Action Plug-ins These plug-ins provide operations for manipulating (e.g., filtering) the event logs that were loaded from the database. Remove activity from simple event log filter plugin PMCubeExplorer.EventLogAction.Filter.RemoveActivity Provides a filter operation that removes events with a particular activity from the event log. Filter traces by frequency plugin PMCubeExplorer.EventLogAction.Filter.TraceFrequency Provides a filter operation that removes traces from the event log which do not match a user-defined frequency. 31

A. Plug-ins Trace Has Activity Simple Event Log Filter PMCubeExplorer.EventLogAction.Filter.TraceHasActivity Provides a filter operation which removes all traces from the event log that do not have at least one event with a particular activity. Event Log Trace Length Filter PMCubeExplorer.EventLogAction.Filter.TraceLength Provides a filter operation to remove traces that do not have a particual user-defined length. A.12. Meta-data Plug-ins These plug-ins provide the data-structures to describe a data cube and its underlying database schema. Default Metadata Plugin PMCubeExplorer.Metadata.Default Provides the meta-data model for the default cube model of PMCube. Direct Access Metadata Plugin PMCubeExplorer.Metadata.DirectAccess Provides the meta-data model for the direct (OLAP) access of (operational) databases. PMC Metadata Plugin PMCubeExplorer.Metadata.PMC Provides the meta-data model for the Process Cubes/PMC data model according to van der Aalst and Bolt. A.13. Semantics Plug-ins These plug-ins provide extensions to attach some semantics to the attributes of the data cube. Transactional Life-Cycle Semantic Plugin PMCubeExplorer.Metadata.Semantic.LifeCycle Provides a semantic extension of the meta-data model to annotate attributes as information on the transactional life-cycle. 32

A. Plug-ins A.14. Meta-data UI Plug-ins These plug-ins provide additional user interface components for the meta-data editor of a specific data cube model. Default Metadata UI Plugin PMCubeExplorer.Metadata.UI.Default Provides user interface components for the PMCube default data cube model. Direct Access Metadata UI Plugin PMCubeExplorer.Metadata.UI.DirectAccess Provides user interface components for the direct access data cube model. PMC Metadata UI Plugin PMCubeExplorer.Metadata.UI.PMC Provides user interface components for the Process Cubes / PMC data cube model. A.15. OLAP Server Plug-ins Each of these plug-ins provides the OLAP functionality for a specific data cube model. Default OLAP Server PMCubeExplorer.OlapServer.Default Provides OLAP functionalities for the PMCube default data cube model. Direct Access Olap Server Plugin PMCubeExplorer.OlapServer.DirectAccess Provides OLAP functionalities for the direct access data cube model. PMC Olap Server Plugin PMCubeExplorer.OlapServer.PMC Provides OLAP functionalities for the Process Cubes / PMC data cube model. A.16. OLAP Server UI Plug-ins These plug-ins provide user interface components for defining the OLAP queries taylored to a specific data cube model. Default Olap Server UI Plugin PMCubeExplorer.OlapServer.UI.Default Provides user interface components for the OLAP functionalities of the PMCube default data cube model. Direct Access OLAP Server UI Plugin PMCubeExplorer.OlapServer.UI.DirectAccess Provides user interface components for the OLAP functionalities of the direct access data cube model. 33

A. Plug-ins PMC OLAP Server UI Plugin PMCubeExplorer.OlapServer.UI.Pmc Provides user interface components for the OLAP functionalities of the Process Cubes / PMC data cube model. A.17. Process Discovery Plug-ins These plug-ins provide algorithms for process discovery. Change Tree Miner PMCubeExplorer.ProcessDiscovery.ChangeTreeMiner Provides an implementation of the Change Tree Miner algorithm Flexible Heuristics Miner (FHM) PMCubeExplorer.ProcessDiscovery.FlexibleHeuristicsMiner Provides an implementation of the Flexible Heuristics Miner process discovery algorithm. FuzzyMiner PMCubeExplorer.ProcessDiscovery.FuzzyMiner Provides an implementation of the Fuzzy Miner process discovery algorithm. Inductive Miner V3 PMCubeExplorer.ProcessDiscovery.InductiveMinerV3 Provides an implementation of the Inductive Miner (- Infrequent) process discovery algorithm). A.18. Binary Significance Metric Plug-Ins These plug-ins provide a binary significance metric for the Fuzzy Miner algorithm. Frequency Significance PMCubeExplorer.ProcessDiscovery.FuzzyMiner.Metrics.BinaryFrequencySignificance Provides the calculation of the binary frequency significance metric for the Fuzzy Miner process discovery algortihm. A.19. Binary Correlation Metric Plug-ins These plug-ins provide a binary correlation metric for the Fuzzy Miner algorithm. Proximity Correlation PMCubeExplorer.ProcessDiscovery.FuzzyMiner.Metrics.ProximityCorrelation Provides the calculation of the proximity correlation metric for the Fuzzy Miner process discovery algortihm. 34

A. Plug-ins A.20. Unary Significance Plug-ins These plug-ins provide a unary significance metric for the Fuzzy Miner algorithm. Frequency Significance PMCubeExplorer.ProcessDiscovery.FuzzyMiner.Metrics.UnaryFrequencySignificance Provides the calculation of the unary frequency significance metric for the Fuzzy Miner process discovery algortihm. A.21. Process Model Plug-ins These plug-ins provide the data structures to represent process models. BPMN PMCubeExplorer.ProcessModel.BPMN Provides the data structures for BPMN process models. Causal net PMCubeExplorer.ProcessModel.CausalNet Provides the data structures for Causal Net process models. Change Tree PMCubeExplorer.ProcessModel.ChangeTree Provides the data structures for Change Tree process models. Merged Annotated Transition System PMCubeExplorer.ProcessModel.MergedATS Provides the data structures for Merged Annotated Transition Systems (ATS) process models. Petri net PMCubeExplorer.ProcessModel.PetriNet Provides the data structures for Petri Net process models. Process map PMCubeExplorer.ProcessModel.ProcessMap Provides the data structures for Process Map process models. Process tree PMCubeExplorer.ProcessModel.ProcessTree Provides the data structures for Process Tree process models. 35

A. Plug-ins A.22. Process Model Converter Plug-ins These plug-ins provide a converter to transform a process model from one modelling language to another (e.g., from BPMN to Petri Nets). Process Tree To BPMN Converter PMCubeExplorer.ProcessModel.Converter.ProcessTreeToBPMNConverter Provides a process model converter for transforming Process Trees into BPMN process models. Process tree to petri net converter plugin PMCubeExplorer.ProcessModel.Converter.ProcessTreeToPetriNetConverter Provides a process model converter for transforming Process Trees into Petri Nets. A.23. Time Enhancement Plug-ins These plug-ins provide the enhancement of the process model with time information. Process Map Time Enhancement Plugin PMCubeExplorer.TimeEnhancement.ProcessMap Provides an enhancement of Process Map process models with the waiting times between two activities A.24. View Model Converter Plug-ins These plug-ins provide converters that convert a process model (i.e. its data structure) into an equivalent view model (i.e. its visual representation) BPMN View Model Converter PMCubeExplorer.ViewModelConverter.BPMN Provides the view model converter to translate a BPMN process model into the respective view model. Change Tree View Model Converter PMCubeExplorer.ViewModelConverter.ChangeTree Provides the view model converter to translate a Change Tree process model into the respective view model. Merged Annotated Transition System (ATS) View Model Converter PMCubeExplorer.ViewModelConverter.MergedATS Provides the view model converter to translate a Merged Annotated Transition System (ATS) process model into the respective view model. Petri Net View Model Converter PMCubeExplorer.ViewModelConverter.PetriNet Provides the view model converter to translate a Petri Net process model into the respective view model. 36