The Pennsylvania State University. The Graduate School. College of Information Sciences and Technology

Size: px

Start display at page:

Download "The Pennsylvania State University. The Graduate School. College of Information Sciences and Technology"

Elijah Leon Hopkins
5 years ago
Views:

1 The Pennsylvania State University The Graduate School College of Information Sciences and Technology CONTEXT-AWARE DESIGN FOR PROCESS FLEXIBILITY AND ADAPTATION A Dissertation in Information Sciences and Technology by Wen Yao 2012 Wen Yao Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2012

2 The dissertation of WenYao was reviewed and approved* by the following: Chao-Hsien Chu Professor of Information Sciences and Technology Dissertation Advisor Chair of Committee Lee Giles David Reese Professor of Information Sciences and Technology Professor of Computer Science and Engineering Heng Xu Associate Professor of Information Sciences and Technology Akhil Kumar Professor of Supply Chain and Information System Mary Beth Rosson Professor of Information Sciences and Technology Director of Graduate Programs * Signatures are on file in the Graduate School ii

3 ABSTRACT Today s organizations face continuous and unprecedented changes in their business environment. Traditional process design tools tend to be inflexible and can only support rigidly defined processes (e.g., order processing in the supply chain). This considerably restricts their real-world applications value, especially in the dynamic and changing environment. Recently, studies have contributed to the development of Adaptive Process Management Systems (APMS) that can facilitate fast implementation and deployment of business processes that allow for flexible adaptation. Most of them, however, only support ad hoc changes that require human intervention and only ensure structural correctness of these changes. This type of adaptation is time and effort consuming if such changes happen frequently. It is also difficult to trace and manage the sources of such changes so as to support automatic adaptation and improve reusability on the basis of past experiences. Furthermore, these approaches are not feasible in a knowledge-intensive environment because few of them examine semantic correctness of process changes. Hence, there is a need for new approaches that aim to design processes for flexibility and adaptability. This dissertation introduces a new approach toward integrating context-awareness in process flexibility and adaptation to overcome these limitations. This dissertation centers around three main projects using principles of the design science methodology. First, we discuss the need of contextawareness in flexible process design and develop a formal approach to enable this design process. We propose to use an ontology-based method to model process contexts and Complex Event Processing (CEP) to detect critical situations. We also discuss the architectural support for context management. Next, we propose various adaptation strategies and integrate them at both process model and instance levels in a context-aware manner. Specifically, we developed a process template and rule-based approach, considering business contexts such organizational policies, to configure process models at design time. This can handle a larger number of process models with small variance and facilitate their management when business objectives change. At the instance level, we propose a placeholder-based approach to iii

4 customize the subsequent workflow on the fly based on the dynamic contexts and case data at runtime. Since the context that impacts process instance adaptation is highly domain dependent, we describe this work in clinical settings in this dissertation. We proposed a framework called ConFlexFlow, and showed how flexible and adaptable clinical pathways can be designed taking into account medical knowledge in the form of rules and detailed contextual information to achieve a high quality outcome. These pathways are selected during workflow execution based on rules that encapsulate medical knowledge and the dynamic context at runtime. Thus, each process instance is customized to an individual patient case based on the patient data, resources availability, etc. Third, we propose a Mixed Integer Programming (MIP) model to check the compliance of process models and to validate the semantic correctness of process adaptations. We propose a formal specification language to model semantic constraints of activities, including presence and dependency relationships, ordering sequences, role assignment and obligations. Then the compliance issue is formulated as an MIP problem. We propose three novel ideas: the notion of a degree of compliance of a process, the concepts of full and partial validity of change operations, and the idea of compliance by compensation. Based on the above ideas, we use mixed-integer programming to check: (a) the semantic compliance of a process model and its evolution; and (b) if the ad hoc changes made to running process instances are semantically valid or not. If not, we calculate the minimal set of compensation operations needed based on degree of non-compliance, and transform a non-compliant process into a compliant one. We show that this novel approach is more elegant and superior to a pure logic-based approach. Throughout this dissertation, most of the examples are from the healthcare domain for illustration. Clinical workflow has been considered as a killing area for the application of process management technologies, due to its dynamic, complex and knowledge-intensive nature. We prove that our approach can improve the care quality by increasing the flexibility and adaptability of clinical pathways while ensuring its semantic correctness. iv

5 TABLE OF CONTENTS List of Figures... ix List of Tables... ix Acknowledgements... xii Chapter 1. Introduction and Problem Statement Problem Statement Background and Motivations A scenario in clinical settings Process flexibility Context handling in contemporary approaches Motivations Research Framework and Contributions Research framework Contributions Organization of the dissertation Chapter 2. Literature Review Process Model Flexibility Process Instance Adaptation Flexibility by Underspecification Semantic Correctness of Business Processes Workflow Technologies in Healthcare Chapter 3. A Context-aware Design Approach Introduction The Design Approach A Conceptual Model An Overall Framework Chapter 4. Context Specification and Management Process Contextualization Ontology-based Approach for Context Modeling Knowledge representation Ontology-based context model Context reasoning v

6 Preliminary implementation and evaluation Context Management Complex Event Processing for Context Detection An application scenario CEP preliminaries Modeling business events using CEP Implementation and assessment Discussion Chapter 5. Context-aware Process Model Configuration and Management Introduction Preliminaries A motivating example Formal representation of a process model Language to design flexible process variants (FlexVar) Rule Representation and Processing Rule representation Rule categories Rule processing and semantics for conflict resolution Checking validity of a sequence of change operations Process Variant Configuration and Representation Overview Algorithm details for tree operations Tree traversal, representation, and string operations Examples Managing a Large Repository of Process Variants Analysis and Discovery Similarity Process variant discovery Querying a rulebase Evaluation of process complexity reduction Prototype Implementation System architecture A use case for handling insurance claims Discussion and limitations vi

7 5.7. Conclusions Chapter 6. Context-aware Process Instance Adaptation Introduction An Integrated Framework CONFlexFlow A running example A meta-model System architecture Clinical Knowledge Representation and Semantic Rules Ontological knowledge model Semantic rules for clinical reasoning Integration of Clinical Pathway and Rules for Flexibility A framework for integration Designing a flexible clinical workflow using BPMN Workflow engine Drools-Flow Customizing an ad-hoc subprocess at runtime Discussion Results of ontology analysis Contributions, success factors and KPIs Conclusions Chapter 7. Ensuring Semantic Correctness of Processes using Mixed Integer Programming Introduction Preliminaries A running example Semantic constraints Our contributions Formal Specification of Semantic Constraints Specification of semantic constraints Properties of semantic constraints Model-level and instance-level semantic constraints Constraint validation Modeling Adaptive Processes Formal representation of process models Change operations vii

8 Modeling relaxation of process representation constraints Process Compliance Checking using MIP Preliminaries Formal algorithms for compliance checking Examples Discussion Discussion and Implications Comparison of MIP- vs. logic-based approach Implementation architecture System realization Limitations and extensions Conclusions Chapter 8. Conclusions and Future Work Conclusions Future Work References Appendix A: A Single Model that Integrates Rules R1-R Appendix B: The Medical Plan for Heart Attack in Flow Chart Appendix C: A Full CPLEX Model for Example viii

9 LIST OF FIGURES Figure 1.1. Process lifecycle... 3 Figure 1.2. A simplified clinical process for proximal femoral fracture... 5 Figure 1.3. Research framework Figure 1.4. Research issues and solutions organized by chapters Figure 2.1. Taxonomy of approaches toward process flexibility Figure 3.1. Context-aware design approach Figure 3.2. A conceptual model Figure 3.3. An overall framework for context-aware process design Figure 4.1. A hierarchy of process context Figure 4.2. OWL-DL representation of class, individual and properties Figure 4.3. An ontology-based context model for healthcare Figure 4.4. Framework for context management Figure 4.5. A typical surgical workflow Figure 4.6. Event ontology in CEP Figure 4.7. Physical and semantic data flow in RFID-enabled hospital Figure 4.8. Walkthrough of event processing implementation Figure 4.9. Temporal constructors in Drools Fusion Figure Performance evaluation Figure Simplified surgical workflow modeled by Drools-flow Figure A screenshot of our prototype Figure 5.1. Description of an insurance process in BPMN notation Figure 5.2. Rules to be applied to the insurance process template in Figure Figure 5.3. Two process variants derived from process template and rules Figure 5.4. BPEL representation of the insurance process model in Figure Figure 5.5. Different types of rules related to the insurance process template in Figure Figure 5.6. A process tree for the insurance process template Figure 5.7. Variant configuration algorithm Figure 5.8. Revised process tree after applying the variant configuration algorithm Figure 5.9. Post-order traversal of the process in Figure Figure Illustration of operations on the POS string of the process in Figure Figure Variant V2 is derived from V1 by performing a series of operations Figure A graph of variants showing how one variant is derived from another by operations Figure Algorithm to determine the relationship between two nodes by scanning POS string ix

10 Figure Architecture of process variant configuration Figure Architecture for variant search, discovery, and instantiation Figure User interface of process variant manager Figure Interface for Variant Configurator Figure Process variant configuration Figure 6.1. A simplified BPMN clinical workflow modeled in BPMN Figure 6.2. A meta-model for CONFlexFlow Figure 6.3. A hierarchy of BPMN 2.0 activity notations Figure 6.4. System architecture of CONFlexFlow Figure 6.5. Partial representation of the clinical context ontology Figure 6.6. Protégé user interface for editing OWL-based ontology Figure 6.7. Knowledge representation in OWL-DL Figure 6.8. Partial representation of the heart failure ontology Figure 6.9. A semantic hierarchy of rules based on clinical knowledge Figure SWRL reasoning using Jess rule engine Figure Encoding and reasoning of chronic heart failure using SWRL rules Figure Methodology for integration of context, activity, and rules Figure A loosely coupled BPMN workflow implemented in Drools-flow Figure CONFlexFlow implementation using the Drools framework Figure XML representation of the nested ad hoc subprocess Treatment-ASP2 in Figure Figure Rules associated with the Treatment and Medication ad-hoc sub-process Figure Results of the ad-hoc subprocess instantiation Figure 7.1. A simplified clinical process for proximal femoral fracture Figure 7.2. Process modeling structures and their representation by constraints Figure 7.3. Formal representation of process structure constraints Figure 7.4. Representation of relaxed process representation constraints Figure 7.5. Algorithm for process model compliance checking at design time Figure 7.6. Algorithm for process compliance checking at initialization time Figure 7.7. Algorithm for process compliance checking at execution time Figure 7.8. Overview of the illustrated examples Figure 7.9. Compliance checking at design time Figure Compliance checking at initialization time Figure Solution and modification to I Figure Compliance checking during the lifecycle - After CP4 is applied Figure An overall architecture for implementation x

11 LIST OF TABLES Table 1.1. Example contexts for changes in the clinical process... 5 Table 1.2. Example semantic constraints for the clinical process... 6 Table 1.3. Properties and types of process flexibility and adaptation... 7 Table 2.1. Comparison of existing approaches for modeling clinical processes Table 4.1. Expression and semantics of event constructors Table 4.2. Common RFID location change events Table 4.3. Common RFID semantic filtering events Table 5.1. Basic patterns to design processes Table 5.2. Base operations of FlexVar language for modifying a process template Table 5.3. A matrix for verifying the correctness of multiple operations Table 5.4. Details of performing change operations on a process tree P_tree Table 5.5. Steps of operations to be performed on post-order traversal string (POS) Table 5.6. Illustration of steps performed in transforming V1 to V2 in the example of Figure Table 5.7. An example Variant_index table containing task and structure bit vectors for each variant Table 5.8. Process complexity metrics and the evaluation results Table 6.1. Example user-defined context reasoning rules Table 7.1. Formal specification of semantic constraints Table 7.2. Model-level semantic constraints for the clinical process in Figure Table 7.3. Instance-level semantic constraints for the clinical process in Figure Table 7.4. Algorithm for semantic constraint validation Table 7.5. Formal specification of primitive change operations Table 7.6. Change sets occurring during the execution lifecycle of instance I Table 7.7. Comparison of MIP-based and logic-based compliance checking xi

12 ACKNOWLEDGEMENTS Many people contributed to this dissertation in innumerable ways and I am grateful to all of them. First, I want to thank my academic advisor Prof. Chao-Hsien Chu, for his invaluable support on both an academic level and a personal level over the past five years. His office door was always open whenever I had a question about my research or writing. I would like to express my special appreciation and thanks to Prof. Akhil Kumar, who has been a tremendous mentor for me. This dissertation would not have been possible without his encouragement, expert guidance, and help. For this dissertation, I would like to acknowledge two other committee members, Prof. Lee Giles and Prof. Heng Xu, for their time, interest, and helpful comments. Thank you for serving as my committee members even at hardship, and letting my defense be an enjoyable moment. A special thanks to Prof. Wil van der Aalst, a big name in the research area of workflow management. Although we do not know each other in person, I benefited a lot from his publications and was greatly inspired by his work during my research path. I also would like to acknowledge Prof. Henry (Haidong) Bi for encouraging my interest in workflow management and hospital information systems. Many thanks go to my future colleagues at HP labs, Jerry Rolia, Sujoy Basu, and Sharad Singhal for giving me enough time to complete this dissertation while working on another project. I am also grateful to them for proving me the great opportunity to work on healthcare workflow. I am appreciative of my fellow graduate students, Zang Li and Rachida Parks, for providing valuable discussions for improving this dissertation. I would like to thank all my dear friends I met at Penn State and last summer at HP labs, for their kindness, friendships, and support. Finally and most importantly, I would like to thank my beloved boyfriend Yi Zhang. His support, patience and unwavering love let me through all the difficulties. Thank you for always being there for me and everything. I would like to extend my deepest gratitude to my parents in China (although they cannot read English), for their faith in me and allowing me to be as ambitious as I wanted. xii

13 Chapter 1 Introduction and Problem Statement A business process (or workflow) comprises a series of value-added activities, performed by their relevant roles (human or computer applications) to achieve a common business goal. A business can be viewed as a collection of processes, and the robustness of these processes to a large extent is a crucial determinant of the success of the business. For example, order fulfillment, car insurance claim processing, and clinical pathways are critical processes respectively in the supply chain, insurance and healthcare organizations. A process management system (or workflow management system) is the software tool used to support design and execution of business processes represented in formal modeling languages. It supports coordination of activities among various people or computer applications and aims to improve business operational efficiency. Today s organizations often face continuous and unprecedented changes in their business environment. Conventional process management systems tend to be inflexible (Müller et al. 2004) and can only support rigidly defined processes (e.g., order processing in supply chain). This restricts their value in real-world applications considerably, especially in a dynamic environment such as mobile computing. Recently, a number of studies have contributed to the development of Adaptive Process Management Systems (APMS) that can facilitate fast implementation and deployment of business processes and allow for their flexible adaptation. However, most of them only consider ad hoc changes that are initiated manually for exception handling (Chiu et al. 1999; Hagen and Alonso 2000; Müller et al. 2004). In most work, structural correctness such as deadlock (or infinite loop) and data inconsistency is validated before such change is applied (Rinderle et al. 2004). These approaches present two major disadvantages. First, the changes are made in an ad hoc way and they need human intervention; thus it is time and effort consuming. It is impossible to trace the sources for such changes so as to support automatic process adaptation and improve the reusability based on past 1

14 experiences. Second, the semantic correctness of such change is overlooked since they assume that users are aware of domain knowledge related constraints. They are not applicable in the knowledge-intensive environment like the healthcare domain where semantic constraints play a key role, e.g., drug interaction. Hence, there is a need for a comprehensive approach that supports formal process context modeling, allows context-aware process adaptation, and preserves its semantic correctness Problem Statement A business process is a series or network of value-added activities, performed by their relevant roles or participants, to purposefully achieve the common business goal (Ko 2009). Figure 1.1 presents a typical process lifecycle as it is promoted in both research and practice. Process design is driven by business goals taking consideration of organizational and legal issues. It translates high-level business objectives into concrete models (probably informal) that can be easily understood. Based on this, the next phase uses formal modeling languages to create executable process models. Three most popular workflow modeling languages include Petri Nets (Peterson 1981), BPMN (OMG 2006), and BPEL (Kloppmann et al. 2005), with different focuses. Expressiveness and complexity of process models (e.g., conditional branches, and loops) are balanced taking consideration of case data and possible scenarios. After that, a process instance can be initiated and it may encounter anticipated or unanticipated exceptions during its execution. The execution of process instances are usually monitored so that they can be analyzed or processed by data mining techniques. The last phase involves process evaluation according to the specific metrics and the result can be used to improve or reengineer the original process model. A variety of factors can trigger the need for process flexibility and adaptation. First, the adjustment in high level business objectives such as regulations and organization policies can lead to the changes in business processes (i.e., phase I). Second, small differences in case data present potential changes to the predefined process model (i.e., phase II), which leads to a large number of process models with minor variance (Hallerbach et al. 2010; Kumar and Yao 2012). Third, during execution of each process instance 2

15 (i.e., phase III), exceptions can happen that require deviation from its reference model (Hagen and Alonso 2000; Luo et al. 2000). Business goals Streamline clinical workflow to improve caregiver productivity Improve old patient safety by giving tolerance test Comply with HIPPA security requirements Business goals, organizational, legal analysis Phase I: Process design Scenarios If X-ray image is not clear, do a CT scan, otherwise not. If a patient has heart attack history, do a physiological test. If a patient is in emergency, no admission required. Metrics, targets Phase IV: Process evaluation Phase II: Process modeling Case data, Scenarios Process monitoring Phase III: Process execution Exceptions Ad hoc changes Exceptions Drug prescription has toxic effect with unknown reason A nurse gives wrong test A surgeon gives surgery on the wrong part of patient body. Figure 1.1. Process lifecycle Defining the characteristics of variations that give rise to the need for flexible processes is essential to understanding how this need affects the requirements for flexibility (Kumar and Narasipuram 2006). It is important that flexible business processes should be designed in such a way as to meet the demands of variations. We define process context as situational circumstances that can impact process design (e.g. legislation, business policy, culture, etc.) and the execution environment in which a process is embedded (e.g. case data, performance requirements, time, location, etc). Contextualization of business processes has been emphasized and discussed at the conceptual level (Rosemann et al. 2008). However, a formal approach toward context-aware design for process flexibility is lacking. Furthermore, two conflicting goals need to be balanced need for control and need for flexibility (van der Aalst et al. 2009a). Process flexibility and adaptability is controlled by structural and semantic constraints. Structural correctness (e.g., no deadlock or data inconsistency) ensures error-free execution of 3

16 processes. On the other hand, semantic constraints, derived from business policies and domain knowledge, enforce their semantic compliance. Business processes must comply with externally imposed regulations (e.g., such as business protocols, legislation, long-term contracts, and quality norms) and internal policies within an organization (Goedertier and Vanthienen 2006). Semantic constrains are derived from domain knowledge. For example in health care, semantic constraints can include medical knowledge such as drug-drug interaction and interdependencies of medical tasks (Ly et al. 2008) Background and Motivations A scenario in clinical settings Healthcare is considered as the killer application area for process management systems (Dadam et al. 2000), since clinical processes are dynamic, complex, and knowledge-intensive. Thus, we use a clinical example to illustrate the aforementioned problems and challenges from the perspective of a real world application. Figure 1.2 presents a simplified clinical pathway, adapted from Blaser et al. (2007), for proximal femoral fracture in BPMN notation (OMG 2006). To some extent this process model is well and rigidly defined. For example, after the patient is admitted (T1) and then examined (T2), depending upon the result of examination, she will either take imaging diagnosis (T5) or have another diagnosis (T3) followed by therapy (T4). T5 refers to a subprocess that consists of X-ray test (T5_1) and CT scan (T5_2) executed in sequence, as shown the dashed box. Later in the process, other well-defined tasks follow, and eventually lead to the end of this process. Although the above clinical process is strictly defined, there are a number of stimuli or triggers that can cause changes to this model and its instances. Such triggers include regulations, guidelines, business policies, culture, and particularity as they relate to case data (i.e., individual patients). Table 1.1 provides several example scenarios for process changes along with their sources. S1 and S2 are from regulations and hospital policies when a hospital wants to improve its clinical performance. S3, S4, and S5 are related to the case data (e.g., a specific patient) of a process instance since some region should be customized to meet the need of individual cases. For example, an old patient may need additional tolerance test before 4

17 any imaging test. S6 and S7 reflect dynamics in the execution environment of process instances, since the attributes of resources (e.g., availability, service quality) can change over time. The dynamics of these case data and environmental data can be obtained from the workflow engine and require adaptation of process instances for specific cases. no T3: Symptom & Diagnosis T4: Therapy A start T1: Patient admission T2: Anamnesis & Examination yes no T6:Therapy B T9: Discharge & Documentation end Subprocess T5 clinical suspicious of proximal femoral facture? T5: Imaging diagnosis yes Indication of proximal femoral facture and operation? T7: Initial treatment & operative planning T8: Operative treatment task start T5_1: X-Ray T5_2: CT end subprocess XOR-split/join Figure 1.2. A simplified clinical process for proximal femoral fracture Table 1.1. Example contexts for changes in the clinical process S# Description of the scenario Source Impact Acquisition S1 A patient must be instructed before imaging diagnosis (T5). (new regulation) Regulation Process Business S2 A patient must sign a consent form before operative and policy models rule base treatment (T8). (new hospital policy) S3 S4 A patient with bacterial infections should be administered with amoxicillin or clindamycin. For patients older than 70, an additional tolerance test Differing prior to operative treatment is required due to possible case data risks. S5 S6 S7 A pregnant patient has to take an MRI test or a sonogram test, instead of CT test. MRI test and CT test are interchangeable in case any device is unavailable. In case an administrative person is not available, patient admission can be deferred before her discharge. Dynamic environment Individual process instance Process/ workflow engine 5

18 On the other hand, process adaptation is restricted and controlled by structural and semantic constraints. Structural constraints refer to the control of process execution at the structural level. For instance, by verifying the absence of deadlocks and inconsistent data in a process model at design time, an APMS can determine whether a process is structurally correct or not. This is necessary to guarantee error-free execution of a workflow both before and after making changes. Semantic constraints stem from domain specific requirements and express dependencies, incompatibilities, existence conditions between activities, etc. (Ly et al. 2008). As an example, such a constraint may state that: possible drug interaction between amoxicillin and oral contraceptives prohibits a patient from taking both medications within 5 days. Similar constraints are also required to ensure that the process models are compliant with policies and regulations as well. Table 1.2 provides several examples of semantic constraints. Hence, an APMS must guarantee that process adaptation does not violate both structural and semantic constraints. Table 1.2. Example semantic constraints for the clinical process S# Description of the scenario Source Impact S1 A patient hypersensitive to penicillin should not be administered with drug amoxicillin or given a shot of penicillin. Process S2 A patient must not be administered Aspirin and Marcumar within 5 days to avoid possible interactions. Medical guideline models and S3 A patient with a cardiac pacemaker should be prohibited from having MRI test. instances Process flexibility Table 1.3 presents an analysis of the properties in process flexibility according to their triggers (i.e., process context) and nature of impacts. The properties in process flexibility and adaptation is based on the hierarchy developed by (Regev et al. 2006). As illustrated in the above section, three types of process contexts are discussed: Type I (business objectives), Type II (complicated scenarios), and Type III (exceptions). They can impact business processes at different levels and thus present different properties to realize the specified flexibility. We categorize the impact into three levels in decreasing granularity: process goal, process model, and process instance. 6

19 Table 1.3. Properties and types of process flexibility and adaptation Process flexibility and adaptation Nature of impact Properties (Regev et al. 2006) Process goal Process model Process instance Extent of Incremental X X change Revolutionary X Duration of Temporary X change Permanent X X Swiftness of Immediate X X change Deferred X Anticipation Ad hoc X X of change Planned X Process context Type I Type II Type III Flexibility at the instance level usually deals with the deviation from a standard process model for handling exceptions (i.e., Type III) encountered at runtime. Flexibility by deviation is temporary and only affects the current process instance, while other instances derived from the same model remain unchanged. It usually requires immediate change in an ad hoc way since the exception is usually unanticipated. For example, in an emergency situation of handling a patient with fracture, it would be appropriate to delay patient admission until treatment or surgery is finished. The overall process model and its constituent tasks remain the same. When the same deviation takes place in many process instances frequently, the process designer should consider incorporating it into the process model (i.e., Type II). Another reason is that events may occur during process execution that was not foreseen during process design. Flexibility at the process model level is permanent and incremental. The change can be applied immediately or deferred depending upon the scenario. Change at the process model level is more complicated since it will affect the related running process instances and different migration strategies are required. For example, a process instance might have already completed the task where the change should be carried out. This instance may continue according to the old process model or restart from the beginning, depending on different migration strategies. When there are too many changes that need to be made to the process model so as to affect the goal of this process (i.e., Type I), the modeler needs to redesign the process model and use it to replace the old 7

20 one. Change at the process goal level requires redesign of a process model, and discarding the old process model along with all its process instances. This belongs to process reengineering where the underlying process model should be redesigned. It is out of the scope of this dissertation to explore this issue. Our focus is on the process context that comprises of exceptional events at the instance level, and complicated scenario at the model level Context handling in contemporary approaches Contemporary approaches and techniques on improving process flexibility describe strategies for how to handle flexibility requirements rather than catching the triggers for process adaptation. Thus, a causeeffect relationship is also neglected in their works. A number studies that surveys flexible workflow design approaches (Regev et al. 2006; Schonenberg et al. 2007) but they also focus on the adaptation strategies. Context is mentioned in some studies but is never treated as the first-class citizen. Recently years have witnessed some innovative researches in context-aware workflow management. For example, Ardissono et al. (2007) developed a framework for adapting activities based on context; Adams et al. (2006) described a service-oriented framework for implementing dynamic workflows. Both consider context as a trigger for workflow adaptation. Although the importance of context is mentioned in (Kumar and Narasipuram 2006; Ploesser et al. 2010; Rosemann and Recker 2006; Rosemann et al. 2008) for process adaptation and flexibility, there is a lack of research in this perspective. Most researches only study one type of contexts. For instance, Modafferi et al. (2005) examined the capability of modeling business logic that is sensitive depending on the users context. They extended existing process modeling languages to allow modeling context sensitive regions. In contrast, Rosemann et al. (2008) emphasized the importance of external contexts such as weather and location. A variety of contexts can interact with each other and have different impacts on process flexibility. Without a formal and systematic approach to model, collect, and manage process context, it is difficult to automate context-ware workflow adaptation. Context-aware process design is an orthogonal research dimension to the contemporary approaches. However, it is highly relevant and critical for improving the research area of adaptive process design. 8

21 Motivations Traditionally and from the technical perspective, process management and context management are two independent research areas (Sell and Springer 2009). The research in the area of context management focuses on the design of context models, context services, sensor-based context acquisition, etc. while the study in adaptive process management concentrates on the adaptation strategies and merely regards the reactive part of process adaptation and not its actual trigger. Although researches have been increasingly interested in the impact of context in process flexibility and adaptation, few of them use a formal methodology to support process context modeling and management. In summary, an adaptive process management system without context-aware capability is unable (1) to support automatic process adaptation when critical contexts are detected, (2) to track the cause-effect relationship between the trigger and the adaptation, and thus reduce the reusability of process fragments that are constructed at runtime. Despite the wide use of the context-aware approach in various domains, its use in process adaptation is limited. Researchers have only introduced this approach recently recognizing that in a dynamic and changing environment, business processes should be adaptable to the contextual information (e.g., unavailability of resources, change of policy). The major difference between our approach and contemporary works is the emphasis on context. Another issue in managing context-aware workflow is the semantic correctness of processes change or adaptation. Although this is an important issue especially in a knowledge-intensive environment, it is not fully addressed in contemporary approaches. It should be considered because automatic adaptation without considering process compliance is not likely to be adopted. Thus, our study also aims to examine the compliance issue in context-aware workflow management Research Framework and Contributions Research framework Motivated by the above observations, this dissertation proposes a framework that allows contextaware process adaptation and meanwhile ensures its compliance, as shown in Figure 1.3. This framework 9

22 also shows a number of research issues and groups them into the following three categories. Next, we discuss the research questions in each category and our technology-based approach to address them. Context Changes in business objective Planned changes Ad hoc change Exceptions Trigger reconfiguration Flexibility Trigger adaptation Adaptability Business processes Process model Process instance Control Compliance Constraint Structural constraint Infinite loop Inconsistent data Semantic constraint Regulation, policy Domain knowledge Figure 1.3. Research framework Category I: Supporting context-aware capability in business processes Problem 1: How to apply context-aware approach to improve process flexibility and adaptation? Problem 2: How to model and manage (i.e., acquisition, reasoning, and dissemination) context? Category II: Context-triggered process reconfiguration and adaptation Problem 3: How to support context-aware flexibility in process models (i.e., reconfiguration of process model based on context change)? Problem 4: How to support context-aware adaptation in process instances (i.e., adaptation in process instances based on context change)? Category III: Ensuring the correctness of process reconfiguration and adaptation Problem 5: How to ensure structural correctness? Problem 6: How to ensure semantic correctness? A number of studies have contributed to the research in adaptive process management, and most of them focus on the adaptation strategies for handling exceptions or ad hoc changes (problem 4), e.g., (Hagen and Alonso 2000; Luo et al. 2000), and their structural correctness (problem 5), e.g., (Müller et al. 2004; Rinderle et al. 2004). However, they only deal with adaptation strategies, i.e., how the adaptations 10

23 are performed, rather than by what and when the adaptations are triggered (Sell and Springer 2009) (problem 2). Thus, no formal method is provided to support context-aware design for process flexibility and adaptation (problem 1). In an dynamic and pervasive computing environment, where sensors and embedded systems are distributed, process adaptation can occur frequently and unmanageable (De Leoni et al. 2007). By monitoring workflow activities, behavior of workflow participants, and environmental factors, a business process can have the capability of sensing potential changes and making adaptations accordingly in a predictive manner. For example, the physiological data of a heart-attack patient is monitored and analyzed continuously during her treatment. Once any critical situation is detected, her subsequent treatment plan needs to be changed according to her current situation. Thus, we need a formal methodology to contextualize business processes and formally model these contexts. Meanwhile, context-triggered process adaptation requires less interruption from human being, which is more likely to cause violation of semantic constraints. Thereby, providing adaptive processes in an automatic and predictive way should take into account compliance checking of the possible adaptation patterns against semantic constraints (problem 6), especially in knowledge-intensive applications Contributions To address the above issues and problems, this dissertation focuses on how to achieve contextawareness in process flexibility and adaptation while ensuring their semantic correctness, following the design science methodology. Figure 1.4 summarizes the abovementioned research issues and proposes our solutions organized in chapters. It includes the following five components: Context-aware design approach: this part uses a formal methodology by extending traditional context-aware design approach for process flexibility and adaptation. The standard approach includes three parts: context specification, management, and usage (or action). Our focus is context usage that associates context change with context-triggered adaptation behavior. Further, we add validation and reuse to the standard process to guarantee the action (i.e., adaptation patterns) is 11

24 semantically correct and can be reused for future analysis. Such cause-effect relationship between context and adaptation patterns is helpful to improve the current process model. Context specification and management: process context can come from heterogeneous sources that use different data formats and semantics. They can be obtained by referencing related documents, communicating with domain experts, or receiving low level events from sensors. To facilitate knowledge sharing and communication, we use ontology-based approach to model process context with reasoning capability. Further, in a pervasive computing environment where data streams is in high-volume and high-speed, distributed data should be correlated in a timely fashion to deliver actionable information. We use Complex Event Processing (CEP) to detect critical situations. Context-triggered process reconfiguration: a process model should adapt to context change in an intelligent way. We use a template and rule-based approach to configure process models on the fly based on the available context. This approach can reduce the large number of process models (a.k.a. process variants) and efforts involved in maintaining these models when a single policy changes, because it allows separation of basic process flow from business policy elements in the design of a process and also integrates resource and data needs of a process tightly. We also developed a novel scheme for storing process variants as strings based on a post-order traversal of a process tree. We showed that such a representation lends itself well to manipulation and also for searching a repository of process variants. Context-aware process adaptation: a process instance should also adapt to context change in an intelligent way. A process execution engine should be notified of context change and support a variety of adaptation strategies according to different scenarios. We use the placeholder activity (i.e., ad hoc subprocess) to handle planned changes with dynamics at runtime, and illustrate the application of this approach in clinical settings. We showed how adaptable clinical pathways can be designed taking into account medical knowledge in the form of rules and detailed contextual information to achieve a high quality outcome. These pathways are selected during workflow execution based on rules that encapsulate medical knowledge and various clinical contexts. 12

25 Chapter 3 Context-aware design approach (Problem 1) Semantic correctness: any adaptation to be applied to a business process should be validated against its structural and semantic constraints. Since structural correctness (problem 5) has been extensively studied in (Rinderle et al. 2004), we only focus on the verification of semantic constraints, which stem from domain specific requirements and express dependencies, incompatibilities, and existence conditions between activities. We propose a formal constraint specification language to model semantic constraints. Further, we apply Mixed Integer Programming (MIP) to check the compliance of processes and validity of process changes during their lifecycle. Research issues Solution organized by chapters Context specification and reasoning (Problem 2) Context modeling Context acquisition Context reasoning Ontology- and rulebased approach CEP for context detection Chapter 4 Process model flexibility (Problem 3) Context-triggered reconfiguration Template and rulebased approach Chapter 5 Process instance adaptation (Problem 4) Context-triggered adaptation Placeholder activity approach Chapter 6 Semantic constraint verification (Problem 6) Constraint specification Compliance checking Constraint specification language MIP approach Chapter 7 *Note: Problem 5 has been well handled by other studies, so it is excluded from our research issues Figure 1.4. Research issues and solutions organized by chapters The major objective of our study is to improve the flexibility and adaptability of business processes while ensuring their semantic compliance so as to make it applicable in dynamic and knowledge-intensive areas. The contributions of this study are many-fold: 13

26 We propose a formal approach that integrates context-awareness in adaptive process design. Our approach complements related works in the BPM community from the perspective of process contextualization and context modeling. Instead of focusing on the adaptation strategies to support ad hoc changes, our study shifts the emphasis to the triggers that have a potential impact on process design and execution. Thus, it does not require human intervention and facilitates reuse of adaptation patterns. We developed an ontology-based context model for health care to capture important concepts (or contexts) and their relationships in clinical settings. With semantic web technologies, this model improves interoperability among distributed software systems and embedded devices. Thus, we address the issue of heterogeneity in healthcare information systems and captures contexts for clinical workflows. We use CEP technology to detect contexts that are composed by basic events in an event-driven architecture. CEP demonstrates an efficient capability in modeling and correlating hospital events in large volume and high speed. We show that our approach is efficient in processing real-time events and detect critical situations in a timely fashion. We propose two types of context-aware adaptation behavior both at the model and instance levels. First, we use a template and rule-based approach to configure process models at design time to improve process flexibility. It can balance the complexity and the number of process variants in a large repository. Second, we propose to materialize a placeholder activity at runtime for adaption in the context of clinical settings. We show that clinical pathways are selected during workflow execution based on rules that encapsulate medical knowledge and various clinical contexts that capture different patient cases. Both approaches are triggered by context change and do not require human intervention. Finally, we propose a formal specification language to model semantic constraints and use a MIPbased approach to model compliance checking of process models and semantic correctness of 14

27 change operations to be applied in process instances. Our approach is novel and presents advantages over pure logic-based approaches Organization of the dissertation This dissertation is organized as follows. Chapter 2 surveys related works in process flexibility. It provides a comprehensive review of approaches in achieving process flexibility and the context handling in these studies. It exposes a major research gap in this area, which motivates us to do this study. We also explore the use of workflow technologies in handling healthcare processes. Then Chapter 3 extends a context-aware design approach to model and manage process context in a formal way. Following that approach, we discuss context specification and detection in Chapter 4. Further, context-triggered process model configuration is handled in Chapter 5, which introduce our template and rule-based approach. Chapter 6 introduces context-aware process instance adaptation and illustrates this approach in clinical settings. Then, Chapter 7 describes an MIP-based approach for checking the compliance of process models and the semantic correctness of change operations during process lifecycle. Finally, Chapter 8 concludes this dissertation and presents our future work. 15

28 Chapter 2 Literature Review In this chapter, we conduct a literature review of related works in achieving process flexibility and adaptation. Figure 2.1 presents a taxonomy of process flexibility, adapted from Schonenberg (2007). Overall, there are two types of specification languages in describing a process model structure in an imperative or a declarative way. An imperative approach focuses on the strict definition of how a given set of tasks has to be performed. Such process models are constrained by the links between tasks and task orders are explicitly defined. A declarative approach focuses on what should be done instead of how. A user needs to define constraints to restrict task execution and scheduling. All execution paths that do not violate these constraints are allowed. This approach increases flexibility but it is more difficult to use, manage, and validate. Techniques to achieve process flexibility are various and can be supported by the two specification languages described above. In this chapter, we analyze literature review based on the level of flexibility they support (i.e., at the model level or the instance level). This chapter also explores related works on process compliance. They expose a potential research gap that motivates our study. Flexibility approaches Techniques Specification language By deviation By underspecification Imperative langauge Declarative language Ad hoc Rulebased Late binding Late modeling Case handling Figure 2.1. Taxonomy of approaches toward process flexibility 16

29 2.1. Process Model Flexibility Recent years have witnessed a growing interest in enabling flexibility at the process model level. Most process design techniques lead to rigid processes where policy is "hard-coded" into the process schema thus reducing flexibility. As a result, a process oriented software system may contain a family of business processes with small variance. A number of approaches are proposed to tackle this issue. For example, Schnieders and Puhlmann (2006) proposed an approach to manage variability implementation using Java variability mechanisms and code generators. A multi-layered approach for configuring process variants from the base layer is presented in (Nakamura et al. 2009). Hallerbach et al. (2010) proposed the Provop approach that allows a user to design a base process (or process template) with various options. For example, a user can adopt the most frequently used process models as a template or extract the minimum common structure as a template, etc. Another research stream models process variations as a workflow version control problem, and many studies have been conducted along this line (Zhao and Liu 2007). Reference process models are proposed to handle the large number of processes with variations (Rosemann and van der Aalst 2007). At design time, variation points are assigned to a reference process model. Then domain experts have to configure the variation points according to their own needs manually. In reality, this approach is error prone because people who are familiar with business policies may have no knowledge about process modeling. Thus, a questionnaire-based guidance component (La Rosa and Dumas 2008) is used to lead an end user to configure process variants based on formal conceptualization of domain knowledge. The correctness-preserving approach to handle syntactic and semantic correctness during configuration is proposed and discussed by van der Aalst et al. (2009b). Inspired by Aspect-Oriented Programming (AOP) in software engineering, Charfi and Mezini (2006) proposed an aspect-oriented workflow language to enable a concern-based decomposition of process specifications and process models. According to the principle of separation of concerns, the process logic is encapsulated in a process module whereas crosscutting concerns are encapsulated in aspect modules. Crosscutting concerns are called aspects, and these are shared among different process models, such as 17

30 compliance, accounting, billing, authorization, etc. They also proposed AO4BPEL and AO4BPMN that apply the idea of aspect-oriented process modeling in BPEL and BPMN languages respectively. AO4BPEL (Charfi and Mezini 2007) uses AOP to support a modular and dynamic strategy for web service composition. It defines an aspect-oriented extension to BPEL where aspects and processes are expressed in XML. Further, it defines a pointcut language to capture the join points that span several processes. AO4BPMN (Charfi et al. 2010), on the other hand, adds aspect-oriented extensions to BPMN in a similar way Process Instance Adaptation There is considerable amount of related work on runtime process instance adaptation, especially in the context of exception handling. The focus there is on modifying a running process when exceptions occur due to failed tasks, erroneous information, etc. Techniques for supporting dynamic changes at the process instance level are discussed in (Müller et al. 2004; Reichert and Dadam 1998) and elsewhere. The focus of this work is to allow operations like task insertion, deletion, etc. to be performed on running workflows in response to exceptions. Deviation-based approaches are popular for handling exceptions in workflow instances (Chiu et al. 1999; Hagen and Alonso 2000; Luo et al. 2000; Müller et al. 2004). ADEPT flex (Reichert and Dadam 1998) is a research prototype that provides a complete and minimal set of change operations that support users in modifying the structure of a running process instance, while maintaining its (structural) correctness and consistency. It has been successfully applied in different areas, such as healthcare. They defined the correctness properties to determine whether a specific change operation can be applied to a given workflow instance or not. If these properties are violated, the change is either rejected or the correctness must be restored by handling the exception resulting from the change. Most of the above approaches only allow ad hoc changes that require human intervention. Later, a rule-based approach is commonly used to handle anticipated changes. The initial efforts contribute to AgentWork (Müller et al. 2004), which provides the ability to modify process instances by dropping and adding individual tasks 18

31 based on events, and Event-Condition-Action (ECA) rules. It also gives a table for checking compatibility of various operations. However, their change operations are primitive (i.e., at activity or task level) and thus users cannot operate at the higher level (e.g., sub-process). Weber et al. (2008) summarized a total of 18 change patterns and 7 change support features to foster the systematic comparison of existing process management technology in respect to process change support. Among the 18 change patterns, 14 patterns are used to support unanticipated exceptions and unforeseen situations, while the remaining four patterns address uncertainty by deferring decisions to runtime. A survey of existing APMS was also conducted based on their proposed criteria. In their recent work, they presented the ProCycle approach (Weber et al. 2009) which captures the whole process life cycle and all kinds of changes (i.e., at both process instance and model level) in an integrated way. Casebased reasoning (CBR) technology is used to support reuse of change patterns Flexibility by Underspecification Flexibility by underspecification does not differentiate flexibility at the model level or the instance level. It allows an incomplete definition of process model at design time and provides a concrete realization for the undefined parts at runtime. Generally, it includes late binding, late modeling, and case handling approach. The former two approaches are suitable for processes that have overall structure fixed and specific points unknown. The exact context for the specific points will only be known at runtime and may be customized by each process instance. Case handling (van der Aalst et al. 2005) is better for processes with a business goal but the overall structure is vague. The late binding approach uses the idea of open configuration, i.e., a model where parts of the process are not explicitly modeled, but represented by a placeholder. During runtime the configuration is completed by binding the placeholder to a specific process fragment (e.g., a subprocess) from a process repository. Late modeling approach even supports binding of on-the-fly created new process. They allow the definition of an open configuration. 19

32 Worklets (Adams et al. 2006) support late binding by Ripple Down Rules (a special type of rule borrowed originally used in knowledge acquisition research) that use instance data for the selection of a process from a repository, and supports late modeling by including new models in the Ripple Down Rules. The rules enable adaptability since they can update selection mechanism for future use. In Adaptive Nets (Van Hee et al. 2006; van Hee et al. 2007), a placeholder is a set of processes that can be given on a transition. The configuration at runtime is completed by selecting one process from the set. This approach is adaptive in that they change themselves during execution. Adaptive Nets are similar to Worklets in the sense of late binding but do not require the subprocess to finish for progress in the upper net unless synchronization is explicitly specified. It does not support late modeling. The Pockets of Flexibility approach (Sadiq et al. 2005) uses placeholders to contain a set of process fragments, from which an instance template can be created by building activities such as sequence and iteration. A placeholder can represent many different instance templates by some process fragments, but also restricted to these fragments. To ensure structural correctness, they provide a discussion on static verification (i.e., conflict validation) and dynamic verification (i.e., template validation). To support flexible process design in knowledge-intensive environment, Van der Aalst et al. (2005) introduced a new paradigm named case handling, which describes only the preferred way of doing things and a variety of mechanism are offered to allow users to deviate in a controlled manner. It is good for processes with a goal but do not have clear workflow control structure. They developed a prototype system Flower that realizes this research idea. Declare (van der Aalst et al. 2009a) aims to handle the full spectrum of flexibility and meanwhile support the user by recommendations and other process-miningbased diagnostics. It is the most recent of the offerings examined and its declarative basis provides a number of flexibility features. However, the case handling approach is better at assisting than guiding what to do. Thus, verification of structural correctness is difficult. 20

33 2.4. Semantic Correctness of Business Processes The issue of checking syntactic correctness of a process after arbitrary changes are made has been extensively studied. There are two streams of research efforts devoted toward flexible process modeling: the imperative (or change-based) approach and the declarative (or constraint-based) approach. The imperative approach adds the exception handling capability to existing workflow management systems (Chiu et al. 1999). For example, AgentWork (Müller et al. 2004) provides the ability to modify process instances by dropping and adding individual tasks based on events and ECA rules. It also gives a table for checking compatibility of various change operations to ensure their syntactic correctness. Rinderle et al. (2004) conducted an extensive survey of contemporary approaches in terms of their structural correctness in handling dynamic changes. But studies examining semantic correctness of process changes are lacking. The declarative approach limits the process flexibility by enforcing the required constraints (at the process structure level) among tasks. For example, the "pockets of flexibility" approach (Sadiq et al. 2005) defines a basic set of overarching constraints for specifying flexible workflows within which ad hoc changes are allowed. Thus, a process instance can execute on the basis of a partially specified model. Similarly, FLOWer (van der Aalst et al. 2005) describes only the preferred way of doing things and a variety of mechanisms is offered to allow users to deviate in a controlled manner. Recently, Lu et al. (2009) proposed a constraint modeling approach including selection and scheduling constraints, which are used to conceptually express task selection requirements and model their temporal properties in a process template respectively. Although these studies consider the validation of constraints, they do not distinguish syntactic constraints from semantic ones, nor do they provide a formal mechanism for validation. Other studies have recognized the importance of process compliance in terms of their semantic correctness. A logic-based formalism for describing both the semantics of normative specification and compliance checking procedures are described by Governatori and Sadiq (2009). Their approach can model business obligations and regulate the execution of processes. Sadiq et al. (2007) model control objectives through the Formal Contract Language (FCL), which is based on a specialized modal logic 21

34 from normative systems theory. They also proposed to visually annotate and analyze control objectives on graph-based process models. Namiri and Stojanovic (2007) introduced a semantic layer in which the process instances are interpreted according to an independently designed set of internal controls. There are also some other approaches that use semantic annotations (Governatori et al. 2009). All these approaches can make sure the process execution is compliant to regulatory constraints but they do not consider the validity of change operations made to running process instances. Another promising approach for process compliance is based on linear temporal logic (LTL) (Awad et al. 2009; Awad et al. 2011). LTL allows modeling of time using operators like next, eventually, always and until. It is based on a state space graph, where it is possible to use these operators to describe the behavior of a model or a system and reason about properties that pertain to the next state, to all states, to an eventual state, or until a certain state is reached. The authors propose to extend LTL by adding corresponding operators for writing statements over past states also such as previous, once, always been and since. This allows them to write powerful queries to check for whether certain constraints expressed as patterns are valid. They also introduce the notion of anti-patterns that correspond to scenarios where constraints are violated. Previously, Liu et al. (2007) have also used LTL to develop a compliance checking framework for process model. Their approach can only handle compliance at the process model level. In summary, although these approaches have rich semantics for process compliance checking, none of them consider the semantic correctness of process changes Workflow Technologies in Healthcare Clinical workflow has been considered as a killing area for the application of process management technologies, due to its dynamic, complex and knowledge-intensive nature (Dadam et al. 2000). Over the last decade, researchers from different areas have devoted towards formalizing and automating clinical processes. Table 2.1 summarizes a variety of approaches from three different research communities and compares them in terms of their knowledge base organization and workflow integration capability, since they are two critical components in clinical decision support. Process modeling approaches focus on 22

35 improving the flexibility in workflow techniques so they can handle dynamic cases. Software engineering approaches focus on system design and deployment of implemented systems. Finally, guideline modeling approaches emphasis the modeling, reasoning, and integration of medical knowledge for decision support at a single point of care. Further details of these studies are discussed next. Approach Process modeling approach Software engineering approach Guideline modeling approach Table 2.1. Comparison of existing approaches for modeling clinical processes Criteria Knowledge/data source Workflow Integration Form of Clinical Formal Workflow Workflow clinical data guidelines vocabulary flexibility patterns Workflow modeling unstructured Ontology modeling ontology Little-JIL EMR Model integrated approach EMR CIGs N/A ++ N/A N/A N/A Ontology-based approach Unstructured, EMR Notation: N/A: not applied; --- not supported/considered; +: weakly supported, ++strongly supported The business process management community has devoted a lot of efforts in designing customized and flexible clinical processes using formal workflow languages (such as BPMN and BPEL) that are executable by existing workflow engines. Lenz and Reichert (Lenz and Reichert 2007) conducted a survey of IT support for healthcare processes and emphasized the importance of flexible workflow design in supporting clinical decisions. Such systems have been developed to enable dynamic changes in predefined process models, such as ADEPTflex (Reichert and Dadam 1998) and AgentWork (Müller et al. 2004). Research on context-aware workflow design (Adams et al. 2007; Ardissono et al. 2007; Heravizadeh and Edmond 2008) also belongs to this area, but focuses more on the integration of context in binding specific services or constructing subprocesses, whereas medical knowledge is deemphasized. These techniques are useful and highlight the criticality of context in designing a flexible workflow. However, there has also been work from the software engineering community on modeling medical processes. In particular, limitations of current workflow languages have been observed (Zhu et al. 2007), and an alternative approach based on Little-JIL language (Wise 1998) has been proposed. Little JIL is a 23

36 language that centers on a coordination diagram of the process that described by a hierarchical task decomposition. It helps to coordinate agents and their activities, and allows steps to be performed in sequence or parallel. A Little-JIL description can be verified using finite state machine verification techniques. Approaches for formally defining and analyzing medical processes using Little JIL are discussed in (Chen et al. 2008; Christov et al. 2008; Clarke et al. 2008; Osterweil et al. 2007). This stream of research is useful for its focus on improving patient safety and detecting medical errors, but it does not emphasize flexibility or decision support. Mathe et al. (Mathe et al. 2008) developed the Model- Integrated Clinical Information System (MICIS) based on model-based design techniques to represent complex clinical workflows in a service-oriented architecture. MICIS translates models into executable constructs such as Web services or BPEL process. The security and privacy specifications are enriched, and treatment protocols are transformed into a clinical process modeling language to promote software reuse and maintainability (Mathe et al. 2009). The medical informatics community has taken a distinctive approach that models clinical processes as guidelines or care plans, recognizing that some important activities in clinical practice are viewed only as artifacts in workflow modeling. Their focus is on medical decision making by interpreting situations and events on best practice. They also formalize process definition in a way that reflects clinical tasks and constraints as clinicians perceive them. Medical guidelines were originally in the form of free-format text documents to assist medical decisions during diagnosis, management and treatment within different areas of healthcare. Recently, various approaches have been proposed to represent Computer-Interpretable Guidelines (CIGs), such as Arden Syntax, Asbru, EON, GLIF, GUIDE, PRESTIGE, PRODIGY, PROforma, and SAGE, see (Fieschi et al. 2003; Luo et al. 2000; Ohno-Machado 1998; OpenClinical 1999; Peleg et al. 2003). CIGs can produce personalized recommendations during patient encounters and reduce variance in patient treatment. Peleg et al. (Peleg et al. 2003) reviewed six CIG modeling systems and established a consensus on a set of common components. They represent clinical guidelines as plans, whose components represent decisions, actions, and their relationships. Decision steps are used for conditional and unconditional 24

37 routing of the flow, while action steps are used to specify a set of tasks or a sub-plan to be carried out. A review of systems using CIGs was conducted in (Isern and Moreno 2008). There are other approaches that transform manually constructed diagrams into executable workflows. For example, Ongenae et al. (Ongenae et al. 2010) presents a semi-automatic verification and translation framework based on ontologies and semantic rules to convert clinical protocols from XML flow charts into executable workflows. They also integrate this framework into a service-oriented architecture. Recently, ontology-based approaches have been used to enhance the semantics of clinical workflows and make customizations accordingly (Alexandrou et al. 2009; Ceccarellia et al. 2009; Dang et al. 2008; Ye et al. 2009). For example, Dang et al. (Dang et al. 2008) built an ontological knowledge framework to represent important entities (e.g., roles and resources) in healthcare by capturing the context in clinical pathways. Based on this framework, they developed a system (Dang et al. 2009) that enables the creation and execution of personalized medical workflows using BPEL language. Similar studies include SEMPATH (Alexandrou et al. 2009) and KON 3 (Ceccarellia et al. 2009) but they focus more on rulebased reasoning. Compared to these approaches, our clinical processes are well structured in general, yet some complex groups of medical activities are loosely-defined at design time using the ad hoc subprocess construct. In this way, we only need to configure the dynamic parts at runtime without starting from scratch for the entire process. Further, we argue that a BPMN model is more understandable by care professionals since it provides a graphical representation. It also has rich semantics for modeling medical tasks and thus is suitable for modeling clinical processes. In contrast, Ye et al. (Ye et al. 2009) proposed a clinical pathway ontology to represent and exchange pathway-related knowledge. They presented clinical pathways as interconnected hierarchical models including the top-level outcome flow and intervention workflow level along a care timeline, with the assistance of semantic rules for modeling temporal knowledge. Although they used an example of a cesarean section surgery to show the applicability of their methodology, a system architecture and implementation are lacking. 25

38 Chapter 3 A Context-aware Design Approach This chapter presents the context-aware design approach for process flexibility and adaptation. We first introduce the formal methodology in designing a context-aware system and how it can be applied in process management. Based on this approach, we present a framework that serves as a concept map for the following chapters Introduction Context-awareness was first discussed by Schilit et al. (1994) and later increasingly widely studied for the pervasive computing environment, where user context changes rapidly and adaptation actions are needed. There are more than 150 definitions of the term context, which vary strongly across different domains (Bazire and Brézillon 2005). A widely referenced definition, according to Dey (2001), states that context is any information that can be used to characterize the situation of an entity, where an entity is a person, place, or object that is considered relevant to the interaction between a user and an application. The situation of an entity can be categorized into activity, location and time, and relations (Zimmermann et al. 2007). This serves as a commonly accepted operational definition of context. These definitions of context are quite broad and most suitable for pervasive computing. When applied to other domains, context should integrate the characteristics of that domain depending on the interested entities and the situation of them. There are a variety of context-aware applications, including smart space, information systems, mobile commerce, Web services, communication systems, etc. (Hong et al. 2009). The most popular and widely used context-aware application is in pervasive computing environment or smart space, including smart home, smart hospital, campus, museum, and conference room where sensors are installed to collect contextual data and send it to the server via wireless network. Once the context change (e.g., a doctor walks into the surgery room), these context-aware applications can adapt their behavior by different actions (e.g., show the patient information on the computer screen in the operating room). It is also widely 26

39 used in mobile computing where handheld devices play a key role in providing services and making content adaptation. Such a content adaptation service for mobile devices (Lum and Lau 2002) is built using a quality-of-service-aware decision engine that can automatically negotiates for the appropriate adaptation decision for synthesizing an optimal content version. Later, the concept of context-aware is increasingly studied and applied in other disciplines where adaptation is needed. A comprehensive survey of context-aware applications can be found in Baldauf et al. (2007) and Hong et al. (2009) s work. Context-aware applications differ in their context specification, acquisition, adaptation behavior, etc. For example, in smart space, the contexts are location, user information, object identity, etc., which are collected by different kinds of sensors and systems, used to indicate the situation of users. The context can vary a lot even in different applications of smart space. For example, a smart home care about residents and their activities while a smart museum deals with visitors and their interested art works. The adaptation behaviors of context-aware systems are also different representing their ways in reacting to these changes. A context-aware system should be designed in the way according to its specific application and following a standard methodology (Vieira et al. 2010). This dissertation extends this design approach for context-aware process model configuration and instance adaptation The Design Approach Typically, designing a context-aware system needs to consider the following issues (Vieira et al. 2010): (1) which kind of information to consider as context, (2) how to represent context information, (3) how to acquire and manage context (considering that it may come from heterogeneous sources), (4) how to integrate context-awareness in the system and (5) how to present useful context in an optimal way. Figure 3.1 presents a context-aware design approach that takes into consideration these requirements for process flexibility and adaptation in the process management domain. Generally it includes several steps: context recognizing, representation, management, and usage. Further, our work extends the standard approach by adding one more component to the context usage by validation. As we have mentioned, semantic correctness should be ensured when a process is changed. 27

40 Process adaptation Application domain Context-aware design approach Domain dependent Domain independent Context management Context acquisition Context storage Context reasoning Context dissemination Context recognizing Context specification Context usage Validation Process contextualization Context model for processes Contextdependent process adaptation Compliance checking Figure 3.1. Context-aware design approach Context recognizing refers to the identification of context variables that cause possible variations in business processes (both that the model and instance levels). Determining the information necessary to build up context is not a trivial task. Several types of information can contribute to the context and the relevance of each piece of information depends on the situation at hand (Nunes et al. 2009). Context specification (or representation) involves representing the recognized context in a format that is understandable and acceptable. It requires a formal language that can create a shared context model that specifies an explicit representation of the recognized context variables. It should also be amenable to logical reasoning mechanisms to check the consistency of context information, to compare with other contexts and to infer new context from existing ones (Nunes et al. 2009). Context management refers to the mechanisms of manipulating context variables and handling the context dynamics (Vieira et al. 2010). It is related to how context will be implemented in the system as is defined in terms of the main tasks including: acquisition, storage, reasoning, and dissemination. Context usage refers to the use of the specified and managed context to guide the variations in the context sensitive behavior or the context presentation. It specifies how a process should adapt or react to context change in terms of context-aware behaviors. 28

41 Validation refers to the compliance checking of process adaptation before they can be applied. It ensures the semantic correctness of the adaptation behaviors. Here, context recognizing, specification, usage, and validation are highly domain-dependent thus we take into consideration the potential variations in adaptive processes. Context management shares similar properties in different applications and thus the architectural support for managing context in pervasive computing (which is quite a well established research area) can be used in our application. For example, sensor technologies and event processing techniques can be seen as the mechanism for monitoring and acquisition of process context in different application domains (e.g., smart hospitals, supply chain management). Specially, the advent of RFID technology has bridged the physical and virtual worlds A Conceptual Model Figure 3.2 presents a conceptual model that illustrates key ideas in context-aware process flexibility and adaptation. We differentiate context variables and actual contexts by their definitions. A context variable is any piece of information that can be used to characterize the design of a process model or the execution environment of a process instance. The context is the actual data that instantiates context variables and trigger adaptation behaviors. For example, location for performing surgery is a context variable while surgery room 101 is the context (data). Context variables can come from a variety of sources and trigger context-aware behavior by rules. Change operations can configure or customize a process model at design time, which then produces process variants, based on business, social or environmental contexts. For example, changes in business goals will trigger reconfiguration of the according process models. Adaptation commands can customize a process instance at runtime, based on case related case data or computing context. For example, each patient shows different symptoms and requires customization in each instance. A process model should be compliant with semantic constraints including presence and dependencies, role capability constraints, and obligations. Adaptation behavior should be checked in terms its validity with semantic constraints. 29

42 ContextSource Rule -ruletype -rulegroup Presence Dependency Resource capability Obligations 1..* 1..* -acquiredfrom ContextVariable -hasname -hastype -hasvalue * -trigger * Adaptation Behavior 0..* -validate 1..* Semantic Constraint ProcessVariant Change Operation Adaptation Command -compliantwith 0..* -compliantwith 0..* 1..* ControlNode -nodetype * -configure -instanceof (Sub)process 1 1..* -adapt ProcessInstance merges with 1 1..* * 1 -performedby Role 1 1..* Activity -id -starttime -endtime +hasstate() hasdata Data Figure 3.2. A conceptual model 3.4. An Overall Framework Figure 3.3 presents the overall framework for process flexibility and adaptation that integrates the capability of context-awareness in process model configuration, process instance adaptation, and the verification of their semantic correctness. Further, the case base is used to log the execution of process instances by associating contextual data with adaptation patterns. Thus, process modelers can get useful feedback by learning from past experiences and improve the current workflow models. They can also analyze the performance of the processes when they adapt to different situations. 30

43 RFID systems Temperature sensor Information systems temp data RFID events messages Event cloud Event processing complex events actions Context model Context management Context-awareness update context Context database Adaptation manager check compliance Compliance checker Case base for analysis feedback Adaptation management send workflow events <context, adaptation pattern> Process designer Process Execution Engine trigger adaptation instantiate Process Model Repository trigger configuration Semantic constraints Domain expert Compliance verification Figure 3.3. An overall framework for context-aware process design Context-awareness module: This module includes several important components: process context model, context management, and context database. We use semantic web technologies (i.e., ontology and rules) to model process context because they can capture the interdependencies of context variables and enable automatic reasoning. We use Web Ontology Language-Description Logic (OWL-DL) (Bechhofer et al. 2004) to model ontology-based process context model that captures the contextual information of a process model. In addition, Semantic Web Rule Language (SWRL) (Horrocks et al. 2004) is used to enable user-defined reasoning. For example, patient urgency level (C1) is a context variable derived from patient age (C2), blood pressure (C3), and history of heart attack (C4). A SWRL rule can be used to reason about the urgency level of Steve Mullin, based on his actual data of C2, C3, and C4. This actual context information is stored in the context database and it is updated at real time to keep its accuracy. 31

44 Context management is responsible for capturing events and data from context sources, using rulebased reasoning to update the context in the database, and disseminating triggered actions to context consumers (i.e., process execution engine). This component is composed of context acquisition, reasoning, and context dissemination. The details will be discussed in Chapter 4. Specifically, we focus on context detection, which is a major mechanism in context acquisition. Adaptation management module Adaptation manager receives actions from the context-awareness module and checks the validity of these actions by interacting with the compliance verification module. Only adaptation patterns that are compliant will be processed further. Generally, there are two types of adaptation strategies. First, it allows reconfiguration of a process model into a process variant. This is an appropriate way to manage large number of process models with minor variance. Second, it can modify the running process instances by sending commands to the workflow engine. We will describe these two types of strategies respectively in Chapter 5 and Chapter 6. Process change operations can include insertion, deletion, and move of an activity or a subprocess fragment (Weber et al. 2008), supported by an adaptive process management system. Compared to the existing work, our study focuses on the automatic triggering of these change operations and meanwhile ensuring their compliance with a set of semantic constraints. The cause-effect relationship between context and adaptation patterns is used for case-based reasoning and providing summary and feedback to process modelers. Compliance verification module Compliance verification module examines triggered adaptation patterns and ensures their compliance with semantic constraints. Existing works on process adaptation use a manual approach to modify the running process instances and do not consider potential violation with semantic constraints. For example, inserting a medication A activity can violate a drug-drug interaction constraint if medical B exists and medicine A and B conflict. Normally, semantic constraints are obtained from domain knowledge and only captured in domain experts mind. However, they can easily forget in the dynamic and complex 32

45 environment when process adaptations occur frequently. In our framework, since process adaptations are triggered by context change (i.e., situation) in an automatic way, semantic constraints must be formally represented and their compliance should be checked. We use mixed integer programming (MIP) to model the compliance checking problem using CPlex. Details about this part will be described in Chapter 7. 33

46 Chapter 4 Context Specification and Management 4.1. Process Contextualization Defining the characteristics of variations that give rise to the need for process flexibility is essential to understanding how this need affects the requirements for flexibility (Kumar and Narasipuram 2006). It is important that process flexibility should be designed in such a way as to meet the demands of variations, whereas the strategies and tactics for achieving flexibility would be appropriate to meet the flexibility design requirements. We define process context as situational circumstances that impact process design (e.g. legislation, business policy, culture, etc.) and the execution environment in which a process is embedded (e.g. case data, performance requirements, time, location, etc). Contextualization of business processes has been emphasized and discussed at the conceptual level (Rosemann et al. 2008). Further, the current design of process models is completely isolated from their environment and in a static, prescriptive manner; at most, contextual variables are captured through textual annotations or decision points, making process models overly verbose (Ploesser et al. 2010). Failing to adapt your processes to changes in the environment can lead to poor performance in terms of service, time and cost (Ploesser et al. 2010). For example, the season can affect the check-in process for an airline (holiday season vs. other seasons); if the fuel prices go up, you may want to change your shipment process in terms of the transportation method to make it cheaper. Figure 4.1 present a general hierarchy of process context according to their attributes. Social context refers to the social environment that has an impact on process models. For example, a patient should sign an agreement form before the surgery. The design of process models including the surgery task should take consideration of these social factors, whose change will also impact or require redesign of related models. Environmental context refers to the environment where a process should be executed, such as time, location, weather, etc. The clinical workflow in Figure 1.1 might be handled differently in the different states of the USA (i.e., location), because of the availability of medical staff, drugs, working 34

47 hour, etc. Further, case data involves the actual data values for case variables in each process instance. For example, three patients aged 5, 30, and 80 should be treated differently if they encounter a femoral fracture. Their medical background is also a kind of case data so it should also be considered. Modeling case data is required for each process model since they have different variables (e.g., a clinical workflow verses insurance handling). Finally, computing context refers to the resources and exceptional events that might happen during a process execution. For instance, when a resource becomes unavailable, a process should make adaptation quickly. The process context is actually highly dependent on the application domains. Thus, this hierarchy of process context needs to be customized according to where it is applied. For example, the supply chain and the healthcare applications will result in two totally different process context models. Process context Social context Environmental context Case data Computing context Business objectives Time Resource Regulations Location Exceptional events Policies Weather Standards Frequency Figure 4.1. A hierarchy of process context 4.2. Ontology-based Approach for Context Modeling Specifying all the contexts in a single model is not practical because the model can be huge and difficult to manage. Thus, we use ontology to facilitate knowledge sharing and merging. In this chapter, we only demonstrate context modeling for case data, which is knowledge dependent and difficult to 35

48 capture. Our model handles case data for a clinical workflow in the smart hospital where devices and sensors are embedded. The heterogeneous nature of software systems and embedded devices leads to interoperability problems (Kataria et al. 2008). The data and information that are stored in various systems and embedded devices are in different formats so knowledge sharing and communication becomes difficult. This will hinder collaboration among healthcare professionals and experts. Furthermore, semantic interoperability is needed to support context-aware services that enable an end user to utilize services at any time and at any location. Otherwise, contextual reasoning would not be possible. Recognizing that a well-designed ontology can enable information sharing, reusability, extensibility, reasoning and inference, we present a Context-embedded Intelligent Hospital Ontology (CIHO) using OWL-DL (Bechhofer et al. 2004) to define a common vocabulary in the hospital domain and thus provide a shared understanding of the structure of information among people and software agents with embedded contextual information. To test the feasibility of our model, we implement CIHO in an ontology editing tool Protégé 3.4 (Standford University 2009) and evaluate it. Consequently, we introduce knowledge representation in OWL-DL and present the CIHO model. Then we illustrate its feasibility of context reasoning for possible applications, followed by preliminary implementation and evaluation. This health care ontology model will be extended and integrated in execution of flexible clinical pathways Knowledge representation We use OWL-DL, a semantic markup language with expressive power and inference capability, to support semantic interoperability in context-embedded hospital knowledge among various entities. OWL uses an object-oriented approach to describe the structure of a domain in terms of classes and properties. Classes represent important objects or entities in the hospital and individuals are instances of classes. The built-in OWL Property owl:subclassof allows us to structure the hierarchical relationship between superclass and sub-class entities. Further, each class is associated with its attributes through Datatype property or other entities through Object property. Datatype properties link an individual to an XML-schema 36

49 datatype value while Object properties link an individual to an individual. Figure 4.2 illustrates the representation of hospital knowledge. In Figure 4.2 (a) Outpatient is a sub-class of the Patient class and it is disjoint with Inpatient class. Figure 4.2 (b) shows that Individual Mary Liu with patientid P003A is an instance of Outpatient class. Figure 4.2 (c) and (d) show how Datatype properties and Object properties are represented. Restrictions can also be defined to constrain the Classes. <owl:class rdf:id="outpatient"> <rdfs:subclassof> <owl:class rdf:id="patient"/> </rdfs:subclassof> <owl:disjointwith> <owl:class rdf:id="inpatient"/> </owl:disjointwith> </owl:class> (a) Class <owl:datatypeproperty rdf:about="#salary"> <rdfs:domain rdf:resource="#hospitalpersonnel"/> <rdfs:range rdf:resource=" /XMLSchema#float"/> </owl:datatypeproperty> (c) DataType Property <OutPatient rdf:about="#"> <haspatientid rdf:datatype= Schema#string> P003A</hasPatientID> <haspersonname rdf:datatype=" Schema#string">Mary Liu</hasPersonName> </OutPatient> (b) Individual <owl:objectproperty rdf:about="#hasparticipant"> <rdfs:domain rdf:resource="#activity"/> <rdfs:range rdf:resource="#person"/> <owl:inverseof> <owl:objectproperty rdf:about="#participatein"/> </owl:inverseof> </owl:objectproperty> (d) Object Property Figure 4.2. OWL-DL representation of class, individual and properties Ontology-based context model Figure 4.3 shows a partial representation of our ontology based context model that describes the main entities of interest, their properties, and semantic relationships. The class Person defines the general features of a person in healthcare, and has Patient and Medical Staff as its sub-classes. The Patient class is further divided into Inpatient and Outpatient sub-classes, and is associated with a Document class which includes the Electronic Medical Records (EMR) and other medical documents. Medical Staff has Datatype Properties such as expertise area, level of expertise and workload, and has subclasses including Doctor, Nurse, Lab staff, Pharmacist, etc. Interactions among persons are important relationships in a hospital. For example, a Doctor treats a patient. In addition, the class Person is associated with other 37

50 hasusage classes such as Facility and Activity since persons are mobile and participate in a variety of activities. The Facility class generalizes all types of facilities in healthcare and defines a set of properties consisting of name, humidity, and temperature. Its sub-classes include Ward, Office, Test facility, Waiting room, Surgery room, Building etc. Support for teams of medical staff can also be added to the ontology. Medical Staff expertise area level of expertise Lab staff workload Pharmacist Nurse blood pressure Doctor heart rate temperature treat hasdoc Patient Person Medicine Disease Medical Term use Document generatedoc participatein Activity Admission Diagnosis Testing Treatment Transfer Discharge Ward Office Test facility Waiting room Surgery room ID age gender locatedin Facility name humidity temperature locatedin locatedin locatedin ID name manufacturer Device starttime endtime Infusion pump Wheel chair Bed Mobile Device OXRB Legend: class rdfs:subclassof owl:object Property owl:datatype Property Figure 4.3. An ontology-based context model for healthcare The Device class defines the general features of a device in a hospital, and presents Bed and Mobile device as its subclasses. Bed is considered as a stable device since it is moved infrequently. Mobile device has sub-classes such as infusion pump, wheel chair, OXRB, etc. The interaction of patient, staff and device is identified as a medical task described by the Activity class at some place during a specific time period (i.e., represented by starttime and endtime). Subclasses of Activity include Admission, Diagnosis, Testing, Treatment, Transfer and Discharge. These classes are further divided into refined medical activities. Persons, devices, and activities are associated with facilities with property locatedin. This can be achieved by location tracking technology such as RFID. The class Medical term defines the 38

51 basic concepts for medical knowledge so as to be used in documentation. It has Disease and Medicine as its sub-classes. This context model has been implemented in Protégé 3.4 (Standford University 2009) and its preliminary evaluation will be presented in Section Context reasoning Our ontology is populated by contextual data from heterogeneous sensors that represent attributes of entities and their activities in the hospital. A context-aware application monitors input (i.e., context) from external sensors and applications, and automatically adapts its behavior to the changing context. We use first-order logic (with connectors AND, NEGATION, IMPLY, etc.) to express contextual information in the form of (subject, predicate, value) which can be easily converted into OWL-DL language or other rule languages for implementation. We consider ontology-based and rule-based reasoning as follows. Ontology-based reasoning infers implicit contexts from explicit contexts based on class relationships and property characteristics. Standard reasoning rules that support OWL-DL entailed semantics can be defined for relationships like subclassof, subpropertyof, disjointwith, inverseof, Transitive-Property, and Functional-Property. Objectproperty: (?haspatient owl:inverseof?treatedby) Context 1 (explicit) <Doctor rdf:id= John_Smith > <haspatient rdf:resource= #Mary_Liu > </Doctor> Context 2 (implicit) <Patient rdf:id= Mary_Liu > <treatedby rdf:resource= #John_Smith > </Patient> In our ontology, haspatient is an inverse property of treatedby. Context 1 is explicitly defined by the user and it shows Doctor John Smith has a patient named Mary Liu. Through ontology reasoning, a new context (context 2) that patient Mary Liu is treatedby Doctor John Smith can be implicitly deduced based the semantics of owl:inverseof. This can be very useful for a personalized patient care service. Rule-based reasoning. Rule-based reasoning is a more flexible reasoning mechanism and allows inference of new contexts based on user defined rules. 39

52 Rule R1: Automated patient identification before surgery (?SurgeryRoom, haspresence,?patient) (?SurgeryRoom, haspresence,?doctor) (?ComputerScreen, locatedin,?surgeryroom) (?ComputerScreen, ShowPatientInfo,?Patient) In rule R1 above, when a surgery room "detects" the presence of a patient and a doctor, the computer screen located in this surgery room will display the patient s medical information. Such application can eliminate wrong patients or wrong OR in a surgery. Other examples include providing security alerts when a device or patient is in the wrong location, warning the nurse when a patient s body temperature rises suddenly, etc Preliminary implementation and evaluation CIHO model has been implemented using an ontology editor and knowledge-based framework, Protégé 3.4. First, we define the six top-level classes and 40 sub-classes in the hierarchy. The second step involves creating all the properties for the existing classes, including 47 Object properties and 60 Datatype properties. Their domain and range are also defined accordingly. After that, 126 restrictions are created for all the classes to enrich the ontology semantic. Since our implementation is still preliminary, it is available on the web (Yao 2009) for extension and improvement in future. We evaluated CIHO with a semiotic metrics suite (Burton-Jones et al. 2005), a multidimensional framework that defines overall ontology quality as the weighted sum of its syntactic, semantic, pragmatic, and social qualities. We compared our ontology with reference to the results from their study. However, at the outset it should be noted that CIHO is free from syntax errors since Protégé has error-checking capability. The model contains 15 types of syntax (29% of available syntax vs. 17% on average in the aforementioned study), such as class and subclass, property and subproperty, inverse, disjoint, and cardinality. Thus it is rich in syntax. Semantic quality relates to interpretability, consistency, and clarity of an ontology. We checked all the words of CIHO in a lexical database WordNet and the results show that our model is interpretable (94% of words exist in WordNet vs. 63% on average in the aforementioned study) and clear (62% average precision of words vs. 78% on average). The reason for a low percentage 40

53 of word precision might be some uncommon use of words, which needs to be improved in future. Besides, CIHO is logically consistent because we used Pellet Reasoner (Sirin et al. 2007) to check the ontology for inconsistencies in each round of development. The result shows that the ontology is logically valid in terms of logical consistency (i.e., no conflictions), concept satisfaction and classification. Third, in terms of comprehensiveness, we have a total of 209 classes and properties. However, evaluating pragmatics, which consists of comprehensiveness, accuracy, and relevance of an ontology, is not easy since it requires the assessment of domain experts. Last, since our ontology is quite new, we are unable to evaluate its social impact at this moment. In future, we might invite assessments from experts in the healthcare domain. Besides, we will improve CIHO by referencing other authoritative and dependable ontologies to define our terms. For example, we can reference time, space, medicine and disease ontologies to define our terms. In Chapter 6, we will describe how CIHO model can integrate with other medical ontologies and rules for providing high level clinical decision support Context Management Context management is domain independent and it encompasses the mechanism of dealing context according to the predefined context model and reasoning rules. Figure 4.4 represents the architecture for context management adapted from (Vieira et al. 2010). Context acquisition collects specified context from a variety of sources, such as RFID systems, environmental sensors, and workflow engine. In other words, it obtains the value of context variables through different mechanisms. Popular methods include form filling, context detection, and context inference. Form filling is the simplest approach to get context value when the user enters a value directly (e.g., enter the age of a patient). Context detection is usually used in pervasive computing environment or event-driven systems where data is in high volume and high speed. Critical situations should be detected and useless data should be filtered out. We will introduce a useful mechanism to detect context in the next section. Context inference provides high level context through ontology- or rule-based reasoning as described before. After context is acquired, it is forwarded to the controller that coordinates the activities among the other three modules. Low level context will be 41

54 matched to derive high level context and then update the context knowledge base (KB). Further, the context dissemination module disseminates the context to the context consumers that actually implements the adaptation behaviors. In our example, the adaptation manager can deliver comments to modify a running process instance, while a behavior trigger can notify a user of certain information such as patient Smith is entering restricted area. Context sources RFID systems Context Acquisition Context management Context Dissemination Context Consumers Adaptation manager Temperature sensor Controller Behavior trigger Workflow engine Inference rules Context Reasoner Context KB Context handler Figure 4.4. Framework for context management 4.4. Complex Event Processing for Context Detection Among all the context acquisition approaches, context detection is used to extract context in pervasive environment and presents the most challenges for researches. Complex Event Processing (CEP) (Luckham 2002) provides an effective solution to process event streams in real time for today s dynamic business environment. Compared to the delayed-analysis methods used traditionally in relational databases, CEP involves continuous processing and analysis of high-volume and high-speed data streams such as RFID data. It also correlates distributed data to detect and respond to business-critical situations in real time. Thus, CEP helps to deal with a variety of data streams to deliver actionable information. For example, in the case of patient identification in a surgery, if a wrong patient is taken to the surgery room mounted with an RFID reader, an alert will be triggered and sent to the care provider immediately. Therefore, leveraging CEP to manage hospital events which is captured by RFID systems and embedded 42

55 devices for situation detection can be helpful to solve the challenges faced by healthcare. In this section, we describe how to use CEP to detect critical situations in RFID-enabled hospitals (Yao et al. 2011) An application scenario The hospital is a large, extremely busy and chaotic environment where hundreds of medical cases are treated every day. Hundreds of doctors and staffs are walking constantly; inpatients and outpatients are moving around under surveillance; medical devices are commonly recalled for emergent use. Although information systems have been used to support the work of medical professionals, they are not able to accurately track patient flow and asset flow in real time. Thus, sense and respond quickly to critical situation becomes impossible in this dynamic and unpredictable environment. In particular, performing a surgery requires extensive information sharing, coordination, emergent situations detection and immediate reactions. Achieving these requirements is even difficult if no proper technology is used to help monitor patient flows, track medical devices, and alert unexpected situations. Failure in doing these would threaten patient safety, decrease operational efficiency, and increase medical costs. We use a typical surgical workflow in Figure 4.5 to illustrate the challenges in performing a surgery. The workflow generally involves three phases, namely preoperative, intraoperative, and postoperative. Five groups of participants are involved in this procedure, including patients (P1), transporters (P2), nurses (P3), anesthetists (P4), and surgeons (P5). Surgical related locations can include ward (L1), holding area of operating suit (L2), operating suit (L3), holding area of operating room (L4), operating room (OR) (L5), recovery room (L6), and intensive care unit (ICU) (L7). In the preoperative stage, the transporter brings the scheduled patient along with related documents from the ward to the OR suit. Before the patient is admitted into the OR suit, the nurse verbally confirm the patient s identity, patient ID and surgical information. Besides, the nurse reviews the patient s medical record such as medication, vital signs and tests, to determine her readiness. Then the patient stays in the holding area of OR until the scheduled OR is ready. However, before the patient is admitted into the OR, the nurse has to check her identity again. Before the surgery begins, the anesthetist verbally confirms the 43

56 patient s identity, medical information (e.g., allergies, medical history) and the scheduled surgery. Then the anesthetist reviews the vital signs and test results of the patient, and gives the most suitable anesthesia. After anesthesia, the patient is ready for the operation but the surgeon also checks if this is the right patient and site of body for the operation. During the surgery, the surgeon and nurses focus on the operation and if any emergency happens, they can hardly know the personal information of the patient, since all of the information is documented in the paperwork. After the operation, the nurse takes the patient to recovery room within OR suit for awakening from anesthesia. At last, the patient is discharged from OR suit and transferred to the original ward or ICU, depending on medical conditions. During the whole procedure, the status of the patient and the OR are updated manually. Figure 4.5. A typical surgical workflow 44

57 From the above surgical procedure, we can identify the following challenges that may threaten the patient s safety. First, identifying patient manually at different stages by different hospital personnel is error-prone and time consuming. Since nurses are usually very busy dealing with a number of cases per day, they might forget to confirm the patient s identity in some cases. Besides, the surgeon has to verify the patient by face recognition if she is anesthetized. As a result, possible human mistakes can cause wrong patient, wrong OR, and wrong procedure for a surgery. Second, the inability to access patient electrical health record (EHR) can cause medical errors or delay in emergency detection. Also, incomplete documentation of patient medical history can cause wrong anesthesia. During the surgery, although patient monitors can show the patient s vital signs in real time, these signals should be interpreted better along with EHR (e.g., history, lab tests, medication). Third, limited resources in hospitals make high recall rate of medical devices, but tracking a device manually is almost impossible. For example, a patient encounters a heart attack in a sudden and needs an infusion pump immediately. Nurses usually have to look several places before they can find the device and this always causes delay in the treatment. Fourth, hospital staff members usually have too much workload so they might make mistakes. Surgeons may leave sponges inside the patient body if manually counting the sponge number before and after the surgery. Even more, they may leave scissors or other small instruments carelessly inside the patient body. Last, improper disposal of used instruments can cause waste to hospitals. For instance, some reusable instruments are thrown away by housekeepers if they do not pay attention to that. Clearly, a technology like RFID is in urgent need to automatically identify and track the objects in hospitals. More importantly, hospitals should be able to sense and respond to critical situations in real time. These motivate us to propose an RFID-enabled CEP framework for modeling surgical events and respond to medically significant events in real time CEP preliminaries An RFID-enabled smart hospital can generate a variety of data streams that are in different formats and need to be processed timely. For example, RFID tracking system consistently generates data about 45

58 the location and time of tagged items, which is in low-level semantics and not directly useful. Besides, embedded sensors and devices continuously generate environmental or medical related data. CEP has been introduced to process and correlate these data. This technique aims at processing multiple streams of data continuously and identifying meaningful events in real-time. CEP has several features. First, it can extract basic events from a large amount of data and correlate them to create business events according to user-defined rules. Second, the patterns to correlate events can include logical, temporal, causal and other constructors. Third, CEP can react to critical situations in real time. In this section, we formalize the definition of events, event constructors, and CEP rules. The relationship of the concepts in CEP is illustrated by the ontology in Figure 4.6. Figure 4.6. Event ontology in CEP Event An event can be defined as a record of an activity in a system for the purpose of computer processing (Luckham 2002), or an occurrence of interest in time (Wang et al. 2009). In general, events can be categorized into basic event and complex event (or composite event). We use upper case and lower case, such as E and e, to represent event type and event instance, respectively. 46

59 Definition 1: (Basic Event). A basic event can be denoted as E=E (id, a, t), where id is the event ID, a={a 1, a 2, a m }, m>0, is a set of event attributes and t is the event occurrence time. A basic event is the atomic, indivisible, and occurs at a point in time. Definition 2: (RFID Event). An RFID event can be denoted as E=e (o, r, t), where o is the tag EPC, r is the reader ID, and t represents the observation timestamp. Although the time of RFID reading might be earlier than the time when the event is captured, we assume this difference is too small to be recognized. An RFID event is a basic event. Definition 3: (Complex Event). A complex event can be defined as E=E (id, a, c, t b, t e ), t e >=t b, where c={e 1, e 2,, e n }, n>0, is the vector that contains the basic events and complex events that cause this event happen, t b and t e are the starting and ending times of this complex event. It can happen over a period of time (i.e., from t b to t e ). A complex event is aggregated from basic events or complex events using a specific set of event constructors such as disjunction, conjunction, and sequence that are explained in the next section. It signifies or references a set of other events to indicate a situation described in the application scenario. Complex events contain more semantic meaning and are much more directly useful for decision making in business applications Event constructors Event constructors or event operators are used to express the relationships among events and correlate events to form complex events. Wang et al. (2006) gives a comprehensive set of event constructors and classify them into temporal and non-temporal constructors. In Table 4.1, we adapt and extend these event constructors and list the most frequently used ones for complex event detection. These event constructors can be used to define event patterns that catch meaningful information from real-time data streams. For example, the pattern within (E 1 E 2, 5s) matches when both events E 1 and E 2 occur within time interval less than 5 seconds; the pattern (E 1 E2 ) matches when event E 1 occurs but event E 2 does not occur; and the pattern (E 1 *; E 2 ) matches when event E 1 is followed by event E 2. 47

60 Table 4.1. Expression and semantics of event constructors Type Constructor Expression Meaning Logical (nontemporal) Temporal AND ( ) E 1 E 2 Conjunction of two events E 1 and E 2 without occurrence order OR ( ) E 1 E 2 Disjunction of two events E 1 and E 2 without occurrence order NOT ( ) E 1 Negation of E 1 sequence (;) (E 1 ; E 2 ) E1 occurs followed by E 2 window(e 1,t) Event E 1 occurs for time period t (s, m, h) window () [s:second, m:minute, h:hour] window(e 1, n) Event E 1 occurs n times (n>0) within () within(e 1, t) Event E 1 occurs within less than t within(e 1,t 1, t 2 ) Event E 1 occurs within interval t 1 and t 2 at () at (E 1, t) Event E 1 occurs at time t [system time] every (*) E 1 * Every occurrence of E 1 during() During(E 1, E 2 ) Event E 2 occurs during event E CEP rules Based on the formalization of event and event constructor described above, CEP rules are defined to specify domain syntax and semantics. A rule is the predefined inference logic or pattern for detecting complex events. Several studies (Wang et al. 2009; Zang et al. 2008) have also described different syntax for CEP rules.. We use ECA rule expression language to describe event patterns since it is easier to use and more understandable. The generic syntax of ECA can be expressed as: Rule rule_id, rule_name, rule_group, priority ON event IF condition THEN action1, action 2, action n END where rule_id and rule_name are unique for each rule, suggesting the id and name for a rule; rule_group is a group of semantically related rules; priority defines the priority of this rule; event specifies the event of interested; condition is a boolean combination of user-defined functions; action defines a user defined procedure (e.g., trigger alarms) or an update in the database (e.g., update of patient status). With CEP rules, we can provide sufficient support for processing RFID and other sensor data. 48

61 Modeling business events using CEP A smart hospital enabled by RFID can track the movements of doctors, patients, and objects carrying RFID tags. With RFID technology, it is possible to create a physically linked world in which every object is identified, cataloged, and tracked (Wang et al. 2006). To achieve these advantages, the first task for an RFID application is to map objects and their behaviors in the physical world into their virtual counterparts by semantically interpreting and transforming data from RFID systems and other sensors. We use the CEP techniques described above to model complex events in hospitals, such as medical related activities and emergencies. Since most studies focus on the correlation of events in supply chain (Wang et al. 2009), such as the aggregation of events based on containment relationships, we argue that the event patterns in hospitals can be quite different from those in supply chains. For example, hospitals are interested in the resources used during a surgery and all the personal performing the surgery. The CEP rule is an effective mechanism to filter basic events and extract meaningful information, in order to identify medically significant events Location transformation RFID readings can imply object movements and location change, which is the basis to identify activities in a clinical workflow and detect medically significant events. For example, when an RFID observation indicates a wrong patient is taken into an OR, a mismatch between the patient and the OR can be detected. In response to the wrong patient event, the medical staff is automatically and instantly warned of this mismatch. In addition, location transformation should be recorded for historical data analysis, since patient and asset flows can be traced by a sequence of location changes. Complex Event Table 4.2. Common RFID location change events Expression E 1 : Object o 1 enters place l * E 1 = within ( e(o 1, r, t 1 ) ; e(o 1, r, t 2 ), 10 s + ) E 2 : Object o 1 leaves place l E 2 = within (e(o 1, r, t 1 ) ; e(o 1, r, t 2 ), 10 s) E 3 : Object o 1 moves from l 1 to l 2 (l 1 l 2 ) E 3 = within (e(o 1, r 1, t 1 ) ; e(o, r 2, t 2 ), 10 s) * Assume Location l is mounted with RFID reader r, l 1 with reader r 1, and l 2 with reader r 2 (same for Table 4.3) + Assume that the readers are scheduled to bulk-read all objects every 10 seconds (same for Table 4.3) 49

62 Table 4.2 lists all the complex events related to the location change of objects. An object can be any RFID-tagged entity such as a person, a device, or a bottle of medicine. Based on these events, we can infer the current location of an object, and the period during which the object stays in a location RFID semantic data filtering Two types of data filtering should be considered for RFID data before they can be further processed: low level data filtering and semantic data filtering (Wang et al. 2009). We assume that incoming RFID data has already been filtered with enough quality by the middleware since most middleware provides the functionality of low-level data filtering and aggregation (e.g., Alien and Symbol). Thus we only consider rules that perform semantic data filtering in a smart hospital. For example, although a medicine bottle is always being detected present in a smart cabinet which is mounted with an RFID reader, we are only interested in when it is put into the cabinet, and when it is taken out of the cabinet, in order to automatically update the status of this medicine bottle and the person performing this action. If an unauthorized person is moving a medicine bottle, medical staff will be alerted automatically and immediately. Table 4.3 captures complex events involving semantic data filtering. Other complex events can be derived from these as well. Complex Event Table 4.3. Common RFID semantic filtering events Expression E 4 : Object o 1 enters proximity of object o 2 E 4 = within (E 1 e(o2, r, t 3 ), 10 s) E 5 : Object o 1 is touching/next to object o 2 E 5 = within (e(o 1, r, t 1 ) e(o 2, r, t 2 ), 10 s) E 6 : Object o 1 leaves proximity of object o 2 E 7 : Object o 1 and object o 2 move to distance d (m) apart E 8 : Person o 1 put object o 2 to location l 1 E 9 : Person o 1 takes object o 2 away from location l 1 E 6 = within (E 2 e(o2, r, t 3 ), 10 sec) E 7 = within (e(o 1, r 1, t 1 ) e(o 2, r 2, t 2 ), 10 s) dist(r1, r 2 ) d E 5 = within (e(o 1, r 1, t 3 ) e(o 2, r 1, t 4 ), 10 s) E 8 = within (E 5 ; E 5, 10 sec) type(o1 ) = person E 9 = within (E 5 ; E 5, 10 s) type(o1 ) = person 50

63 RFID real-time monitoring CEP rules can provide effective support for real-time monitoring of RFID-tagged objects, especially medical devices and patients. It is well known that hospitals own a great number of expensive medical equipments and part of them are stolen on a regular basis (Fuhrer and Guinard 2006). RFID can improve theft prevention by tracking equipment to reduce the severe consequences caused by the lack of vital equipment. If a reader at the building exist detects a piece of tagged equipment without detecting an authorized user, then it implies the equipment is being taken out illegally, and an alert is sent to a hospital security personnel. Another common application is patient tracking before a surgery. If a reader r mounted at the OR door detects a tagged patient who is not authorized to have a surgery within 45 minutes from current time (CT), then an alarm is triggered to inform this mismatch. This rule can be represented as: Rule R1, patient_identification ON within (e (p_epc, r, t) type (p_epc) = patient, 10s) IF NOT (SELECT * from SURGERY WHERE patient_epc = p_epc AND location_epc = r AND CT scheduled_time CT+45min) THEN trigger_alarm END Patient monitoring In addition to the RFID event stream in hospitals, patient monitoring systems continuously track patient physiological data. For example, vital signs monitors can track heart rate and blood pressure; pulse oximeter monitors the blood oxygen saturation levels of patients. The value of an individual physiological parameter is in low level and generally does not provide much semantic meanings in terms of patient status. To detect medical situations, CEP rules are used to correlate various physiological events with temporal reasoning. Besides, patient medical records may be combined to trigger alarms since patients have different medical backgrounds. In the operating room, if the detection of critical situations is delayed, the patient s life will be threatened. 51

64 We use hypovolemia danger detection as an example to illustrate the modeling of patient monitoring. Suppose a patient with hypovolemia history is being operated and his vital signs are being tracked by monitoring devices. If his heat rate increases over 5% and his blood pressure decreases over 6% within a 5 minutes time period, an alarm is sent to the medical staff for action. The rule can be represented as: Define E 1 = HeartRate (epc, value, t1) Define E 2 = BloodPressure (epc, value, t2) Define E 3 = observation (epc, r, t3) Rule R2, hypovolemia_danger ON (E 1.epc= E 2.epc= E 3.epc) type(epc) = patient type(r) = OR window (increase(e 1.value)>5% decrease(e 2.value)>6%, 5min) IF SELECT * from MedicalRecord WHERE patient_epc = p_epc AND hypovolemia = true THEN send_alarm END Data aggregation Hospitals are flooded with massive flux of data from RFID systems and other medical monitors. To avoid data overload or missing important events, CEP rules are used to aggregating data in an automatic fashion. For example, if we detected the presence of correct patient and medical staff in a surgery room and the light of this room is turned on, an aggregated event surgery begin can be inferred. As a result, we can update the status of the surgery room and related persons automatically Implementation and assessment The CEP framework is designed to collect basic events from heterogeneous sources and correlate them situation detection. We have implemented a prototype system that aims to offer sense-and-response capability to a smart hospital so they can react quickly to emergencies, especially for time critical scenarios Architecture Figure 4.7 presents the physical and semantic data flow in an RFID-enabled CEP framework. At the lowest level, raw readings from location tracking systems are captured by RFID readers and then filtered 52

65 (i.e., smoothing and aggregation) by the middleware to remove noisy and redundant data. The produced RFID events along with data from other embedded sensors or devices are then passed on to the CEP engine for further processing. These events are basic events since they are captured directly from their sources and have not been aggregated. In addition, data from other information systems or database is needed for complex event pattern matching. They are inserted into the working memory as facts. Facts and basic events from sensors can be correlated by event constructors. CEP rules are stored in the rule base so that the CEP engine can detect complex events to signify critical situations. As a result, the complex events contain semantic meanings and can be used by applications. For example, if a certain threat to patient safety is identified, an alert will be sent to the care provider. Figure 4.7. Physical and semantic data flow in RFID-enabled hospital CEP engine Drools 5.0 We selected the open source software Drools fusion 5.0 (JBoss 2010c) as our CEP engine, since it provides an integrated platform for modeling rules, events, and processes. Drools Expert is a forward 53

66 chaining inference engine, using an enhanced implementation of RETE algorithm (Forgy 1982). To support complex event processing, Drools Fusion was developed to support processing of multiple events from an event cloud for event detection, correlation, and abstraction. It has several advantages. First, it supports asynchronous multi-thread streams in which events may arrive at any time and from multiple sources. Second, since temporal reasoning is an essential part of CEP, we examined the capability of Drools Fusion in support temporal relationships. It has a complete set of temporal operators to allow modeling and reasoning over temporal relationships between events. Third, it allows complete and flexible reasoning over the absence of events (i.e., negation). Lastly, aggregation of events over temporal or length-based windows is supported by the use of sliding windows. Figure 4.8. Walkthrough of event processing implementation Figure 4.8 presents an overall walkthrough of how we implemented complex event processing of RFID and medical data with Drools. The event receivers defined as Java classes and they are considered 54

67 as input adapters, which are fed with data streams from physical data sources. Besides, they can also receive simulations from data files and computer programs. Event receivers also respond to complex event detectors (e.g., Patient Misidentification) with events of interest. An entry point is a channel through which the events of interest are asserted into the working memory. In this way, all event streams can be independent of each other and multiple threads can run in parallel. Complex event patterns are defined using CEP rules. Upon detection of the predefined patterns, alerts are sent out as messages and actions are triggered. Complex events are also fed back into the engine and treated as incoming events Temporal reasoning for complex event detection Traditionally, the RETE algorithm is used in expert systems for production-based logical reasoning, matching a set of facts against a set of inference rules. The basic idea of RETE algorithm is to construct an acyclic network of rule premises for forward-chaining. By default it does not support temporal operators. We are not describing the details of RETE algorithm since we focus on the extension of RETE with temporal reasoning. To support temporal constraints in rules, events can be modeled as facts with timestamps and they are inserted into the working memory at runtime. In addition, events need to be discarded when they are no longer of interest or cannot contribute to complex events. Temporal relationships can be realized by explicitly stating the conditions in the rules (Walzer et al. 2008). For instance, if a complex event E 3 is created when an event E 1 occurs followed by event E 2, we can use a rule to express this: IF t_begin(e 1 ) < t_begin(e 2 ), THEN create E 3. Then this rule can be processed by the traditional rule engine. Other temporal constructors can be realized in the same way. Figure 4.9 presents 13 temporal relationships between events as well as their semantic meanings, including before (after), meets (metby), overlaps (overlappedby), during (includes), starts (startedby), finishes (finished by), and coincides. All of them are supported by Drools Fusion. Besides, sliding windows are supported for correlating events on temporal or length-based time windows. Our proposed event constructors can be easily supported by these operators. The logical event constructors AND ( ), OR ( ), and NOT ( ) are supported respectively by the operators and (&&), or ( ), and not. Temporal 55

68 constructors are also transformable. For example, sequence(;) is often combined with within to constraint the time distance between two events. The pattern (within (E 1 ; E 2 ), 10 sec) can be expressed as E 2 (this after [0s, 10s] E 1 ). Similarly, the rest of temporal constructors can be transformed as well. Figure 4.9. Temporal constructors in Drools Fusion Performance evaluation To test the performance of the proposed CEP-enabled RFID applications, we implemented a prototype in Java language for testing and evaluation. Our testbed is a PC with 2.0 GB of RAM and 1.86GHz Genuine Intel CPU running Windows XP Professional operating system. In the prototype, we simulated three event streams to generate basic events. RFID event stream has four event types; patient monitoring event stream has eight event types; and environmental sensor stream has two event types. Each event type has its own attributes and methods to get these attributes. Each event type is used by at least one complex event expression. We used prepared data files to simulate the continuous generation of events instances. Currently, there are approximately 120 complex event expressions involving different kinds of semantic meanings and complexities. All the complex events defined (i.e., E 1, E 2, E 9 ) and illustrated (i.e., events related to RFID monitoring and patient monitoring) in Chapter 4 are included. Each complex 56

69 event expression has at least one logical constructor (i.e., AND, OR, or NOT) and one temporal constructor. The most frequently used temporal constructors are window and sequence (or after), both of which are used by more than 60 complex events. On average, each complex event has approximately two logical and two temporal constructors. Based on these complex event expressions, we have defined 41 rules in total for detecting situations. Each rule uses 1~3 complex event expressions and two-thirds of rules involve the correlation of facts, e.g., the medical history of a patient. (a) Event processing time (b) Detection of complex events Figure Performance evaluation We evaluate the scalability and the situation detection ability of our system with the Drool-fusion rule engine. Figure 4.10 shows the results in terms of processing time and number of detected complex events. We define event processing time as the total time of processing the number of incoming events. In Figure 4.10(a), when the number of basic events increases from 2,500 to 50,000, the event processing time increases from 4,800 to 352,000 milliseconds, if we use 24 rules. However, if we increase the number of rules to 41, the processing time is increased from 5,500 to 560,000 milliseconds. Thus, the number of event rules can have significant impact on the performance of event processing time, especially if they involve complicated temporal and logical reasoning. Figure 4.10 (b) shows the number of complex events that are detected for an increasing number of basic events. Obviously, when the rule number is small (i.e., 57

70 24), we only detect fewer complex events (i.e., from 38 to 1582). When the rule number increases to 41, we can detect 44 complex events by 2,500 basic events and 2577 complex events by 50,000 basic events. Since delay and false positive alarms are two important indicators for the hospital practice, we also evaluate the performance of this prototype on the basis of latency and detection accuracy. Within the 2577 complex events that we have detected (when we applied all the 41 rules), around one third are related to patient identification at a series of locations they need to go through for a surgery. Another one third of complex events signify a variety of medical threats to the patient, e.g., high fever and heart attack. The remaining one third of situations concerns sending reminders to surgical personnel beforehand so they can get prepared, access control of medical equipments, and improper disposal of reusable instruments. Since we give a clear definition of rules, the true positive alerts are identified with one hundred percent accuracy. However, in the real case, the definition of event patterns is fuzzy so it s not so easy to provide such positive results. The latency of detection time of these scenarios is all below one second, which is quite acceptable. However, we need to cut down the number of unnecessary or inconductive reminders and alerts. Otherwise, healthcare professionals can easily get disturbed A use case of surgical workflow In general, the performance of CEP approach to process RFID and other sensor data using Drools fusion is acceptable in the hospital environment. In this section, we illustrate how this approach can improve the surgical workflow described in Section Figure 4.11 presents the simplified surgical workflow modeled by Drools-flow (JBoss 2010b), which is a part of Drools integration framework. Two basic events are considered including RFID events and patient physiological events. Hospital database are accessed to retrieve patient medical records and surgery schedule. These data are inserted into CEP engine as facts and associated with event streams for complex event detection. When various resources are involved in a surgical workflow, events generated from their activities can impact each other and result in different surgical outcomes. In such a time critical environment, temporal relationships of these events need to be captured to detect critical situations. 58

71 In this workflow, we define CEP rules to model critical situations in surgical management and embed these rules in workflow activities. We focus on two scenarios: patient identification and patient monitoring. As described, patient identification should be conducted at several stages: admission to L3, admission to L5, anesthesia, and leaving from L6. Accurate and timely identification of patients can avoid adverse situations such as wrong patient, wrong OR, and wrong procedure. When a patient enters an OR, a complex event is generated. The patient identification rule also checks surgery schedule to see if the patient is scheduled in this OR. If not, an alert is triggered. Otherwise, the subsequent activity is activated. Figure Simplified surgical workflow modeled by Drools-flow Patient monitoring captures the change of a patient s physiological parameters and sends alerts if any emergency happens. When surgeons are operating on a patient, they are always too focused on the operation to be aware of the patient s condition. Thus, the detection of critical situations may be delayed and the patient s life will be threatened. The physiological events of a patient will evolve with time and needs to be associated with the patient s medical background. Thus temporal reasoning is critical to capture this feature. We present a screenshot from our prototype system in Figure 4.12 for patient monitoring. The left panel presents the patient in the current OR is Jack Miller and his physiological parameters are 59

72 changing with time. These physiological data is generated by computer following a normal distribution function. For control purpose, we altered some data points to play the role of complex event detection. For example, in the CTCO2 chart, a peak is created and in the SvO2 chart, a smaller peak and valley are created. With these variations, we are able to detect different kinds of situations. Besides, the banner bar in the bottom keeps rolling and shows the real time variations of these parameters, such as increasing and decreasing percentages. The right panel shows raw RFID readings that have not been processed. We can see that their semantics are in a very low level and do not make sense to people. The central panel presents detected complex events like reminders and alerts. These events contain more semantic meanings and help healthcare personnel to respond to critical situations quickly. Figure A screenshot of our prototype 60

73 Discussion According to our study, a number of benefits can be obtained by using RFID-enabled CEP framework for designing hospital information systems to improve patient safety and reduce operational costs. The experimental results show that our proposed approach is feasible in practice. Our proposal of integrating CEP logic with RFID technology has several advantages over conventional methods. First, we identify the current challenges in the hospital and model an RFID-enabled smart hospital. With the tracking capability offered by RFID technology, the smart hospital has the promise to track people, equipment, and even the medicines. Thus, these objects have the power to express themselves. By associating their EPC with the hospital database, we can get more detailed information of these objects. Second, CEP enables semantic interoperability for a variety of sensors, embedded devices and information systems. CEP can deal with data streams from different sources and correlate them to identify critical situations. Although the data from sensors, RFID system and messages from other systems are in different formats and incoming rates, CEP can easily integrate these basic events with event patterns. For example, RFID system itself only provides location and time information of tagged objects while humidity sensors only offer room conditions. However, in the real application, the room humidity needs to be adjusted according to what that room is used for. Monitoring everything manually is time and effort consuming, and error-prone. Third, with the complex events detected by CEP, service providers are able to react and respond to unexpected changes immediately. That is, they can identify situations that require immediate attention to increase real-time responsiveness. Given CEP s degree of access and visibility, hospitals can benefit a lot as it has unpredictable and chaotic environments. Last, our proposed framework can improve the quality of care services and reduce operational costs. A number of possible benefits can be brought to the hospitals. For example, the system can reduce human errors and resources wastes and improve medical treatment quality as well. Most of these benefits have been captured and realized by our proposed rule base. 61

74 However, our approach also has some limitations. Our current rule base use fixed combination of parameters and values to detect complex events. That is, it is not able to detect uncertain situations. For example, in patient monitoring, a single value cannot determine whether the patient s heart rate is definitely high or low. We need a range of value to model these fuzzy characteristics. Using fuzzy logic to partition the range of values for these physiological parameters can improve the detection accuracy in practice. 62

75 Chapter 5 Context-aware Process Model Configuration and Management A variety of context can affect a business process during its lifecycle in different ways. Change in highly level business objectives may requires a modification in process models while exceptions may lead to variations in process instances during execution. Thus, a process should adapt in an intelligent way according to the actual context. In this chapter, we focus process flexibility at the model level and propose a template- and rule-based configuration approach (Kumar and Yao 2012). First, we show how flexible process variants can be configured by applying rules to a generic process template using a configuration algorithm triggered by business context. This leads to a separation of control flow and business policy. Second, we develop a new succinct representation for process trees as strings by performing a post-order traversal. This structure can facilitate process variant configuration and retrieval. Third, we develop techniques for querying a repository of process variants by means of bit vectors. In this way the management of variants even in a large repository is greatly enhanced. Finally, we describe a preliminary implementation of our approach with a case study for demonstration. Our focus is on capturing deeper process knowledge and achieving ease of accommodating changes to business policy (i.e., process context as discussed in Chapter 4) Introduction Among the many approaches and frameworks for designing business workflows, most are based on mapping a control flow that specifies the coordination of various activities, for instance (Dumas et al. 2005; van der Aalst 1998; van der Aalst et al. 2003). The control flow description of a process is also called a process schema. In general, there are a large number of process schemas in an organization. This occurs partly because many schemas are variants of one another with minor differences among them. Take for instance, an insurance company that writes policies for automobile, home and other kinds of insurance. When claim applications are made, the company has to initiate a different process schema for an automobile accident claim as compared to a home damage claim. Moreover, in the home damage claim 63

76 scenario, a different process must be enacted for a home whose value is less than $100,000 versus a home whose value is more than $250,000. In the former case only one adjuster might be required to visit the home and appraise the damage, while in the second case two adjusters are required to submit independent reports of damage assessment. In general, if there are thousands of process variants it makes finding the correct process difficult and error prone. Most process management tools do not support configuration and management of process variants (Hallerbach et al. 2010). Usually, the above scenarios are handled by two approaches with traditional workflow technologies. The single-model approach captures multiple variants in a single process model using conditional branches. Thus it results in a large and complex spaghetti-like model, which is difficult to understand and expensive to maintain. In contrast, the multi-model approach creates a variant by duplicating a process model and adjusting it to specific needs. This leads to high redundancy because these variants are identical or similar in most part. More complications arise if the company changes its policy to require two independent assessments only when the value of the home is more than $500,000. Just this simple business policy change will necessitate a change in many process variants. It is both time and effort consuming if every variant pertaining to that policy has to be modified manually. Therefore, business policy should be separated from the process schema. Business rules have been used to increase the expressiveness of existing process modeling languages and improve process flexibility (Goedertier and Vanthienen 2007). Rules have been used to support run time adaptation of process instances in response to exceptions (Müller et al. 2004; Reichert and Dadam 1998). Other studies are concerned with improving the design time flexibility of process models. For example, some studies (van Eijndhoven et al. 2008) have proposed to use rules to model business logic that is extracted from process logic, leading to a better decoupling of the system. Aspect-oriented process modeling (Charfi and Mezini 2006; Charfi and Mezini 2007; Charfi et al. 2010) also belongs to this stream of research, since it uses aspects to modularize cross-cutting concerns shared among several processes. As a result, a process is adaptable at certain points where cross-cutting activities are inserted. Although more complexity and effort may arise for managing semantics of rules, rule-based approaches 64

77 are promising as a useful way to model business logic and improve manageability of large collections of process models. In this chapter we propose a novel solution for process design and management based on the idea of context-aware process variant configuration. Process variant configuration means to generate a process variant on the fly, for example a BPMN (OMG 2006) or BPEL (Kloppmann et al. 2005) model, that integrates the control flow, resource needs and data by applying business rules to a generic process template, which describes a very basic and general process schema,. In this approach, rules are not only applicable at designated points, but they also allow adaptation of any process pattern in a more general way. This approach offers several advantages. First, it incorporates business policy into process design dynamically. If a policy changes, only the corresponding rules have to be modified while the process template can remain the same. Process variants derived from this process template will be automatically reconfigured to adapt to policy changes without human intervention. In addition, an end user does not have to create a large number of process schemas beforehand, and manually determine which schema to execute when a case (e.g., a new insurance claim) arrives. Second, we develop new techniques to facilitate variant search and discovery in a large repository based on representing process schemas as strings. An end user can search a variant based on its process structure using post-order traversal, or retrieve a number of relevant process variants by querying the rule base. We show that our method is very helpful for managing large collections of process models. Third, this approach leads to holistic process design. Workflow research has focused on the modeling of the control flow of a process, while other key aspects like data flow and resource needs are neglected. In general, a holistic process model requires additional information like resource needs (e.g., equipment, and facilities) for task completion and data values of parameters associated with a task, etc. Our approach integrates the modeling of resource and data needs of various tasks as well into the process description. Thus, the essence of our approach is: process template + rules = process variants. The rules create specific change operations (e.g., insert or delete a task) based on the actual case data and then apply them to the process template to generate a variant. Naturally, this leads to considerably more flexibility than 65

78 conventional approaches and is suitable in a constantly changing environment, as well as for resource intensive or ad hoc workflows Preliminaries A motivating example Figure 5.1 shows an example process template for an insurance process in BPMN notation (OMG 2006). In this template, after a claim is received by a customer representative, it is validated by a clerk to ensure that the customer has a valid policy that relates to this claim. The clerk also makes an assignment to two adjusters who will review and appraise the damage to the auto or the house, and then submit a report. The two adjusters may perform their jobs in parallel, which is indicated by a parallel gateway shown as a diamond with a cross. The first parallel gateway splits into multiple branches and has a corresponding parallel gateway where these paths merge. After the reports are received by the customer representative, they are checked for completeness and sent to an officer who will determine the settlement amount. Subsequently, two approvals are required by a manager and a senior manager. Finally, the accounts manager will make the payment to the customer. Figure 5.1. Description of an insurance process in BPMN notation Figure 5.1 shows the normal tasks, the roles that perform them and the control flow relationships among them. However, the actual process in practice may vary depending upon a particular incident or case. Thus, the template should be customized for a specific case by applying rules to it. An example rule 66

79 set is shown in Figure 5.2. The rules are written in plain English-like syntax. So, if the loss claimed is less than $500K, then only one adjuster is required (R1); however, if the loss is more than $250K then the adjuster should have more than 10 years of experience (R7) and must fill the long form (R9). Further, if the loss is more than $500K, the second adjuster should be classified as an expert (R8). After a settlement is assessed, either one or two approvals are required before payment is made, depending upon the amount of loss (R4, R5, R6). The other two rules deal with the urgency status of the case. If it is marked expedite, then the approvals may be performed in parallel to save time (R2). On the other hand, if it is marked urgent and the loss is small, then the second approval is deferred until payment is made (R3). R1: If loss < $500K, then skip review by adjuster 2 R2: If application = expedite then perform approvals in parallel R3: If application = urgent and loss < $500K then defer second approval until after payment R4: If loss < $100K, then need manager approval R5: If $100K < loss < $500K, then need manager & senior manager approval R6: If loss > $500K, then need manager + VP approval R7: If loss > $250K, then need adjuster with minimum 10 years experience R8: If loss > $500K, then need detailed assessment from an expert R9: If loss > $250K, then adjustor must fill the "long" form Figure 5.2. Rules to be applied to the insurance process template in Figure 5.1 Clearly, by applying rules to the template based on different case data, we obtain different process variants. For example, Figure 5.3 shows two variants V1 and V2, derived from the process template in Figure 5.1. In part (a), the loss is $200 K and it is marked expedite, while in part (b) the loss is $300K and it is marked urgent. We can see that these two variants are identical in most parts and only have minor difference in policy. The full process that captures all the variants can be very large and complex, as shown in Appendix A. We will evaluate its complexity later in Section 5.5. Informally, the basic criteria for differentiating business logic (or business rules) from process logic is that rules are more dynamic (e.g., approval procedure is different depending upon the damage and the urgency status), while process logic is relatively static and hardly changes over time. The purpose of these examples was to motivate the need for configuring process variants by separating business policy from 67

80 process scheme. Our motivation in a nutshell is to reduce the redundancy between process models that are slightly different from each other, to make process models more adaptive to business policy changes, and to efficiently retrieve the desired model for a specific case. Thus, our approach uses process templates to abstract process logic for similar cases, and rules to better handle dynamic requirements. (a) Process variant V1: (loss = $200K, status = expedite) (b) Process variant V2: (loss = $300K, status = urgent) Figure 5.3. Two process variants derived from process template and rules Formal representation of a process model In general, a business process can be composed by combining four basic constructs as building blocks, i.e. sequence, parallel, decision structure (or choice) and loop, as shown in Table 5.1. They can be applied to two or more atomic tasks, e.g. S(A,B, C), or P(A,B,C), to indicate that tasks (or subprocesses) A, B and C are in sequence, or in parallel, respectively. Parallelism is introduced by using a parallel split gateway to create two or more parallel branches which are synchronized by another parallel 68

81 merge gateway. We use 'P' to denote this structure. Similarly, a choice structure, denoted as 'C', is created with a pair of exclusive OR (XOR) gateways. The first XOR node represents a choice or a decision point, where there is one incoming branch and two or more outgoing branches, exactly one of which can be activated. Finally, a loop, denoted by 'L', is also drawn using a pair of XOR gateways but differently from a choice structure. The first XOR gateway takes only one out of all the incoming branches and the second XOR gateway represents a decision point that can activate any one of the outgoing branches. The patterns are applied recursively to create complex processes. In this section, we use BPEL to represent process schema since it is one of the most popular process modeling languages. Table 5.1 also describes how BPEL is mapped into BPMN notation. Figure 5.4 shows the BPEL representation of the insurance claim process in Figure 5.1 (the data and resources pertaining to this process are omitted). In general any structured process can be represented in this way. Other process features, such as events and structured activities, can also be expressed in BPEL. Table 5.1. Basic patterns to design processes Structure Sequence Graphical Representation (BPMN) t1 t2 BPEL Representation <sequence name= S1 > <invoke name= t1 /> <invoke name= t2 /> </sequence> Notation S(t1, t2) Parallel t1 t2 <flow name= P1 > <invoke name= t1 /> <invoke name= t2 /> </flow> P(t1,t2) Choice cond.= A1 cond.= A2 t1 t2 <switch name= C1 > <case condition= A1 > <invoke name= t1 /> <case/> <otherwise name= A2 > <invoke name= t2 > </otherwise> </switch> C(t1, t2) Loop t1 cond.= A1 <while name= L > <condition name= A1 > <invoke name= t1 > </condition> </while> L(t1) 69

82 <?xml version="1.0" encoding="utf-8"?> <process name="insurance process model"> <variables> <variable name="loss" type= xsd:double /> <variable name="status" type: xsd:string /> </variables> <sequence name="s1"> <invoke name="receive Claim"/> <invoke name="validate Claim"/> <flow name="p"> <invoke name="review Damage 1"/> <invoke name="review Damage 2"/> </flow> <invoke name="receive Reports"/> <invoke name="determine settlement"/> <invoke name="approval 1"/> <invoke name="approval 2"/> <invoke name="make payment"/> </sequence> </process> Figure 5.4. BPEL representation of the insurance process model in Figure Language to design flexible process variants (FlexVar) A process template captures the common process logic among a family of process models and serves as the basis for deriving variants. Thus, it can also be called a base process. A process variant is configured from a process template by applying a small number of change operations to it through rules. Strictly, the process template is also another variant; however, it is chosen in such a way that it has the closest degree of similarity to other variants in a collection of related processes. The process of configuring a variant from a process template by applying a series of change operations is called variant configuration. Table 5.2 presents a simple language called FlexVar consisting of basic change operations that can be used to modify a process template. In general, these operations can be categorized into three types that correspond to the control flow, resource, and data related perspectives of process modeling. Control flow related operations can change the execution sequence of activities in a process template, e.g., insert, delete or move a task; resource related operations can change the role that performs a particular task or activity, e.g., assign a different performer for the task; data related operations can modify the properties of activities, resources, and other process metadata. With these operations, we can design a process variant 70

83 from different perspectives. For example, the insert operation allows us to insert a new task into a process template. However, we must also specify the control-flow relationship of the new task with an existing task, i.e., in a sequence, parallel, choice, or loop structure. For a sequence, the user should specify whether the task is inserted before (S b ) or after (S a ) another task in the process. This would also apply to a loop structure to indicate the flow sequence within the loop. However, it is not required for parallel and choice constructs. Note that in Table 5.2, the various operations can just as equally apply to process fragments (or subprocesses) as to tasks. An example of a process fragment in Figure 5.3(a) is the parallel structure formed by tasks T3 and T3-2. In general, a process fragment f can similarly be deleted, inserted, etc., as a regular task. Table 5.2. Base operations of FlexVar language for modifying a process template Perspective Base operations Description Control flow delete(t f*) Delete task t f from the process related insert(t f, S b S a P C L b L a, t1, [N]) Insert task t f in sequence, parallel, choice or loop with task t1 to create a node N (optional) (S b L b = before; S a L a = after) replace(t1 f1, t2 f2) Replace t1 f1 with t2 f2 move(t f, S b S a P C L b L a, t1 f1) Move t f to a different position. The new place is defined in relation to t1 f1. change(t1 f1, t2 f2, S b S a P C L b L a ) Change relationship between t1 f1 and t2 f2 to S b, S a, P, C, L b, or L a Resource related role(t, r) Task t is performed by role r Data related data(attribute, value) Assign a value to a data attribute prop(role, property_name, value) Each role can have several properties and corresponding values with them data_in(t1, din) din is an input data parameter for task t1 data_out(t2, dout) dout is an output data parameter for task t2 task (t, name) status(proc_id, value) *Note: f represents a process fragment or a subprocess Assign a new name to the task with id t Change the status of a process (value = normal, expedite, urgent, OFF) 5.3. Rule Representation and Processing This section describes how business rules can be represented, categorized, and processed to enable context-triggered process model reconfiguration. 71

84 Rule representation We use a formal language to represent rules that can be easily written and recognized by a process designer. The format of the rule language is presented as follows: Rule rule_name, rule_group, rule_category, priority IF condition 1, condition 2 condition n THEN action 1, action 2 action n END Here, the parameter rule_name assigns a unique name to each rule. A rule_group clusters rules with similar functionalities so they can be triggered or disabled at the same time, rule_category categorizes type of rules according to process perspectives (see more details in Section 5.1.2), and priority specifies the order in which the rule is triggered. If two rules are both triggered and they conflict with each other, the one with higher priority will be executed. Once conditions are met, the corresponding rule is triggered and produces actions. Conditions can be connected by logical operators including AND, OR, and NOT, to define complex conditions. An action can be any change operations presented in Table 5.2 to modify a process template. For example, we can use this rule language to rewrite R3 (see Figure 5.2) so it can be processed by a rule engine as follows: Rule R3, RG1, Control_flow, 10 IF process.status = "urgent" AND process.loss < THEN move (t7, S a, t8) END Rule categories In general, the actions carried out by rules can affect the control flow, resource, and data perspectives of a process. Thus, we categorize rules in terms of their effects as: Control flow related rules are used to alter the control flow of a process based on case data. In addition to deleting or replacing a task, they can also alter the control flow by moving a task to a different position, or changing the relationship between two tasks. Resource related rules are concerned with resource assignment based on case data. 72

85 Data related rules are associated with properties or attributes of a resource related to a case. Hybrid rules concern the modification of several aspects of process design. For example, they might alter the flow of control of a process as well as change the properties of a resource. In addition, a hybrid rule may affect a process template in more than one perspective. These categories can help to organize rules and define them systematically. For simplicity, we use the format Rule_name: conditions actions, where the conditions appear on the left hand side and the actions on the right. Figure 5.5 shows how the rules in Figure 5.2 are expressed in this format. They refer to the process template in Figure 5.1, and the actions are based on the operations presented in Table 5.2. Rules R1, R2 and R3 concern the control flow perspective. For example, R1 deletes task t3-2 from the process template if the amount of loss is less than $500K since the review by the second adjuster is not required. R2 changes the relationship between tasks t6 and t7 from sequence to parallel if it is marked expedite. Similarly, R3 moves the second approval task t7 to the end of the process if the case is urgent. Rules R5 and R6 are resource related rules and they make assignments of resources to tasks based on case data, i.e., the amount of loss. Rules R7, R8 and R9 are related to properties of the resources, etc in the process. Finally, Rule R4 is a hybrid rule since it not only changes the control flow but also resources of the insurance process. Control flow related R1: process.loss < delete(t3-2) R2: process.status= expedite change(t6,t7,p) R3: process.status= urgent AND process.loss < move(t7, S a, t8) Resource related R5: process.loss > AND process.loss < role(t7, senior_manager) R6: process.loss > role(t7, vice_president) Data related R7: process.loss > prop(adjuster2, min_exp, 10) R8: process.loss > prop(adjuster2, qualification, expert) R9: process.loss > prop(form, type, long) Hybrid Rules R4: process.loss > 0 AND process.loss < role(t6, manager) AND delete(t7) Figure 5.5. Different types of rules related to the insurance process template in Figure

86 Rule processing and semantics for conflict resolution When the above rule set is fed with specific case data, corresponding rules are triggered if their condition clauses are satisfied. Then the resulting change operations are applied to the process template to configure a variant. Assume the case data is as follows: loss = $300k; status = expedite. Now, the rules triggered are: R1, R2, R5, R7 and R9. The corresponding operations/actions that become true as a result of applying these rules are: Operation 1: delete(t3-2) Operation 2: change(t6,t7,p) Operation 3: role(t7, senior_manager) Operation 4: prop(adjuster, min_exp, 10) Operation 5: prop(form,type,long) These operations can be applied to the process template to produce a variant for this specific case or scenario. Although all these rules have been validated by domain experts, they may produce conflicting results depending upon the order in which they are fired. For example, insert(t2, S a, t1) and insert(t3, S a, t1) applied to a process consisting of a single task 't1' can result in two process fragments S(t1, t2, t3) or S(t1, t3, t2) depending upon the order in which they are applied. Moreover, sometimes rules may fail. For instance, insert(t1, S a, t2) would fail if task 't2' has been already deleted by an operation in the previous step. Therefore, it is very important to specify the correct semantics for handling such situations for variant configuration. Some possible semantics are: Arbitrary semantics: does not impose any order on the rules. This implies that all execution orders of rules are acceptable. Priority semantics: assigns a priority to rules if the execution order is important. Higher priority rules will execute first, followed by lower priority one in descending order. Complexity semantics: defines a complexity factor to each rule according to the number of conditions. A rule with more conditions is more specific than a rule with fewer ones and is execute first. 74

87 Recency semantics: assigns a time factor to each rule when it is inserted into the rulebase. The more recent rule is activated with a higher priority. Fail semantics: returns failure or throws an exception when a rule cannot be executed due to different reasons. In this case a variant cannot be generated. Therefore, the user will have to intervene to modify the rules or assign new priorities to them. In general, a user should specify the semantics depending on the application. Otherwise, arbitrary semantics will be used by default. If the configuration fails, an exception is generated and the user is notified about the conflicting rules that cause it Checking validity of a sequence of change operations Before a series of change operations can be applied to create a variant, one must ensure that they are compatible and the resulting variant is structurally correct (e.g., it does contain an infinite loop, etc.). The matrix in Table 5.3 shows how conflicts between a pair of change operations can be identified to resolve this confliction issue. op2 op1 insert (new, X, old) delete (old) replace (old, new) move (old1, X, old2) change (old1, old2, X) Table 5.3. A matrix for verifying the correctness of multiple operations insert (new', X', old') compatible delete (old') If old' = new, then do nothing* Replace (old', new') If old' = new, then insert (new', X, old) If old = old', then op1 and op2 are incompatible compatible compatible If old' = old1, then skip op1; If old' = old2, then replace (old2, old1) If old' = old1 or old' = old2, then If old' = old1, then move (new', X, old2) move (old1', X', old2') If old1' = new, then insert (new, X, old2') change (old1', old2', X') compatible If old = old1' or old=old2, then op1 and op2 are incompatible If old1' = old1 and old2' = old2, then skip op1 If old1' = old1 and old2' = old2, then move (old1, X', old2) compatible compatible If old1' = old1 and old2' = old2, then skip op1 skip op1 Note: X=S b S a P C L b L a ; old represents existing tasks and new represents new tasks, resp.; = checks equality *Note: Otherwise, it is compatible. Same applies to other results in this table. Assume that operation op1 is applied first, followed by operation op2. There are three resulting cases: (1) op1 and op2 are compatible and should be performed independently, e.g. insert (new, X, old) and 75

88 insert (new', X', old'); (2) op1 and op2 are compatible and these two operations can sometimes be merged to produce no effect. For example, the operations insert (new, X, old) and delete (old') will have no effect if old' equals new; (3) op1 and op2 are incompatible and users should be notified, e.g., delete (old) and insert (new', X', old') if old' equals old. Further, if, say, a task t is deleted and reinserted in the same position the net result will be move (t, X, t), which implies no change. Next, we discuss the variant configuration algorithm Process Variant Configuration and Representation The rule processing stage generates a list of change operations related to control flow, resources, and data of a process. The operations that relate to resources (such as role operations) and also those related to data (such as data or prop operation) are added to the case database directly. For example, prop(form,type,long) assigns the value long to the type attribute of the form object. Such data would be used as input data for task execution. However, the operations that relate to the control flow are used as input for a variant configuration algorithm in order to generate a new control flow schema for that variant Overview Our variant configuration algorithm is based on creating a tree for a process template, and then applying each change operation to the tree. After all the changes are applied, the resulting tree reflects the control flow of the variant. Although, this can be converted into any process description language to describe the process schema, here we use BPEL. Figure 5.6 shows the process tree of the template in Figure 5.1, and Figure 5.7 presents our variant configuration algorithm. There are several equivalent representations of such a tree. However, this is not the focus of our study, since it does not make a difference in the configuration process. In general, there are two types of nodes in a process tree: the leaf nodes are the task nodes, while the internal nodes are control nodes that capture the relationships (e.g. S, P, C, or L) among tasks and/or process fragments. The child nodes of a sequence node are numbered in order from left to right. Thus, the leftmost child appears first in the sequence, and the rightmost one last. For parallel and choice nodes, the order of appearance of the child nodes does not matter because of their 76

89 execution semantics. In general, a loop node could have two child nodes, the first for the forward path and the second for the reverse; hence, the order does matter. This tree can be stored in a tabular data structure as follows: (node_id, type, child_node, in_order), where - node_id is the ID of the tree node - type refers to four control flow relations (i.e., S, P, C, or L) - child_node is the set of all the child nodes of the current node - in_order is a Boolean value to indicate if the order of child nodes matters as in S and L For example, node S1 is used to connect two control nodes and can be represented as (S1, S, {S2, S3}, Y). In contrast, node P1 is a parallel control node and it is used to connect two tasks. It can be represented as (P1, P, {T3, T3-2}, N). Figure 5.6. A process tree for the insurance process template Algorithm Variant_Configuration (PT, case_data) PT: the process template in BPEL case_data: the initial case data 1. Generate a series of change operations op_set based on case_data and rule set //run rule engine 2. Transform PT in BPEL format into a process tree P_TREE 3. FOR each operation op in op_set //modify the process tree by executing all change operations 4. Apply op to P_TREE using the configuration algorithm description in Table If operation failed 6. Throw Exception 7. Else 8. Save P_TREE // the change is applied successfully 9. End FOR 10. Transform P_TREE into P_VARIANT in BPEL format 11. Return process variant P_VARIANT 12. END Figure 5.7. Variant configuration algorithm 77

90 Algorithm details for tree operations Table 5.4 shows the detailed pseudo-code for implementing the control flow change operations that were introduced in Table 5.2. We illustrate these operations next in the context of Figure 5.6. Although, this tree only presents the control flow perspective of the process, the resource and data related information can also be included in the nodes. A delete operation simply removes the node corresponding to a task in the tree, and if the parent of the deleted node has only one child left, then the child is moved up to take the place of the parent. Each non-leaf node should have at least two child nodes and a task is always a leaf node. For simplicity, we discuss operations on task nodes (i.e. at the leaf level) only here. Later we will extend this approach to perform operations on internal nodes. When a task is to be inserted, its position must be specified in the tree with respect to an already existing task node. Moreover, the relationship between the existing node and a new node should also be specified as sequence (S), parallel (P), choice (C) or loop (L). If it is a sequence (or a loop) then it is also necessary to state whether the new task is inserted before (S b ) or after (S a ) the current node. The insert procedure is to create a parent node P1 for the existing node (say, t1) and insert the new node t as a child of P1 in the tree. The parent node can optionally be given a new label N. The replace operation simply changes the label of a task node with its new name. The move operation is like a delete followed by an insert. It removes a task from its current location in the tree and inserts it into a new position. This new position is defined with respect to an existing task node in the tree which serves as an anchor node. The change operation may be used to modify the relationship between two existing nodes t1 and t2 in the tree. However, this is possible only if a direct relationship (i.e., with a common parent node and no other siblings) exists between them. Otherwise, the operation would fail. In order to implement this operation, we first check if a direct relationship either exists already or can be found by rewriting the tree into an equivalent tree by means of rewriting rules. If it is possible to do so, then the parent node of t1 and t2 is changed to the new relationship. Otherwise, an exception is generated. 78

91 These operations are summarized in Table 5.4. As an example, Figure 5.8 is derived from Figure 5.6 by performing delete (T3-2) and change (T6, T7, P) operations. We next show how a process tree is represented as a text string for ease of storing and searching, and also how the various operations discussed above are translated into search and replace operations on the string to transform one variant into another. Table 5.4. Details of performing change operations on a process tree P_tree Operation Algorithm description Delete (Node t) IF parent (t) has 2 child nodes Replace(parent(t), t.sibling); ELSE Delete (t); [Exception: Node t is not P_tree] Insert (Node t, Rel X, Node t1, [N]) Create a new parent node N for t1 && N.node_type=X; Note: X = S b S a P C L b L a Add t as a new child of N ;} [Exception: Node t1 is not in P_tree] Replace (Node t1, Node t2) Rename node t1 with t2; [Exception: Node t1 is not in P_tree] Move (Node t, Rel X, Node t1) Delete (t) && Insert (t, X, t1); [Exception: Node t1 is not in P_tree] Change (Node t1, Node t2, Rel X) Change parent (t1, t2).node_type to new relationship X;} [Exception: Nodes t1,t2 not in P_tree] Figure 5.8. Revised process tree after applying the variant configuration algorithm 79

92 Tree traversal, representation, and string operations We developed a novel representation structure for storing a process variant by traversing all the nodes of its tree in post-order. This traversal scheme is: starting at the root node of the tree, recursively explore its left child sub-tree first, then its right child sub-tree and finally the root itself. A post-order traversal of the tree in Figure 5.8 is shown in Figure 5.9 (a). Another representation for this structure is shown in Figure 5.9 (b). An alternative representation as two equal length TV and LV vectors is shown in Figure 5.9 (c). Note that these two representations are equivalent. This figure also shows a derived delta vector, i.e. the difference between successive level numbers in the string, the delta for position i in the string is level[i] level[i-1]; for position 0 it is 0. T1-T2-S4-T3-S2-T4-T5-S5-T6-T7-P2-T8-S3-S1 (a) A standard post-order traversal of the tree in Figure 5.8 T1(3)-T2(3)-S4(2)-T3(2)-S2(1)-T4(3)-T5(3)-S5(2)-T6(3)-T7(3)-P2(2)-T8(2)-S3(1)-S1(0) (b) A post-order traversal of the tree in Figure 5.8 with level numbers in parentheses (root is level 0) Task vector (TV) Level Vector (LV) T1 T2 S4 T3 S2 T4 T5 S5 T6 T7 P2 T8 S3 S Delta (c) Alternative representation of post-order traversal as TV, LV and a derived delta vector Figure 5.9. Post-order traversal of the process in Figure 5.8 Lemma 1. A post-order traversal with levels is equivalent to the tree representation and vice-versa. Proof sketch. Given a post-order traversal with levels we can scan the entries one at a time and add them to a tree. Further, given a tree we can first add levels to the nodes with the root node at level 0. Then we can create a post-order traversal using the standard algorithm described above. This representation serves two purposes: (1) it simplifies the performance of various operations such as insert, delete, replace on this tree as string operations; and, (2) it facilitates the storing of a process 80

93 succinctly and querying it. Various operations on the tree can be directly performed as search and replace string manipulation operations on the post-order traversal string (POS). The details for each operation are shown in Table 5.5. For example, a delete node operation, simply deletes a single node entry from POS if the node is a task node. If it is an internal node, then the subtree rooted at that node must be deleted. For example, if we wish to delete node P2 in Figure 5.8, the subtree with P2 as its root is represented in postorder as the string: "T6(4)-T7(4)-P2(3)." To delete this subtree, we remove the entry P2, and all contiguous prior entries with level numbers higher than the level number of P2. Therefore, T6 and T7 will also be deleted. Operation Delete(n) Insert (n, X, n1) Note: X = S b S a P C L b L a Move(n, X, n1) Table 5.5. Steps of operations to be performed on post-order traversal string (POS) Detailed Steps Remove node n and higher level numbers to the left until a lower level reached; Reorganize(n) Replace ("n1", "n1 n X", POS); or Replace ("n1", "n1 n X", POS); level#(x) = level# (n); level# (n1)= level# (n) +1; level# (n)= level# (n) +1; Renumber child nodes of n with respect to number of n SP1 = Remove node n and higher level numbers to the left until a lower level reached; Reorganize(n); Insert(SP1, X, n1) Change(n1,n2,X) Scan substring of POS within entries n1 and n2; Find the highest level control node in the substring; Replace node with X Reorganize(n) If a right or left neighbor NN of n has a non-zero delta with pred. and succ. nodes, then (1) Delete the neighbor of NN with the lower level number (there will be only one such neighbor) (2) Decrement level number of node NN (3) Decrement level number all nodes in (position(nn) i) s.t. Level#(position(NN) i) > Level(NN) [Note: a neighbor of n is an adjacent node to n in POS, and at the same level as n ] A reorganize step follows the deletion of a node. In this step, we check the delta value of a deleted node's neighbor (an adjacent node at the same level as the deleted node). If the neighbor has a non-zero delta, then it means that the deleted node did not have a "sibling". Therefore, the parent of the deleted node is also deleted, while the level of the sibling (and any children of the sibling) is reduced by 1. An insert operation may insert a single task, or even a subprocess. In the former case the task is a single element; in the latter case the subprocess is represented as a post-order string. In both cases a string replacement operation is performed as shown in the table. The move operation consists of two parts. The 81

94 first part is like a delete where a contiguous substring is removed from POS. The second part is like an insert where the deleted string is inserted in a new position in POS. The change operation modifies the relationship between two task nodes or, in general, two subtrees. To do this we find the first common ancestor of the two nodes and change its value to the new relationship Examples Example 1: Figure 5.10 shows various operations on the running example of Figure 5.9. It gives the initial POS string (or vector), along with the corresponding level number and delta values. It also shows the effect on each of these values after deleting task T3 and reorganizing; then deleting T5 and reorganizing; and finally inserting T9. Operation Post-order traversal string (POS) with levels TV0 LV0 T1 3 T2 3 S4 2 T3 2 S2 1 T4 3 T5 3 S5 2 T6 3 T7 3 P2 2 T8 2 S3 1 S1 0 Delta Delete T3; calculate delta X Reorganize:S2, S4 both have non-zero delta -Delete S2 -decrement S4,T2,T1 levels Delete T5; Reorganize Insert(T9, S4, P) T1 3 0 T1 2 0 T1 2 0 T1 2 0 T2 3 0 T2 2 0 T2 2 0 T2 2 0 S4 2-1 S4 1-1 S4 1-1 S4 1-1 T4 3 2 T4 3 2 T9 2 1 Figure Illustration of operations on the POS string of the process in Figure 5.8 S2 1-1 T5 3 0 T6 3 0 P3 2 0 T4 3 2 S5 2-1 T7 3 0 T4 3 1 T5 3 0 T6 3 1 P2 2-1 T6 3 0 S5 2-1 T7 3 0 T8 2 0 T7 3 0 T6 3 1 P2 2-1 S3 1-1 P2 2-1 T7 3 0 T8 2 0 S1 0-1 T8 2 0 P2 2-1 S3 1-1 S3 1-1 T8 2 0 S1 0-1 S1 0-1 S3 1-1 S1 0-1 Example 2: In general a variant V i is transformed into another variant V j by a series of operations op1, op2, (e.g. delete, insert, change, etc.) on V i. Consider the two variants V1 and V2 shown in Figure V2 is derived from V1 by performing the six operations shown in the figure. Table 5.6 shows the POS and level# vectors for V1 in the first row. In successive rows it shows the effect of each operation on these vectors. The last row shows that the vectors for variant V2 are the same as the resulting vector in the row above except for only a minor structural difference related to T1-T3-T2-T4 task sequence. Note that 82

95 in a process diagram a sequence is shown by a flow using directed arrows without a special symbol. However, in a tree because there is no notion of flow, a node is introduced corresponding to the flows of S1, S2, etc. in the example of Figure The representation developed here will be employed in the next section in the context of managing and searching a large repository of process models. Figure Variant V2 is derived from V1 by performing a series of operations Table 5.6. Illustration of steps performed in transforming V1 to V2 in the example of Figure 5.11 Operation Variant V1 Delete(T2) insert(t2,t3,s A ) delete(t11) change(t7,s3, C) change(t8,t9,s B ) replace(t12,t10) Variant V2 T1 1 T1 1 T1 1 T1 1 T1 1 T1 1 T1 1 T1 2 T2 2 T3 2 T3 2 T3 2 T3 2 T3 2 T3 2 T3 2 T3 2 T4 1 T2 2 T2 2 T2 2 T2 2 T2 2 T2 2 C1 1 T5 2 S4 1 S4 1 S4 1 S4 1 S4 1 T4 2 T4 1 T6 2 T4 1 T4 1 T4 1 T4 1 T4 1 S6 1 T5 2 C2 1 T5 2 T5 2 T5 2 T5 2 T5 2 T5 2 T6 2 T7 3 T6 2 T6 2 T6 2 T6 2 T6 2 T6 2 POS string C2 T7 1 3 T11 S2 3 2 C2 T7 1 3 C2 T7 1 3 C2 T7 1 3 C2 T7 1 3 C2 T7 1 3 C2 T7 1 3 T11 3 T8 3 T11 3 T8 3 T8 3 T9 3 T9 3 T9 3 S2 2 T9 3 S2 2 T9 3 T9 3 T8 3 T8 3 T8 3 T8 3 S3 2 T8 3 S3 2 S3 2 S5 2 S5 2 S5 2 T9 3 P1 1 T9 3 P1 1 C3 1 C3 1 C3 1 C3 1 S3 2 T12 1 S3 2 T12 1 T12 1 T12 1 T10 1 T10 1 P1 1 S1 0 P1 1 S1 0 S1 0 S1 0 S1 0 S1 0 T12 1 T12 1 S1 0 S1 0 83

96 5.5. Managing a Large Repository of Process Variants Analysis and Discovery In this section we first introduce notions of similarity among process models as a way to cluster related models. Then we discuss how a large number of process variants can be indexed using bit vector structures and accessed using SQL queries. Later we discuss metrics for evaluating the complexity of a process model and use them to compare various models Similarity One notion of similarity is the number of edit operations between variants. Figure 5.12 shows how variants can be placed in a graph. V1 through V6 are variants derived from a base variant V by performing one or more operations on the arcs connecting them. As shown in this figure, it is also possible to derive one variant from another one (e.g. V3 from V2) by performing an operation. It is also possible to go in reverse by performing an inverse operation. For example, the two operations op1 and op2 are performed in sequence transform V to V3. The inverse operation to go from V3 to V is (op1, op2) -1. In general, (op1, op2,.opn) -1 = opn -1, op2-1, op1-1. Figure A graph of variants showing how one variant is derived from another by operations However, it is difficult to relate to the notion of edit distance because it does not consider the number of tasks that are common between two processes or the structural similarity among them. Hence, we introduce two other measures called task distance and structural distance. Task distance is defined as follows: 84

97 are different between process model variants i, j = Hamming distance between task vectors of variants i, j = # tasks (structures) present in model i but not in model j + # tasks (structures) present in model j but not in model i The structural distance between variants i, j is based on comparing the pre-order traversal strings of the variants. We create the pre-order strings for the two variants first. Then, we remove all task entries from the strings to perform a structural comparison. For the example of Figure 5.9, this will result in a string like: S1(0)-S2(1)-S4(2)-S3(1)-S5(1)-P2(2). Thus, only the structural information is retained. In this structure there are five sequence nodes at levels 0, 1 and 2, and also one parallel node at level 2., where # of structures that differ between models i, j at level k The idea of this formula is to compare the structural similarity at the top level, i.e. the root level, and then at successive sublevels. The term is intended to reduce the weight of dissimilarity or distance at successively lower levels. These notions of similarity and distance can be used to cluster a group of variants together if their inter-process distance on various metrics is less than a cutoff value, and also to find one base variant with the minimum distance from all other variants. We do not discuss clustering algorithms here, but refer to other work Process variant discovery In this section, we discuss techniques for querying a large repository of variants using data structures and techniques. First, we maintain two bit vectors, task bit vector and structure bit vector that represent the presence of tasks and control flow structures in a variant, respectively. Thus, a task vector (0,1,1,0,1,1,0) means that this variant contains tasks T2, T3, T5 and T6. A user can query the bit vector and find the variants that meet the requirement and also the cluster in which they lie. Similarly, a structure vector of (1,0,1,0) indicates that the S and C control flow structures are present in the variant but P and L are not. These vectors are included as parts of the Variant_index table shown in Table

98 Table 5.7. An example Variant_index table containing task and structure bit vectors for each variant Task bit vector Structure bit vector variant cluster T1 T2 T3 T4 T5 T6 T7 S P C L V1 C V2 C V3 C V4 C V5 C V6 C The Variant_index table can be queried using SQL. Some simple example queries are as follows: Q1: Select variant from Variant_index where T1 =1 and T2 = 1 and T7 = 0 Q2: Select variant from Variant_index where T1 = T2 and T3 = 1 and not(t6 = T7) Q3: Select variant from Variant_index where T1+ T2+T3 2 or T6 + T7 = 1 Q4: Select variant from Variant_index where Task_distance < 5 and Struct_distance < 3 Q5: Select count(*) from Variant_index where T1 = 1 Q6: Select count(*)from Variant_index where not(t1 = T2) In this way a large variety of queries can be designed using the standard Boolean operations. These queries can be processed within a Relational Database Management System (RDBMS). More advanced queries can relate to the control flow structures present in a variant. A user may wish to find all variants where tasks Ti and Tj appear in parallel, sequence, etc. Such a query can be expressed as: Q7: Select Cluster, Model where T1 =1 and T2 = 1 and T7 = 0 and Parallel(T1, T2) The algorithm to process such a query is shown in Figure Algorithm Relationship_Query (Variant_index, query_clause, POS[]) 1. Rewrite the query clause so that a relationship X(Ti,Tj) is replaced by X=1 2. Find all variants where the rewritten query clause is satisfied. 3. Then transform the set of resulting variants into in-order traversal strings. 4. Extract the substring that lies between T1 and T2 in each string. 5. Find the highest level control structure in this substring. 6. If it is P then this variant is included in the result of the query; otherwise not. Figure Algorithm to determine the relationship between two nodes by scanning POS string 86

99 The algorithm rewrites the query clause to check the structure vector for even a single occurrence of the desired structure. In query Q7, Parallel(T1, T2) would be replaced by P=1, and the query condition is rewritten as: "T1 =1 and T2 = 1 and T7 = 0 and P=1." Then it searches the bit vector for matching variants. Then the POS string for each variant is scanned to determine the relationship between a pair of nodes and the ones that do not satisfy the query are filtered out. If the query contains multiple relationship conditions then the scan is repeated for the remaining set of variants. The order of screening for relationships can be arranged so that the variants are first screened for the relationship that is less likely. As an example of screening, consider the in-order traversal (which is a variant of the post-order traversal) of the process of Figure 5.9: In-order:T1(3)- S4(2)-T2(3)-S2(1)-T3(2)-S1(0)-T4(2)-S5(1)-T5(2)-S3(1)-T6(3)-P2(2)-T7(3)-T8(2) Now, to find the relationship, say, between T6 and T7, we extract the substring that includes both T6 and T7, i.e. T6(3)-P2(2)-T7(3). The highest level of control structure in this string is P2. So the relationship between T6 and T7 is of type parallel. Similarly, for task T4 and T8, we extract the substring that includes both T4 and T8 as: T4(2)-S5(1)-T5(2)-S3(1)-T6(3)-P2(2)-T7(3)-T8(2). The highest control structure in this is S3 at level 1. This means that the relationship between T4 and T8 is of type sequence Querying a rulebase A second approach for finding variants is by querying the rulebase on case data. For instance, a query in the insurance handling repository may state: Q8: Select * from Rules where loss > 500K and status = "urgent" This will find the process variants in the collection that handle cases with loss larger than $500K and in urgent status. Again the Boolean operators can be used to connect atomic predicates and create more powerful search conditions. A more complex query can be specified as: Q9: Select * from Rules where loss > 300K and loss < 500K and (NOT (status = "urgent")) or (loss > 500K AND status = "normal") It is also possible to write queries pertaining to the consequent of a rule, i.e. the action tasks required to be performed as a result of a rule. This query returns the action tasks that require the role manager. 87

100 These types of queries can be implemented by using standard database techniques and by building indexes on important attributes that appear frequently in rules, such as loss, status and role in our running example. Querying and searching is an important aspect of managing a large number of variants. Both the representation of data objects, as well as the nature of indexing structures, plays an important role in determining the effectiveness of search. Our representation scheme facilitates string searches to look for patterns and variants. Any given pattern or variant can itself be described as a post-order traversal string. Then, it is possible to search for this string in a file of variants for exact match using any text editor. It is also possible to search for partial matches using a partial match score of similarity. Moreover, the bit vector index allows us to narrow down the number of variants on which string search needs to be carried out. The operations on such index tables can be performed very fast. In addition, exact string searches on very large text files (or, in our case, on a large number of rows of strings) can be also carried out in subsecond times. Therefore, the scheme proposed here is promising Evaluation of process complexity reduction Our study can facilitate the design and management of large collections of process models. In this section, we evaluate our approach by measuring the complexity of process variants versus a single process model using a number of standard complexity metrics. Gaining insights from software engineering, cognitive science, and graph theory, Cardoso et al. (2006) proposed a variety of process complexity metrics. These metrics includes number of activities, number of activity and control nodes, McCabe s cyclomatic complexity, and control-flow complexity (CFC). CFC concerns the process structure or control flow patterns. It has significant impact on process design as shown by empirical evidence (Cardoso et al. 2006). The CFC of a process P is defined as: where C is a control node of type XOR-split, OR-split, AND-split CFC XOR (c) = fan-out (c) CFC OR (c) = 2 fan-out (c ) 1 CFC AND (c) = 1 88

101 Another set of metrics proposed by Mendling and Neumann (Mendling and Neumann 2007) aims to identify errors in process design. These metrics capture the degree of sequentiality, cyclicality, and parallelism in a process model and hence reflect its complexity that in turn relates to the likelihood of errors in a process model (Mendling and Neumann 2007). For example, an increase in sequentiality would imply a decrease in the probability of errors in a model, while an increase in cyclicality would imply an increase. The complexity of process models is related to the quality of process models, as elaborated in (Vanderfeesten et al. 2007). In order to comprehensively evaluate the complexity of two variants, V1 and V2 in Figure 5.3, against the integrated process model in Appendix A, we use a combination of metrics from the abovementioned studies which have been empirically validated. Table 5.8. Process complexity metrics and the evaluation results Complexity Metrics Single model Process Variant Perspective Metrics description P V1 V2 Process size Number of activity nodes (Cardoso, 2006) Number of activity and control nodes McCabe s cyclomatic complexity Control-flow perspective (Cardoso, 2006; Mendling and Neumann, 2006) Number of XOR-split nodes Number of OR-split nodes Number of AND-split nodes Control-flow complexity (CFC) Sequentiality 2/9 1/2 1 Cyclicity 2/ Parallelism Table 5.8 summarizes the complexity metrics used in this study and our evaluation results. From this table, we can see that even for a process with ten tasks, the complexity of a single model that integrates six rules R1-R6 (R7, R8, and R9 are excluded since they concern the data flow) is much greater than that of the variants generated by our approach. Although our variants are also maintained as process models in the repository, they are tightly associated to the template through rules. Thus, they will be automatically adapted once any rules are modified in response to policy changes. 89

102 5.6. Prototype Implementation System architecture A high level architecture for our approach is shown in Figure A process designer can create, modify or delete process templates and rules using an editor. As described above, a process template, relevant rules and variants are organized in a cluster, which is a process family that refers to an application domain (e.g., insurance claim handling, patient workflow, etc.). Generally, each cluster only contains one template represented in BPEL but may contain hundreds or thousands of variants. The process template is configurable for generating more variants by using a number of rules. The editor checks the template for correctness, and the rules for consistency. We implemented a prototype in Java and used the Drools Rule Language (DRL) to represent our rules. The open source rule engine Droolsexpert (JBoss 2010a) is used to process rules and resolve conflicts. The process designer can easily configure a process variant using the variant designer. First, he needs to select a process template and then feed case data as a variant scenario. Then the rule engine produces a valid set of change operations according to the received case data. Specifically, the rule processing module is responsible for firing rules and coordinating with the other two components. It makes sure the resulting change operations are valid and correct before they are passed on to the variant configuration algorithm. In the case where two rules conflict, the conflict resolution component will produce a warning and ask the designer either to modify the rule or assign priorities to them. The validity checking component ensures the resulting change operations can be applied to the process template in terms of structural correctness using the verification matrix in Table 5.3. For example, the editor will produce an error or a warning to the process designer if the action delete (task t1) is triggered but t1 does not exist in the process template. Hence, valid process templates and rules are maintained respectively in their own repositories. As the business policy changes over time, the process designer can change rules associated with a specific process while keeping the process template the same. 90

103 The variant configuration algorithm uses these operations to modify the process template, create a variant schema, and store it in the repository. This is enabled by the transformation between a process template in BPEL and its corresponding process tree. For each process tree, we maintain a (task vector, level vector) pair. Further, the interchangeable representations of the tree are maintained, as well as postorder, in-order and pre-order structures. We use post-order traversal for making changes to the process template. After a variant is generated, the rule editor can also check for data flow consistencies to ensure that a task will receive all its input data from the output of previous tasks. Otherwise, an error is flagged. Besides configuring a variant from scratch, we also consider process refactoring (Weber and Reichert) from an existing set of process models. Process refactoring includes extracting process templates as well as rules from either a large single model or multiple redundant models. This can be helpful for industry to manage the transition from the existing process management methodology to our proposed methodology. We plan to implement this part as a plug-in into our existing prototype. It should be noted that a domain expert should be assigned to validate the compliance (e.g., to certain norms) of the derived process variants before they can be deployed. Shared Rule Repository Rules Rule Engine Conflict resolution Process Designer Rule Editor Variant Designer Template Editor Case Data Rule Processing Validity checking a valid set of change operations Template & Variant Repository template variant Variant Configuration Algorithm Figure Architecture of process variant configuration 91

104 After a variant is validated, it is ready to be enacted and executed by the workflow engine. However, the user needs to select the appropriate variant that fits the application scenario. For example, to handle a car damage claim with a loss of $10k with normal status, our system should automatically find the variant that fits the scenario and present it to the user. Figure 5.15 presents our architecture for variant search, discovery and instantiation. We implement two types of query processing for variant discovery: Boolean query and scenario-based query. We run a query against the Variant_index table that maintains presence of tasks and control-flow structures in a variant. This is helpful when a user wants to find a number of variants that share the same tasks or similar structure. In contrast, a scenario-based query has more semantics and is goal oriented. It compares the case data with the rule conditions, and finds the matching rules that lead to their process variants. After a process instance is completed, the frequency of its use is automatically updated in a log kept in the database. These statistics can be helpful in various ways, e.g. variants with low usage frequency can be discarded. Process Variant Management User query retrieved variants Variant discovery Variant Similarity Bit-Array Table instantiation Boolean query Scenariobased query Workflow Engine Template & Variant Repository Shared Rule Repository Process Log Figure Architecture for variant search, discovery, and instantiation 92

5.6.2. A use case for handling insurance claims In this section, we will walk through the running example of insurance claim handling in our prototype. Figure 5.

105 A use case for handling insurance claims In this section, we will walk through the running example of insurance claim handling in our prototype. Figure 5.16 shows the user interface of the process variant manager. Three process clusters are deployed for an insurance company, say ABC Corp. Clusters represent sets of relevant templates and variants in different application domains. For example, C1 is used to start a new insurance quote, C2 to handle insurance claims, and C3 to process customer complaints. Say, John is a process designer who wishes to implement the motivating scenarios. When he clicks on cluster C2, the right panel shows the similarity values between variants and their corresponding template. The columns in this comparison table include task distance, structure distance, usage frequency and application scenario. Depending on the template design preferences a template can be considered as the most frequently used process model and it can be instantiated. It is also possible that a template is a minimum set of required tasks across all variants, and is not be applicable to any application scenario without some modification. Figure 5.16 shows that the process template V has been instantiated five times. Figure User interface of process variant manager 93

Figure 5.17 shows the user interface for creating a new variant (i.e., by clicking on the new button in the variant configuration group in Figure 5.16).

106 Figure 5.17 shows the user interface for creating a new variant (i.e., by clicking on the new button in the variant configuration group in Figure 5.16). The process designer can examine the process template by browsing the BPEL structure in the left panel, and the process tree in the right panel. John can also learn about the current rules by clicking the RuleSet tab. Moreover, he can directly input the change operations and the application scenario based on heuristics, or he can also generate a variant by feeding case data, as shown in Figure 5.18 (a). The resulting process variant is derived from the modified process tree shown in Figure 5.18 (b), and then added to the process repository. After a specific variant is generated, John should send it to a senior employee who is familiar with insurance business logic in ABC Corp. for evaluation. Over time, the business policies in ABC Corp. may evolve. Upon receiving change requirements, John can revise the rules so that changes propagate to all the insurance processes, instead of manually revising them one at a time in the traditional way. After variables or rules are revised, John is responsible for running the variant configuration to ensure the current processes are compliant with up-to-date rules. Existing running instances may be restarted, continued without change, or migrated to new variants. Process template Process tree dissertation Variant Figure Interface for Variant Configurator 94

(a) Input case data (b) Revised process tree Figure 5.18. Process variant configuration 5.6.3.

For companies that already have a large repository of redundant or complex process models, we need an efficient and automated technique for refactoring process templates and rules from existing

107 (a) Input case data (b) Revised process tree Figure Process variant configuration Discussion and limitations This use case shows how to use our prototype to design and manage variants based on the algorithms we proposed. We assume the template and rules are designed from scratch. For companies that already have a large repository of redundant or complex process models, we need an efficient and automated technique for refactoring process templates and rules from existing models. For a large and complex process model, we need to identify the repetitive elements and merge them into a concise structure with the rules for dealing with variations such as, for example, multiple branches. The methodology is similar for handling multiple redundant models but it is applied to different process models. Thus, semantic inconsistency of task names or labels should be resolved. As business processes and policies evolve, they affect process models. One advantage of our approach is that it facilitates the management of process model evolution. Since process variants are derived from a template, any change that is made to the template (e.g. insert or remove a task) will automatically propagate to its variants. Handling change to business policy is also made simpler. As noted above, it is time and effort consuming if process variants are designed as different process models. If any policy changes, each model related to that policy must be changed as well. However, in our approach, we only need to modify the rules that embody that policy. It is still necessary to check consistency of the 95

108 rules after any change is made. Nonetheless, it is still easier than checking every process variant, applying the new policy to it and then verifying the correctness of each revised process. We recognize our approach has several limitations. First, we only restrict ourselves to structured processes. Some business processes are unstructured. In such situations they have to be mapped into equivalent structured processes, if possible. New techniques for managing unstructured processes are also required. Second, it is necessary for users to specify the semantics of the applied rules. In Section 3.3, we discussed possible semantics for conflict resolution when two or more rules may apply to a given case data. In Section 3.4, we developed a matrix to disallow invalid combinations of operations on a template. However, the checking of the rule semantics still has to be done manually by the user to ensure that a rule does not produce unintended consequences. Third, additional cost is involved in managing the rules and ensuring their consistency. If the rule semantics is incorrect it can lead to variants that are wrong. In this respect it is important to realize that the power of the configuration operations is a double-edged sword. On the one hand they give considerable flexibility, but on the other they enable creation of variants that may be incorrect. Consequently, it may be useful to develop additional controls in the form of meta-rules to somewhat restrain the flexibility of the approach. Finally, the goal of a template is to capture the essence of a process in such a way that other variants can be derived easily from it. However, it would be very helpful to have a precise definition of what is meant by a template. One possible definition could be based on the notion of similarity as discussed in Section 5.1, but the difficulty lies in that the possible variants that may arise from a template, along with their frequencies of occurrence, is hard to estimate Conclusions This chapter described a novel approach for designing flexible business processes (or variants) based on combining process templates with business rules. We showed how our template- and rule-based approach allows separation of basic process flow from business policy elements in the design of a process and also integrates resource and data needs of a process tightly. Thus, rules that reflect business policy will be automatically triggered by change in business context and result in customized process variants. 96

109 We also developed a novel scheme for storing process variants as strings based on a post-order traversal of a process tree and attaching level numbers to each node of the tree. We showed that such a representation lends itself well to manipulation and also for searching a repository of variants. Moreover, indexing structures were developed for faster access to process models based on user requirements expressed through SQL queries. Finally, we showed that it is also possible to query the rulebase to find variants that match the criteria specified in a query. A proof-of-concept prototype has been partially implemented to test and evaluate this methodology based on the proposed architecture. However, more experiments are required to evaluate the querying and searching technique. A further challenge lies in creating an interface that allows users to describe the rules in an easy way and making the details of the language completely transparent to them. We intend to explore various solutions for this including the use of an English-like rule language, and providing a GUI interface to increase ease of use and prevent typographical errors. More effort will also be devoted to enhancing the richness of the process description language with events, and also on the semantics for rule conflict resolution. Lastly, adding ontologies to the architecture would increase the expressive power of the framework as well. 97

110 Chapter 6 Context-aware Process Instance Adaptation Process model flexibility is important due to frequent changes in business policy or goals. However, flexibility at the process instance level is also critical when each case shows its own characteristics (e.g., patient clinical pathways). Thus, process instances should adapt in an intelligent way according to the actual context. The key idea in this chapter is the tight integration of process instances and rules to design adaptive business processes, while recognizing that at some stages in a workflow instance, the subsequent path is determined by the context. We use the healthcare domain as an application area to showcase our proposed approach. The context does not only come from the current case data, but is also based on the specific pathway that was taken to reach this point in the workflow. It may as well involve other aspects like resource availability, doctor status, and environment conditions as discussed above. Here, we discuss the general principles, and present a framework for integration. In this chapter, we propose CONFlexFlow (Clinical context based Flexible workflow) as a novel approach for integrating clinical pathways into Clinical Decision Support Systems (CDSS) (Yao and Kumar 2012a). It recognizes that flexible pathways are critical for the success of CDSS. Further, it is based on a better understanding of clinical context through ontologies, and bringing them to bear in deciding the right rules for a certain activity. We also describe an approach for dynamically realizing context dependent subprocess fragments in a clinical pathway using the ad-hoc subprocess element of BPMN 2.0. To illustrate the feasibility of our approach, we propose an implementation architecture and present a proof of concept prototype using multiple open source tools such as Protégé, Jess, and Drools. The role of semantic web technologies in integrating clinical pathways into CDSS is highlighted. Preliminary results from our initial implementation are discussed Introduction Workflows play an important role in clinical environments by delineating the steps through which the treatment of a patient progresses. The temporal order and correct coordination of the various steps are 98

111 clearly important. The workflows in these environments are called clinical pathways. These pathways generally follow well-established standards or clinical guidelines. However, they differ from other workflows found in business and production environments because clinical processes involve frequent deviations, and hence there is a need for considerable flexibility. Typically, a medical facility develops clinical pathways from clinical guidelines on the basis of its local resources and settings. Moreover, the pathway is further customized into a treatment scheme to suit an individual patient s needs. Clinical workflows are highly dynamic, context sensitive, event driven, and knowledge intensive. In this respect, they are quite unique. In general, a patient interacts with a Primary Care Physician s (PCP) office, a pharmacy, labs, and one or more specialists, etc. In such a setting, it is important to maintain coordination and flow of information among these various entities so as to ensure an optimal outcome. The need for new modeling techniques for designing such flexible workflows is motivated by several considerations. First, although many research efforts are geared towards establishing international healthcare standards (e.g., HL7) and a representation for sharable guidelines (e.g., GLIF) for clinical practice, formal models of executable and flexible clinical workflows are very few, e.g., (Dang et al. 2008; Ye et al. 2009). The execution of a clinical workflow is highly dependent on the existing body of medical knowledge, available resources, and specific case data. For example, doctors with different skill levels and fields of expertise may offer differing diagnosis to the same patient, thus leading to different treatment plans for this patient. A sudden rise in a patient s blood pressure may require an additional test and alter her treatment in the subsequent pathway. Thus, different pathways can arise based on case specifics and the proclivities of attending doctors. It is important to formally model these scenarios. Second, medical staff are extremely busy handling a lot of cases each day and they may make mistakes in prescribing medications, performing procedures, and even making diagnosis (Berner 2009; Kohn et al. 2000; Teich et al. 2005). For example, a doctor might accidently prescribe an X-ray test for a pregnant patient, or a drug that is not covered by insurance to another patient. Thus, there is a need for CDSS and Computer Interpretable Guidelines (CIGs) to assist care professionals with decision making activities. 99

112 Our goal is to show how flexible clinical pathways can be designed taking into account medical knowledge in the form of rules, and also detailed contextual information for a medical workflow involving multiple participants to improve care quality. A clinical pathway is a workflow that charts a path for a patient through the various steps in interacting with a PCP office, lab, pharmacy and other participants in the process of patient care. We propose a methodology for designing formal workflow models that capture medical knowledge and context in a common framework, and yet allow flexibility through ad hoc subprocesses. Since clinical workflows should be naturally aligned with clinical guidelines, it is necessary to ensure that it is formal and correct so that integrating decision support into this workflow can be helpful. The methodology is based on a formal rule and context taxonomy using ontologies. The rule taxonomy organizes the rules into a hierarchy while context encompasses aspects of patients, providers, resources, and environment. We will focus on how context is captured, described and summarized, and how rules are developed. A proof of concept prototype is developed to shown the feasibility of our approach with preliminary results. In this chapter we use the terms pathway and workflow interchangeably An Integrated Framework CONFlexFlow In this study, we propose a Clinical CONtext based Flexible WorkFlow (CONFlexFlow) approach for designing medical knowledge base and providing decision support. Our study complements earlier research studies described in the related work. We aim to bridge the gap among researches from the medical informatics community, the business process management community, and the semantic web community. In essence, this proposed approach makes the following contributions: We extend the CIHO model (Figure 4.3) to capture contextual knowledge that has an impact on the clinical practice. This ontology is integrated with heart failure ontology and can be easily extended in future by combining other ontologies in other medical disciplines, e.g., immunization ontology, disease ontology, hypertension ontology, etc. 100

113 Based on this model, we use semantic rules to encode medical procedural knowledge from clinical guidelines. These rules are used to provide medical decision support through reminders, alerts, and recommendations, and present them to medical staff at the right time in an unobtrusive manner. We model clinical processes using a standard workflow language (BPMN 2.0) and make them adaptable based on the changing context. Thus, each process instance is customized to an individual patient based on case data, including patient information, resources availability, staff expertise, etc. This is done by rerouting rules. We presented an architecture for implementing CONFlexFlow and developed a proof of concept prototype using multiple open source tools. A preliminary evaluation is conducted and advantages of this proposal are discussed A running example Figure 6.1 represents a simplified clinical workflow in BPMN 2.0 (OMG January 2011) which support various types of tasks and workflow patterns. It is executable and supported by existing workflow engines. In this diagram, pools (represented by rectangular blocks) correspond to workflow participants, including patient, CDSS, PCP office, pharmacy, and lab. A message flow (shown by dotted lines) is used to coordinate communication between two participants. Lanes are used to organize and categorize activities within a pool, such as between a doctor and a nurse. The flow of control, or the ordering, among activities within a pool is shown by solid lines. This pathway shows that a patient having any symptom visits the PCP office and is examined by the care providers. The doctor writes a prescription (Rx) and sends it to the pharmacy. The pharmacy prepares and dispenses the medicine, and the lab is responsible for various tests. Activities or tasks are shown in round rectangles, while events are in circles. Both of them have a variety of types (in different BPMN symbols). Complex gateways (shown in diamonds with an asterisk) allow one or more outgoing branches based on the results of conditions. Parallel gateways (shown in diamonds with a plus sign) create parallel paths without checking any conditions. In this way the workflow gives a formal description of the coordination among activities involved in this process. 101

114 Figure 6.1. A simplified BPMN clinical workflow modeled in BPMN2.0 In this diagram, Prepare Rx is a reusable subprocess whose execution semantics are strictly defined while Do tests is an ad hoc subprocess that is realized from other isolated activities. It is loosely defined at design time because its instantiation is dynamically determined by the outcome of previous activities. For example, if a doctor upon examination suspects that a patient may have suffered a heart attack, then EKG and ECG would be executed in the subsequent pathway. In contrast, Blood test might be needed if a patient is suspected of hypertension. Thus, using a strictly predefined subprocess to design an uncertain and dynamic activity is not practical since it lacks flexibility and adaptability. Context is needed to provide execution semantics for the "loosely-defined" regions of such clinical processes at runtime. Our CONFlexFlow approach shows how to handle such situations using context and rules. Thus, a clinical workflow is aware of the current context, displays evidence-based recommendations for care providers, and dynamically reroutes subsequent pathways. 102

115 When a generic process model like the one in Figure 6.1 is actually deployed for a specific patient, it is called a process instance. Thus, the process instance for patient 'Sue' with id 'P1001' is different from that for patient 'Jack' with id 'P1003', and so on A meta-model Figure 6.2 represents a meta-model that describes the semantic relationships among key concepts used in CONFlexFlow. Hence, it also introduces some of the terminology used in this chapter. Role performs Atomic Activity composes Composite Activity Clinical Guidelines Medical Ontologies derived from uses contains produce Rule Group Rule associated with belongs to triggers produce integrate with Activity Action composes intervene with Alert Clinical Workflow Reminder Context Recommendation Low-level Context inferred from High-level Context subclassof Figure 6.2. A meta-model for CONFlexFlow Clinical workflows are composed of activities performed by roles (i.e., participants). Rules are derived from clinical guidelines and uses medical ontologies for clinical reasoning. They are categorized into rule groups according to their semantic meanings. A rule groups can be associated with one or more workflow activities and it is automatically triggered when the activity node is reached. The data required to evaluate the rule condition is modeled as context, which is critical in the clinical practice. The context captures all entities that have an impact on the clinical process, and has subclasses including low-level and 103

116 high-level context. Low-level context is explicit and it is directly collected from user inputs or other data sources (e.g., medical devices, environment sensors). For example, patient name, physical location, and body temperature are examples of low-level context. They are used to deduct high-level, implicit context by rule-based reasoning, such resource availability and patient situation. When the context is relevant to a clinical decision, rules that meet the condition are fired to produce actions. In general, actions are in the form of reminders, alerts, and recommendations. They are tightly integrated with medical activities as a manner to intervene with the clinical workflow. They can evidence-base suggestions for medical decision making in an atomic activity or provide operational semantics for a composite activity. Further, workflow activities in turn produce context, which is not specified in this meta-model. We use a variety of BPMN 2.0 elements for modeling a flexible clinical workflow. Figure 6.3 presents a hierarchy of activity notations. An atomic activity is indivisible and used to compose processes and composite activities. There are various types of atomic activity. For example, a human task (or user task) requires human interaction with the assistance of a software application while a manual task is expected to be performed without the aid of any workflow engine or application. A service task is done automatically by a Web service or an automated application. A script task is executed by a workflow engine and a rule task usually involves some rules to handle complicated business logic. Various tasks are used to support modeling different types of medical tasks. Composite activities include reusable and embedded subprocesses. In this study, we mainly use ad-hoc subprocess, which are loosely defined at design time. Then at runtime a more specific (or concrete ) description is realized based on the actual context. The Do tests activity in Figure 6.1 is an example. In contrast, the execution semantics of a structured process is strictly defined and it is the same for all running process instances. More details about these notations can be found in BPMN 2.0 specification. 104

117 Figure 6.3. A hierarchy of BPMN 2.0 activity notations System architecture Figure 6.4 depicts the system architecture of CONFlexFlow based on our meta-model. Several roles are involved in this system. Knowledge engineers communicate with domain experts and collect medical knowledge. Other medical resources and documents can be consulted as well for this purpose. Then they use ontology and rule editors to build medical ontologies and rules to formalize clinical guidelines. Context ontology is developed to formally model relevant context that has an impact on clinical pathways. Clinical context can be acquired from a variety of distributed sources, such as laboratory information systems, surgical information systems, sensors, and the workflow engine. Then the rule engine triggers rules, evaluates conditions and takes actions accordingly. The produced actions are tightly integrated into running clinical processes. Meanwhile, business process designers use workflow modeling tools to create clinical processes to be stored in the process repository. They can be initialized and executed by the workflow engine to automate patient flow. At runtime, a process instance is executed and fed with specific case data (e.g., patient, medical staff, and environment data). Workflow participants, i.e. various healthcare professionals, manage clinical workflows through web-based management console to perform clinical activities and 105

118 monitor their current process instances. They are presented with reminders, alerts, and recommendations produced by the rule engine based on clinical reasoning. Furthermore, the routing of clinical workflow is flexible since the operational semantics of the composite activity can be specified for an individual patient according to the current context. Every activity may produce data (e.g., patient symptoms, lab test results, and device availability) and update the context base continuously. In this way, we ensure the information in our context database is up to date and ensure the correctness in clinical reasoning. Laboratory Information System Distributed Hospital Information Systems (HIS) Surgical Information System Resource Management System Sensors (e.g., humidity, temperature) clinical context Clinicians Knowledge Engineer Ontology and Rule Editor (Protégé + SWRLJessTab) Medical Knowledge Base (OWL-DL) Medical Ontologies inferred context Rule Engine (Jess) Context Ontology triggered rules clinical context Context Database (XML) Workflow Engine (Drools-flow) clinical process Web-based Management Console Process designer Rule Base (SWRL) Process Repository (BPMN2.0) Workflow Editor Figure 6.4. System Architecture of CONFlexFlow Thus, this architecture captures the nexus of workflow, rules, and context. It is important to note that a clinician will still be in control. The pathways and alerts generated by the CDSS can be overridden, and they can adjust the alert/reminder settings as per their needs. Their preferences are integrated into the clinical workflow better. The open source tools used in our implementation are also displayed. In the next sections, we will describe critical components in more details. 106

119 6.3. Clinical Knowledge Representation and Semantic Rules The knowledge used for medical decision making during clinical workflow include shared understanding of medical domain concepts, contextual data that characterize clinical workflow, as well as the procedural knowledge represented as rules. In this section, we discuss the representation and reasoning of clinical knowledge in CONFlexFlow Ontological knowledge model As discussed above, ontologies are explicit formal specifications of the terms in a domain and the relationships among them. It can facilitate knowledge sharing, logical inference, and knowledge reuse. Ontologies are widely accepted as a useful instrument for modeling context, thus we use them to model clinical context that can be obtained from heterogeneous sources and represented in different formats. Further, domain ontologies from medical disciplines are included in our knowledge framework since they are critical for decision support. We use Web Ontology Language-Description Logic (OWL-DL) (Bechhofer et al. 2004) to encode our model using Protégé 3.4 (Standford University 2009), which is a popular tool for ontology editing and representation. This model can be shared and improved across the healthcare network Clinical context ontology Our clinical context ontology is extended from the CIHO model (Figure 4.3) discussed in Chapter 4. According to Dey (Dey 2001), context is any information that is used to characterize the situation of an entity. Moreover, a system is context-aware if it uses context to bring to bear in the provision of relevant information and/or services, where relevance is naturally situation dependent. In a clinical workflow, we define context as any information that can be used to influence medical decision making and clinical workflow routing. For example, a patient s medical history might affect a doctor s prescription (i.e., influence a medical decision); a patient s lab test result will affect her subsequent treatment (i.e., dynamic routing). A normal test result will lead to a follow up check, while an abnormal result may necessitate customized treatment. 107

120 Clearly, proper understanding and representation of context plays a critical role in the CDSS. In this study, we formally model the clinical context using an ontology-based approach by capturing the important concepts in the clinical process. At a preliminary level, we consider the following categories of context: Patient-related context includes medical history, age, gender, physiology data (e.g., blood pressure, body temperature, and heart rate), X-ray test results, mobility state, etc. Patient data can be obtained from human input, medical devices (e.g., physiological monitors), and EMR central repository. It is continuously updated. Clinical staff-related context concerns information related to doctors, nurses, lab staff, pharmacists, and other providers. They are the major workflow participants in the clinical pathway and their decision can affect the patient treatment. Their attributes include expertise area, level of expertise, gender, availability, workload, desired alert level, etc. Resource-related context concerns the information about resources (e.g., devices) that are needed for workflow execution. This might include a defibrillator s availability and current location, the number of available wheel chairs, stock levels of medicines, etc. Additional information is required about the environment at a particular facility such as a surgery room s humidity, temperature, and schedule. Location-related context concerns the current location of persons, devices, and other movable assets. This information can be captured by RFID devices, smart phones, or other pervasive computing technologies. This is helpful to deal with emergent situations, such as the need of an OXRB bottle during a surgery. Figure 6.5 shows a partial representation of our ontology-based clinical context model that describes the main entities of interest, their properties, and semantic relationships. It shows the contextual information involved in a healthcare activity, such as hospital staff, patients, resources, facility, etc. Their relationships are described so as to capture the constraints between these entities, such as the relation 108

121 locatedin between Person and Location. The pair of relations treatedby and treats between Patient and Staff help to manage the medical staff needed for patient treatment. Figure 6.5. Partial representation of the clinical context ontology Protégé 3.4 (Standford University 2009) provides a graphical interface to model ontologies in OWL, which has become a standard ontology language in the semantic web community. Protégé is an open source platform based on Java, and it provides a plug-and-play environment to allow for flexible prototyping and application development. This model was built based on an existing hospital ontology (i.e., CIHO model in Figure 4.3) that was developed previously by (Yao et al. 2009). As shown in Figure 6.6, six individuals are created as instances of the class Patient. The asserted properties are low-level context directly entered by the user or fed by other services. For example, the ID, name, gender and vital signs can be entered by the nurse, while physical location can be detected by the RFID tracking service. This patient shows the symptoms of chest pain, dry cough, fatigue, and nausea, which are defined in the medical ontologies and available for nurses to select. Similarly, the other asserted properties show the current activity of this patient and attending clinicians, which will change as this process proceeds. The inferred properties are high-level context deducted from ontology- or rule-based reasoning. For example, hypothetical diagnosis is inferred based on patient symptoms according to practice guidelines. Our CDSS 109

makes recommendations about patient diagnosis and treatments based on medical guidelines as well as the clinical data. The details about rule-based reasoning are discussed in the next section.

122 makes recommendations about patient diagnosis and treatments based on medical guidelines as well as the clinical data. The details about rule-based reasoning are discussed in the next section. Figure 6.6. Protégé user interface for editing OWL-based ontology <owl:class rdf:id="patient"/> <owl:class rdf:id="staff"/> <owl:class rdf:id="assert"> <rdfs:subclassof> <owl:class rdf:id="resource"/> </rdfs:subclassof> </owl:class> (a) Class <owl:objectproperty rdf:id="suggestedtreatment"> <rdfs:domain rdf:resource="#patient"/> <rdfs:range rdf:resource="#treatment"/> </owl:objectproperty> <owl:objectproperty rdf:id="supportsdiagnosis"> <rdfs:domain rdf:resource="#tests"/> <rdfs:range rdf:resource="#diagnosis"/> </owl:objectproperty> (c) Object property <Patient rdf:id="patient_pt_ "> <hasgender rdf:datatype="&xsd;string"> Female</hasGender> <hasname rdf:datatype="&xsd;string">mary Smith</hasName> <hassigns rdf:resource="#body_fluid_retention"/> </Patient> (b) Individual <owl:datatypeproperty rdf:id="hasstarttime"> <rdfs:domain rdf:resource="#activity"/> <rdfs:range rdf:resource=" #datetime"/> </owl:datatypeproperty> <owl:datatypeproperty rdf:id="hasendtime"> <rdfs:domain rdf:resource="#activity"/> <rdfs:range rdf:resource=" #datetime"/> </owl:datatypeproperty> (d) Data property Figure 6.7. Knowledge representation in OWL-DL OWL-DL (Bechhofer et al. 2004) is a sublanguage of OWL. It has a good balance between the computational and the expressive requirements. It follows an object-oriented approach to describe the structure of a domain in terms of classes, their properties and semantic relationships. It has inference capabilities through the OWL characteristics of properties, like inversion, symmetry, and transitivity. 110

123 Figure 6.7 shows the XML representation of class, individual/instance, object property, and data property in OWL-DL. It is automatically generated and stored by Protégé when the context ontology in Figure 6.5 is created Medical domain ontologies Medical domain ontologies are established to provide a shared understanding of concepts in specialized medical disciplines. Thus, they provide important and formal vocabularies for a CDSS. In this study, it is impossible to include every specialty in the medical domain. Hence, we adopted the already published heart failure ontology named HF_ontology (HEARTFAID team 2008) in our prototype. This ontology is developed in accordance with guidelines of European Society of Cardiology, and useful for medical decision making related to heart disease, such as diagnosis and treatment of systolic heart failure. HF_ontology is used to infer patient conditions, diagnosis and treatment, as well as detection of emergent situations. Figure 6.8 shows a partial hierarchy of HF_ontology. Important concepts such as patient characteristics, tests, and treatments are formally modeled and their relationships are represented as well. Treatments include medication, device therapy, surgical therapy, patient education, and recommendations. Further, medication can be categorized into ACE inhibitor, Angiotensin receptor blocker, Beta blocker, Digoxin, Diuretics, and Spironolactone. This ontology is relatively stable and it will only be updated when there is any change in the shared understanding of heart disease. Concepts from HF_ontology are connected to our context ontology. For example, patients have signs, symptoms, suggested tests, suggested treatment, etc. These constraints are captured by object properties including hassigns, hassymptoms, suggestedtest, suggestedtreatment, etc., accordingly between class Patient and classes Signs, Symptoms, Tests, Treatment in HF_ontology. Similarly, this model is also implemented using Protégé and stored as OWL file in our knowledge base. Although we only consider heart failure ontology, other domain ontologies can be incorporated easily in future, such as immunization ontology, diabetes ontology, and pediatrics ontology, to handle other diseases. 111

124 Figure 6.8. Partial representation of the heart failure ontology Semantic rules for clinical reasoning Rules embody medical procedural knowledge and are used to help make complex clinical decisions through logical reasoning, often in real time and in response to critical events. They can also handle exceptional situations. This section discusses how semantic rules are represented and rule-based reasoning is conducted. We also discuss their intervention with clinical workflows Medical rules A clinical workflow may have thousands of rules for a variety of purposes (e.g., patient diagnosis, treatment) and applicable in different medical disciplines (e.g., heart failure, diabetes, pediatrics). But only a small part of the rule set are triggered for an activity within a process instance. Managing and integrating medical knowledge in the form of rules and applying results from rule-based reasoning into a clinical pathway is critical to achieve an optimal outcome. We organize rules into categories according to when they might be applied in the clinical process. We consulted the American Heart Association 1 and the American Academy of Family Physicians (AAFP) 2 websites to obtain these rules

125 Figure 6.9 shows a hierarchy of our rules categorized into patient evaluation, diagnosis, treatment, and prescription checking. Each category contains one or more rules that represent relevant knowledge and it is associated with a medical activity during patient encounters. Each category is further divided into sub-categories. For instance, prescription checking is subdivided into allergy checking, drug interaction checking, dose checking, and insurance checking. In case a prescribed drug has interaction with the medication a patient has taken, the system can propose an alternative that may be referred to the doctor for approval. New rules can also be developed and added in this hierarchy. Figure 6.9 serves as a guide in organizing rules systematically, i.e. deciding what rules to include in a category. Figure 6.9. A semantic hierarchy of rules based on clinical knowledge Semantic rules in SWRL We use Semantic Web Rule Language (SWRL) (Horrocks et al. 2004) to encode rules for userdefined reasoning. SWRL is the standard rule language for the semantic web based on OWL and RuleML. It utilizes the typical logic expression antecedent consequent to represent semantic rules. Both antecedent (rule body) and consequent (rule head) can be conjunctions of one or more atoms written as atom 1 Λ atom 2 Λ atom n. If all the atoms in the antecedent are true, then the consequent must also be true. In our example discussed below, the symbol Λ means conjunction,?x stands for a variable (i.e., an 113

126 instance or individual), and means implication. The consequent part of a triggered rule can be used to update context database, issue reminders and alerts, and more importantly, customize an ad hoc subprocess. Table 6.1. Example user-defined context reasoning rules Category/ Intervention Patient Evaluation Rules (PER) - Physical Examination Patient Diagnosis Rules (PDR) - Diagnosis Patient Treatment Rules (PTR) - Treatment Prescription Checking Rules (PCR) - Prescription Scenario Chronic heart failure detection Suggested tests in presence of heart failure Systolic heart failure diagnosis Hypertensive heart failure diagnosis Systolic heart failure treatment Patient-drug interaction Drug-drug interaction Reasoning rules represented in SWRL PER1: Patient(?p) hassymptoms(?p, Dyspnea) hassymptoms(?p, Fatigue) hassymptoms(?p, Peripheral_edema) hassymptoms(?p, Palpitations) hassymptoms(?p, Coughing) hassymptoms(?p, Nausea) hassymptoms(?p, Neurologic_deficit) hypotheticaldiagnosis(?p, Chronic_heart_failure) PER2: Patient(?p) hypotheticaldiagnosis(?p, Chronic_heart_failure) suggestedtest(?p, Physical_examination) suggestedtest(?p, Blood_test) suggestedtest(?p, Echocardiography_tests) suggestedtest(?p, Chest_X-ray) suggestedtest(?p, Electrocardiography_tests) PDR1: Patient(?p) hypotheticaldiagnosis(?p, Chronic_heart_failure) hastestresults(?p, Chest_X-ray_abnormal) hastestresults(?p, ECG_abnormal) hastestresults(?p, BNP_value_higher_than_100_pg_per_ml) hastestresults(?p, LVEF_lower_than_40_percent) suggesteddiagnosis(?p, Systolic_heart_failure) PDR2: Patient(?p) hypotheticaldiagnosis(?p, Chronic_heart_failure) hassigns (?p, High_blood_pressure_disorder) hassigns (?p, Abnormal_heart_sounds) hastestresults(?p, ECG_abnormal) hastestresults(?p, Pulmonary_venous_congestion) suggesteddiagnosis(?p, Hypertensive_heart_failure) PTR1: Patient(?p) suggesteddiagnosis(?p, Systolic_heart_failure) suggestedtreatment(?p, ACE_inhibitor) suggestedtreatment(?p, Beta_blocker) PTR2: Patient(?p) suggesteddiagnosis(?p, Systolic_heart_failure) hassymptoms(?p, edema) hassymptoms(?p, congestion) suggestedtreatment(?p, ACE_inhibitor) suggestedtreatment(?p, Beta_blocker) suggestedtreatment(?p, Diuretic) PCR1: Patient (?p) allergicto (?p, Aspirin) Prescription (?Rx) hasprescription (?p,?rx) swrlb:contains (?Rx, Aspirin) hasalert (?p, This patient is allergic to Aspirin) PCR2: Patient (?p) hasprescription (?p,?rx1) Prescription (?Rx1) Prescription (?Rx2) interactwith (?Rx1,?Rx2) forbidrx (?p,?rx2) Table 6.1 shows several example rules from different categories and their triggering context within the workflow. This information is provided to care professionals as recommendations to avoid deviation 114

127 from clinical guidelines. Patient evaluation rules evaluate a patient s medical history, social background, habits, symptoms, etc., prior to physical examination, thus hypothetical diagnosis can be presented to a clinician during the examination. Patient diagnosis rules evaluate a patient s signs, lab test results, other relevant medications, and provide recommendations for diagnosis decisions, such as systolic heart failure or hypertensive heart failure. Although signs and symptoms describe the same conditions, they are essentially different. Signs are what a doctor sees (e.g., high blood pressure and abnormal heart sounds) while symptoms are what a patient experiences (e.g., fatigue and dyspnea). Symptoms can characterize a disease and reported by a patient to a nurse; signs are usually detected and logged by a doctor during examination. Patient treatment rules provide suggestions for treatments such as medications, device therapy, or patient education, according to the confirmed diagnosis. Prescription checking rules deal with drug interactions (e.g., Carbidopa and Levodopa), dosage checking, and allergy-drug effects to avoid prescription errors. For example, rule PCR1 will generate an alert message if a patient who is allergic to Aspirin is given any medication that contains Aspirin. In addition, there are other rules that concern about resources, such as availability, schedules, etc. An extensive enumeration of these rules is omitted here since we mainly focus on guideline-based patient treatment Semantic reasoning using Jess rule engine The execution of SWRL rules requires the availability of a rule engine, which perform reasoning using a set of rules and a set of facts as input. Any new facts that are inferred are used as input to potentially fire more rules (i.e., forward chaining). In our implementation, we use Jess rule engine (Friedman-Hill and others 2005) to enable SWRL reasoning on the Protégé data set, since it is small, light, and one of the fastest engines available. Jess rule engine is implemented in Java and supports forward and backward chaining. It is stable, well supported, and has been successfully used by others. Figure 6.10 shows the translation of ontology and rule knowledge to enable this process. As can be seen, SWRL rules are stored as OWL individuals with their associated knowledge base. First, SWRL rules are translated into Jess rules using a SWRLJessBridge (O'Connor et al. 2005), which is a Protégé 115

plug-in, and added to Jess rule engine; then, the ontologies and the knowledge base are translated into Jess facts and introduced into the rule engine as well; third, Jess rule engine does the

128 plug-in, and added to Jess rule engine; then, the ontologies and the knowledge base are translated into Jess facts and introduced into the rule engine as well; third, Jess rule engine does the reasoning and produces results in the Jess format, and finally, these results are translated back into the OWL format. In this figure, we have shown the format of Jess facts as well as the format of Jess rules for PTR1. Figure SWRL reasoning using Jess rule engine SWRL rules can be easily modeled in SWRLJessTab (O'Connor et al. 2005) which is a Protégé plugin and provides a graphical interface to interact with SWRLJessBridge. SWRLJessTab allows us to insert, remove, and edit SWRL rules. Figure 6.11 (a) shows the implementation of rule PER1 in this editor. With the symptoms asserted by a nurse (see Figure 6.11 (b)), this rule infers that this patient may have chronic heart failure, as shown in Figure 6.11 (b). Meanwhile, another rule PER2 is triggered, uses results generated by PER1 to induce forward chaining, and produces suggested tests as output. These results are presented to a doctor during physical examination. As the clinical process proceeds, other rules can be triggered as more data becomes available when it is asserted into the knowledge base. For example, signs 116

are asserted by the doctor and lab test results are asserted by lab staff. Accordingly, treatments are recommended (i.e., medication of ACE inhibitor and Beta blocker).

This process is called forward chaining and is widely used in rule-based systems. Thus, the knowledge base is continuously updated by facts inserted.

129 are asserted by the doctor and lab test results are asserted by lab staff. Accordingly, treatments are recommended (i.e., medication of ACE inhibitor and Beta blocker). In this example, rule-based reasoning is conducted recursively, which means a rule can use the results of another rule as input data, until no rules can be fired. This process is called forward chaining and is widely used in rule-based systems. Thus, the knowledge base is continuously updated by facts inserted. The results of ontological reasoning will be tightly integrated with clinical pathways and provide evidence-based decision support for clinicians through a user-friendly interface. In the next section, we discuss how the knowledge derived from SWRL rules is used to implement flexible clinical pathways. (a) Rule in SWRL editor (b) Results of SWRL rule chaining Figure Encoding and reasoning of chronic heart failure using SWRL rules 6.4. Integration of Clinical Pathway and Rules for Flexibility The key innovation in CONFlexFlow is the tight integration of pathways and rules to improve the operation in a CDSS, while recognizing that at some stages in a clinical workflow, the subsequent path is determined by the context. Hence, flexibility is of the essence. The decision on the next path to take is usually determined by the medical plan to avoid deviation, but it is also influenced by the contextual information of individual patients. The context is derived not only from the current case data, but also based on the specific pathway that was taken to reach this point in the workflow. Here, we first give a framework for integration, and then discuss our implementation in detail. 117

130 A framework for integration There is an m-to-n relationship between (activity, context) on the one hand, and rules on the other. This means a certain contextual situation with regards to an activity may need the application of many rules, while a given rule may apply to many (activity, context) situations. Thus, a context where a patient is not allergic to any substance will not trigger allergy check rules, and similarly if a patient does not carry insurance, an insurance check is omitted. These relationships can be expressed as follows in the form of (Activity, Context) Rules : (Medication, allergic) allergy check rule(s) (Admission, broken leg) wheel chair check rule(s) (Admission, emergency) emergency procedure rule(s) (Do tests, X-ray) dose check (Revaluation, overweight) diet recommendation rule(s) Hence, during the medication activity, if a patient is allergic to some drugs, an allergy check is needed. At the admission time, if there is an emergency then a different admission procedure that deviates from the normal one is used. But clearly, at admission time we do not need to check whether the patient has an allergy. Thus, only a small set of rules related to a specific activity should be triggered before the activity is executed, according to the current context. Figure 6.12 presents our proposed methodology for integrating context and rules to design a customized subprocess at runtime. The main idea is that after relevant rules are extracted based on the (context, activity) combination as discussed above, they are fired to determine the execution semantics for the isolated activities contained in the loosely coupled composite activity. Thus, a subprocess fragment is generated and executed as the subsequent pathway for the current running process instance. Figure Methodology for integration of context, activity, and rules 118

6.4.2. Designing a flexible clinical workflow using BPMN 2.0 BPMN 2.0 (OMG January 2011) not only defines a standard for how to graphically represent a business process just as BPMN 1.

131 Designing a flexible clinical workflow using BPMN 2.0 BPMN 2.0 (OMG January 2011) not only defines a standard for how to graphically represent a business process just as BPMN 1.1 did, but also includes execution semantics for the defined elements. It uses an XML format to store and share process definitions. This new standard provides a large variety of node types including events, activities, gateways, etc. Specifically, we use human task, business rule task, reusable subprocess, and ad-hoc subprocess (introduced in Section 6.3) frequently in our implementation. Human task is very useful for modeling clinical scenarios, since most medical tasks cannot be completely automated (e.g., using Web services), but rather involve a lot of user interactions. The recommendations and reminders can be presented to clinicians in a user friendly interface using human tasks. Reusable subprocess refers to those strictly defined processes that can be reused in many scenarios, such as ECG process and X-ray process. Furthermore, we use ad-hoc subprocess to realize our notion of workflow flexibility and allow context-aware customization of a loosely defined region or activity. Figure A loosely coupled BPMN workflow implemented in Drools-flow Figure 6.13 shows such a BPMN 2.0 workflow model for treating heart diseases and it is implemented using Drools-flow (JBoss 2010b). We make medical plans or protocols, which are originally presented in flow charts or plain texts, executable through workflow modeling. An example of such guidelines for treating heart attack patients is presented in Appendix B. In this process, we have several human tasks such as physical examination and diagnosis, which requires interaction with clinicians. 119

132 Clinical data can be obtained from human tasks by user inputs and is then accessible to the workflow engine. In particular, we have four ad-hoc subprocesses (ASP) for composite activities to be customized at runtime. Do tests ASP1, Medication ASP3, and Therapy ASP4 are three composite activities that contain a number of atomic activities, while Treatment ASP2 is nested with other ASPs. Whether the contained activities will be executed or how they are executed is unknown at this time. For example, Do tests ASP1 refers to the set of scheduling and testing activities that are likely to follow the doctor s examination. It is not possible to predict what tests the doctor will prescribe until patient is examined. Similarly, the therapy activity may include Percutaneous coronary intervention (PCI) and Coronary artery bypass grafting (CABG) depending on the diagnosis. Hence, the operational semantics for instantiating an ASP is only known when this node is reached (i.e., not activated yet). A strictly defined subprocess will be produced dynamically based on the new information inserted into the context base. Similarly, the procedure for patient treatment depends on her test results, further diagnosis, and medical history. A more detailed use case study in handling heart failure pathways is discussed in (Yao and Kumar 2012b) Workflow engine Drools-Flow 5.2 We use an open source tool Drools-Flow 5.2 (a.k.a. jbpm5) (JBoss 2010b) as the workflow engine. It implements all element types defined in BPMN 2.0, and allows us to execute the clinical process in Figure In addition, we used Drools-Expert (JBoss 2010a), which is essentially a rule engine, as a complementary tool to support the rerouting of subsequent pathways. Figure 6.14 shows the system implementation of CONFlexFlow using the Drools framework. We created a stateful knowledge session that will load required BPMN process models and the Drools rule file into the production memory at runtime. In this study, we have the heart failure process (i.e., HeartFailureProcess.BPMN), ECG process (i.e., ECGProcess.BPMN), and some other medical processes in BPMN files. Besides, heart failure rerouting rules (i.e., HeartFailureReroutingRules.drl) are also loaded to serve the purpose of workflow flexibility. (Drools rules are in the form of drl, which stands for drools rule language and is also the rule file extension.) The Drools workflow engine manages the status of 120

133 process instances along with their associated data. The Drools rule engine is trained about processes by inserting the current state of the processes as part of the working memory data. On the other hand, the clinical context data that is relevant to the current process instance is loaded from our OWL medical knowledge base (discussed in the previous section), transformed into Java class using the adapter, and then inserted into the working memory as facts. Web-based Human Task Client (Mina) Production Memory (Processes and Rules) Drools Workflow Engine Human Task Service startprocess(p_name) signalevent(event, data) External Services Drools-Expert Rule Engine Pattern Matcher Event Listeners process instance (e.g., state, data) AgendaEventListener ProcessEventListener WorkingMemoryEvent Listener Working Memory (Facts) HeartFailureProcess.BPMN ECGProcess.BPMN X-rayProcess.BPMN ACEInhibitorMedication.BPMN HeartFailureReroutingRules.drl Agenda OWL API Insert facts Knowledge Manager Adapter SWRL API Update KB Jess Rule Engine Figure CONFlexFlow implementation using the Drools framework Drools-Expert provides a forward chaining inference engine that implements and extends the RETE algorithm (Forgy 1982), which is called ReteOO, signifying the enhancement and optimization in their implementation. It will then be able to derive the next steps taking Drools rules and processes into account jointly. If a part of the process needs to be executed, the rule engine will request the workflow engine to execute that step. This can be done easily, since the process engine is equipped with a WorkingMemoryEventListener (WMEL) that "listens" for any events sent from the rule engine. Once the current step is completed, the process engine returns control to the rule engine to again derive the next 121

134 step. Thus, the Drools implementation gives the control to the rule engine to decide and notify the workflow engine of what to do next. Drools rules allow us to alter process behavior dynamically. The Drools workflow engine comes with several event listeners for responding to different types of events. For example, upon triggering a signalevent the WMEL listener can activate any event or activity in the running process instance. Event related data can be passed from the parent process to a subprocess or activity as well. The ProcessEventListener "listens" for events from current instances, such as event beforenodetriggered, beforeprocessstarted, and afterprocessstarted. They provide real time updates on process status and are widely used to support medical tasks. For example, after the human task node PhysicalExamination is triggered, the system gets results from medical reasoning rules and prepares the data to be shown to clinicians. The AgendaEventListener is aware of the rule creation, activation, firing, etc. Drool rules can be updated (i.e., inserted or deleted) in the production memory at runtime, and such events should be captured by the system to provide a response. The Agenda manages the execution order of conflicting rules using the conflict resolution strategy. The default strategies employed by Drools are salience and LIFO (Last in, First out). We use salience to specify the priority of rules. The rule with higher salience value will be preferred. In addition, we use ruleflow-group to associate a set of Drools rules to a rule task or an ad-hoc subprocess. Any rule has the attribute ruleflow-group with the same name as the activity will be triggered before the activity node is activated. The workflow process will immediately continue with the next node if it encounters a ruleflowgroup where there are no active rules at that point. To allow clinicians interact with clinical process instances, we use the web-based process management console provided by Drools. They can input clinical data that will be used in the subsequent pathways Customizing an ad-hoc subprocess at runtime In this section, we show how to realize the operational semantics for an ad-hoc subprocess that can be used at runtime. Figure 6.15 shows the representation of the ad-hoc subprocess Treatment-ASP2 in BPMN 2.0 XML. It is nested with two other ASPs Medication-ASP3 and Therapy-ASP4, which 122

135 should be carried out in sequential order, while activities within these two ad-hoc subprocesses must be conducted in parallel. This is specified using the attribute ordering. A subprocess is completed when there is no active instance running (i.e., specified by the attribute completioncondition). We can see that for adhoc subprocess, there can be no connection between nodes (i.e., activities) thus their execution order is not known at design time. There can be a number of execution semantics for connecting these nodes and they are specified in a rule group with the same name (i.e., Treatment-ASP2 ). The activation of this rule group is maintained by Agenda as discussed in the above section. The process engine is aware of rule activation by using AgendaEventListener (see Figure 6.14).  <adhocsubprocess id="_8" name="treatment-asp2" ordering="sequential" >   <adhocsubprocess id="_8-2" name="therapy-asp4" ordering="parallel" >  <callactivity id="_8-2-1" name="coronary therapy" calledelement="cornoarytherapy"/> <callactivity id="_8-2-2" name="bypass surgery" calledelement="bypasssurgery" />  <! There can be no connection within an ad-hoc sub-process--> <completioncondition xsi:type="tformalexpression"> getactivityinstanceattribute("numberofactiveinstances") == 0 </completioncondition> </adhocsubprocess>  <adhocsubprocess id="_8-1" name="medication-asp3" ordering="parallel" >  <callactivity id="_8-1-1" name="ace inhibitor" calledelement="aceinhibitor" /> <callactivity id="_8-1-2" name="arb" calledelement="arb" /> <callactivity id="_8-1-3" name="beta blocker" calledelement="betablocker" /> <callactivity id="_8-1-4" name="aspirin" calledelement="aspirinmedication" /> <callactivity id="_8-1-5" name="statins" calledelement=" StatinsMedication" /> <callactivity id="_8-1-6" name="analgesic" calledelement="analgesicmedication"/>  <completioncondition xsi:type="tformalexpression"> getactivityinstanceattribute("numberofactiveinstances") == 0 </completioncondition> </adhocsubprocess> <scripttask id="_8-3" name="start Medication" > <script>system.out.println("start medication");</script> </scripttask>...  <sequenceflow id="_8-3-_8-1" sourceref="_8-3" targetref="_8-1" /> <sequenceflow id="_8-5-_8-2" sourceref="_8-5" targetref="_8-2" /> <completioncondition xsi:type="tformalexpression"> getactivityinstanceattribute("numberofactiveinstances") == 0 </completioncondition> </adhocsubprocess> Figure XML representation of the nested ad hoc subprocess Treatment-ASP2 in Figure 6.13 Figure 6.16 shows sample rules associated with their ad-hoc subprocess (comment lines start with //). The first rule R1 is associated with the Treatment-ASP2 subprocess, since they belong to ruleflow- 123

136 group Treatment-ASP2. This rule shows that therapy and medication will only be carried out when a patient needs thrombus breaking (or blood clot). Similarly, R2 and R3 belong to the Medication-ASP3 activity. Rule R2-General Medication will always be triggered but it will assign different medications (i.e., ACE inhibitor vs. ARB) depending on whether the patient is intolerant of ACE inhibitor. R3 will prescribe statins (or HMG-CoA reductase inhibitors) when a patients has high Low-density lipoprotein (LDL). R2 is given a higher priority than R3 by using the salience attribute since ACE inhibitor or ARB medication should be given first to heart failure patients to control their blood pressure, treat heart failure, and prevent strokes. In our implementation, more rules are present to tackle different kinds of scenarios. rule "R1-Treatment for patients require thrombus breaking" ruleflow-group "Treatment-ASP2" when processinstance: WorkflowProcessInstance() then if(processinstance.getvariable("requirethrombusbreaking").equals("yes")) //triggering both therapy and medication (in sequence) if thrombus breaking is required drools.getcontext(processcontext.class).getprocessinstance().signalevent("start Therapy", pdata); drools.getcontext(processcontext.class).getprocessinstance().signalevent("start Medication", pdata); else //triggering only medication activity if thrombus breaking is not required for this patient drools.getcontext(processcontext.class).getprocessinstance().signalevent("start Medication", pdata); end rule "R2-General medications" ruleflow-group "Medication-ASP3" salience 30 when processinstance: WorkflowProcessInstance() then if (processinstance.getvariable("intolerantoface_inhibitor").equals("no")) drools.getcontext(processcontext.class).getprocessinstance().signalevent("ace inhibitor", data); else //ARB medication is usually given to patients if they are intolerant of ACE inhibitor drools.getcontext(processcontext.class).getprocessinstance().signalevent("arb", pdata); end rule "R3-Medication for patient with high LDL" ruleflow-group "Medication-ASP3" salience 20 when processinstance: WorkflowProcessInstance() eval(processinstance.getvariable("hashighldl").equals("yes")) then drools.getcontext(processcontext.class).getprocessinstance().signalevent("statins"); end Figure Rules associated with the Treatment and Medication ad-hoc sub-process In this way, a variety of scenarios can be created for an ad-hoc subprocess, according to different contexts. Figure 6.17 (a) shows the process log generated by Drools when the workflow model in Figure 6.13 is instantiated for a specific patient, who needs thrombus breaking, is not intolerant of ACE inhibitor, and shows signs of high LDL, etc. It reflects the operational semantics for the ASP Do-tests- 124

ASP1 and Treatment-ASP2 (highlighted with a red rectangular) in a particular context. The visualization of the actual pathway for treating this patient (i.e., Treatment-ASP2 ) is presented in Figure 6.

137 ASP1 and Treatment-ASP2 (highlighted with a red rectangular) in a particular context. The visualization of the actual pathway for treating this patient (i.e., Treatment-ASP2 ) is presented in Figure 6.17 (b). More complicated workflow patterns can be modeled and instantiated in this approach as well. (a) Drools screenshot of the process log for a specific patient (b) Visualization of the runtime results for ad hoc subprocess Treatment-ASP2 in Figure 6.13 Figure Results of the ad-hoc subprocess instantiation 6.5. Discussion Above we have described a novel and practical framework CONFlexFlow for designing a clinical context and guideline integrated CDSS. We argue that flexible integration of CDSS with clinical workflow is a key to its success. Moreover, semantic web technologies like OWL can help to create 125

138 ontologies that are exchangeable across various healthcare departments and organizations. This promotes the understanding of medical knowledge across different providers and also enables sharing and extensibility. Thus, one provider can import an existing ontology and extend it further without having to reinvent the wheel. This approach is formal, yet also flexible. Slowly but increasingly, more medical practices are shifting towards electronic records Results of ontology analysis The clinical context model is developed following a formal methodology (Noy and McGuinness 2001). First, we enumerated healthcare use cases and checks on existing ontologies. Then, we defined the five top-level classes and 43 sub-classes in the hierarchy. The third step involved creating all the properties for the existing classes, including 47 Object properties and 106 Datatype properties. Their domain and range were also defined accordingly. After that, 126 restrictions were created for all the classes to enrich the ontology semantically. When the ontology schema was completed, we added 30 patients, 25 hospital personnel, 40 assets, and 40 locations as test cases. Our implementation is available on the web (Yao 2009) for future extension and improvement. To validate our model, we installed Pellet Reasoner Plug-in (Parsia and Sirin 2004) for Protégé 3.4. This tool can check OWL-encoded ontology for inconsistencies and infer new instances or classes. After each iteration of ontology development, we ran Pellet against the ontology to check its validity and revised it if any inconsistency or redundancy exists. The final result showed that our context ontology is logically valid in terms of logical consistency (i.e., no conflictions), concept satisfaction and classification. In our knowledge base, we also include the heart failure ontology developed by the (HEARTFAID team 2008). This ontology involves more than 200 classes, 100 properties, and 1000 individuals. Note that for now we only focus on the diagnosis and treatment of heart failure patients based on context information. Many other medical domain ontologies can be integrated into our model such as immunization and cancer ontology. Reusing mature existing medical ontologies can make our model more complete for real environments. 126

139 Based on these ontologies, we developed 18 SWRL rules for describing heart failure procedural knowledge. They include the detection, diagnosis, and treatment of chronic heart failure, systolic heart failure, hypertensive heart failure, etc. Jess Rule engine is used to automate the reasoning process. We aim to add more semantic rules in future to cover a more complete medical knowledge Contributions, success factors and KPIs Another contribution of this work is the tight integration of flexible clinical pathway with medical decision support. Our study differs from contemporary workflow modeling approaches from the following perspective: (1) we use a standard process modeling language BPMN 2.0 to model the care process. It provides rich semantics (e.g., human task, rule task and subprocess) for modeling clinical activities and coordinates interactions among various healthcare entities; (2) Clinical processes are strictly defined in the overall structure, yet some groups of activities are loosely defined using ad-hoc subprocesses. The actual execution semantics are triggered by clinical context that is derived from previous medical tasks and current environment. This contextual information is only available at runtime and is unique for each individual patient. In addition, an ad-hoc subprocess can be nested to handle complicated and highly dynamic scenarios; (3) the decision support provided during patient encounters is based on formal context and medical ontologies that are aligned with clinical guidelines. Thus, we can ensure the correctness of medical recommendations. Many studies have discussed the factors leading to successful CDSS implementation in (Berner 2009; Peleg and Tu 2006; Sittig et al. 2008). The most critical factors are identified as: (1) capture of evidence in machine-interpretable knowledge base, (2) computerized decision support instead of paper-based, (3) timely advice, (4) workflow integration, (5) response to user needs, (6) maintenance and extension, and (7) clinical effects and costs. Our ConFlexFlow framework has covered most of these aspects but the evaluation of our system in terms of user satisfaction is out of the scope of this dissertation. A key objective in developing and implementing such a system is to improve quality as reflected in concrete measures. Although detailed discussion of quality is beyond the scope of the current work, some key 127

140 measures of quality (KPI's) for our purposes are: number of treatment errors because of drug interactions (or allergies), number of diagnosis errors, number of cases of treatment not covered by patient's insurance, number of treatment failures for lack of available resources, complication rate per patient, patient satisfaction, etc. These metrics will inform the evaluation of the impact of the proposed system. One way to use these metrics is by means of a post hoc audit. For example, we can compare the number of treatment errors from drug interactions after the CDSS is in place versus the corresponding number before installing such a system by examining the patient log over two different periods of equal lengths Conclusions The goal of this research is to study new ways for designing clinical decision support systems. We proposed a framework called ConFlexFlow, and showed how flexible and adaptable clinical pathways can be designed taking into account medical knowledge in the form of rules and detailed contextual information to achieve a high quality outcome. A clinical workflow delineates the main pathways to be taken. These pathways are selected during workflow execution based on rules that encapsulate medical knowledge. Furthermore, we developed a proof of concept prototype using the Drools framework to model and execute BPMN 2.0 processes. A full implementation is in progress. Although we use SWRL rules to model medical knowledge, we further plan to use a mapping mechanism to make the knowledge convertible between Arden syntax and SWRL to make our approach more acceptable in the medical domain. Next, we plan to enhance the current prototype and test it in a practical environment using quality metrics. An audit of past errors without a CDSS and errors under the CDSS operation can reveal the improvement realized from the new system. Other issues of interest are how to create a contextual summary automatically for a doctor based on a patient s record, and how to allow a doctor to customize her alert settings so the alerts are useful but not distracting. 128

141 Chapter 7 Ensuring Semantic Correctness of Processes using Mixed Integer Programming In knowledge-intensive environments, it is important that Adaptive Process Management Systems (APMS) ensure error-free process execution and compliance with semantic constraints. However, most process design tools handle only syntactic constraints. This restricts their value in real-world applications considerably. This chapter proposes a novel approach to check the compliance of process models against semantic constraints and the validity of process change operations using a Mixed-Integer Programming (MIP) approach (Kumar et al. 2010; Kumar et al. 2012). The MIP formulation allows us to describe existential, dependency, ordering and various other relationships among tasks in a process along with business policies in a standard way. In addition to incorporating the semantic constraint specifications into a MIP formulation, we introduce three novel ideas in this approach: (1) the notion of a degree of compliance of processes based on a penalty function; (2) the concepts of full and partial validity of change operations; and (3) the idea of compliance by compensation. Thus, compensation operations derived from compliance degree can transform a non-compliant process into a compliant one both at design and execution time. We illustrate our approach in the context of a healthcare workflow as a way to reduce medical errors, and argue that it is more elegant and superior to a pure logic based approach. Complex scenarios with multiple concurrent processes (and constraints across them) for a single patient are also considered Introduction Today's organizations often face continuous and unprecedented change in their environment. Hence, there is a strong demand for APMS that allow flexible adaptation of processes in business, healthcare and other domains. Process adaptation is a strategy to deal with exceptional situations during workflow execution. Thus, if such exceptions can be captured and modeled, processes can adapt to them automatically without human intervention. In addition, APMS technology should ensure error-free process execution and compliance to external regulations and internal business policies. Thus, two 129

142 conflicting goals need to be balanced the need for control versus the need to provide sufficient flexibility for workflows to adapt to constantly changing environments. Process adaptation is defined as the capability to react to uncertainty in a process model through changes or a running process instance through deviation. The uncertainty may arise from the case data, context, and real-time events. There are several ways to incorporate flexibility into a process design. First, ECA (Event-Condition-Action) rules are a popular approach to catch anticipated events and adapt to workflow exceptions (Bae et al. 2004; Müller et al. 2004). Besides, ad hoc changes are necessary when unanticipated events occur. They both allow a process instance to deviate from normal execution. Second, process flexibility is provided by under-specification (Sadiq et al. 2005). An underspecified model is described as a list of tasks to be executed, and a set of constraints that apply to them. At runtime, any instance that satisfies the constraints is valid. A third notion of flexibility is based on separation of business policy from process control flow by parameterizing aspects of the process description with business policy instead of hard-coding them (Kumar and Yao 2009). Of course, it is important that process flexibility should not violate syntactic and semantic constraints. Syntactic constraints refer to the control of process execution at the structural level. For example, by verifying the absence of deadlocks and inconsistent data in a process model at design time, an APMS can determine that it is syntactically correct. This is necessary to guarantee error-free execution of process instances both before and after making changes. Semantic constraints stem from domain specific requirements and express semantic relationships between activities, such as presence, dependencies, and incompatibilities (Ly et al. 2009). As an example, such a constraint may state that: possible drug interaction between amoxicillin and oral contraceptives prohibits a patient from taking both medications within, say, 7 days of each other. In addition, similar constraints are required to ensure that process models are compliant with policies and regulations as well, say from the Federal Drug Administration. In general, the real-world regulatory environment is rather complicated. A variety of semantic constraints are issued by various governmental bodies having jurisdiction across different geographical areas. Depending on their nature, different sets of 130

143 constraints are applicable in different settings. For instance, a hospital may adopt nation-wide guidelines published by Agency for Healthcare Research and Quality (AHRQ) but also add site-specific clinical constraints. Hence, an APMS must manage the aggregate set of semantic constraints in a systematic way and be able to verify for audit purposes that making process changes will not lead to violations. The pioneering work of Ly et al. (2009) provides a useful inspiration for our work, but it focuses more on a constraint framework for compliance support and formal compliance criteria. It also does not discuss a formal constraint specification language, instead relying on natural language examples. The primary focus of this chapter lies in developing techniques for detecting and handling semantic constraint violations due to process changes. We aim to improve the efficiency and effectiveness of process audits to ensure their compliance with regulations and policies. We illustrate our approach in the context of a healthcare workflow as a way to reduce medical errors Preliminaries A running example A process consists of tasks that need to be performed by resources to complete the process. Figure 7.1 (a) presents a simplified clinical pathway for proximal femoral fracture adapted from Blaser et al. (2007) in BPMN notation (OMG 2006). This process model is coordinated by a series of tasks. After a patient is admitted (T1), she undergoes anamnesis and examination (T2). Depending upon the result of examination, if the patient is under suspicion of having a proximal femoral fracture, she has to take CT scan test (T5); otherwise, she is diagnosed and prepared for therapy (T3), followed by customized therapy (T4). Further, depending on the results of her imaging diagnosis, she is either treated with therapy (T6) or by surgery (T9). If surgery is need, two prerequisite tasks surgical planning (T7) and administering pain medication (T8), e.g. Aspirin, should be carried out. Finally, the case is documented (T10) and the patient is discharged (T11). The various tasks are stored in the task repository (see Figure 7.1 (b)), including those not present in the current process model, such as T12, T13, etc. 131

144 no T3: Symptom & Diagnosis (R2) T4: Therapy A (R3) Proximal femur fracture event T1: Patient admission (R1) T2: Anamnesis & Examination (R2) clinical suspicion of proximal femoral facture? yes T5: CT scan (R4) no yes T7: Surgical planning (R5, R6) T6:Therapy B (R3) T10: Documentation (R1) T11: Discharge (R1) end Indication of proximal femoral facture and operation? T8: Pain Medication (R5) T9: Surgery (R6, R7) Head injury event T30: Pain Medication (R5) T31: CT Scan (R4) T32: Evaluate patient after 6 hours (R2) Patient is responding well? yes no T33: Continue medication (R5) T34:Perform surgery (R2) end Contractions event T51: Evaluate contractions (R5) Other complications? yes no T53: Evaluate after 12 hours (R2) T52: Perform C-section (R2) yes no Slow progress? end ( ) task/activity AND-split/join XOR-split/join role (a) Process Model Task repository Resources T1 T2 T3... T10 T11 T12: Tolerance test T13: MRI test T14: X-ray test T15: Endoscopy R1: Administrative staff R2: Doctor R3: Therapist R4: Lab staff T16: Marcumar Medication T17: Amoxicillin Medication T18: Penicillin shots T19: Transfer patient T20: Narcotics R5: Nurse R6: Surgeon R7: Anesthetist (b) Task Repository (c) Resources Figure 7.1. A simplified clinical process for proximal femoral fracture The ordering relationship of tasks can be easily determined from this process model. For example, T1 is followed by T2 implying a sequential control flow. T7 and T8 are executed in parallel, denoted by an (AND-split, AND-join) pair of nodes. At an AND-split node, all branches are followed and they meet at an AND-join node. T3 and T5 are exclusive alternatives branching from an XOR-split node. The semantics of an XOR-split node are that only one branch can be pursued at this node. Hence, it is an exclusive choice structure. All the paths initiated at the XOR-split node must finish and meet at the corresponding XOR-join node. Further, each task is assigned to one or more resources. Resources may be 132

145 people (or roles), locations, equipment, and any other objects needed for executing tasks. In this example, we only consider roles required for performing medical tasks. For example, administrative tasks (T1, T10, and T11) are executed by the administrative staff (R1); diagnosis related tasks (T2, T3) are performed by doctors (R2); lab tests (T5) are performed by the lab staff (R4). All the resources involved in this clinical process are specified in Figure 7.1 (c). Any other types of resources can be added similarly. Figure 7.1 (a) also shows two additional processes triggered respectively during the execution of the proximal femoral fracture process when the following two events occur: head injury and onset of contractions. As a result, there are three related, interacting, concurrent processes. The steps of each new process are shown as before. The dotted lines from the start of the main process to the start of these two processes indicate that they were initiated subsequently to the main process. In general there may be multiple concurrent processes of this nature corresponding to the various ailments of a patient. Based on the process model, at runtime process instances are initialized and executed for handling specific cases. A process instance records an actual execution of a process model. For an example consider: Patient Mary was admitted by the hospital, and then underwent anamnesis and examination. The result indicated that there was no suspicion of proximal femoral fracture, thus she did not need imaging diagnosis. However, her symptoms were further examined and diagnosed. Then, she was given therapy and discharged. This is a typical instance following the process model in Figure 7.1. The actual execution path of this instance can be recorded as {start, T1, T2, T3, T4, T10, T11, end}. For a running process instance, a task can have the following status: activated, executing, done, aborted, suspended, and not activated. In general, one process model may have hundreds or thousands of instances running at the same time. Before a new or modified process model is used in practice, it must be both syntactically and semantically correct. Syntactic correctness is usually checked automatically in an APMS by workflow verification techniques (e.g., petri-net techniques, propositional logic, and graph theory). In contrast, semantic correctness is usually manually validated by domain experts, according to requirement analysis and domain knowledge. However, when a number of process instances are enacted, they can be changed 133

146 in an ad hoc way by users (Reichert and Dadam 1998) or by triggered events (Müller et al. 2004) to handle exceptions, etc. In general, each patient may encounter different medical situations and require customized clinical pathways to handle a variety of scenarios. Thus, users must check ad hoc changes and ensure they will not violate semantic constraints. This is time and effort consuming, and also error-prone Semantic constraints The design of a process must strike the right balance between control (through constraints) and flexibility (through change operations). Traditional APMS only consider syntactic constraints that deal with structural correctness of the modified process, such as absence of deadlocks. Many studies have been devoted towards this research effort and have handled this problem well. A comprehensive comparison of these approaches is surveyed by Rinderle et al. (Rinderle et al. 2004). However, this is not sufficient. In knowledge-intensive application areas, if processes undergo frequent changes initiated by a variety of users, mechanisms to ensure the semantic correctness become necessary by integrating domain knowledge into the APMS. Thus, in this dissertation our main focus is on semantic constraints. We define a semantic constraint as an inviolable domain specific restriction on a process. A process model or instance that violates semantic constraints may still be syntactically correct, but not applicable to a real world scenario because it is semantically wrong. Take Figure 7.1 as an example. Here, if a user removes "surgical planning" (T7), this process is still syntactically correct but semantically meaningless, since "surgery" (T9) is dependent on "surgical planning". This is particularly important for knowledge-intensive environments such as clinical settings where domain knowledge plays a critical role in process design. For instance, a patient with bacterial infection is usually given amoxicillin or clindamycin. But administration of amoxicillin and penicillin shots should be prohibited for patients who are hypersensitive to penicillin. In an application, a doctor may handle a number of cases each day and she may not recall all the allergies of a specific patient. A certain patient may be sensitive to amoxicillin and react badly to it. Thus, this kind of semantic constraint should be 134

147 formally modeled and the APMS should provide compliance checking mechanisms to avoid potential violations. Other examples of semantic constraints might include: Symptom examination and diagnosis should be performed before therapy. A patient with a cardiac pacemaker is not allowed to have a MRI test. Administration of Aspirin and Marcumar within 5 days of each other is prohibited to avoid possible drug interactions. During the process design, the semantic constraints for a particular process should be acquired from domain experts in an English-like language, and then transformed into a formal language interpretable by process management systems. A compliance checking mechanism should be developed to reduce constraint violations that arise due to process changes made at the design or execution time. This mechanism should also be incorporated into any adaptive process management framework. In this dissertation, since our focus is on semantic constraints, we assume that syntactic constraints are not violated after an ad hoc change is made Our contributions Our work stems from the need to develop a formal framework to manage semantic constraints, both domain related and of a regulatory nature, and checking the compliance of processes to these constraints throughout their lifecycle from design through stages of execution. Our proposed approach makes several contributions. First, we propose a formal specification language to define semantic constraints in a process. It has considerable expressive power and is capable of expressing their various properties (e.g., commutativity and transitivity). Our specification language is more expressive than that of Ly et al. (2008), which can only describe two types of semantic constraints: exclusion and dependency. In contrast, our approach covers a variety of constraints for presence of tasks, dependencies among them, etc. Second, we differentiate model- and instance-level semantic constraints, since some constraints should only be applied to specific cases. Third, we propose three novel ideas: the notion of a degree of compliance of a process, the concepts of full and partial validity of change operations, and the idea of compliance by 135

148 compensation. Fourth, based on the above ideas, we use mixed-integer programming to check: (a) the semantic compliance of a process model and its evolution; and (b) if the ad hoc changes made to running process instances are semantically valid or not. If not, we calculate the minimal set of compensation operations needed based on degree of non-compliance, and transform a non-compliant process into a compliant one. We illustrate our approach in the context of a healthcare workflow as a way to reduce medical errors, and argue that it is more elegant and superior to a pure logic based approach Formal Specification of Semantic Constraints The starting point of our work is a formal language for representing semantic/regulatory constraints. The importance of doing so has also been noted by Ly et al. (2008), but only dependency and exclusion relationships are considered in their study. Inspired by the work in Lu et al. (2009) where constraints are employed to specify interdependencies among tasks and validate process variants, we use them to model semantic constraints. Further, our study extends this by modeling resource assignments and obligations in a similar way. The constraints are later integrated into a mixed integer programming (MIP) formulation with an objective function Specification of semantic constraints Table 7.1 presents the formal specification for a comprehensive set of semantic constraints and their meanings. We consider four types of semantic constraints: presence and interdependencies, ordering sequence, resource assignment, and obligations. Next, we introduce the decision variables needed to model them. For presence and interdependence constraints, we consider each task T i as a propositional variable ranging over domain D i = {0, 1} as follow: With this notation, we can use equations or inequalities to present other constraints for coexistence, exclusive choice, dependency, etc, which are useful in expressing semantic relationships. For example, 136

149 "administration of Marcumar" (T 1 ) and "administration of Aspirin" (T 2 ) are incompatible because these two drugs can have an adverse interaction. Then, we can use an exclusive choice constraint to specify this relationship, e.g. This ensures only one task will be executed at execution time. Table 7.1. Formal specification of semantic constraints Constraint Meaning Formal Specification Prop -erty * mandatory ( ) must be executed. N/A forbidden ( ) is not allowed to be executed. N/A coexist (, ) Both or none of and are executed. M, T choice (, ) Only one of and must be executed. M choice (T 1, T 2,, T n, m) [n m] Exactly m of T 1, T 2,, T n should be executed. N/A dependency (, ) The presence of also imposes the T restriction that must be included. exclusion (, ) At most one of and can be executed. M cardinality (, min, max) [max min] is executed min to max times. N/A sequence (, ) must be executed after, if both and T are present. resourceassignment (, ) must be performed by the resource. N/A resourcecapability (, ) has the capability to perform. N/A obligation ( ) If = 1 and = 1 (i.e., both are executed), Then { = 1; and if 1, then = 1} obligation ( ) If is executed, it must be performed by the resource ; if is not available, then should be performed by. Note: *M: commutative; T: transitive N/A N/A For ordering constraints, we define the sequential relationship between tasks and using the decision variable with domain = {0, 1} as follows: Obviously, we must ensure because only one kind of ordering can exist between two tasks. Besides, we should enforce the transitivity property among sequence constraint, i.e.,. The symbol specifies an "if-then" relationship, which can be enforced 137

150 with logic constraints. The ordering constraints represent most common scenarios where some activity should be carried out before another activity. For resource-related constraints, we define a resource assignment constraint between task and resource using the decision variable with domain = {0, 1} as follows: This constraint requires that a task must be assigned to a specific resource but its execution can involve other resources as well. In general, a task can be assigned to several resources. For example, a surgery is performed by a team of a surgeon, a nurse, and an anesthetist. If a task can only be assigned to one resource, then we can specify it as:, where n is the total number of resources. Further, we define a resource capability constraint between task and resource using the decision variable, with domain = {0, 1} as follows: This decision variable can be used in a constraint to limit a resource to performing only certain tasks that it is capable of. Thus, indicates that if resource is assigned to task, then must have the capability to perform. To express obligation constraints (Tan and Thoen 2002), we use the syntax " " to define the violations and resultant reparation policies where are the premises, or propositions, in logic, and is the conclusion that captures obligations and normative positions in response to violations of obligation. It means that if is true, a party is obliged to perform ; if is not fulfilled, then a secondary obligation should be fulfilled, and so on. An atomic proposition can be a task presence constraint (e.g. ) or a resource assignment statement (e.g. ). Multiple atomic propositions can be connected using logical operators, including AND, OR, and negation, to form the premises in the constraint. Similarly, a deontic proposition in the conclusion can be any type of constraint. 138

151 Properties of semantic constraints Table 7.1 also specifies properties of semantic constraints, including commutativity and transitivity. Our definitions provide a well-founded representation of these properties that lends itself well to reasoning. For example: For commutativity: coexist (, ) coexist (, ) is equivalent to = = choice (, ) choice (, ) is equivalent to + = 1 + = 1 exclusion (, ) exclusion (, ) is equivalent to For transitivity: coexist (, ), coexist (, ) coexist (, ) is equivalent to: =, = = dependency (, ), dependency (, ) dependency (, ) is equivalent to:, sequence (, ) and sequence (, ) sequence (, ) is enforced by the logic constraint: Thus, the transitivity property can facilitate the representation of coexistence, dependency and sequence relationships among multiple tasks Model-level and instance-level semantic constraints Among all the semantic constraints obtained from domain experts, we distinguish between two types of constraints, model-level and instance-level. The model-level constraints apply to all instances derived from that process model, while the instance-level ones only apply to certain instances based on the case data. Thus, compliance of a process model with model-level constraints should be performed at design time; however, compliance with instance-level constraints can be carried out only at initialization time when the case data becomes available. For example, a hospital may require that a task "sign an agreement 139

152 form" be performed for all patients before surgery while only old patients are required to take a "tolerance test" before surgery. The first constraint applies to all the instances derived from the proximal fracture workflow model, while the second one is only applied when patient age is more than, say, 70. With different case data, each process instance can carry a different set of semantic constraints although they are initiated from the same model. Thus, separating these two types of constraints is important for compliance checking. Tables 7.2 and 7.3 provide examples of semantic constraints at the model and the instance level respectively using our specification language. They are derived from medical domain knowledge. Others can be represented similarly with the expressive power of our language. Table 7.2. Model-level semantic constraints for the clinical process in Figure 7.1 # Semantic meaning Specification Category SC 1 Patient admission (T 1 ) is mandatory. T 1 = 1 Presence SC 2 Therapy A (T 4 ) requires the inclusion of the previous T 4 T 3 and SC 3 diagnosis (T 3 ). If a patient is suspected of proximal femoral fracture, at least one imaging diagnosis test should be carried out. T 3 = 0 T 5 + T 13 + T 14 + T 15 1 dependency constraint SC 4 Surgery (T 9 ) requires the inclusion of surgical planning (T 7 ). T 9 T 7 SC 5 Marcumar medication (T 16 ) and Aspirin medication (T 8 ) are T 16 +T 8 1 exclusive since they have drug interaction. SC 6 Documentation (T 10 ) is required before discharge (T 11 ) or transfer (T 19 ). S 10,11 = 1 S 10,19 = 1 Ordering constraint SC 7 Patient admission (T 1 ) should be performed before the S 1,2 = 1 examination (T 2 ). SC 8 Surgical planning (T 7 ) must be executed before surgery (T 9 ). S 7,9 = 1 SC 9 An administrative staff (R 1 ) does not have the capability to do an examination (T 2 ) for patients. RC 2,1 =0 Resourcerelated SC 10 Lab staff (R 4 ) must be assigned to CT scan (T 5 ). RA 5,4 =1 constraint SC 11 Surgeon (R 6 ) must be assigned for surgery (T 9 ). RA 9,6 =1 SC 12 If a patient is diagnosed (T 2 ), she should be treated with therapy (T 4 or T 6 ), otherwise surgery (T 9 ). T 2 (T 4 T 6 ) T 9 Obligation SC 13 If a patient is admitted (T 1 ), she must be discharged (T 11 ), or transferred (T 19 ). T 1 T 11 T 19 Notice from Table 7.3 that constraints 2 and 5 are in conflict for a pregnant patient who also suffered a head injury. In fact, this situation arises in the model of Figure 7.1 where the first two, or all three, processes are triggered and running in parallel for the pregnant patient. To resolve such conflicts we 140

153 associate a priority number in the last column of Table 7.3. A higher priority reflects greater importance of the constraint, or equivalently a larger penalty for its violation. Thus, in a scenario where these two constraints apply, the one with higher priority would prevail. We will show later how our proposed solution incorporates and resolves priorities in this way. Table 7.3. Instance-level semantic constraints for the clinical process in Figure 7.1 # Semantic meaning (condition is underlined) Specification Priority 1 A patient hypersensitive to penicillin should not be given amoxicillin T 17 = 0 && 1 or a shot of penicillin. T 18 = 0 2 A patient who is pregnant must not have a CT scan. T 5 = A patient with cardiac pacemaker is not allowed to have a MRI test. T 13 = A patient older than 70 should take a tolerance test prior to any T 12 = 1 && 1 operative treatment. S 12,9 = 1 5 A patient who has suffered a head injury must have a CT scan T 5 = A patient with a head injury must not be given aspirin for pain T 8 = Constraint validation To describe a process model, an end user will normally consult with domain experts and define a set of semantic constraints, among which implicit, redundant, and conflicting ones may exist. Thus, we propose an algorithm to infer implicit constraints, and resolve redundancy and conflict issues. This algorithm can ensure that a complete and sound set of semantic constraints is used for compliance checking. The constraint definitions are treated as a system of equations. The existing set of constraints can be composed together to derive implicit constraints. We use to denote the composition of semantic constraints. For example: SC1: T 1 and T 2 must coexist (i.e., T 1 = T 2 ), SC2: T 2 is dependent on T 3 (i.e., T 3 T 2 ). From SC1 and SC2, one can derive a new constraint: SC' = SC1 SC2 = (T 3 T 1 ), i.e., T 1 is also dependent on T 3. This constraint SC' is implicit because it is inferred from existing ones. Further, we can generalize this example as: (T i = T j ) && (T k T j ) T k T i. Detecting implicit constraints can help remove redundant constraints and detect conflicting ones. If there exists another constraint in the system, say SC3: T 3 T 1, then it should be removed because it creates a redundancy (i.e., SC' is the same as SC3). 141

154 Conflicting constraints on the other hand will lead to inconsistencies and must be resolved before they can be applied. For the above example, if an explicit constraint SC4: T 1 < T 3 were specified, it would create a conflict since T 1 < T 3 and T 3 T 1 can both not be true. We can write it as SC' SC4 = fail or SC1 SC2 SC4 = fail. We propose an algorithm for validating a set of semantic constraints based on constraint composition as shown in Table 7.4. SC[i] denotes the i th constraint in the semantic constraint set vector SC. If two constraints conflict (line 4), then an alert message is generated by the algorithm to notify the user (line 5), while if they are redundant (line 7) then one of them is removed (line 10). We do not discuss details of which one of the redundant constraints to remove. If they are independent, their composition is null, which is denoted by, and no action is needed (line 14). For example (T 1 = T 2 ) (T 3 + T 4 1) =. The time complexity for this algorithm is Θ(n 3 ), where n is the number of semantic constraints. The output of this algorithm is a complete and sound set of semantic constraints. Next we discuss how adaptive processes can be modeled. Table 7.4. Algorithm for semantic constraint validation Algorithm Constraint_validation Input: initial constraint set vector SC Output: complete and sound constraint set vector SC Constraint_validation (parameter: SC) 1: Define n = SC.size 2: FOR i = 1 to n // go through all the constraints in the set 3: FOR j = i+1 to n 4: IF SC[i] SC[j] = fail // composition failed from a constraint conflict 5: Print "constraints SC[i] and SC[j] conflict." 6: break; 7: ELSE IF SC[i] SC[j] // redundant constraints 8: FOR k = j+1 to n 9: IF SC[k] == SC[i] SC[j] 10: SC.remove (SC[k]) // remove the redundant constraint 11: END IF 12: END FOR 13: ELSE 14: do nothing; // independent constraints 15: END IF 16: END FOR 17: END FOR 142

155 7.4. Modeling Adaptive Processes As noted earlier, changes made to a process include changes at the model level (e.g. through process model evolution) and those at the instance level (e.g. as ad hoc changes in response to exceptions or events). Further, complicated scenarios can arise when both types of changes occur concurrently. At design time, a set of semantic constraints is assigned to a process model, which the model and all the running process instances initiated from it have to comply with. In this section, we describe how an adaptive process can be represented using constraints. With this formalism, we introduce auxiliary variables to relax these constraints so that changes can be accommodated in a process and its compliance with semantic constraints can be automatically checked. These constraints are also integrated into the MIP formulations later Formal representation of process models A process model describes the coordination of tasks that need to be performed by resources to complete the process. It can be defined in a formal language such as BPMN or BPEL. We first define a process model formally. Definition 1 (General process model) Let P = (T, N, E, Res, Dat, SC) to denote a process model, where T = {T 1, T 2,, T n } denotes the set of all tasks or activities, N = {N 1, N 2,, N n } denotes the set of all control nodes, E {T N N T N N T T} represents the set of all edges connecting tasks and control nodes, Res denotes a set of all resources, Dat denotes related data for this process model, and SC denotes a set of semantic constraints associated with this process model. Nodes can be of six types; thus, Type(N) {start, end, sequence, parallel, choice, loop}. The start and end nodes are required for each process model. T = (idata, odata, State, Res) denotes a task with input 143

156 data, output data, state, and the resource executing it. The states of a task are: activated, executing, done, aborted, suspended, and not activated. T1 T2 T1 T2 T3, ;, ; ; ; ; ; ; ; (a) Sequence (b) Choice (c) Parallel T4 T1 T2 T3 T4 Figure 7.2. Process modeling structures and their representation by constraints Any process can be represented by a set of basic structures shown in Figure 7.2, including sequence, choice and parallel. When two tasks are in an immediate sequence, denotes that they must both or none be executed; and shows their ordering sequence, if both are present (Figure 7.2 (a)). Similarly, when two tasks are in a choice structure the flow constraints require that only one path be taken at the branch after (Figure 7.2 (b)); when the tasks are in a parallel structure then both branches after are taken (Figure 7.2 (c)). We do not represent loop structure because it is difficult to translate this structure into MIP formulas and few semantic constraints involve loop relationships. Figure 7.3 shows how the process model of Figure 7.1 is represented, including both process flow and resource assignment constraints. This approach is very expressive and flexible since it can describe under-specified process models. We can easily specify the required structural relationships and leave the unknown parameters open. Start=T1; T1=T2; T2=T3+T5; T3=T4; T5=T6+T7; Task relationship Process flow constraints T7=T8; T8=T9; T10=T11; T11=End; Start =1; End =1; Sequence relationship S(1,2)=1; S(2,3)=1; S(2,5)=1; S(3,4)=1; S(5,6)=1; S(5,7)=1; S(5,8)=1; S(7,9)=1; S(8,9)=1; S(4,10)=1; S(6,10)=1; S(9,10)=1; S(10,11)=1; Resource constraints RA(1,1)=1; RA(1,2)=1; RA(3,3)=1; RA(4,3)=1; RA(5,4)=1; RA(6,3)=1; RA(7,5)=1; Figure 7.3. Formal representation of process structure constraints RA(7,6)=1; RA(8,5)=1; RA(9,6)=1; RA(9,7)=1; RA(10,1)=1; RA(11,1)=1 144

157 Change operations Process change operations (or adaptation patterns) are actions issued by users in an ad hoc manner or triggered by ECA rules in reaction to exceptional events (Müller et al. 2004). For example, when a patient is detected with high blood pressure during fracture examination, an additional treatment is needed to lower her blood pressure. The ECA rule can be written as: Event blood pressure status Condition blood_pressure>=140/85 mmhg Action Insert ("Administer Diuretics Medication") after Examination For maximum flexibility, a variety of adaptation patterns should be supported in an APMS. A comprehensive survey of available change patterns and change support features was conducted in (Weber et al. 2008) based on empirical evidence from large case studies. In addition, Kumar and Yao (2009) presented a variety of operations that can be performed to change a process model or instance. We define primitive change operations as atomic operations at the task level since they are indivisible units for making changes to a process, and serve as a foundation for composing more advanced adaptation patterns. These primitive change operations are summarized in Table 7.5. This is a simplified version of FlexVar as described in Table 5.2 (Chapter 5), since our focus here is semantic compliance checking. Table 7.5. Formal specification of primitive change operations Expression Meaning Formal specification insert (T, N 1, Insert T between N 1 and N 2, i.e. T = 1, =1, = 1 N 2, [cond * ]) after N 1 and before N 2 delete (T) Remove T from the process T = 0, =0, = 0 (note: [N 1, N 2 ] denotes the original position for T) move (T, N 1, Move T to a new position =1, = 1; remove any sequence status N 2, [cond*]) between N 1 and N 2 constraints for the former position of T swap (T 1, T 2 ) Swap the position of T 1 and T 2 in the process =1, = 1, =1, = 1 (note: [N 1, N 2 ] and [N 3, N 4 ] denote the original positions for T 1 and T 2 respectively) R 1,1 =1 setresource Set the resource for task T 1 to (T 1, R 1 ) resource R 1 *cond is an optional parameter that should be specified as the condition if T is inserted as a choice or loop structure 145

158 These operations can provide process flexibility both at the model and the instance level. A change can be associated with a specific process model identified by a process_id, or with a process instance identified by an instance_id. Change at the model level is usually permanent and it permits long-term flexibility; while, change at the instance level permits short-term flexibility so that a process instance can deviate temporarily from a standard process model. For example, applying Delete(T) on a process instance only allows the current instance to skip task T while other process instances are still executing T. If we apply this change at the model level, then it will remove task T permanently from this model and all instances derived from this model will be affected accordingly. Detailed steps for performing change operations, and validation of change conflicts at the syntactic level, are discussed in Chapter 5. For example, if both Delete(T) and Move(T, T 1, T 2 ) should be carried out on the same process, it will cause a conflict because task T can either be deleted or moved, but not both. The incompatibility table (Figure 5.3 discussed in Chapter 5) is used to check whether any two operations are incompatible. However, compliance of change operations against semantic constraints is not possible with this mechanism. In this study, we allow a set of change operations to be issued together instead of just one operation at a time. In general, a primitive change operation may violate the constraint set while a set of change operations may actually be compliant. For example, if a user wishes to delete "surgical planning", the dependency semantic constraint SC 4 in Table 7.2 is violated. However, if he deletes both "surgical planning" and "surgery", this will not violate any constraints since the process instance after making this change becomes compliant. Thus, it is more reasonable to handle a set of change operations all together Modeling relaxation of process representation constraints A process represented by constraints, as shown in Figure 7.3, may be relaxed to accommodate change operations if the semantic constraints are not violated. We allow relaxation of process structure constraints using auxiliary variables as described next. 146

159 For each task, relaxation variables and denote allowed changes in their status, and and denote allowed changes in their associated execution sequence. Their meanings are defined as follows: A or value of 1 represents removal of a task or a sequence relationship, while a or value of 1 means a task, or a sequence, relationship is added, respectively. Both types of changes are possible. Then, we can replace the occurrences of in the process flow constraints with, ( ) to allow relaxation of task status. Likewise, we can replace with, ( ) to allow relaxation of execution sequence. Similarly, we use and to define the change of a resource assignment relationship between task and resource, where = 1 represents removal of a resource assignment relationship, and = 1 its addition. Then we can replace with, to allow relaxation of a resource assignment. Consequently, we can relax the process model in Figure 7.3 with the above variables, as shown in Figure 7.4. In this way, the status of tasks and their relationships are allowed to change to be compliant with assigned semantic constraints. Next, we show that how an adaptive process represented by relaxed constraints can combine with semantic constraints to form a MIP with an objective function. It further let us check whether certain changes are allowable without constraint violations, and if not, to minimize the number of violations. 147

160 Process flow constraints Task relationship Sequence relationship Start=T1+TX1-TY1; S(1,2)+SX(1,2)-SY(1,2)=1; T1+TX1-TY1=T2+TX2-TY2; S(2,3)+SX(2,3)-SY(2,3)=1; T2+TX2-TY2=T3+TX3-TY3 S(2,5)+SX(2,5)-SY(2,5)=1; +T5+TX5-TY5; S(3,4)+SX(3,4)-SY(3,4)=1; T3+TX3-TY3=T4+TX4-TY4; S(5,6)+SX(5,6)-SY(5,6)=1; S(5,7)+SX(5,7)-SY(5,7)=1; Start =1; S(5,8)+SX(5,8)-SY(5,9)=1; End =1; Resource constraints RA(1,1)+RX(1,1)-RY(1,1)=1; RA(1,2)+RX(1,2)-RY(1,2)=1; RA(3,3)+RX(3,3)-RY(3,3)=1; RA(4,3)+RX(4,3)-RY(4,3)=1; RA(5,4)+RX(5,4)-RY(5,4)=1; RA(6,3)+RX(6,3)-RY(6,3)=1; RA(7,5)+RX(7,5)-RY(7,5)=1; Figure 7.4. Representation of relaxed process representation constraints 7.5. Process Compliance Checking using MIP In this section, we introduce a Mixed Integer Programming (MIP) approach to check the compliance of process models and instances against a sound set of semantic constraints. First, we give formal definitions for important concepts in process compliance checking. Then we describe how the MIP approach is used to model semantic compliance verification using a penalty score as an objective function to be minimized. In this study, we use CPLEX (IBM 2010) as a tool to solve the MIP formulations. A series of related examples are used to illustrate our approach Preliminaries Definition 2 (Semantic compliance of a process model or instance). Let P be a process model, SC = {SC 1, SC 2,, SC k } be a set of semantic constraints assigned to P, and I be an instance derived from P. Then, P(I) is compliant with, formally, if is not violated by P that is represented in MIP formulas. Otherwise, P(I) is not compliant with, formally. Thus,, if (i=1 k), i.e., is compliant with every semantic constraint in the set SC., if (i=1 k), i.e., is not compliant with at least one semantic constraint in the set SC. Definition 3 (Compliance degree of a process model or instance). Let [P] or [P(I)] denote the compliance degree of a process model P or its instance P(I), respectively. It is defined as the total number of semantic constraints divided by the number of compliant constraints for P(I), formally, 148

161 , where SC = {SC 1, SC 2,, SC k } and is a binary function that gives the true or false result of semantic compliance checking in Definition 2. We will use to compare the degree of compliance among several process models or instances. Thus, non-compliance degree is 1-. Naturally, it follows that: If, [P(I)] = 1, i.e., the process is fully compliant If, [P(I)] [0,1), i.e., the process is partially- or non-compliant Thus, when [P(I)] = 0, P(I) is non-compliant; alternatively, when [P(I)] = 1, P(I) is fully compliant. In the next section we give an algorithm to automatically determine whether a process model or instance is compliant or not, and calculate its compliance degree. Further, if a process instance I is initiated from process model P, and if we let denote the initial state of the instance (i.e., no task has been executed yet), then. Definition 4 (Semantic validity of change operations). Let I denote a process instance that is derived from P. Let CP = {OP 1, OP 2, OP m } denote a set of change operations applied on I', which is the current status of I. The assumption is. Let I'' denote the status of the process instance I' after CP is applied. Then CP is valid on I', formally iff and. Each process instance I derived from a process model P can have a variety of incarnations, i.e. I = {I 0, I 1, I m, I e }, and a variety of change sets CP = {CP 1, CP 2, CP m } correspondingly during its execution, where I 0 denotes the initial state before any task has been executed, I 1 denotes the instance status after CP 1 is applied, I m denotes the instance status after CP m is applied, and I e denotes the final state when the instance is about to end. Assuming that, we can ensure the semantic compliance of a process instance through its lifecycle by checking the validity of all the change sets to be applied. Hence, we can write it as. 149

162 Definition 5 (Penalty and Compensation of change operations). Let CP denote a set of change operations to be applied on a process instance I with status I'. Let I'' denote the status of I' after CP is applied, i.e.,, and. For this general situation, we introduce a penalty function to punish non-compliance caused by change operations in CP. We define penalty constraints caused by CP, or formally as the number of non-compliant semantic. Thus, the larger the degree of noncompliance caused by change operations, the higher the penalty costs. Further, if there exists a minimal set of additional change operations such that, and, we say that CP is partially valid over I', i.e.,, with as the compensation. If there does not exist any such such that, and, we say CP is invalid, i.e.,. Formally, there are three possible cases for each set of change operations to be applied on I'. These three cases are described as follows: (1) Valid. CP is valid on I', i.e. after CP is applied the instance is compliant with SC without compensation; (2) Partially valid (or valid with compensation). CP is partially valid on I', i.e. after CP is applied the instance is compliant with SC with compensation. (3) Invalid. CP is invalid on I', i.e. no compensable operations can be found to make I'' compliant with SC. Next, we will show how we use MIP to minimize the penalty as an objective function and determine the corresponding compensation operations Formal algorithms for compliance checking In this section, we introduce the algorithms for checking the semantic compliance of a process model or an instance. Minimal compensations are provided to transform a non-compliant process into a 150

163 compliant one based on the compliance degree. Further, it allows us to verify the validity of any change to be applied to a process instance during its execution. If the set of change operations is partially valid, the algorithm provides an optimal solution to heal it. Our algorithms handle three scenarios through the process lifecycle in which its compliance must be considered: Design time compliance of a process model Initialization time compliance based on case data Execution time compliance of a running instance Process model compliance checking at design time Figure 7.5 presents our algorithm to automatically check the compliance of a process model. A process model is P = (T, N, E, Res, Dat, SC), with tasks T = {T 1, T 2,, T n }; the domain of each task D i {0, 1}, i [1, n]; and SC = {SC 1, SC 2,, SC k } to denote a set of model-level semantic constraints defined for P. To verify if a process model is compliant with a set of semantic constraints, we first follow steps (1)-(3) to relax the process model. Above we defined penalty as the number of non-compliant semantic constraints caused by change operations (see Definition 5). Based on this definition, step (4) sets the objective function obj to minimize the total penalty to make it compliant. Algorithm Compliance_design Input: Process model P = (T, N, E, Res, Dat, SC) Output: Objective function obj, compliance degree, suggested actions based on obj (1) Add task status relaxation: for each formula pertaining to task replace with. (2) Add sequence status relaxation: for each formula pertaining to sequence variable, replace with. (3) Add resource status relaxation: for each formula pertaining to resource assignment variable, replace with. (4) Minimize objective function obj:. Figure 7.5. Algorithm for process model compliance checking at design time The penalty function is weighted by the priority (with prefix 'P') of the corresponding constraint associated with the auxiliary variable for a task or sequence (see Table 7.3). Thus, violation of a higher priority constraint would incur a greater penalty than of a smaller priority constraint. In the case of 151

164 conflicting constraints with different priorities such as constraints 2 and 5 both associated with in Table 7.3, variable would be assigned a value of, say, 1 and 2. This problem is modeled as a MIP formulation. Intuitively, the goal of this formulation is to minimize the total penalty incurred in satisfying the relaxed set of constraints. If obj = 0, then ; if obj > 0, then, with compliance degree Process model compliance checking at initialization time At the initialization time, a process instance is instantiated with available case specific data. Thus, instance-level semantic constraints are triggered according to the case data of this instance, and further compliance checking with respect to such constraints is required if additional constraints are added. We use the algorithm in Figure 7.6. In the most complicated situation, if a change is required for ensuring the compliance of the instance with, we also need to combine the original constraint set SC, since this action may cause violation of certain constraints in SC. Algorithm Compliance_init Input: Process model P = (T, N, E, Res, Dat, SC), Output: Objective function obj, compliance degree, suggested actions based on obj 1 If 2 Exit( ); //no further compliance checking is needed at initialization; 3 Else // if (i.e., change is needed) 4 { 5 Update P = (T, N, E, Res, Dat, SC + ); //check the constraint set; 6 Run algorithm Compliance_design (P); //invoke design time compliance algorithm 7 Output(obj,, ); 8 } Figure 7.6. Algorithm for process compliance checking at initialization time Compliance checking for instance adaptation at execution time At execution time, a process instance is subject to changes in response to events or exceptions. Our objective is to ensure that potential change operations will not violate semantic constraints. Figure 7.7 presents the algorithm that checks the compliance of a running process instance against the proposed changes at execution time. First, the instance status is updated by applying change operations and adding 152

165 new constraints to reflect the completed tasks. For example, if a task T 1 has been executed, we update the task status: T 1 = 1. It has no relaxation because a completed task is not reversible. Finally, the algorithm Compliance_design is used for compliance checking. Three results are possible as described in Definition 5. If obj = 0, then the instance after changes is compliant and thus the change set is valid; otherwise, if obj > 0, then it is compliant with compensation, thus the change set is partially valid. By applying suggested actions, the process instance can be made compliant. If obj is infeasible, then no solution is found and thus the change set is invalid. Algorithm Compliance_runtime Input: Process model P = (T, N, E, Res, Dat, SC), CP = {CP 1, CP 2, CP m } Output: objective function obj, compliance degree, suggested actions based on obj For each operation CP i in CP, we make the following changes to the model: (1) Task status: update the process instance to assign the new value of task status after the proposed change, e.g. if was 1, but is to be deleted then set (2) Sequence status: If a deleted task, say appears in a sequence constraint as or, then delete this constraint since the task has been deleted. If task is replaced by task, then replace all i occurrences by k in or, for all j. (3) Resource assignment status: If any change operation belongs to resource assignment, then revise the related formulas by the new value. Run algorithm Compliance_design (P); Output (obj,, ); Figure 7.7. Algorithm for process compliance checking at execution time Examples In this section, we use several healthcare scenarios along with the running example in Figure 7.1 to illustrate the applicability of our approach. As shown in Figure 7.8, our examples cover the lifecycle of a process from the time it is designed to the time it is completed for handling a specific case. We use the CPLEX tool to solve an MIP formulation and check whether a process is semantically compliant. The CPLEX model has eight parts: the defined variables; the decision variables for which the model finds a solution; the objective function to be maximized or minimized; process representation constraints; consistency and transitivity constraints; and the model- and instance-level semantic constraints. Each part of the model has been described above. The CPLEX model for Example 1 is given in Appendix C. 153

166 Figure 7.8. Overview of the illustrated examples Example 1 (Design Time): Suppose the semantic constraints in Table 7.2 are enforced for any workflow model of a proximal femoral fracture process. Further, assume that Hospitals A and B have designed their own clinical workflow models P1 and P2 (see Figure 7.9). Using our algorithm, we can check their compliance against the semantic constraints that should be enforced. The semantic constraints SC 1 -SC 13 have been validated beforehand to ensure that the constraint set is sound and complete (i.e., there are no redundancies or conflicts). The result shows that P1 is compliant since the objective function is 0. Thus the compliance degree =1. In contrast, P2 is not compliant since the objective function is 3; hence =1-3/13=10/13. Although the output of the objective function in this CPLEX model is 4, the adjusted obj is 3 because, since both SX 9,7 =1 and SY 7,9 =1, it results in double counting. Suggestions are also provided to make P2 compliant with SC: TY 1 =1 (for SC 1 ), SX 9,7 =1 and SY 7,9 =1 (for SC 8 ), TY 11 =1 (for SC 13 ). The interpretation of this output from the CPLEX model is that the process modeler should: (1) add task T 1 (i.e., patient admission), (2) execute T 7 before T 9 (i.e., surgical planning should be carried out before surgery), and (3) add T 11 (i.e., a patient should be discharged). Thus, specific feedback is given to the user in this manner. This ability to check process compliance in an automatic manner is useful for any non-trivial model. With this approach, we can verify if a process 154

167 model is compliant or not. If it is not compliant, we can find an optimal solution (if it exists) and suggest compensation operations to heal it. Our algorithm calculates the compliance degrees of different process models, and offers a way to compare them. (a) Process model P1, obj=0,, =1 (b) Process model P2, obj=3,, =10/13 Figure 7.9. Compliance checking at design time Example 2 (Initialization Time): Suppose the four process instances I1 through I4 are initialized from P1 with different case data: I1 (a normal patient), I2 (a pregnant patient who is hypersensitive to penicillin), I3 (a patient aged 35 with cardiac pacemaker), and I4 (a patient aged 74 with cardiac pacemaker). We use to denote additional semantic constraints triggered by the applicable ECA rules at initialization time as follows: I1 (a normal patient):. I2 (a pregnant patient hypersensitive to penicillin): = {T 5 = 0, T 17 = 0, T 18 = 0} I3 (a patient aged 35 with cardiac pacemaker): = {T 13 = 0} I4 (a patient aged 74 with cardiac pacemaker): = {T 13 = 0, T 12 = 1, S 12,9 = 1} 155

168 Thus, no compliance checking is needed when I1 is initialized because no additional constraint is triggered, i.e.,. However, compliance checking is needed for I2, I3, and I4. We apply the compliance_init algorithm and obtain solutions as given in Figure 7.10: Extra semantic constraints on I2 derived from P1 SC 1 :T 17 =0 SC 2 :T 18 =0 SC 3 :T 5 =0 Extra semantic constraints on I3 derived from P1 SC 1 :T 13 =0 Extra semantic constraints on I4 derived from P1 SC 1 :T 13 =0 SC 2 :T 12 =1 SC 3 : S 12,9 =1 (a) I2: obj =0, = 1 (b) I3: obj=0, =1 (c) I4: obj=2, =14/16, suggested action: TY 12 =1, SY 12,9 =1 Figure Compliance checking at initialization time From the results, I2 and I3 are compliant despite adding the new semantic constraints. Thus, no further action is required. The objective function value for I4 is 2 and suggested actions to make it compliant are provided. However, these actions, before being applied, should be checked against a combination of the original and new constraints; otherwise, the original constraints might be violated (refer to algorithm Compliance_init). The solution for I4 requires that a new task "tolerance test (T 12 )" be inserted and executed before "surgery (T 9 )" (see Figure 7.10 (c)). After modifications are made to I4, we obtain the revised process instances shown in Figure Figure Solution and modification to I4 This example shows the difference between model- and instance-level semantic constraints. The former ones are complied with by all instances derived from the process model; hence, they are checked at design time. In contrast, the latter ones are only applicable to cases that satisfy specific conditions, thus 156

169 they are only verified at initialization time when case data are available. In general, different cases can have various constraints and they are treated differently. Example 3 (Execution time): An instance may go through several changes during its execution lifecycle. After each change set is applied or a task execution occurs, the current context for this process instance will be changed. Thus, each time before a change is issued it should be verified. Consider the execution of instance I2 derived from process model P1. Note that the initial status of I2 is compliant, as shown in Example 2. We apply four sets of changes CP1 through CP4 to I2 as shown in Table 7.6. This table describes the meaning of each change set and the resulting actions generated by our algorithm. The final resulting process is shown in Figure Table 7.6. Change sets occurring during the execution lifecycle of instance I2 Change set Explanation Results CP1={OP 1 }, OP 1 = remove (T 5 ) CP2={OP 1, OP 2 }, OP 1 =insert (T 16 before T 8 ) OP 2 =remove (T 8 ) CP3={OP 1, OP 2 }, OP 1 = removeresource(t 9, R 6 ); OP 2 = setresource (T 9, R 3 ) CP4={OP 1, OP 2 }, OP 1 =delete (T 11 ), OP 2 =insert (T 19 ) This change set is automatically triggered when I 2 takes the path to T 5 (i.e., this patient is under suspicion for a proximal femoral fracture). This could happen because a pregnant patient must NOT have CT scan. It is issued before T 8 is triggered. This could happen due to a traumatic head injury event. OP 1 aims to insert a task "Administration of Marcumar" to decrease the clotting ability of the blood so that thrombosis is prevented. OP 2 aims to forbid "Administration of Aspirin". It is issued when T 7 is completed. It tries to assign a therapist to the surgery instead of a surgeon since no surgeon is available. Issued when task T 11 is running. It replaces the discharge task with transfer task because the patient will be transported to another hospital due to emergency. Solution: obj=1, CP1 is valid with compensation that suggests T 13 should be inserted (TY 13 =1). Semantically, it means at least one imaging diagnosis test should be carried out if the patient is suspicious of proximal femoral fracture. Solution: obj=0, CP2 is valid. Semantically, Marcumar can interact with Aspirin, so they are exclusive to avoid drug interaction (i.e., T 8 + T 16 1). However, administration of Aspirin is moved (i.e., T 8 =0) due to the head injury event, thus "Administration of Marcumar" is allowed. Solution: no obj, CP3 is invalid. Semantically, it means performing a surgery requires a surgeon. Solution: obj=2, CP4 is allowed with compensation that suggests T 11 be removed and T 19 be inserted as a secondary obligation. See Figure

170 For change operations that are invalid or partially valid, the user will have the option whether to adopt the suggested actions or not. In other words, she can use the invalid change operations but the system will log this violation for future analysis. For example, for CP1, the doctor is advised to perform MRI (no radiation) instead of CT scan due to pregnancy. However, since a MRI test takes much longer, the doctor may want to go ahead with treatment based on symptoms and then make a decision depending upon how the patient responds. The doctor can order a MRI test or a CT scan after medication (i.e., insert an imaging test after T8). This example shows that a process instance may encounter a number of changes during its execution lifecycle. After each change, the structure of this instance may change and thus affect subsequent changes. Again it is difficult, if not impossible, for human beings to memorize such frequent changes. Hence, medical errors can occur easily in the absence of automatic checking. Figure Compliance checking during the lifecycle - After CP4 is applied Discussion Occurrence of multiple events can lead to complicated situations where various rules are triggered. As a result, a number of semantic constraints should be applied simultaneously. In the context of the example of Figure 7.1 we noted earlier that constraints 2 and 5 (see Table 7.3) are in conflict for a pregnant patient 158

171 with a head injury. However, since constraint 5 has a higher priority than constraint 2, violation of constraint 2 will incur a smaller penalty. Thus, in the solution the conflict is resolved by satisfying constraint 5 (i.e. performing the CT scan), and violating constraint 2. However, in practice it is likely that the MIP solution is infeasible and process violations will occur. In some cases, medical staff may not accept the actions suggested by our algorithm that make a non-compliance process compliant, since they are not applicable in certain context. Thus, it seems more reasonable to let the final decision rest with the end user in health care settings. However, the system should have the capability to log such process violations for further analysis and diagnosis. If process violations occur frequently and follow the same pattern, then the process designer should consider updating related process models in order to reflect these complex situations as they arise. A possible extension of our approach is to develop a process from scratch by simply inserting tasks on an ad hoc basis as the instance proceeds at run time without any prior model. This is feasible with our approach but the drawback is that in an ad hoc process like this, one would not know what semantic constraints to apply. Of course, one could apply all the semantic constraints that exist in the repository but this would make the analysis very slow. Therefore, it is more appropriate to work within the confines of a process model with associated semantic constraints, and then make changes as the instance proceeds. This would restrain the degree of freedom available to an end user, but we do not view it as a disadvantage because it creates a framework in the form of a process model in which the designer is given considerable flexibility. This will also be a less error-prone method than creating a process instance entirely on the fly. Another possible extension is to offer the user all the solutions that can restore compliance when multiple solutions may exist. Currently, we only provide one solution suggested by CPLEX and users have to request additional solutions from the system if the recommended one is not appropriate. In the clinical settings, we can provide all solutions to a clinician who may choose the best one based on human judgment. 159

172 7.6. Discussion and Implications In this section, we first discuss the advantages of the MIP-based approach over the logic-based approach for checking process compliance. Then we present an architecture for implementation, system realization, and discuss some limitations and possible extensions of our approach. We recognize that developing a complex system like the one described above and building the repository of semantic constraints will naturally take a long time, especially in the healthcare industry which is a slow adopter of IT. In addition, work needs to be done by domain experts like clinicians to maintain and update these semantic constraints. In the healthcare domain, semantic constraints are defined and disseminated by various organizations for different purposes. For example, more than 10,000 guidelines covering various disease types and topics have been published by Agency for Healthcare Research and Quality (AHRQ) and the U.S. Department of Health & Human Services. Other health organizations such as American Heart Association (AHA) focus on guidelines pertaining to heart diseases. We anticipate that over time the constraints corresponding to these guidelines can be described in standard formats and shared across organizations. On the other hand, semantic constraints can also be specific to geographic areas such as a region or a country. Further, each hospital may customize the constraints to its local settings to augment the standard ones. Thus, we believe our systematic approach, though somewhat complex, can certainly be beneficial in such environments where multiple sets of overlapping, and sometimes conflicting, constraints must co-exist Comparison of MIP- vs. logic-based approach The MIP-based approach for expressing constraints for process compliance has considerable appeal in comparison with a logic based one, such as first order predicate logic (Huth and Ryan 2004). We compare these two approaches according to a variety of criteria in Table 7.7. First, logic does not allow us to formulate a problem in terms of a penalty function as an objective, and find an optimal solution to ensure weak compliance as one can do with the MIP method. Second, it is very difficult to express m-of-n constraints in logic but not so in our approach as we showed in Chapter 3. Thirdly, commutative and 160

173 transitive relationships can be expressed more simply and elegantly in the proposed formalism as compared with logic. Moreover, MIP formulations have standard representation formats, and can be solved very efficiently using a variety of proprietary and open source tools. Logic rules can be represented in various formats that are not so easily interchangeable. Of course, MIP formulations have limitations in terms of expressive power that can be obtained in higher-order logics. This comparison simply shows that for our problem at hand, the MIP based approach is particularly appealing. It is not our intention to argue that one approach is better than the other because it is not possible to generalize. More comparison and discussion of these approaches and hybrid approaches that combine logic constraints with IP style ones can be found in (Hähnle 1994; Hooker and Osorio 1999). Table 7.7. Comparison of MIP-based and logic-based compliance checking Criteria MIP-based Approach Logic-based Approach Penalty function as an objective Easy to formulate Does not lend itself easily Optimal solution Determines minimum changes Cannot optimize easily needed to restore compliance Expressing m-of-n relationships Simple Messy Expressing constraint properties As an algebraic expression As logic predicates (e.g., commutativity, transitivity) Support of mature tools Efficient tools like CPLEX based on standards Large number of non-standard tools Language issues MIP formulations are quite standard Many languages for expressing rules Higher order logics Not supported More expressive, but harder to implement Specifically, in contrast to a LTL-based approach (Awad et al. 2009; Awad et al. 2011), we do not have the notion of states. From a complexity point of view the number of states is exponential in the number of tasks and the complexity of the solution is exponential in the number of states. Thus, roughly speaking there is a double exponential complexity (in the number of tasks) in such an approach versus the single exponential complexity in our approach. Hence there is a tradeoff between expressive power and complexity. We use an existing well known solution technique to provide reasonable expressive power, while the other approach proposes a new framework for which a solution engine is not readily available. 161

174 Implementation architecture Figure 7.13 presents an implementation architecture that integrates the specification of semantic constraints and compliance checking of processes into an APMS. Generally, pre-designed process models are stored in the process repository and tasks are stored in the task repository (omitted in the figure). The MIP adapter analyzes the process models and converts them into MIP models in CPLEX. As shown in Figure 7.2, the control-flow structure can be converted into MIP formulas easily. Resource assignments pertaining to each task should be converted into MIP formulas as well. The MIP adapter is responsible for synchronizing any changes made to process models. Each process model represented in a MIP formulation also includes a complete and sound set of semantic constraints. The semantic constraint designer is a GUI-based tool that allows knowledge engineers to design semantic constraints using our specification language in Table 7.1. We propose that a visual representation of semantic constraints be available to knowledge engineers to assist their design. The validated semantic constraints are stored in the repository that can be shared by different users, departments or even organizations. As domain knowledge changes, the repository is updated. Process designer Workflow participant Knowledge engineer changes to process models ad hoc change to process instances design or revise semantic constraints Process Repository Process Execution Engine case data Semantic Constraint Designer task status MIP Adapter MIP model in CPlex change operations MIP Solver (CPLEX) Rule Engine instance-level constraints model-level constraints Semantic Constraints Repository Figure An overall architecture for implementation 162

175 The MIP solver is responsible for checking the compliance of process models and instances by solving MIP formulations. Once a process model is inserted or updated into the repository, the MIP model is added or updated as well. Then the MIP solver will look for its associated semantic constraints in the repository. If such semantic constraints are found, they are added to the MIP model and trigger the algorithm Compliance_design. Depending on the resulting objective function, the user is notified whether the process model is compliant or not (with suggested compensation operations). If the semantic constraint set pertaining to any process model is updated, the design time compliance checking will be repeated. The CPLEX tool is the core module used to solve MIP models by the MIP solver. When a process model is initialized, case data are entered by workflow participants and passed from the process execution engine to the rule engine. If conditions are met, ECA rules are triggered and produce instance-level semantic constraints. The MIP solver checks the compliance of the initialized instance by running the Compliance_init algorithm that takes into consideration both model-level and instance-level constraints. Compensation operations are suggested to users if the instance is not compliant. During execution, the process execution engine maintains the runtime status of all running instances and sends their task status back to the MIP solver. This information is maintained and used for checking the validity of ad hoc changes during instance execution, according to the Compliance_runtime algorithm. This is helpful for workflow models that involve a variety of participants and are prone to exceptional events. This system should give users the right to decide whether to apply non-compliant changes. If so, such exceptions should be logged in the system and analyzed for future use. For example, if certain process violation occurs frequently, the process designer should consider updating the corresponding process model by incorporating the repeated violation System realization We expect that the actual realization of our proposal will proceed through stages. In the first stage, Hospital Information Systems (HIS) should be designed and deployed to manage Electronic Medical Records (EMR). Other systems in hospital should be integrated as well, such as Laboratory Information 163

176 Systems (LIS), Radiology Information Systems (RIS), and Picture archiving and communication system (PACS). With such an integrated system, patient records can be managed electronically, and their medical condition or events can be captured in real time. Meanwhile, medical knowledge originally in the freetext form is increasingly available in standard languages like GLIF and Arden Syntax. In the second stage, they will have to formally model the clinical processes and build a repository of process knowledge in a standard format as discussed above. In addition, they will need to implement a process execution engine that can execute these processes stored in the repository. This allows routing of tasks among various medical staff and decision support to be delivered to them. Finally, in the third stage they will need to integrate the architecture proposed by us into their system. The conversion of nation-wide clinical protocols (e.g., from AHRQ or AHA) from GLIF or Arden Syntax into our semantic constraint specification language can be automated to a large extent. While this may require a fairly substantial one-time effort as in any large systems development project it will quicken the pace of injecting evidence-based medicine. Of course, the process repository must be continuously maintained and kept up to date with latest clinical findings. It will also be useful to migrate this architecture to the cloud eventually. This will facilitate maintenance of the semantic constraints in a collaborative way across organizations and reduce duplication of effort by creating shared libraries. Subsequently, each healthcare provider may adapt this knowledge suitably to their own needs and as per their policies. At present, many hospitals and medical practices are still in stage one, while some are at stage two. Therefore, the rollout of stage three will happen only gradually Limitations and extensions A limitation of our approach is that it does not handle (implicit) OR structures and loops in general. However, it can be extended to accommodate them. In contrast to explicit ORs, at implicit OR split nodes, more than one path may be activated. Thus, for example, in the process model P2 of Figure 7.9, one might wish to express that either one, or both, of CT scan (T5) or MRI (T13) may be conducted at the discretion of the doctor. This can be expressed by two constraints as: T5 + T13 2; T5 + T13 2. Workflows 164

177 involving structured loops with exactly one entry and one exit node can be handled by our approach by simply omitting the return branch and performing the analysis. Unstructured loops occur less frequently, and are not covered. Another limitation of our approach is that we do not consider verification of data flows in this dissertation. Every task in a workflow process needs input data and it produces output data. As a separate exercise it is necessary to verify that in general if a task i produces data required by another task j, then task j should not appear before task i in the control flow of the process. Moreover, encoding semantic constraints from medical guidelines and rules can be complicated and time consuming for care professionals. Thus, actual implementation of such a system in hospitals has to move through stages as noted above. It is necessary to have a process management system in place first before an adaptive and flexible system is introduced. Our approach can also be further extended to handle concurrent changes at both model and instance levels. For example, say, a new task is to be inserted into the process model and this change is to propagate to the already running instances initiated from that model. Meanwhile, some changes may also be made to individual running instances. The changes at the model and the instance levels can be concurrent. To handle such scenarios, their relationship should be formally analyzed to check if it is identical, overlapping, disjoint, inclusive, etc. Our compliance checking algorithm can be modified to analyze the relationship to see if a consistent change set is possible. Otherwise, the change must be rejected Conclusions Increasingly, processes that arise in healthcare and other applications need flexibility and ease of adaptability to changes and events in the environment. They cannot be treated as rigid production processes. Rule based approaches, such as ECA, lend themselves well to the design of such processes. However, it is also important to ensure that changes conform to semantic constraints. We proposed a novel approach for detecting and correcting any violations to such constraints. It is based on a formal 165

178 language for describing constraints that captures a large set of existential and coordination relationships among tasks, a MIP formulation, and solution and analysis techniques. We argue that our approach is superior to pure logic-based approaches. Manually verifying the semantic correctness of process models and instances is infeasible when processes are large and complicated. We provide an effective solution to address this issue. First, we can verify if a process model is compliant or not. If it is noncompliant, our approach will provide an optimal solution that contains minimal compensation operations in order for the process model to become compliant. Second, our algorithm can calculate compliance degrees of process models. Thus, it serves as a useful mechanism to compare the degree of compliance among different models. Third, we can ensure the validity of dynamic change operations that were issued by workflow participants or users. Finally, semantic constraints can be shared and customized by organizations since they reflect common domain knowledge and organizational policies. We described our approach in the context of a healthcare process to illustrate how it can be used to reduce medical errors. So far we only have a basic proof of concept, and are working on building a larger prototype. Our goal is to integrate it within an APMS environment with suitable plug-ins. We also plan to do more exhaustive testing with larger examples and run experiments with other data sets. 166

179 Chapter 8 Conclusions and Future Work In this dissertation, we have explored the research challenges raised in the flexible adaptation of business processes caused by various dynamic contexts, as well as their compliance issues Conclusions This dissertation follows a formal design science methodology to contextualize business processes and model contexts, as described in Chapter 3. We provide adaptive processes in an automatic and intelligent way (i.e., context aware) and also take into account compliance checking of the possible adaptation patterns against semantic constraints. This approach is very useful especially in dynamic and knowledge-intensive environments. First, in Chapter 4, we investigate the event detection in a complex and dynamic environment (e.g., RFID-enabled smart hospital) where multiple event streams are produced in a high speed fashion. We presents a novel approach to process surgical events and provide sense-and-response capability for smart hospitals. This approach can help to accelerate the adoption of sensor technologies in healthcare and provide a feasible way to solve the interoperability problem. The performance evaluation in terms of processing delay and detection accuracy shows that our approach is reasonable and acceptable. With the aim to improve patient safety and reduce operational costs, our study suggests a possible solution to capture critical context in a surgery. Second, this dissertation focuses on process adaptation at the model level and the instance level triggered by different contexts. Chapter 4 described a novel proposal for configuring flexible business processes based on combining process templates with business rules. The configuration process is enabled by business policy modeled as context. This approach allows separation of basic process flow from business policy elements in the design of a process and also integrates resource and data needs of a process tightly. We also developed a novel scheme for storing process variants as strings based on a postorder traversal of a process tree. We showed that such a representation lends itself well to manipulation 167

180 and also for searching a repository of variants. A proof of concept prototype has been partially implemented to test and evaluate this methodology based on the proposed architecture. Another type of process adaptation occurs at the process instance level, where context is highly dependent on the application domain. Chapter 5 investigated this problem in the context of clinical settings. We proposed a framework called ConFlexFlow, and showed how adaptable clinical pathways can be designed taking into account medical knowledge in the form of rules and detailed contextual information to achieve a high quality outcome. A clinical workflow delineates the main pathways to be taken. These pathways are selected during workflow execution based on rules that encapsulate medical knowledge and the context model that captures various aspects of the clinical settings. We developed a proof of concept prototype to demonstrate the applicability of the proposed approach. It can lead to better, smarter routing of clinical workflows by applying rules with a deeper understanding of context. Moreover, various kinds of alerts can improve patient safety, and reduce treatment errors. Finally, better and quicker recommendations can be generated at various decision points. Finally, Chapter 6 proposed a novel approach to check the compliance of process models against semantic constraints and the validity of process change operations using Mixed-Integer Programming (MIP), since manually verifying the semantic correctness of process models and instances is infeasible when processes are large and complicated. The MIP formulation allows us to describe existential, dependency, ordering and various other relationships among tasks along with business policies in a standard way. In addition to incorporating the semantic constraint specifications into a MIP formulation, we introduce three novel ideas in this dissertation: (1) the notion of a degree of compliance of processes based on a penalty function; (2) the concepts of full and partial validity of change operations; and (3) the idea of compliance by compensation. Thus, compensation operations derived from compliance degree can transform a non-compliant process into a compliant one both at design and execution time. We illustrate our approach in the context of a healthcare workflow as a way to reduce medical errors, and argue that it is more elegant and superior to a pure logic based approach. Complex scenarios with multiple concurrent processes (and constraints across them) for a single patient are also considered. 168

181 8.2. Future Work The long term vision of this research is to integrate the context-aware capability in the process design to enable better flexibility and adaptability so that adaptive process management systems can be applied in dynamic and knowledge-intensive environments to improve work efficiency, e.g., mobile applications, clinical settings, and IT service outsourcing. This dissertation opens many research directions both in the research methodology of business process management and many domains where BPM techniques can be applied. I will focus on the following directions for future work: Process-driven patient-centric health care: Patient-centered communication is an essential act in health care delivery to support an aging population. I plan to apply workflow technologies to streamline patient-provider communication and knowledge management techniques (e.g., ontologies and rules) to support medical knowledge representation and sharing. The research can help develop technology-based interventions to improve care quality and patient satisfaction. For example, process automation has a potential impact to integrate the team, push information when and where it is needed, manage communication points, and make decisions. Another example is telemedicine, which shifts much care to the home or to local telemedicine centers where patients can be connected to diagnostic devices, while physicians can concentrate on high-level tasks. Process mining and data analytics: Process mining aims at extracting process knowledge from event logs which may originate from all kinds of systems. By revealing the original processes, it can suggest key performance indicators such as number of operations and length of waiting lists. I plan to apply theories and approaches in process mining to provide insights in improving clinical processes and operational efficiency. Further, I am interested in combining process mining techniques with real-time data analytics such as complex event processing to inform decision making. Other techniques such as online analytical processing (OLAP) can be integrated as well. Ontologies and rules for Business Process Management (BPM): Ontologies and business rules are key components in enterprise computing to support business processes. While many have recognized their importance, there are many open research challenges to be addressed. I will investigate the fundamental 169

182 research that explores ontological foundations, languages, and methods for enterprise and business modeling, but also drawn to applied research that looks into enhancing business rule engines and BPM systems by ontologies. This program of research will enhance the semantics in BPM approaches and improve the interoperability in enterprise integration. Adoption of information technologies in healthcare: Information technology is seen as a panacea that can solve these problems and improve health care quality yet progress has been limited. Thus, it is very important to explore the factors that impact the adoption of IT in healthcare and provide insights for current practice in both industry and academia. I have investigated the current status of RFID applications in healthcare and presented the results in my literature review. Similar studies can be carried out for other technologies, such as workflow systems, EHR, etc. Social, ethical, legal and organizational issues may have a major impact and they should be carefully examined through case studies and interviews. 170

183 References Adams, M., Hofstede, A.H.M.t., Aalst, W.M.P.v.d., and Edmond, D "Extensible and context-aware exception handling for workflows," in: Proceedings of CoopIS'07. pp Adams, M., ter Hofstede, A., Edmond, D., and van der Aalst, W "Worklets: A service-oriented implementation of dynamic flexibility in workflows," in: OTM'06. pp Agency for Healthcare Research and Quality (AHRQ). "National Guideline Clearinghouse (NGC): a public resource for evidence-based clinical practice guidelines." Retrieved on Jan 8 th, 2012, from Alexandrou, D., Xenikoudakis, F., and Mentzas, G "SEMPATH: semantic adaptive and personalized clinical pathways," in: 2009 International Conference on ehealth, Telemedicine, and Social Medicine (etelemed '09). pp Ardissono, L., Furnari, R., Goy, A., Petrone, G., and Segnan, M "Context-aware workflow management," in: Proceedings of the 7th international conference on Web engineering. pp Awad, A., Weidlich, M., and Weske, M "Specification, verification and explanation of violation for data aware compliance rules," ICSOC-ServiceWave, pp Awad, A., Weidlich, M., and Weske, M "Visually specifying compliance rules and explaining their violations for business processes," Journal of Visual Languages & Computing (22:1), pp Bae, J., Bae, H., Kang, S.H., and Kim, Y "Automatic control of workflow processes using ECA rules," IEEE Transactions on Knowledge and Data Engineering (16:8), pp Baldauf, M., Dustdar, S., and Rosenberg, F "A survey on context-aware systems," International Journal of Ad Hoc and Ubiquitous Computing (2:4), pp Bazire, M., and Brézillon, P "Understanding context before using it," in: 5th International and Interdisciplinary Conference on modeling and using context, A. Dey, B. Kokinov, D. Leake, et al. (eds.). Paris, France: pp Bechhofer, S., Harmelen, F.v., Hendler, J.A., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., and Stein, L.A "OWL Web Ontology Language Reference," in: W3C Recommendation, M. Dean and G. Schreiber (eds.). Berner, E.S "Clinical decision support systems: State of the Art," in: AHRQ Publication No EF. Rockville, Maryland: Agency for Healthcare Research and Quality. Blaser, R., Schnabel, M., Biber, C., Bäumlein, M., Heger, O., Beyer, M., Opitz, E., Lenz, R., and Kuhn, K.A "Improving pathway compliance and clinician performance by using information technology," International Journal of Medical Informatics (76:2-3), pp Burton-Jones, A., Storey, V.C., Sugumaran, V., and Ahluwalia, P "A Semiotic Metrics Suite for Assessing the Quality of Ontologies," Data & knowledge engineering (55:1), pp Cardoso, J., Mendling, J., Neumann, G., and Reijers, H "A discourse on complexity of process models," in: Business Process Management Workshops. pp Ceccarellia, M., Stasioa, A.D., Donatiellob, A., and Vitaleb, D "A Guideline Engine For Knowledge Management in Clinical Decision Support Systems (CDSSs)," in: Proceedings of SEKE. pp Charfi, A., and Mezini, M "Aspect-oriented workflow languages," On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE (Part I), pp Charfi, A., and Mezini, M "Ao4bpel: An aspect-oriented extension to bpel," World Wide Web (10:3), pp Charfi, A., Müller, H., and Mezini, M "Aspect-oriented business process modeling with AO4BPMN," Modelling Foundations and Applications, pp Chen, B., Avrunin, G.S., Henneman, E.A., Clarke, L.A., Osterweil, L.J., and Henneman, P.L "Analyzing medical processes," Proceedings of the 30th international conference on Software engineering, Leipzig, Germany, pp

184 Chiu, D.K.W., Li, Q., and Karlapalem, K "A meta modeling approach to workflow management systems supporting exception handling," Information Systems (24:2), pp Christov, S., Chen, B., Avrunin, G., Clarke, L., Osterweil, L., Brown, D., Cassells, L., and Mertens, W "Rigorously defining and analyzing medical processes: An experience report," MoDELS Wokshops, H. Giese (ed.), Nashville, TN, USA, pp Clarke, L.A., Avrunin, G.A., and Osterweil, L.J "Using software engineering technology to improve the quality of medical processes," in: Companion of the 30th international conference on Software engineering. Leipzig, Germany: pp Dadam, P., Reichert, M., and Kuhn, K "Clinical Workflows: The Killer Application for Process Oriented Information Systems?," in: Proceedings of 4th International Conference on Business Information Systems (BIS'00). Poznan, Poland. Dang, J., Hedayati, A., Hampel, K., and Toklu, C "An ontological knowledge framework for adaptive medical workflow," Journal of Biomedical Informatics (41:5), pp Dang, J., Hedayati, A., Hampel, K., and Toklu, C "Personalized medical workflow through semantic Business Process Management," in: 11th International Conference on Enterprise Information Systems (ICEIS). De Leoni, M., Mecella, M., and De Giacomo, G "Highly dynamic adaptation in process management systems through execution monitoring," Proceedings of the 5th international conference on Business Process Management (BPM'07), pp Dey, A.K "Understanding and using context," Personal Ubiquitous Comput. (5:1), pp 4-7. Dumas, M., Van Der Aalst, W., and Ter Hofstede, A Process-aware information systems. Wiley Online Library. Fieschi, M., Dufour, J.C., Staccini, P., Gouvernet, J., and Bouhaddou, O "Medical decision support systems: old dilemmas and new paradigms," Methods Inf Med (42:3), pp Forgy, C.L "Rete: A fast algorithm for the many pattern/many object pattern match problem," Artificial intelligence (19:1), pp Friedman-Hill, E., and others "Jess, the rule engine for the java platform," Sandia National Laboratories. Fuhrer, P., and Guinard, D "Building a smart hospital using RFID technologies," in: 1st European Conference on ehealth (ECEH'06). Fribourg, Switzerland: pp Goedertier, S., and Vanthienen, J "Compliant and flexible business processes with business rules," in: 7th Workshop on Business Process Modeling, Development and Support (BPMDS 06) at CAiSE. pp Goedertier, S., and Vanthienen, J "Declarative process modeling with business vocabulary and business rules," in: OTM 2007 Workshops, Z. Tari and P. Herrero (eds.). Springer Berlin / Heidelberg, pp Governatori, G., Hoffmann, J., Sadiq, S., and Weber, I "Detecting Regulatory Compliance for Business Process Models through Semantic Annotations," in: BPM 2008 Workshops. pp Governatori, G., and Sadiq, S "The Journey to Business Process Compliance," in: Handbook of Research on Business Process Modeling, J. Cardoso and W.v.d. Aalst (eds.). pp Hagen, C., and Alonso, G "Exception handling in workflow management systems," IEEE Transactions on software engineering (26:10), pp Hähnle, R "Many-valued logic and mixed integer programming," Annals of Mathematics and Artificial Intelligence (12:3), pp Hallerbach, A., Bauer, T., and Reichert, M "Capturing variability in business process models: the Provop approach," Journal of Software Maintenance and Evolution: Research and Practice (22:6-7), pp HEARTFAID team "HF Ontology." Italy: University of Calabria, Department of Electronics, informatics, Systems (DEIS). 172

185 Heravizadeh, M., and Edmond, D "Making workflows context-aware: a way to support knowledge-intensive tasks," in: Proceedings of the fifth on Asia-Pacific conference on conceptual modelling. Wollongong, NSW, Australia: pp Hong, J., Suh, E., and Kim, S.J "Context-aware systems: A literature review and classification," Expert Systems with Applications (36:4), pp Hooker, J.N., and Osorio, M.A "Mixed logical-linear programming," Discrete Applied Mathematics (96-97:1), pp Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., and Dean, M "SWRL: A semantic web rule language combining OWL and RuleML," W3C Member submission (21). Huth, M., and Ryan, M Logic in computer science. New York: Cambridge University Press Cambridge. IBM "ILOG CPLEX Optimization Studio, High-performance software for mathematical programming and optimization." Retrieved on April 9 th, 2012, from 01.ibm.com/software/integration/optimization/cplex-optimizer/ Isern, D., and Moreno, A "Computer-based execution of clinical guidelines: A review," International Journal of Medical Informatics (77:12), pp JBoss. 2010a. "Drools Expert." Retrieved on May 16 th, 2012, from JBoss. 2010b. "Drools Flow." Retrieved on May 16 th, 2012, from JBoss. 2010c. "Drools Fusion." Retrieved on May 16 th, 2012, from Kataria, P., Juric, R., Paurobally, S., and Madani, K "Implementation of ontology for intelligent hospital wards," in: the 41st Hawaii International Conference on System Sciences (HICSS'08). Kloppmann, M., König, D., Leymann, F., Pfau, G., Rickayzen, A., von Riegen, C., Schmidt, P., and Trickovic, I "WS-BPEL Extension for Sub-processes BPEL-SPE," Joint white paper, IBM and SAP. Ko, R.K.L "A computer scientist's introductory guide to business process management (BPM)," in: ACM Crossroads. ACM press: pp Kohn, L.T., Corrigan, J.M., Donaldson, M.S., and others "To err is human: building a safer health system," in: National Academy Press. Washington DC. Kumar, A., and Yao, W "Process Materialization Using Templates and Rules to Design Flexible Process Models," in: Proceedings of the 2009 International Symposium on Rule Interchange and Applications. Las Vegas, Nevada: Springer-Verlag, pp Kumar, A., and Yao, W "Design and management of flexible process variants using templates and rules," Computer In Industry (63:2), pp Kumar, A., Yao, W., Chu, C.-H., and Li, Z "Ensuring compliance with semantic constraints in process adaptation with rule-based event processing," in Proceedings of the 2010 international conference on Semantic web rules, Berlin, Heidelberg: Springer-Verlag, pp Kumar, A., Yao, W., and Chu, C.H "Ensuring Process Compliance with Semantic Constraints using Mixed-Integer Programming (MIP)," submitted to INFORMS Journal on Computing (Condtionally accepted). Kumar, K., and Narasipuram, M.M "Defining requirements for business process flexibility," in: Workshop on Business Process Modeling, Design and Support (BPMDS06), Proceedings of CAiSE06 Workshops. pp La Rosa, M., and Dumas, M "Configurable Process Models: How To Adopt Standard Practices In Your How Way?," in: BPTrends Newsletter. Lenz, R., and Reichert, M "IT support for healthcare processes-premises, challenges, perspectives," Data & Knowledge Engineering (61:1), pp Liu, Y., Müller, S., and Xu, K "A static compliance-checking framework for business process models," IBM Systems Journal (46:2), pp

186 Lu, R., Sadiq, S., and Governatori, G "On managing business processes variants," Data & Knowledge Engineering (68:7), pp Luckham, D The power of events: an introduction to complex event processing in distributed enterprise systems. Springer. Lum, W.Y., and Lau, F.C.M "A context-aware decision engine for content adaptation," IEEE pervasive computing (1:3), pp Luo, Z., Sheth, A., Kochut, K., and Miller, J "Exception handling in workflow systems," Applied Intelligence (13:2), pp Ly, L., Rinderle-Ma, S., Göser, K., and Dadam, P "On enabling integrated process compliance with semantic constraints in process management systems," Information Systems Frontiers (Online first). Ly, L.T., Rinderle, S., and Dadam, P "Integration and verification of semantic constraints in adaptive process management systems," Data & Knowledge Engineering (64:1), pp Mathe, J., Werner, J., Lee, Y., Malin, B., and Ledeczi, A "Model-based design of clinical information systems," Methods of Information in Medicine (47:5), pp Mathe, J.L., Martin, J.B., Miller, P., Lédeczi, Á., Weavind, L.M., Nadas, A., Miller, A., Maron, D.J., and Sztipanovits, J "A Model-Integrated, Guideline-Driven, Clinical Decision-Support System," in: IEEE Software. pp Mendling, J., and Neumann, G "Error metrics for business process models," in: 19th International Conference on Advanced Information Systems Engineering (CAISE 2007), Trondheim, Norway. Modafferi, S., Benatallah, B., Casati, F., and Pernici, B "A Methodology for Designing and Managing Context-Aware Workflows " in: Mobile Information Systems II, J. Krogstie, K. Kautz and D. Allen (eds.). New York: Springer-Verlag, pp Müller, R., Greiner, U., and Rahm, E "Agentwork: a workflow system supporting rule-based workflow adaptation," Data & Knowledge Engineering (51:2), pp Nakamura, M., Kushida, T., Bhamidipaty, A., and Chetlur, M "A Multi-layered Architecture for Process Variation Management," in: World Conference on Services - II (SERVICES-2 '09). pp Namiri, K., and Stojanovic, N "Pattern-based design and validation of business process compliance," in: OTM 2007, Part I, R. Meersman and Z. Tari (eds.). pp Noy, N.F., and McGuinness, D.L "Ontology Development 101: A Guide to Creating Your First Ontology." Nunes, V.T., Santoro, F.M., and Borges, M.R.S "A context-based model for Knowledge Management embodied in work processes," Information Sciences (179:15), pp O'Connor, M.J., Knublauch, H., Tu, S.W., Grossof, B., Dean, M., Grosso, W.E., and Musen, M.A "Supporting Rule System Interoperability on the Semantic Web using SWRL and Jess," Fourth International Semantic Web Conference (ISWC2005), Galway, Ireland, pp Ohno-Machado, L.a.G., J.H. and Murphy, S.N. and Jain, N.L. and Tu, S.W. and Oliver, D.E. and Pattison-Gordon, E. and Greenes, R.A. and Shortliffe, E.H., Barnett, G "The Guideline Interchange Format (GLIF)," Journal of the American Medical Informatics Association (5:4), p 357. OMG "Business Process Modeling Notation (BPMN) Version 1.0. OMG Final Adopted Specification." Retrieved on May 16 th, 2012, from V1.0.pdf OMG "Business Process Model And Notation (BPMN) Version 2.0." Retrieved on May 16 th, 2012, from Ongenae, F., Backere, F.D., Steurbaut, K., Colpaert, K., Kerckhove, W., Decruyenaere, J., and Turck, F.D "Towards computerizing intensive care sedation guidelines: design of a rule-based architecture for automated execution of clinical guidelines," BMC Medical Informatics and Decision Making (10:3). 174

187 OpenClinical "Arden Syntax." Retrieved on April 9 th, 2012, from Osterweil, L.J., Avrunin, G.S., Chen, B., Clarke, L.A., Cobleigh, R., Henneman, E.A., and Henneman, P.L "Engineering Medical Processes to Improve Their Safety," IFIP International Federation for Information Processing, Geneva, Switzerland: Boston Springer, pp Parsia, B., and Sirin, E "Pellet: An OWL DL reasoner," in: Third International Semantic Web Conference-Poster. Peleg, M., and Tu, S "Decision support, knowledge representation and management in medicine," in: Yearbook of medical informatics. pp Peleg, M., Tu, S., Bury, J., Ciccarese, P., Fox, J., Greenes, R.A., Hall, R., Johnson, P.D., Jones, N., Kumar, A., Miksch, S., Quaglini, S., Seyfang, A., Shortliffe, E.H., and Stefanelli, M "Comparing computer-interpretable guideline models: A case-study approach," Journal of the American Medical Informatics Association (10:1), pp Peterson, J.L Petri net theory and the modeling of systems. Prentice Hall PTR Upper Saddle River, NJ, USA. Ploesser, K., Peleg, M., Soffer, P., Rosemann, M., and Recker, J.C "Learning from context to improve business processes," BPtrends (6:1), pp 1-7. Regev, G., Soffer, P., and Schmidt, R "Taxonomy of flexibility in business processes," in: Business Process Modeling, Development, and Support (BPMDS'06). pp Reichert, M., and Dadam, P "ADEPT flex supporting dynamic changes of workflows without losing control," Journal of Intelligent Information Systems (10:2), pp Rinderle, S., Reichert, M., and Dadam, P "Correctness criteria for dynamic changes in workflow systems a survey," Data & knowledge engineering (50:1), pp Rosemann, M., and Recker, J "Context-aware process design: Exploring the extrinsic drivers for process flexibility," Proceedings of the 18th International Conference on Advanced Information Systems Engineering (CAISE). Rosemann, M., Recker, J., and Flender, C "Contextualisation of business processes," International Journal of Business Process Integration and Management (3:1), pp Rosemann, M., and van der Aalst, W.M.P "A configurable reference modelling language," Information Systems (32:1), pp Sadiq, S., Governatori, G., and Namiri, K "Modeling Control Objectives for Business Process Compliance," BPM 2007, G. Alonso, P. Dadam and M. Rosemann (eds.), pp Sadiq, S.W., Orlowska, M.E., and Sadiq, W "Specification and validation of process constraints for flexible workflows," Information Systems (30:5), pp Schilit, B., Adams, N., and Want, R "Context-aware computing applications," in: First Workshop on Mobile Computing Systems and Applications, WMCSA pp Schnieders, A., and Puhlmann, F "Variability mechanisms in e-business process families," in: 9th International Conference on Business Information Systems (BIS 2006). pp Schonenberg, M.H., Mans, R.S., Russell, N.C., Mulyar, N.A., and van der Aalst, W.M.P "Towards a taxonomy of process flexibility (extended version)," BPM Center Report BPM Sell, C., and Springer, T "Context-sensitive adaptation of workflows," in: Proceedings of the doctoral symposium for ESEC/FSE on Doctoral symposium. Amsterdam, The Netherlands: ACM, pp Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., and Katz, Y "Pellet: A Practical OWL-DL Reasoner," Web Semantics: Science, Services and Agents on the World Wide Web (5:2), pp Sittig, D.F., Wright, A., Osheroff, J.A., Middleton, B., Teich, J.M., Ash, J.S., Campbell, E., and Bates, D.W "Grand challenges in clinical decision support," Journal of Biomedical Informatics (41:2), pp Standford University "Protégé: An open source ontology editor and knowledge-based framework (version 3.4)." Retrieved on April 9 th, 2012, from 175

188 Tan, Y.H., and Thoen, W "Formal aspects of a generic model of trust for electronic commerce," Decision Support Systems (33:3), pp Teich, J.M., Osheroff, J.A., Pifer, E.A., Sittig, D.F., and Jenders, R.A "Clinical decision support in electronic prescribing: recommendations and an action plan," British Medical Journal (12:4), p 365. van der Aalst, W., Pesic, M., and Schonenberg, H. 2009a. "Declarative workflows: Balancing between flexibility and support," Computer Science - Research and Development (23:2), pp van der Aalst, W.M.P "The Application of Petri Nets to Workflow Management," The Journal of Circuits, Systems and Computers (8:1), pp van der Aalst, W.M.P., Dumas, M., Gottschalk, F., Hofstede, A.H.M., Rosa, M.L., and Mendling, J. 2009b. "Preserving correctness during business process model configuration," Formal Aspects of Computing (22:3), pp van der Aalst, W.M.P., Ter Hofstede, A.H.M., Kiepuszewski, B., and Barros, A.P "Workflow Patterns," Distributed and Parallel Databases (14:3), pp van der Aalst, W.M.P., Weske, M., and Grünbauer, D "Case handling: a new paradigm for business process support," Data & knowledge engineering (53:2), pp van Eijndhoven, T., Iacob, M.E., and Ponisio, M.L "Achieving business process flexibility with business rules," in: 12th International IEEE Enterprise Distributed Object Computing Conference. Munich, Germany: pp Van Hee, K., Lomazova, I., Oanea, O., Serebrenik, A., Sidorova, N., and Voorhoeve, M "Nested nets for adaptive systems," in: Proceedings of the 27th International Conference on Application and Theory of Petri Nets. pp van Hee, K., Oanea, O., Serebrenik, A., Sidorova, N., Voorhoeve, M., and Lomazova, I.A "Checking properties of adaptive workflow nets," Fundamenta Informaticae (79:3), pp Vanderfeesten, I., Cardoso, J., Mendling, J., Reijers, H.A., and van der Aalst, W "Quality metrics for business process models," BPM and Workflow handbook, pp Vieira, V., Tedesco, P., and Salgado, A.C "Designing context-sensitive systems: An integrated approach," Expert Systems with Applications. Walzer, K., Breddin, T., and Groch, M "Relative temporal constraints in the Rete algorithm for complex event detection," in: Proceedings of the second international conference on Distributed event-based systems. Rome, Italy: ACM, pp Wang, F., Liu, S., and Liu, P "Complex RFID event processing," The VLDB Journal (18), pp Wang, F., Liu, S., Liu, P., and Bai, Y "Bridging physical and virtual worlds: Complex event processing for RFID data streams," in: International Conference on Extending Database Technology (EDBT). Munich, Germany: pp Weber, B., and Reichert, M. "Refactoring Process Models in Large Process Repositories," in: Advanced Information Systems Engineering, Z. Bellahsène and M. Léonard (eds.). Berlin, Heidelberg: Springer Berlin Heidelberg, pp Weber, B., Reichert, M., and Rinderle-Ma, S "Change patterns and change support features enhancing flexibility in process-aware information systems," Data & Knowledge Engineering (66:3), pp Weber, B., Reichert, M., Rinderle-Ma, S., and Wild, W "Providing integrated life cycle support in proces-aware information systems," International Journal of Cooperative Information Systems (18:1), pp Wise, A "Little-JIL 1.0 Language Report. Technical report (UM-CS )," Department of Computer Science, University of Massachusetts, Amherst, MA. Yao, W "Hospital ontology." Retrieved on May 16 th, 2012, from Yao, W., Chu, C.-H., Kumar, A., and Li, Z "Using ontology to support context awareness in healthcare," in: Proceedings of the 19th WITS. Phoenix, Arizona: pp

189 Yao, W., Chu, C.H., and Li, Z "Leveraging Complex Event Processing for Smart Hospitals using RFID," Journal of Network and Computer Applications (34:3), pp Yao, W., and Kumar, A. 2012a. "CONFlexFlow: Integrating Flexible Clinical Pathways into Clinical Decision Support Systems using Context and Rules," Decision Support Systems (Special Issue on Healthcare modeling), forthcoming. Yao, W., and Kumar, A. 2012b. "Integrating clinical pathways into CDSS using context and rules: a case study in heart disease," Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, New York, NY, USA: ACM, pp Ye, Y., Jiang, Z., Diao, X., Yang, D., and Du, G "An ontology-based hierarchical semantic modeling approach to clinical pathway workflows," Computers in Biology and Medicine (39:8), pp Zang, C., Fan, Y., and Liu, R "Architecture, implementation and application of complex event processing in enterprise information systems based on RFID," Information Systems Frontiers (10:5), pp Zhao, X., and Liu, C "Version Management in the Business Process Change Context " in: 5th International Conference on Business Process Management (BPM'07), G. Alonso, P. Dadam and M. Rosemann (eds.). Brisbane, Australia: pp Zhu, L., Osterweil, L.J., Staples, M., and Kannengiesser, U "Challenges Observed in the Definition of Reference Business Processes," in: BPM 2007 Workshops. pp Zimmermann, A., Lorenz, A., and Oppermann, R "An operational definition of context," in: Proceedings of the 6th international and interdisciplinary conference on Modeling and using context. Roskilde, Denmark: Springer-Verlag, pp

190 APPENDIX A A SINGLE MODEL THAT INTEGRATES RULES R1-R6 Note: some tasks have to repeated because there is no easy way to model all scenarios using control nodes Notation: T1: Receive claim T2: Validate claim T3: Review damage 1 T3-2: Review damage 2 T4: Receive report T5: Determine settlement T6: Approval 1 (by manager) T7: Approval 2 (by senior manager) T7-2: Approval 4 (by VP) T8: Make payment 178

APPENDIX B THE MEDICAL PLAN FOR HEART ATTACK IN FLOW CHART Source: http://lis.irb.

191 APPENDIX B THE MEDICAL PLAN FOR HEART ATTACK IN FLOW CHART Source: Refer to for the legend 179

Enabling Flexibility in Process-Aware

Manfred Reichert Barbara Weber Enabling Flexibility in Process-Aware Information Systems Challenges, Methods, Technologies ^ Springer Part I Basic Concepts and Flexibility Issues 1 Introduction 3 1.1 Motivation