BPM optimization Part 1: Introduction to BPM code optimization XPDL source code tuning 4/12/2012 IBM Lukasz Osuszek Abstract: This article describes how optimization techniques characteristic of source code optimization and tuning are applicable to BPM. An example of using tweaking methods with BPM is presented resulting in better performance and cost savings About the author: Lukasz Osuszek is with IBM ECM technology pre-sales, architect and technical support for IBM FileNet P8. He has eight years experience in ECM area, and four years experience in IBM Polish Software Group. Reach out to him at lukasz.osuszek@pl.ibm.com
BPM code BPM systems are a type of high-level programming tools, and workflow map optimization (optimization of algorithms created by process engines) may be treated as code optimization. This is subject to the same laws and dependencies as the optimization of algorithms created in Java or C++ engines. Process engines (such as the P8 BPM Process Engine) interpret the algorithm of a workflow map recorded in a graphic form (XPDL). This three part article present ways, in which traditional algorithm optimization techniques, may be applied in the process of business process optimization in BPM environment. Performance optimization is usually considered the final phase of application code streamlining. Therefore, it seems correct to treat code optimization as a complex process comprised of several phases. Better results are achieved when optimization is used early in the project. Yet we should not get paranoid and try to optimize each line of the code. A holistic approach to the process is preferred here, as well as identifying weaknesses and so-called bottlenecks with reference to the project as a whole. Once a problem is identified, it should not be addressed right away but subsequent steps in the algorithm should be analyzed and a comprehensive view of the problem should be adopted. Sometimes the right solution lays one step ahead. Optimization algorithms described here are based on the top-down algorithm - from a higher level of abstraction towards the lower, more detailed one. This article should help business consultants and P8 engineers better analyze and optimize workflow maps. After reading this material it will be easy to find some of the patterns and workflow map fragments which could be easily optimized, according to guidelines form this article. Readers should consider reviewing and analyzing existing workflow maps for BPM code optimization. Optimization process 1. Problem identification and understanding. We should start by identifying the process which is most inefficient. Analysis phase could be supported by data from IBM Business Activity Monitor (BAM). After finding the candidate for optimization, we need to find the culprit, i.e. determine the place and reason for business process slowdown. In order to locate a bottleneck, it is best to make use of one of BPM system simulation and analysis programs available on the market. P8 BPM Process Analyzer (now Case Analyzer) and Process Simulator are among the most popular. The "divide and conquer" method is perfect for business process analysis and profiling purposes. It consists in dividing a given task (map/algorithm) into several smaller parts that are further subdivided until the problem is reanalyzed as a set of basic subproblems. Once this is done, we can solve the sub-problems and then put them back together to get the final result. After locating the problem we can start the process of optimization. 2. Algorithm 2
The first step consists in a thorough analysis and monitoring of the algorithm's block performance. In the case of BPM this is equivalent to understanding the logic of a given business process. Without starting to "feel" the essence of the algorithm's functioning, further optimization works seem not to have much sense. What seems most effective is a holistic approach that allows for treating the algorithm as a whole and not just a sum of its individual parts. 3. Code Another phase consists in checking algorithm (i.e. process) implementation. This requires taking a deeper look into the logic of subsequent process steps - both routes, and their related logical conditions as well as loops, comparisons, text data operations and other components are assigned to subsequent steps of the workflow map. Figure 1. Example of a conditional loop. In our first approach we should create a maximally simplified code. This will constitute the "initial" version of the code we will be working on. The "divide and conquer" method will be used here again in the case of more complicated algorithms in order to divide them into less complicated parts. This process facilitates the understanding of algorithm functioning and implementation. It also facilitates the work of the compiler. Figure 2. Top-level Process as divided into sub-processes. After providing a theoretical background, it is now time to get to the practical aspects of optimization. Below you will find some practical guidelines concerning code optimization: Do not change too much at one go Do not expect dramatic improvement in application performance after a single fix 3
Test and compare the existing techniques/algorithms or those proposed by others with your own Remember that sometimes weaker performance in one area may result in increased performance of the whole application Always try to be open-minded - the more solutions you provide, the better your results may be Make use of every piece of information or technical expertise concerning the problem you are working on Look at the problem as a whole from time to time so that you can focus on the actual issue to be solved - sometimes solving a problem in one area may cause trouble somewhere else. You need to monitor the whole application from time to time. Optimization methods Application optimization is a recursive process. It is important to decide when the optimization process should be concluded. It is an easy task if the application's performance level is determined up front. As for example, if according to the project guidelines, image transfer from the server to the customer must take less than 3 seconds, the process of optimization is complete after reaching that threshold. However, most tasks have no clear requirements of this type and sometimes "the faster, the better" approach makes the business analyst lost in the never-ending optimization process. Generally speaking, the problem should be approached in a reasonable and individual manner. There are two types of optimization: 1. Code optimization in the narrow sense and optimization techniques. When writing a code, we can always solve a given task in several ways. Some of them will significantly improve performance. This passive type of optimization is the socalled "code style" optimization (code optimization in the narrow sense). 2. Active optimization means using optimization techniques to eliminate bottlenecks in the application. A set of optimization techniques which are also applicable when working with Workflow paths defined for the needs of BPM is presented below. Using variables In traditional programming environments, local variables should be chosen over wider scope variables. Local variables are those communicated as parameters and declared internally within a function/procedure. Only local variables may be transformed into register variables, and a register variable equals greater speed! Sometimes it is better to copy global data into local variables before using them. This technique is best used with reference to loop variables. Such operations boost the speed of variable copying, which drives better performance. In the case of P8 BPM and maps created with the use of Process Designer, however, the only available option is to use global variables. Yet the rule saying that local variables should be chosen over the global ones has one exception. This applies to arrays with simple types. If the size and components of the array 4
are constant, declaring the array global will save register work. These savings will become significant if we are dealing with defined, constant structures. In P8 BPM, the whole set of process variables is stored in database structures. Variables declared as an array are stored in a separate array dedicated to complex formats. Thanks to external tools tracing the index database load, we can check which approach (several simple variables or array of such types) is more optimal in a given case. Loops In the case of small loops in the algorithm, it is worth using the "loop shortening" method. Generally speaking, it consists in doing what originally required several interactions within a single loop route. This allows for the reduction of costs related to the loop overload. I recommend this technique in the case of loops whose overload is always costly. Examples are provided below: Figure 3. Loop for consisting of three steps The first step checks the condition for continuing the loop - reads the value of the condition parameter and performs a given route, depending on the truth value of the logical condition. Figure 4. for_begin step assignment The "create folders" step is the body of the loop. It is aimed at creating a particular folder structure. 5
Figure 5. Loop body The last step consists of increasing the loop counter by one value. Figure 6. Loop counter increasing After the loop shortening operation is completed, the code will look as follows: The "create folder" procedure is invoked again within the create folders step, with a modified value of proparray parameters: {"FolderName", "STRING", "Folder"+convert(counter+1, string)} The for_end step has been modified, increasing the loop counter: Counter = Counter +2 According to the conducted analyses, the cost-effectiveness limit for the "loop shortening" method is exceeded with a ratio greater than 4. Since we have already mentioned loops, another good practice is to avoid conditional expressions and logical conditions testing within them. Most logical tests within the loop can be eliminated by SPP or by dividing the loop into two or more loops. One of the most important optimization techniques related to loops is reducing the number of loop conditions. Using loops which are based on several logical conditions is a frequent programming strategy. For instance, if a given condition is true and the loop index is lower than a certain preset value, a loop should be created. In the case of small loops (often consisting solely of the loop index increments), the total cost of loop creation is equivalent to verifying loop conditions. Reducing the number of loop conditions almost 6
always results in increased algorithm performance. The algorithm for looking for an appropriate character in a chain is a fundamental example of such regularity: i = 1; l = Length(s); while ((i <= l) and (s[i] <> c)) do inc(i); Note: Placing the wanted character at the end of the chain (which is an example of using the so-called sentinel) results in combining the two conditions: Figure 7. Adding a sentinel to the loop while As a result, we double the algorithm's speed. This technique is very often used together with the SPP. Case design A lot can be achieved by optimizing algorithms containing the Case expression. This expression is quite complicated for the compiler and requires much work. First, the list of values/ranges undergoes classification (which leads to the conclusion that the order of the conditions is irrelevant). Then the compiler uses a binary comparison tree and the constructed jump address table to check the conditions of the case expression. The algorithm is repeated until all the cases are processed. As you can see, the structure of the tree is crucial for optimization. Therefore, if we are able to group conditions, it is good to create separate case designs for each of the ranges. 7
Figure 8. Case structure representation in XPDL Example: Case x of 100 : proc1; 101 : proc2; 102 : proc3; 103 : proc4; 104 : proc5; 105 : proc6; 200 : proc9; 201 : proc10; 202 : proc11; 203 : proc12; 204 : proc13; end; Should be replaced with: Case x of 100..105 : case x of 100 : proc1; 101 : proc2; 102 : proc3; 103 : proc4; 104 : proc5; 105 : proc6; end; 200..204 : case x of 200 : proc9; 201 : proc10; 202 : proc11; 203 : proc12; 204 : proc13; end; end; 8
Summary Optimization is one of the most important branches of software engineering. The process of optimization often stands opposite to other processes and goals of software engineering. As creators, we must choose between system stability, compactness, transferability and speed. Nevertheless, at the level of the code - represented by the XPDL diagram - optimization is desirable and always beneficial. By combination of advices presented in this material and tools for process simulation, it s easy to prove values of workflow map optimization. I encourage to experiments with BPM code optimization and analyzing results with Process Simulators. Such approach guarantees effectiveness of optimization. More of the optimization techniques are described in the rest of article series. Further part introduces innovative methods of Business Process optimization: Renders conversion of XPDL workflows into Petri Network Modeling Notation for optimization in category of time consumption - which results in tangible savings and economical benefits. Presents workflow map optimization by using multi- Objective algorithms. Application of multi-objective optimization algorithms to enable fully automated optimization of the Workflow map. The mathematical model of the business process may be subject to specific multi-criteria optimization algorithms Resources Heiko Falk, Peter Marwedel, Source code optimization techniques for data flow dominated embedded software, 2004 www.4programmers.net www.dyszla.aplus.pl 9