Design Level Performance Modeling of Component-based Applications. Yan Liu, Alan Fekete School of Information Technologies University of Sydney

Similar documents
Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec

Session 4.2. Switching planning. Switching/Routing planning

Connectivity in Fuzzy Soft graph and its Complement

Scalable Parametric Runtime Monitoring

ABHELSINKI UNIVERSITY OF TECHNOLOGY Networking Laboratory

Measurement and Calibration of High Accuracy Spherical Joints

Microprocessors and Microsystems

Semi-analytic Evaluation of Quality of Service Parameters in Multihop Networks

Quantifying Performance Models

On the End-to-end Call Acceptance and the Possibility of Deterministic QoS Guarantees in Ad hoc Wireless Networks

Cluster ( Vehicle Example. Cluster analysis ( Terminology. Vehicle Clusters. Why cluster?

A Fast Way to Produce Optimal Fixed-Depth Decision Trees

Research on Neural Network Model Based on Subtraction Clustering and Its Applications

Time Synchronization in WSN: A survey Vikram Singh, Satyendra Sharma, Dr. T. P. Sharma NIT Hamirpur, India

An Optimal Algorithm for Prufer Codes *

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Performance Evaluation of TreeQ and LVQ Classifiers for Music Information Retrieval

AADL : about scheduling analysis

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Interval uncertain optimization of structures using Chebyshev meta-models

Optimal shape and location of piezoelectric materials for topology optimization of flextensional actuators

RV-Monitor: Efficient Parametric Runtime Verification with Simultaneous Properties

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Progressive scan conversion based on edge-dependent interpolation using fuzzy logic

Petri Net Based Software Dependability Engineering

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Bit-level Arithmetic Optimization for Carry-Save Additions

Efficient Distributed File System (EDFS)

Topic 5: semantic analysis. 5.5 Types of Semantic Actions

A Novel Dynamic and Scalable Caching Algorithm of Proxy Server for Multimedia Objects

ELEC 377 Operating Systems. Week 6 Class 3

Minimize Congestion for Random-Walks in Networks via Local Adaptive Congestion Control

Link Graph Analysis for Adult Images Classification

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Parallelism for Nested Loops with Non-uniform and Flow Dependences

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

TAR based shape features in unconstrained handwritten digit recognition

The Simulation of Electromagnetic Suspension System Based on the Finite Element Analysis

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Problem Set 3 Solutions

The Codesign Challenge

Simulation Based Analysis of FAST TCP using OMNET++

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

AMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Solving two-person zero-sum game by Matlab

International Journal of Pharma and Bio Sciences HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS ABSTRACT

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

AP PHYSICS B 2008 SCORING GUIDELINES

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

A Semi-parametric Approach for Analyzing Longitudinal Measurements with Non-ignorable Missingness Using Regression Spline

Priority queues and heaps Professors Clark F. Olson and Carol Zander

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

TCP Performance over Current Cellular Access: A Comprehensive Analysis

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Load Balancing for Hex-Cell Interconnection Network

Mathematics 256 a course in differential equations for engineering students

Concurrent Apriori Data Mining Algorithms

Performance Evaluation of Information Retrieval Systems

Programming in Fortran 90 : 2017/2018

Multilabel Classification with Meta-level Features

Intro. Iterators. 1. Access

ETAtouch RESTful Webservices

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Analytic Evaluation of Quality of Service for On-Demand Data Delivery

Color Texture Classification using Modified Local Binary Patterns based on Intensity and Color Information

Topology Design using LS-TaSC Version 2 and LS-DYNA

LOCAL BINARY PATTERNS AND ITS VARIANTS FOR FACE RECOGNITION

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Wishing you all a Total Quality New Year!

A Binarization Algorithm specialized on Document Images and Photos

REGISTRATION OF TERRESTRIAL LASER SCANNER DATA USING IMAGERY INTRODUCTION

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Support Vector Machines

A Real-Time Detecting Algorithm for Tracking Community Structure of Dynamic Networks

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

Steganalysis of DCT-Embedding Based Adaptive Steganography and YASS

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract

A Study on the uncertainty and sensitivity in numerical simulation of parametric roll

USING GRAPHING SKILLS

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Adaptive Class Preserving Representation for Image Classification

Module Management Tool in Software Development Organizations

A fault tree analysis strategy using binary decision diagrams

Analysis of Continuous Beams in General

5 The Primal-Dual Method

A Flexible Solution for Modeling and Tracking Generic Dynamic 3D Environments*

Multiscale Heterogeneous Modeling with Surfacelets

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

Cost-efficient deployment of distributed software services

Computing Cloud Cover Fraction in Satellite Images using Deep Extreme Learning Machine

arxiv: v3 [cs.ds] 7 Feb 2017

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Transcription:

Desgn Level Performane Modelng of Component-based Applatons Tehnal Report umber 543 ovember, 003 Yan Lu, Alan Fekete Shool of Informaton Tehnologes Unversty of Sydney Ian Gorton Paf orthwest atonal Laboratory Rhland WA, USA ISB 1 86487 615 8 Shool of Informaton Tehnologes Unversty of Sydney SW 006

Desgn Level Performane Modelng of Component-based Applatons Yan Lu 1, Alan Fekete 1 and Ian Gorton 1 [jennylu,fekete]@t.usyd.edu.au; an.gorton@pnl.gov 1 Shool of Informaton Tehnologes Unversty of Sydney Australa Paf orthwest atonal Laboratory Rhland, WA 9935,USA Abstrat. In ths paper, we present an approah to predt the performane of omponent-based applatons n the desgn phase of development. We an buld a quanttatve performane model for a proposed system desgn. The nputs needed to produe ths performane predton are a state dagram showng the man watng and resoure usage aspets of the proposed system arhteture, and measurements taken on the mddleware nfrastruture usng a smple benhmark applaton whh s muh heaper to mplement than the full system. The performane model allows the system desgner to make desons between alternatve arhtetures and mplementaton approahes, n terms of ther salablty, and ablty to aheve requred serve levels. We show our method n aton usng a JEE applaton, Stok-Onlne, and valdate these predtons by mplementng the desgn and measurng ts performane. The modelng approah s applable to applatons bult on ommon mddleware tehnologes suh as CORBA, JEE and COM+/.ET. 1. Introduton The omponent-based approah has proved suessful n the onstruton of enterprsesale systems. A range of tehnologes suh as JEE, CORBA, COM+/.ET, eah allow desgners to deploy separate omponents n an envronment where mddleware nfrastruture provdes essental support for aspets suh as namng/bndng, messagng, persstene, transatons and seurty [4]. Whle the mddleware redues muh of the omplexty of odng a dstrbuted applaton, t has been a hallenge to desgn a system wth onfdene that t wll perform well enough to meet Serve Level Agreements (SLAs). The overall performane of a deployed system depends on arhtetural desons n the desgn of omponents, on the partular mplementaton of the mddleware, and on the many tunable parameters of the mddleware, as well as on the lent load [5]. Furthermore, these aspets nterat n subtle ways, so (for example) the relatve salablty of three dfferent software arhtetures may be reversed f the hostng mddleware produt s hanged to another whh mplements the same tehnology [6]. Some of the desons taken n the development of a omponent-based system an be adjusted easly, late n the development lfeyle. For example, n order to mprove performane, the sze of a thread pool an be onfgured at deployment. However some desons must be taken early, and hange s thereafter very expensve: for example the hoe of the tehnology (JEE vs.et) and the hoe of system arhteture need to be made rrevoably, long before substantal - 1 -

odng takes plae. An unwse deson at desgn-tme ould make t mpossble to aheve the requred performane level one the system has been delvered. Consequently, the desgner needs to be able to predt the performane of the fnshed system, workng from a desgn but wthout aess to a omplete mplementaton of the applaton. An arhtet ould use a performane model to answer questons suh as the followng: What are the maxmum load levels that the system wll be able to handle? If lent load nreases by a gven amount, what level of extra hardware s needed to mantan the requred performane? What are the average response tme, throughput and resoure utlzaton under the expeted workload? Whh omponents have the largest effet on the overall system performane and are they potental bottleneks? What performane beneft would be obtaned by varous desgn hanges? Can tunng aheve the requred performane? The ommon prate to address these problems s to buld a prototype. For a omplex applaton, ths s expensve and tme-onsumng. Our approah avods the need for a prototype, sne we an determne the overall form of the performane equaton from the desgn desrpton. We an then estmate the numer parameters of the equaton by measurng the mddleware produt performane wth a smple benhmark, whh s muh smpler n both ode and arhteture than the proposed desgn. Ths researh fts wthn the goals of predton-enabled omponent tehnology [8]. Our work s more narrowly foused than [8] as we deal only wth performane rather than general qualty attrbutes. However, models for the lass of mddleware-hosted applatons our work addresses must deal wth the subtle nteratons between the nfrastruture and doman omponents. A partular ontrbuton of ths paper s our ablty to dsentangle the nfluenes of performane haratersts of the nfrastruture from those of the applaton s arhteture. Ths s aheved wthout the need to nspet or nstrument the mddleware nfrastruture ode tself. Our work s also related to the tradton of analytal performane modelng of omputer systems, based partularly on queung theory [9][1][15][16]. These tehnques have been appled to omponent-based mddleware-hosted applatons [14]. One approah has been to model the omplete system at a very detaled, physal level []. In ontrast, other researhers have worked at the abstrat level of the software arhteture [3][10][13]. These efforts generally lead to a performane model gvng the overall form of the equaton. However, a quanttatve predton requres applaton-spef parameters, whh an be obtaned only from a substantal prototype mplementaton. Another thread of researh on the desgn of omponent-based applatons to meet performane goals has been amed at more qualtatve predton [1]. Ths researh s often based on detaled measurements of a benhmark applaton, but the outome s nsght nto the aspets of a desgn wth performane onsequenes [1][5]. In ths paper, seton desrbes our new approah to provde a quanttatve performane model. Ths s followed by a ase study where we predt the performane of an applaton used n expostory lterature. We desrbe the Stok-Onlne applaton n seton 3, and n seton 4 - -

demonstrate our performane model for three dfferent arhtetures n a JEE mplementaton of Stok-Onlne. We also valdate the predtons by odng these desgns and measurng atual performane.. Framework of Our Approah An overvew of our modelng approah s shown n Fgure 1. We follow fve phases: 1. Modelng. We establsh a general model for the hosen tehnology, by dentfyng the man omponents of the system, and notng where queung delays an be expeted. We abstrat detals of the nfrastruture omponents and ther ommunaton.. Calbratng. An arhtetural hoe an be mapped to a set of nfrastruture omponents and ther ommunaton pattern[3]. The operatons of serve omponents an be further aggregated nto omputng modules. Calbratng the performane model means dervng mathematal models wth parameters haraterzng those omputng modules. 3. Charaterzng. For a gven applaton, we an determne how often eah module s exeuted. Ths wll depend on the busness log, whh tells us how often methods are alled, and what operatons are performed by what omputng modules. 4. Benhmarkng. The above produes a performane predton for the desgned system n the form of an equaton wth parameters. Some of the parameters represent observable or tunable features of the onfguraton, but other parameters reflet nternal detals of the mddleware platform. We therefore mplement a smple applaton, wth mnmal busness log and a smple arhteture, on the target mddleware platform, and measure ts performane. Solvng the performane model orrespondng to the smple applaton allows us to determne the parameter values, whh we desrbe as the performane profle of the platform. 5. Populatng. The parameters of the mddleware performane profle an be substtuted nto the performane model of the desgned system, gvng the requred quanttatve predton of performane of that system. Mddleware Doman Abstrat of Infrastruture behavor Modelng General performane model Arhteture pattern Applaton Doman Target applaton Senaro based behavor Charaterzng Arhteture spef model Calbratng Benhmark sute Calbratng Benhmarkng Performane profle Populatng - 3 - Applaton spef model Output Performane metrs of the target applaton wth spef arhteture Fgure 1 The performane predton framework

3. Predtng Stok-Onlne Performane 3.1 Motvaton Stok-Onlne [4] s a smulaton of an on-lne stok-brokng system. It models typal e- ommere applaton funtons and ams to exerse the ore features of a omponent ontaner. Stok-Onlne supports sx busness transatons and t enables users to buy and sell stok, nqure about the up-to-date pres of partular stoks, and get a holdng statement detalng the stoks they urrently own. The supportng database has features to trak ustomers, ther transatons, payments, and so on. There are four database tables to store detals for aounts, stok tems, holdngs and transaton hstory. The transaton mx an be onfgured to model a lghtweght system wth read-mostly operatons, or a heavyweght one wth ntensve update operatons. The proess of determnng the transaton mx of the workload requres an understandng of the busness doman, a data model, and some expetatons as to the patterns of usage of the data. Two busness models of Stok-Onlne are and lsted n Table 1, one for the read-only ntensve usage pattern of data and the other one for nreasng demand of data update,.e. the system has doubled updates ompared to the frst model. Table 1 Two Stok-Onlne busness models Transaton mx Transaton mx Usage Pattern Transaton (Read-only ntensve) (Doubled updates) Read-only pont Query stok 70% 5% (read one reord) Read-only multple (read a olleton of reords) Read-wrte (nsertng reords, and updatng reords; all requrng transatonal support) Get stokholdng 1% 1% Create aount % 4% Update aount % 4% Buy stok 7% 14% Sell stok 7% 14% The deployment envronment onssts of a JEE applaton server as the ontaner for Stok-Onlne and a database for persstene. The ontaner and database exeute on separate mahnes. In the test envronment, smulated lents mm the behavor of web server hosted omponents under a full, sustaned request load. Clents also exeute on ther own mahne. Suppose that Stok-Onlne s to be mplemented usng EJB omponents and assembled nto a sngle JEE applaton. A ommon soluton arhteture ould use Contaner Managed Persstene (CMP) entty beans, usng the standard EJB desgn pattern of a sesson bean as a façade to the entty beans. A sngle sesson bean mplements all transaton methods. Four entty beans, one eah for the database tables, manage the database aess. Transatons are ontaner managed. Whle ths sesson-façade arhteture s ertanly a vald approah, there are others ould be onsdered. For example, reall that the read-only ntensve busness model for Stok-Onlne has 8% read-only transatons. Ths makes a soluton based on the read-mostly pattern [1] a strong alternatve to a pure CMP-based desgn. In ths pattern, read-only and read-wrte operatons are separated nto two dfferent entty beans. By explotng the JEE ontaner s entty bean ahng ablty, reads from read-only beans are not bloked, redung the transatonal overhead n synhronzng the ahe wth the persstent data store. Another opton s optmst onurreny ontrol [11], whh s mplemented by most JEE applaton servers for persstene serve. The dea s that no lok s held durng a transaton n the - 4 -

ontaner and the database. The ontaner ssues a predated update lause to detet the onflton of transatons at the ommt tme. If the data n a transaton has been updated by other transatons, the transaton s rolled bak. For an arhtet, knowledge about whh desgn s most sutable for the applaton wthout buldng all the three solutons s desrable, as s determnaton of the level of performane that the system provdes under load. 3. Performane modelng 3..1 A losed QM for an applaton server A queung network model (QM) s used for the performane model. The model aptures an applaton server s behavor n proessng a request from a lent. The QM of an applaton server wth a fxed sze server thread pool 1 s shown n Fgure. A losed QM s approprate for applaton servers wth fnte thread pool apaty as ths effetvely lmts the maxmum requests atve n the server. Clents LA m k Request queue Contaner queue DataSoure queue Fgure A losed QM for an applaton server The lents n the model represent the proxy lents of an applaton server, suh as servlets n a web server. Consequently, a lent s onsdered as a delay resoure and ts serve tme equals the thnkng tme between two suessve requests. The request handler and ontaner are modeled as sngle server queues respetvely wth no load-dependeny. Applaton servers provde a database poolng mehansm to faltate onneton reuse aross requests. The only database lents are n fat the EJBs wthn the applaton server, and hene the onneton pool an be set to the sze of the server thread pool. Database aess s then modeled as a delay server wth load dependeny. The operaton tme at the database ter ontrbutes to the serve demand of the DataSoure queue. Smultaneous resoure possesson (SRP) ours at the Contaner and DataSoure queues. A server thread s bloked whle awatng reples from the database, and hene s not avalable to other requests. The key pont s that the serve demand of Contaner queue an overlap wth the serve demand of DataSoure queue. A model wth SRP an be solved usng the Method of Surrogates [9]. In ths paper however we use an approxmaton tehnque to solve the model usng benhmarkng. We wsh to represent exatly one every serve demand and every soure of queung delay exhbtng SRP. When the 1 For an applaton server, whh dynamally spawns a thread to serve eah request, the maxmum atve requests n the server vary dependng on the workload n the system. It s mportant for the applaton to plae lmts on the number of onurrent allers nto the applaton server. - 5 -

workload n the overall network s stable, the utlzaton of the Contaner queue an be determned. Consequently, the atve number of server threads nvokng requests at the database an be determned. At ths pont, we an onsder the DataSoure queue as a load-ndependent server wth a onstrant on ts lent populaton. The delay at the Contaner queue manly results from the delay awatng server thread avalablty, whle the DataSoure queue delay s manly due to ontenton. The approxmaton made here s that a logal queung delay s substtuted for the ontenton delay at the DataSoure queue. The orgnal model an now be revsed nto a logally separable QM, as n Fgure 3. We add the onstrant value m to Contaner queue to represent the atve number of server threads and k to the DataSoure queue to represent ts atual lent populaton. The network serve demand s gnored beause the effets of network traff are unform between arhtetural hoes. Clents Request queue Contaner queue DataSoure queue m k Fgure 3 A separable QM of the applaton server Ths model fouses on the software nfrastruture of an applaton, as the am s to predt the performane of dfferent arhtetures before the system s bult. To arry out apaty plannng at the hardware level, the model an be deomposed to map to hardware resoures (e.g. CPU, dsk) usng Herarhal Deomposton [1] or Layered Queung Models [15][16]. 3.. Model analyss The performane metr of nterest s the average server sde response tme. We fous on the average response tme under dfferent usage patterns, as ths smplfes the tasks of modelng and measurement. Our presentaton s based on the followng notaton: : Average requests/lent populaton n the QM. K: umber of queues. X 0 : Average throughput of the queung network. D : Serve demand of a sngle queue,.e. the amount of tme requred for a request to be served at queue. R: Average response tme, equal to the total resdene tme over all queues. u : Average utlzaton of queue, defned as the fraton of tme the resoures of queue s busy. s: Server thread pool sze. m: Average number of atve server threads. k: Average number of atve database onnetons. Hereafter, we denote subsrpt 1,, 3 for Request, Contaner and DataSoure queue respetvely. - 6 -

We make approxmatons when the Contaner queue has multple threads onurrently ' ' ' servng the requests. Let D 1, D and D 3 be the effetve serve demand of eah queue. The Request queue s stll a sngle server queue n ths ase. Therefore: D = (1) ' 1 D1 We assume the serve demand of the Contaner queue s effetvely dvded by m beause the sngle queue struture guarantees that the requests wll be served n a frst-ome-frst-server fashon, and no request wll wat f any thread s dle. ' D D = m s () m We assume the effetve serve demand of the DataSoure queue s a lnear funton of k. The database serve tme s atually affeted by varous fators, (e.g. transatonal attrbutes, lok management). Those settngs are stat and hene not expltly modeled. Usng Utlzaton and Lttle s Law [1], we have ' D3 = kd3 k m (3) - 7 - K K u ' u ' D = D = (4) = 1 X X o R K = 1 = K = 1 Ths model an be solved usng Approxmate Mean Value Analyss (MVA) for Closed ' QM [14]. D s the nput parameter and R ' and u are the output. Eq.(1)-(3) show that D depends on D, m and k. ' To solve the model, the hallenge s to obtan D, the separated serve demand of eah queue wth sngle resoure. We ntrodue a notaton for the usage of a queue, denoted as U K. U s the perentage of the total serve tme the applaton server spent on queue durng the measurement. It does not nlude the fraton of watng tme on the proessng of serves at lower levels. U an be measured by benhmarkng. Let T be the serve tme the applaton server spent on queue durng the measurement and T be the total tme of the measurement. In a multthreaded applaton envronment, T s the average tme of all the runnng server threads. The followng trval relatonshp holds: U U = u = T u D K T = 1 = T T u K = 1 Therefore, usng Utlzaton and Lttle s Law and Eq.(6) we an have D u ' 0 K ( U u R ) ' =1 = (7) (5) (6)

3.3 Modelng EJB arhtetures We now need to albrate the applaton server tehnology that wll host the three alternatve desgns. Thus we add a supersrpt to quanttes to ndate the relevant arhteture. For mp D example, 1 s the serve demand of queue1 (Request queue) n the CMP arhteture. It proves onvenent to abstrat the applaton server s state n order to analyze ts serve tme. We desgnate that a request resdes n one of the followng states: Request dspathng (RDISP): A request s aepted by the server and dspathed to the Contaner queue. Server ntalzaton proessng (SIIT): When a thread s avalable, the request steps through an nvoaton han n the server. Bean objet skeleton proessng (BSP): Before the atual busness method s nvoked, the skeleton ode of a sesson bean objet s frst nvoked. Pre-method-nvoaton proessng (PREP): A sesson bean nstane from the pool s assoated wth the request. The ontaner regsters the request wth the transaton manager. Method proessng (MP): The busness log s exeuted. Post-method-nvoaton proessng (POSTP): The ontaner fnalzes the request proessng. Data updates are ommtted. Clearly, the serve demand of Request queue equals the serve tme at state RDISP, whh an be regarded as a onstant. Therefore the three arhtetures have the same Request queue mp rm o serve demand,.e. = D D. D1 1 = 1 SIIT, BSP, PREP and POSTP are modules that a request must pass through. If the tme for database operatons s deduted, ther total serve tme s onstant for all the three arhtetures, T denoted as 0. For onvenene, these states as a whole are referred to as a omposte, CS. The busness log s proessed n the MP state, and t omprses the majorty of the serve demands on the Contaner queue and DataSoure queue. The serve tme of operatons n the MP f f state are modeled as and d for the Contaner and DataSoure queues respetvely. From the f above analyss, we know that f and d are determned by the omponent arhteture, denoted as supersrpt A, thus A = T 0 A D 3 = D + f A f A d ow the problem s to fnd the expresson of and omponent nvolved n a transaton, and model two senaros: A f A f d. We frst onsder only one fndbyprmarykey s nvoked on the omponent s home nterfae. Only one omponent s dentfed by the prmary key. fndbyonprmarykey s alled on the omponent s home nterfae. A olleton of referenes s returned. - 8 -

3.3.1 Modelng CMP A sesson bean gets a referene to an entty bean by nvokng ts fndbyprmarykey method n state FndByPK n Fgure 4(a). If the prmary key s present n the database, a referene to the entty bean s returned to the sesson bean. The ontaner heks ts ahe to see whether a orrespondng bean nstane (dentfed by ts prmary key) exsts. If so t transtons to state LC, n whh a ahed nstane s returned. We defne the ahe ht rato as h, so ths transton ours wth probablty h. Otherwse, the ontaner transtons to state A/P. ejbatvate, whh nurs expensve seralzaton operatons, s alled to assoate a pooled nstane wth a prmary key. In the ase that the pool s full, ejbpassvate s alled to seralze a vtm entty bean nstane to seondary storage. Both the LC and A/P states are followed by state LD, n whh a all to ejbload s made to synhronze the state of the entty bean ahe wth the underlyng database. At the end of the transaton, ejbstore s alled to store any updates made n the transaton. Ths s represented by state SD. As the absolute order n the state mahne s not rual for the serve demand alulaton, we ompat FndByPK, LD and SD nto a ompound state DataSoure and separate t from states nvolvng the Contaner queue, (see Fgure 4(b),()). Consequently, we an represent the CMP arhteture effet on Contaner queue as: f mp = ht ( T (8) 1 + 1 h) where T 1 and T are the serve tme of state LD and A/P respetvely. Smlarly we model the CMP arhteture effet on the DataSoure queue as n Eq.(9), where, T fnd, Tload and T store are the serve tme to fnd, load data nto and store updates to an entty bean. p s the rato of read-only requests. mp d fnd load f = T + T + ( 1 p) T (9) store (a) Overall CMP states (b) Contaner queue states () DataSoure queue states Fgure 4 CMP state mahne 3.3. Modelng the read-mostly pattern For RM, the ontaner s aware of the request type and aesses the read-only or read-wrte entty bean ahe. For a read-only bean, the ontaner goes to state LD only when a ahe mss - 9 -

ours. We denote the ahe ht rato of read-only beans by h r. The ontaner deals wth read-wrte requests n the same manner as the CMP arhteture. The state mahne for ths behavor s shown n Fgure 5. We therefore model the RM arhteture as follows: f f rm rm d = [ phr + (1 p) h] T1 + [ p(1 hr ) + (1 p)( 1 h)] T = T + [ p(1 h ) + (1 p)] T + (1 p) T fnd r load store (b) Contaner queue states (a) Overall RM States () DataSoure queue states Fgure 5 Read-mostly pattern state mahne 3.3.3 Modelng Optmst Conurreny Control The state mahne of OCC s shown n Fgure 6. The ontaner nvaldates the entty bean nstane, whose operaton results n the onfltng transaton. ejbload method s alled the next tme ths nvaldated entty s nvoked, whh s modelled as Invaldaton state followed by LD state. The ontext nformaton of an nvaldated bean s kept vald, whh an save the overhead of atvatng/passvatng a bean nstane. If there s no onflt deteted, ejbstore s alled at the end of the read-wrte transaton. OCC s desrable f there s a low probablty of wrte ontenton. We assume that the overhead of nvaldaton and rollng bak a transaton an be gnored when the data sze of unform aess s large enough and the rato of read-wrte transatons s low. The ahe ht rato of OCC entty ahe s h o. We model OCC arhteture as follows: o d fnd o f = h T o 1 + 1 o) o - 10 - ( h T f = T + ( 1 h ) T + (1 p) T load store

(a) Overall OCC states (b) Contaner queue states () DataSoure queue states Fgure 6 Optmst onurreny ontrol state mahne 3.3.4 Modelng fnd-by-non-prmary-key The FndByonPrmaryKey method of the EnttyHome nterfae an return a olleton of entty bean referenes. The ontaner loads all the mathed entty beans nto the ahe when fndbyonprmarykey s alled. Ths s modeled by state Fnd&Load n Fgure 7. When getvalue s alled the ontaner uses the ahed beans. As bean nstanes are aessed teratvely, we denote the rato of state transton as np where p s the rato of fndbyonprmarykey transatons. Other states are enapsulated nto a ompound state, fndbypksm. In the deomposton of the overall state mahne, the sub-state mahnes of the Contaner and DataSoure queues wth fndbyprmarykey transatons are abstrated as states fndbypkcsm and fndbypkdsm ol ol respetvely. The formulae for f and f d n dfferent arhteture models wth fndbyonprmarykey transatons are lsted n Table. (a) Overall state (b) Contaner queue states () DataSoure queue states Fgure 7 State mahne for fndbyonprmarykey transatons - 11 -

CMP_COL RM_COL OCC_COL ol f Table Arhteture models wth fndbyonprmarykey transatons ( 1 p + np ) ht1 + [(1 p + np )(1 h )] T _ mp ol _ mp f d ol _ rm f ol _ rm f d ol _ o f ol _ o f d [( 1 p ) + np (1 h )] T + (1 p p) T + T load [( p + np ) hr + (1 p p) h] T1 + [( p + np )(1 hr ) + (1 p p)(1 h)] T r r load store [ 1 p ph + np (1 h )] T + (1 p p) T + T store ( 1 p + np ) ht1 + [(1 p + np )(1 h )] T [ 1 p ph + np (1 h )] T + (1 p p) T + T load fnd store fnd fnd 3.3.5 Modelng multple EJBs ow we generalze the analyss for an applaton wth multple sesson and entty beans. In the Sesson-Faade arhteture, we assume the nstane pool sze s large enough, redung the watng tme to obtan a sesson bean nstane to zero. Let p s be the probablty of an operaton of sesson bean s beng aessed and T s be the average serve tme of ts operaton, whh doesn t nlude the tme watng for reples from nested beans operatons. The serve tme of all sesson beans s p st s. For an entty bean, operatons an be ategorzed nto four groups, reatng a s bean, removng a bean, gettng value(s) and settng value(s). The tme for a ontaner to reate and remove an entty bean s denoted as T reate and T remove respetvely. Let p e, o be the probablty of operaton o n entty bean e beng nvoked and T e, o be the average serve tme. If o s a getter or setter method, T e, o s determned by the entty arhteture model f A derved n the above setons. T, also does not nlude the tme spent n nested beans operatons. e o T e, o A f ( p = pe, ot pe, ot e, o reate remove, T, T 1 ) o { get, set} o { reate} o { remove} T s an be regarded as the total serve tme of all but state SIIT n CS, whose serve tme s T. Hene, the overall serve demand of the Contaner queue s SIIT D = TSIIT Te, o (10) + psts + s Smlarly let t e, o be the serve tme of operaton o n entty bean e to aess the database through the DataSoure queue. Ths gves: t e, o A f d ( p = pe, ot pe, ot e, o nsert delete, T fnd, T load e, T o store ) o { get, set} o { reate} o { remove} Here T nsert and T delete are the tme for the database to nsert and delete an entty reord respetvely. The overall serve demand of DataSoure queue s 3.4 Charaterzng Stok-Onlne D t e, o e o = 3 (11) The above analyss has dentfed the nfrastruture omponents nvolved n a spef arhteture and establshed the lnkage between arhteture haratersts and the overall - 1 -

performane. ow the task s to haraterze the applaton and determne how muh t utlzes the operatons of nfrastruture omponents. We use the onept of a senaro, whh proves useful n understandng system and software behavor[7], to defne the performane expetatons for an applaton. A senaro traes through the applaton and an be derved from use ases or lass dagrams. It has attrbutes to spefy ts name, vst ount, type (e.g. read, wrte, reate, remove) and thnk tme (e.g. dle tme between two requests n mllseonds). A senaro omprses multple alls. A all equals a message n sequene dagrams. It has attrbutes aller, allee, aller senaro and allee senaro to spefy the orgn and destnaton of a all. A all also has other attrbutes suh as the number of alls (or nvoaton ount) n a transaton, the type (e.g. synhronous or asynhronous) and teraton ount. If a all s remote, bytes sent and reeved an be spefed to estmate the overhead of the network ommunaton. For example, BuyStok transaton n Fgure 8 queres an nstane of Aount entty by ts prmary key. A getter method of Aount entty, getcredt s nvoked nsde BuyStok transaton. The aess expetaton of getcredt method equals to the number of alls n the transaton multpled by the transaton s frequeny n the transaton mx (7%). Smlarly we an derve the aess expetaton of other enttes nvolved n BuyStok transaton. Sne an entty bean suh as Aount entty s used by more than one transaton, e.g, BuyStok, SellStok, UpdateAount et., we aggregate the aess expetaton for all transatons that use one entty, gvng the metr for eah entty bean shown n Table 3 for the two busness models we ntrodued earler. ote that the transaton frequenes sum to 1, but the aess expetatons sum to more than 1 (sne eah transaton aess one or more methods). lent broker aount stoktem stokholdng stoktx Transaton Manager BuyStok (transaton mx =7%) start transaton get stok pre (nv. ount=1) hek aount redt (nv. ount=1) transaton rollbak f there s not enough redt (nv. ount=1) update (nv. ount=1) nsert transaton reord (nv. ount=1) ommt transaton Clent Applaton server /Transaton ontext Fgure 8 Sequene dagram of BuyStok As the serve demand of the Request queue s onstant, therefore StokOnlne mp rm StokOnlne StokOnlne D1 = D1 = D1. D and D 3 an be alulated usng Eq.(10) and Eq.(11). Sne there s only one sesson bean nvolved n Stok-Onlne, t s easy to onlude that TSIIT + psts = TSIIT + Ts = T 0. s - 13 -

Three of the Stok-Onlne entty beans, Aount, StokItem and StokTx an be modeled usng fndbyprmarykey, as ther nstanes are obtaned by prmary key. StokHoldng has a senaro to aess a olleton of entty beans and the olleton sze s 0 and t s onsequently modeled by fndbyonpramrykey. ow t s lear that to predt the performane of Stok-Onlne wth arhteture CMP and RM, the followng parameters have to be populated: T 0, T 1, T reate,, Tnsert T load, T store, and also the ahe ht rato of eah arhteture must be known. T fnd Table 3 Metrs of Stok-Onlne entty beans Entty Operaton Aess Expetaton Aess Expetaton (Read-only ntensve) (Doubled updates) Aount Get 13.956% 7.91% Set 16.8% 3.564% Create.36% 4.65% StokHoldng Get 6.978% 13.956% Set 13.956% 7.91% Get (teraton=0) 11.630% 11.630% StokItem Get 83.736% 79.084% StokTx Create 13.956% 7.91% 3.5 Benhmarkng Benhmarkng s used to populate the performane parameters needed to solve the model. 3.5.1 The benhmark desgn The benhmark sute onssts of four modules, namely a workload generator, benhmark applaton, montorng utlty and proflng toolkt. The benhmark lents smulate atve requests from proxy applatons, suh as a servlet engne n a web server. Ths knd of proxy lent has an gnorable nterval between two suessve requests. Its populaton n a steady state s bounded 3. Hene the lent spawns a fxed number of threads n eah test. Eah thread submts a new serve request mmedately after the results are returned from the prevous request to the applaton server. The thnkng tme of the lent s effetvely zero. The mplementaton of the smple applaton nvolves a sesson bean objet Agent and an entty bean objet Reord. CMP persstene s used and transatons are ontaner-managed. The example ollaboraton dagram n Fgure 9 shows the sequene of alls n the benhmark applaton for a read/wrte request. In Fgure 10 getreords event smulates the senaro of an entty, whh has fndbyonprmarykey transatons. The montorng utlty s mplemented usng the Java Management Extensons (JMX) remote API. It ollets performane metrs at runtme, nludng the number of atve server threads and atve database onnetons. A proflng toolkt OptmzeIt s also employed. OptmzeIt obtans proflng data from the Java vrtual mahne, and helps n trang the exeuton path and olletng statsts suh as the method nvoaton ount and duraton of eah nvoaton. Proflng tools are neessary for COTS omponent-based systems, when nstrumentaton of the soure ode s not possble. 3 A web server has onfguraton parameters to lmt the atve workload. For example, Apahe uses MaxClent to ontrol the maxmum number of workers, thus the onurrent requests to the applaton server are bounded. - 14 -

The same fundamental desgn for the benhmark sute should apply to dfferent tehnologes. It wll be the prese mplementaton detals that dffer. However, t should be possble to use the same sute for applatons that adhere to the same mddleware, e.g.jee n ths ase. : fnd Reord by PK 1: read (wrte) Clent Agent : SessonBean 5: return value 3: get (set) value 4: return value Reord : EnttyBean Fgure 9 Benhmark applaton: read/wrte event : fnd Reords by on-pk 3: terate to get value 1: get reords Clent Agent : SessonBean 6 : ret urn re ords - 15-4: get value n every teraton Reord : EnttyBean 5: return value n every teraton Fgure 10 Benhmark applaton: getreords event 3.5. Performane measurement The benhmarkng envronment omprses three 4-CPU mahnes onneted by a 100 Mbps solated network. The lent, applaton server and the database server are deployed on separate mahnes. The applaton server used s WebLog Server 7.0 and the server thread pool sze s s 0. The JVM s Sun s JDK 1.3.1 wth settngs hotspot, Xms51m and Xmx104m. The database server s Orale 8.1.7 and the Orale thn JDBC drver s used. Intally, there are 1000 reords n the database table. All three mahnes are runnng on Wndows 000 server. Eah experment has three phases, rampup, steadystate and rampdown. The system s started and ntalzed n the rampup stage for 1 mnute. The system then transfers to steadystate for 10 mnutes. The statstal results are measured durng ths tme. Eah experment s run several tmes to guarantee that the dfferene between runs s mnor. Table 4 Parameters measured by benhmarkng Aess Expetaton Aess Expetaton Parameter (Read-only ntensve) (Doubled updates) Workload () 100 100 Throughput (tps) (X) 513. 457.8 Response tme (ms) (R) 194.3 18.4 Atve # of server threads (m) 19.97 19.69 Atve # of db onnetons (k) 9.79 10.88 CMP ahe ht rato (h) 50% 50% RM ahe ht rato (h r ) 99.6% 99% OCC ahe ht rato (h o ) 60% 63% fndbyonpramrykey ahe ht rato (h ) 96% 97% Request queue utlzaton ( U 1 ) 15.93% 14.16% Contaner queue utlzaton ( U ) 57.1% 51.10% DataSoure queue utlzaton ( U 3 ) 6.86% 34.74%

In order to measure parameters to solve the performane model, we need to selet the workload so that t s large enough to effently utlze the server s physal resoures but small enough to prevent overload and performane degradaton. We saled the workload from 1 to 1000 lents and measured the throughput. We observed the throughput was 513 transatons per seond (tps) when =100. The applaton server CPU utlzaton was 89% and the database server utlzaton was 3%. The peak throughput was 57 tps for all lents and the degradaton of throughput was.66%. We therefore deded to use =100 as base workload. In order to measure the ahe ht rato, the benhmark applaton s mplemented usng CMP, RM and OCC arhteture. getreords events are njeted to smulate the StokHoldng entty s usage pattern wth fndbyonprmarykey transatons. Table 4 lsts the parameters obtaned by benhmarkng. 3.6 Populatng parameters K ( U u R ) ' Reall that D = =1. Thus ' D depends on u however further ntrodue ntermedate parameters as ' " D1 U R D = = K u =1 u s unknown. We " It an be proved that u = u, where u " s the utlzaton of queue under the serve demand as " " D. D an be alulated usng data n Table 4, then u " an be solved by MVA algorthm. Consequently u s avalable and D ' an be solved usng Eq.(7). The serve demand of the benhmark applaton an be solved n Table 5 usng Eq.(1)-(3), from whh we an derve other performane parameters. Table 5 Benhmark serve demands (alulated) Transaton mx Transaton mx Serve demand (Read-only ntensve) (Doubled updates) mp benh D _ 1 0.5409 0.5956 mp _ benh D 38.796 4.993 mp _ benh D 3 0.09317 0.1343 mp _ benh mp _ benh Our analyss treated D 1 and D as ndependent of the workload mx p. mp benh The measurement shows that the error s 9.18% for D _ mp benh 1 and 9.76% for D _ when p=50%. Approxmatng T fnd = Tload n Eq.(8), we get T fnd + Tload + (1 0.9) Tstore = 0.09317 T fnd = 0.04135 T fnd + Tload + (1 0.5) Tstore = 0.1343 Tload = 0.04135 T fnd = Tload Tstore = 0.103 As readng data from ahe s faster than readng data from the database, we use T load as a substtute for T 1. Thus T T an be solved by Eq.(8). We further assume that nsert = Tstore. We also found from the JVM proflng that average serve tme of state CS s about 11.816% of the mp benh Contaner queue s serve demand, so we estmate T 0 =11.816% D _ =4.584. Smlarly we an have T reate =0.008. Serve demands of Stok-Onlne are alulated n Table 6 usng Eq.(10)(11). - 16 -

Table 6 Stok-Onlne serve demands (alulated) Busness model Busness model Serve demand (Read-only ntensve) (Doubled updates) CMP RM OCC CMP RM OCC StokOnlne D 1 StokOnlne D 0.5409 0.5409 0.5409 0.5409 0.5409 0.5409 50.6077 15.070 4.8199 58.1064 7.3369 48.5704 StokOnlne D 3 0.1481 0.0976 0.1149 0.653 0.153 0.139 The utlzaton u of a software task s a useful performane metr, from whh we an estmate the average number of threads that are busy. We only measure m and k when =100 lents. The values for other workloads are estmated as follows: The maxmum number of requests beng served smultaneously by the server threads s the thread pool sze s. When s, the apaty of the server threads approahes saturaton, m100 and k 100 an be used for m and k as approxmatons. When < s, we an regard u as the rato of atve number of threads m not dle watng for requests to the thread pool sze,.e. u m s ; u3 as the rato of the number of atve database onnetons to the number of atve threads,.e. u3 k m. Therefore m m = m 100, k = mn( m 100 = k 100, s u ( )), k = mn( k 100, m u 3 ( )) s < s 3.7 Predtng Stok-Onlne performane To verfy our approah, three versons of Stok-Onlne were mplemented, one usng the CMP arhteture, one usng the RM arhteture and another OCC arhteture. Eah was deployed and run on the expermental envronment used for benhmarkng. The predted server sde response tme was then ompared to the empral results. 3.7.1 Sngle applaton server performane Fgure 11 shows the performane results of three mplementatons of Stok-Onlne. The error of predton s around 5-9% and the worst ase exept for =1 s about 13% for the three mplementatons. The predton s that RM arhteture sgnfantly mproves performane by about 60-65% over CMP, whle OCC arhteture has lmted performane optmzaton over CMP by about 5-15%. Measurement onfrms ths predton. Fgure 11 Stok-Onlne Performane (Read-only ntensve busness model) - 17 -

When the busness model has twe as many updates, we predt the performane n Fgure 1. The error of predton s around 5-17% exept =1. The error of predtng OCC performane s up to 17% when =500. We observed that more transatons rolled bak as the workload nreases, whh results n more overhead of the ontaner. It an be seen from the predton that the advantage of RM arhteture dmnshes as the rato of read-only transatons dereases. RM arhteture and OCC arhteture stll produe better performane than CMP arhteture. The model an also help to sze the apaty of the system. Suppose that the requrement for the applaton response tme s sub-1 seond. We an estmate that the maxmum workload of CMP to meet ths requrement s =400, and ths degrades to =380 when the system has twe as many updates. Fgure 1 Stok-Onlne Performane (Doubled updates busness model) The utlzaton of eah queue n the QM s depted n Fgure 14. The predton s that n the CMP and OCC arhteture, the Contaner queue s the most demandng subsystem. It s the bottlenek software omponent as ts utlzaton s approahng 100%. In the ontrary, the RM arhteture speeds up the ontaner by optmzng the ahe usage of read-only data. Thus the bottlenek s transferred to the DataSoure queue whose utlzaton s saturated. 3.7. Tunng the thread pool sze The onfguraton of thread pool sze represents an mportant tunng opton. Too few threads wll lmt performane n the applaton server by seralzng muh of the applaton proessng. Too many threads wll onsume resoures and nrease ontenton, agan redung applaton performane. Fortunately, the model an help us fnd the optmal onfguraton for the thread pool sze. The model aurately predts n Fgure 13 that the optmal value s 0 for both CMP and RM and 5 for OCC nstead of the default settng 15. Measured Predted Fgure 13 Stok-Onlne thread pool sze tunng - 18 -

1 (a) CMP Utlzaton (u) 0.8 0.6 0.4 0. Contaner-D Contaner-R DataSoure-D DataSoure-R 0 50 100 00 300 400 500 Request-D Request-R o. of lent () 1 (b) RM Utlzaton (u) 0.8 0.6 0.4 0. DataSoure-D DataSoure-R Contaner-D Contaner-R 0 50 100 00 300 400 500 Request-D Request-R o. of lent () 1 () OCC Utlzaton (u) 0.8 0.6 0.4 0. DataSoure-R Contaner-D Contaner-R DataSoure-D 0 50 100 00 300 400 500 Request-D Request-R o. of lent () Fgure 14 Predted utlzaton of eah queue n a sngle server. For one arhteture, the fgure shows the utlzaton of eah queue under two busness models: Read-only ntensve wth the olumn named n the form of Queue-R and Doubled updates wth the olumn named n the form of Queue-D - 19 -

3.7.3 Two-node server lusterng Most mddleware produts support server lusterng to provde salablty and avalablty. The applaton omponents are replated on the servers wthn the luster. We appled our approah to explore the performane of CMP arhteture n a two-node server luster. The RM arhteture n a luster envronment requres more omplated nfrastruture serves to manage the onssteny of beans replas. The ontaner s responsble for the nvaldaton of ahed entty beans nvolved n onfltng transatons. The RM arhteture model for a sngle server onsequently has to be extended to represent ths hange n a lustered envronment, and ths s beyond the sope of ths paper. Clents Two-node luster Request queue Contaner queue DataSoure queue m 1 1 k 1 m k Fgure 15 QM of a two-node server luster We assume the two server mahnes are dental and they aess a shared database server. The extended QM s depted n Fgure 15. The workload s balaned between these two servers. The onstrant of the lent populaton of the DataSoure queue s the total number of atve database onnetons used by the two servers. We further assume that m 1 =m and k 1 =k. Ths model an be solved usng Herarhy Deomposton [1]. The server thread pool sze s 0 and the parameters derved from a sngle server an be used to solve the model. The results are shown n Fgure 16. The predton error s -6% for varous tested workloads. Usng the model we an predt that addng an dental server, the maxmum workload s =660 whle meetng the requrement for a sub-seond response tme. Fgure 16 Stok-Onlne performane (two-node luster) The model an also produe the utlzaton of eah queue n the QM of a two-node luster. It s learly shows n Fgure 17 that the utlzaton of two Request queues s smlar to eah other n the two applaton servers, whh s also true for the utlzaton of the two Contaner queues. Ths s beause the workload s balaned between these two servers and atve number of server threads and database onnetons are almost dental. We an see that now the DataSoure queue s about 100% utlzed, whh ndates t s the bottlenek software. Thus wthout needng more experments, we an reason that addng extra nodes beyond to the applaton server luster would - 0 -

not be useful. To further mprove the performane, ether the database ould be transferred to a more powerful mahne or more database servers should be lustered to provde serve. 1 Utlzaton (u) 0.8 0.6 0.4 0. 0 100 300 500 1000 DataSoure queue Contaner(server) Contaner(server1) Request(server) Request(server1) o. of lents () 4. Conluson Fgure 17 Predted utlzaton of eah queue n a luster In ths paper we report a sgnfant ontrbuton to software developers who need to make sensble arhtetural hoes durng the desgn phase for a omponent-based, mddleware-hosted applaton, n order to aheve desred performane goals. We have shown that one an predt eventual performane qute losely (wthn 10% on most measures for our example system). Our approah derves a quanttatve performane model for the desgn, wth parameters that reflet propertes of the nfrastruture platform. These parameters an be measured by runnng a smple benhmark applaton on the platform. We have demonstrated our approah by predtng the performane of three ompetng desgns for an applaton from the lterature. The predtons were good enough to hoose between the desgns. The example desgn used EJB and JEE tehnologes, and dealt wth a CMP desgn, a read-mostly optmzaton desgn and optmst onurreny ontrol desgn. So far, we only model synhronous proesses and we only derve the average response tme of all lasses of requests. The next step s to model asynhronous proesses and to derve performane metrs of eah lass of requests. We also plan to nvestgate models that deal more presely wth the database server, whh we have approxmated as a smple queue. Case studes wll be arred out on other mddleware based applatons. We are onfdent that the approah s general, and ould be appled to other omponent tehnologes. 5. Aknowledgements We would lke to thank Software Arhtetures and Component Tehnologes (SACT) group n CSIRO Australa for provdng the experment envronment for ths work, espeally Dr. Shpng Chen for hs tehnal support. 6. Referenes [1] Bahmann, F.; Bass, L.; Klen, M. Dervng arhtetural tats, CMU/SEI-003-TR-004, 003. [] Dlley, J.A.; Fredrh, R. J.; Jn, T. Y.; Rola, J. Measurement tools and modelng tehnques for evaluatng web server performane, HPL-96-161, 1996. - 1 -

[3] Gomaa, H.; Menasé, D.A. Desgn and performane modelng of omponent nteronneton patterns for dstrbuted software arhtetures, Pro. Workshop on Software and Performane (WOSP), 000, pp.117-16. [4] Gorton, I. Enterprse Transaton Proessng Systems, Addson-Wesley, 000. [5] Gorton, I.; Lu, A. Performane evaluaton of alternatve omponent arhtetures for EJB applatons, IEEE Internet Computng, vol.7, no. 3,003, pp.18-3. [6] Gorton, I.; Lu, A.; Brebner, P. Rgorous evaluaton of COTS mddleware tehnology, IEEE Computer, vol. 36, no.3, 003, pp. 50-55. [7] Harel, D.; Kugler, H.; Marelly, R.; Pnuel, A. Smart play-out, 18 th Annual ACM SIGPLA Conferene on Objet-Orented Programmng, Systems, Languages, and Applatons (OOPSLA 03), pp. 68-69, 003. [8] Hssam, S., Moreno, G., Stafford, J., Wallman, K., Pakagng predtable assembly, Component Deployment: IFIP/ACM Workng Conferene, LCS 370, 00, pp 108-4. [9] Jaobson, P. A. and Lazowska, E. D., Analyzng queueng networks wth smultaneous resoure possesson, Comm. of the ACM, vol 5, no., 198, pp.14-151. [10] Kounev, S., Buhmann, A., Performane modelng of dstrbuted E-Busness applatons usng queung petr nets, Pro. of IEEE Int l Symp on Performane Analyss of Systems and Software, 003. [11] Kung, H. T. and Robnson, J. T. On Optmst Methods for Conurreny Control, ACM Transatons on Database Systems, vol. 6, o., 13-6, 1981. [1] Lazowska, E., Zahorjan, J., Graham, S., Sevk, K., Quanttatve System Performane, Prente Hall, 1984. [13] Lu, T.K., Behrooz, A., Kumaran, S. A performane model for a busness proess ntegraton mddleware, IEEE Int l Conf. on E-Commere, 003, pp. 191-198. [14] Menasé, D.; Almeda, V.A.F. Salng for E-Busness: Tehnologes, Models, Performane, and Capaty Plannng. Prente-Hall, 000. [15] Rola, J. A., Sevk, K.C., 1995. The method of layers, IEEE Transaton on Software Engneerng, vol. 1, no. 8. 1995, p.689-700. [16] Woodsde, C.M., elson, J.E., Petru, D.C., Majumdar, S., 1995. The Stohast Rendezvous etwork Model for Performa of Synhronous Clent-Server-Lke Dstrbuted Software, IEEE Transatons on Computers, vol. 44, no. 1, January, 1995, pp. 0-34. [17] Xu, J., Woodsde, C. M., Petru, D., Performane analyss of a software desgn usng the UML profle for shedulablty, performane and tme, Pro. Computer Performane Evaluaton, Modellng Tehnques and Tools (TOOLS), LCS 794, 003, pp 91-310. - -