Freshness-driven adaptive caching for dynamic content Web sites

Size: px
Start display at page:

Download "Freshness-driven adaptive caching for dynamic content Web sites"

Transcription

1 Data & Knowledge Engineering 47 (2003) Freshness-driven adaptive caching for dynamic content Web sites Wen-Syan Li *, Oliver Po, Wang-Pin Hsiung, K. Selcßuk Candan, Divyakant Agrawal C&C Research Laboratories Silicon Valley, NEC USA, Inc., North Wolfe Road, Suite SW3-350, Cupertino, CA 95014, USA Received 30 October 2002; received in revised form 9 April 2003; accepted 14 May 2003 Abstract Both response time and content freshness are essential to e-commerce applications on the Web. One option to achieve good response time is to build a high performance Web site by deploying the state of art IT infrastructures with large network and server capacities. With such a system architecture, freshness of the content delivered is limited by the network latency since when users receive the contents, the contents may have changed at the server. With the wide availability of content delivery networks, many e-commerce Web applications utilize edge cache servers to cache and deliver dynamic contents at locations much closer to users, avoiding network latency. By caching a large number of dynamic content pages in the edge cache servers, response time can be reduced, benefiting from higher cache hit rates. However, this is achieved at the expense of higher invalidation cost. On the other hand, a higher invalidation cost leads to a longer invalidation cycle (time to perform invalidation check on the pages in caches) at the expense of freshness of cached dynamic content. In this paper, we propose a freshness-driven adaptive dynamic content caching technique, which monitors response time and invalidation cycle length and dynamically adjusts caching policies. We have implemented the proposed technique within NECÕs CachePortal Web acceleration solution. We have conducted experiments to evaluate effectiveness of the proposed freshness-driven adaptive dynamic content caching technique. The experimental results show that the proposed technique consistently maintains the best content freshness to users. The experimental results also show that even a Web site with dynamic content caching enabled can further benefit from deployment of our solution with improvement of * Corresponding author. Tel.: ; fax: addresses: wen@ccrl.sj.nec.com (W.-S. Li), oliver@ccrl.sj.nec.com (O. Po), whsiung@ccrl.sj.nec.com (W.-P. Hsiung), candan@ccrl.sj.nec.com (K. Selcßuk Candan), agrawal@ccrl.sj.nec.com (D. Agrawal) X/$ - see front matter Ó 2003 Elsevier B.V. All rights reserved. doi: /s x(03)

2 270 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) its content freshness up to 10 times especially during heavy user request traffic and long network latency delay. Ó 2003 Elsevier B.V. All rights reserved. Keywords: Dynamic content; Web acceleration; Freshness; Response time; Network latency 1. Introduction Response time and content freshness are essential to e-commerce Web sites. In business terms, the brand name of an e-commerce site is correlated to the type of experience users receive. The need for accounting for usersõ quality perception in designing Web servers for e-commerce systems has been highlighted by [1]. Snafus and slow-downs at major Web sites during special events or peak times demonstrate the difficulty of scaling up e-commerce sites. Slow response times and down times can be devastating for e-commerce sites as reported in a study by Zona Research [2] on the relationship between Web page download time and user abandonment rate. The study shows that only 2% of users will leave a Web site (i.e. abandonment rate) if the download time is less than 7 s. However, the abandonment rate jumps to 30% if the download time is around 8 s. The abandonment rate reaches 70% as download times exceed 12 s. This study clearly establishes the importance of fast response times to an e-commerce Web site to retain its customers. In technical terms, ensuring the timely delivery of fresh dynamic content to end-users and engineering highly scalable e-commerce Web sites for special peak access times put heavy demands on IT staff. This load is compounded by the everchanging complexity of e-commerce applications. For many e-commerce applications, Web pages are created dynamically based on the current state of a business, such as product prices and inventory, stored in database systems. This characteristic requires e-commerce Web sites to deploy and integrate Web servers, application servers, and database systems at the backend. A basic system architecture of database-driven Web sites consists of the following components: 1. A Web server (WS) which receives user requests and delivers the dynamically generated Web pages. 2. An application server (AS) that incorporates all the necessary rules and business logic to interpret the data and information stored in the database. AS receives user requests for HTML pages and depending upon the nature of a request may need to access the DBMS to generate the dynamic components of the HTML page. 3. A database management system (DBMS) to store, maintain, and retrieve all the necessary data and information to model a business. When the Web server receives a request for dynamic content, it forwards the request to the application server along with its request parameters (typically included in the URL string). The Web server communicates with the application server using URL strings and cookie information, which is used for customization, and the application server communicates with the database using

3 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) queries. When the application server receives such a request from the Web server, it processes it and it may access the underlying databases to extract the relevant information needed to dynamically generate the requested page. To improve the response time, one option is to build a high performance Web site to improve network and server capacity by deploying the state of art IT infrastructure. However, without the deployment of dynamic content caching solutions and content delivery network (CDN) services, dynamic contents are generated on demand. In this case, all delivered Web pages are generated based on the current business state in the database. However, when users receive the contents, the business state could already have changed due to the network latency. The alternative solution is to deploy network-wide caches so that a large fraction of requests can be served remotely rather than all of them being served from the origin Web site. This solution has two advantages: serving users via a nearby cache closer to the users and reducing the traffic to the Web sites. Many content delivery network services [3,4] provide Web acceleration services. A study in [5] shows that CDN indeed has significant performance impact. However, for many e-commerce applications, HTML pages are created dynamically based on the current state of a business, such as product prices and inventory, rather than static information. As a result, the time to live (TTL) for these dynamic pages cannot be estimated in advance. Therefore, content delivery by most CDNs is limited to handling static portions of the pages and streaming media, rather than the full spectrum of dynamic content that constitutes the bulk of the e-commerce Web sites. Applying caching solutions for Web applications and content distribution has received a lot of attentions in the Web and database communities [6 17]. These provide various solutions to accelerate content delivery as well as techniques to assure the freshness of the cached pages. Note that since Web content is delivered through the Internet, the content freshness can only be assured rather than being guaranteed. For the Web sites that are dynamic content caching enabled, by caching more dynamic contents in edge cache servers, response time can be improved through the benefit of higher cache hit rates and low network latency. However, in a database-driven Web site, databases must be monitored so that the cached pages which are impacted by database content changes can be invalidated or refreshed in a timely manner. As a result, caching a large number of Web pages at edge cache servers to yield fast response time are achieved at the expense of invalidation cycles. Thus, it requires more computational resources and time to complete invalidation checking. On the other hand, when invalidation cycle is long, the freshness of cached pages cannot be maintained at a desirable level. Furthermore, many parameters, such as database update rates, user request rates, and network latency, also have effect on user response time and length of invalidation cycles. How to architect Web sites and tune caching policy dynamically to serve fresh content in short response time is a complex issue. In this paper, we focus on the issue of how to maintain the best content freshness that can be assured for a given set of user request rate and database update rate. We propose an adaptive dynamic content caching technique that monitors response time and invalidation cycle as feedback for dynamic adjustment of caching policy. By considering the trade-off of invalidation cycle and response time, it maintains the best content freshness to the users. The rest of this paper is organized as follows. In Section 2, we discuss issues in caching policy tuning specifically for dynamic content Web sites. In Section 3, we give an

4 272 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) overview of our proposed solution. In Section 4 we describe the dynamic content invalidation schemes used in NECÕs CachePortal. In Section 5, we describe dependency between user request response time and invalidation cycles. In Section 6 we identify how to fine tune caching policy for balancing response time and invalidation cycles for maintaining freshness. We also present experimental results that evaluate effectiveness of our proposed technique for ensuring content freshness. Section 7 summarizes related work and Section 8 concludes the paper. 2. Issues in caching policy tuning for dynamic content Web sites In this section, we present the problem formulation and give an overview of our proposed solution. We start with the description of an e-commerce Web site system architecture that is dynamic content caching enabled. There are multiple tiers in the content delivery infrastructure where cache servers can be deployed. They include the following: Database caches, which are placed between application servers and database systems for acceleration of database access via JDBC or ODBC interfaces. The work addressed in [18] focuses on the caching issues in this tier. DBCaches, which are usually placed in data centers. DBCaches are considered mirror databases and they are used by applications hosted in the data centers. By generating requested contents from locations close to users, content delivery can be accelerated. The work described in [15,19] focus on caching issues in this tier. Content page caches, which are reverse caches and can be placed close to users, functioning as edge caches, or in front of Web servers, functioning as frontend caches. The study of the impact of cache positions on the user response time in [17] validates that edge caches have many advantages over front end caches, such as fast and more predictable user response time. The work described in [6,7,9,13,14,16,20] focus on caching issues in this tier. In the scope of this paper, we focus on the content delivery network architectures where cache servers are placed at the network edges close to users. Fig. 1 shows the architecture and data flows of a database-driven e-commerce Web site where NECÕs CachePortal technology [7,9,17] is deployed. The architecture is very similar to that of typical e-commerce Web sites, except for the two CachePortal components, Sniffer and Invalidator. The CachePortal technology enables dynamic content caching by utilizing Sniffer to derive URL/DB Query Mapping, the relationships between cached pages and the database contents that are used to generate these pages. The Invalidator is responsible for monitoring database changes by scanning database update log. The database changes are monitored periodically by the Invalidator and then it performs invalidation checking process to identify the cached pages which are impacted by the database changes during the current period. Note that the knowledge about dynamic content is distributed across three different servers: the Web server, the application server, and the database management server. Consequently, it is not straightforward to create a mapping between the data and the corresponding Web pages auto-

5 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) End Users Request: URL+Parameters Cookies Page content (HTML) in the same or nearby netwok Invalidation Edge Cache Internet Request: URL+Parameters Cookies Page content (HTML) Sniffe Captureing Messages Web Server Constructing Invocation: Program parameters Cookies Page content (HTML) URL/Query Mapping Application Server in the same or nearby network Utilizing DB Queries DB Query Results Invalidator Content Change Monitoring DBMS Fig. 1. A database-driven e-commerce site with dynamic content caching enabled. matically in contrast to the other approaches [21 24], which assume such mappings are provided by the system designers. The detailed descriptions on automated construction of such URL and database query mapping are given in [7]. In Section 4, we give a detailed description of the invalidation process. In this configuration, user requests are directed to the edge cache that is close to the requesting users. If there is a cache hit, the requested content will be delivered to the users from the cache. Otherwise requests are forwarded to the origin Web server. Note that in our implementation, edge caches behave as both reverse proxies (if there is a hit) and proxies (if there is a miss so that the requests are forwarded). Since edge cache servers are close to the users so that the network latency between the cache servers and the users can be disregarded compared with the network latency between the cache servers and the origin Web sites. Let us define the main three terms that have impact on the freshness of delivered contents as follows:

6 274 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Response time at edge cache servers: This is the round trip time for requests that are served at the edge cache servers (as a result of a cache hit). The response time from edge caches is expected to be extremely fast. Response time at origin Web sites: This is the round trip time for requests that are served at the origin Web servers (as a result of a cache miss). Invalidation cycle: This is the time required to process invalidation checks for all the pages in the cache servers. The invalidation cycle is impacted by the following factors: the update rates in database systems; the number of pages in the cache servers; and the correlation between the pages in the cache servers (in terms of the query statements used to generate these pages). More detailed information will be given in Section 4 in the discussion of query type concepts and consolidation of invalidation. For Web sites that do not deploy any dynamic content caching solution, all pages need to be dynamically generated at the origin Web site. The content freshness that such a system can assure is the response time at the Web sites. For example, if the response time for a request at a Web site is 10 s on the average the Web page can be fresh, but since the database content can change after the Web page is generated, the assured freshness of the page (i.e. age) is 10 s! For Web sites that deploy dynamic content caching solutions, some pages are dynamically generated at the origin Web site, but the majority of the pages are delivered from the cache server. Since the database content changes need to be reflected to the pages in the cache server, invalidation check needs to be performed periodically or on demand to ensure content freshness. In these Web sites, the content freshness that can be assured is the maximum of (1) the response time from origin Web sites, (2) the response time from edge caches, and (3) the length of the invalidation cycle. Note that since the response time from edge caches is much faster than the response time from either the origin Web sites or the length of invalidation cycles, the content freshness that can be assured is the larger of the response time from origin Web sites and the length of the invalidation cycle. Depending on the caching policy, traffic patterns, cache hit rates, server load, and database update rates, sometimes the response time at origin Web sites can be much longer than the invalidation cycle time. In Fig. 2(b), we show a system configuration that caches few pages. As a consequence, it is expected to has a short invalidation cycle; at the expense of slow response time when there is a cache miss since the Web site is carrying a heavy load. On the other hand, the system configuration in Fig. 2(c) has a different caching policy and caches a large number of Web pages at its edge caches. This configuration provides fast response time for all requests. When there is a cache hit, the response time is naturally very fast. When there is a cache miss, the response is still fast since the Web site has lighter work load as most of the requests are served directly from the edge cache servers. However, this configuration and its caching policy has potentially low content freshness since it will take much longer time to complete the necessary invalidation check for all the pages in the cache servers.

7 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 2. Response time and freshness of delivered content by various system architectures: (a) without dynamic content caching, (b) with dynamic content caching enabled, (c) with dynamic content caching enabled, (d) deploying freshnessdriven adaptive dynamic content caching, (e) higher response time at the Web site, (f) freshness assured at the new equilibruium.

8 276 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Solution overview From Section 2, we learn that the freshness of requested pages can be ensured at the highest level when the hit rate is tuned to a point where the response time from the origin site and invalidation cycle for the cached pages are equivalent as shown in Fig. 2(d). In this paper, we propose a freshness-driven adaptive dynamic content caching technique. The proposed technique aims at maintaining the best content freshness that a system configuration can support and assure. Our technique does not blindly attempt to maintain the lowest average response time nor average content freshness. Instead it assures good content freshness: the delivered content is either fresh or not older than a threshold (i.e. the assured content freshness) given by the system. For the configuration in Fig. 2(b), the proposed freshness-driven adaptive caching would tune its caching policy by increasing the number of cached pages so that the traffic at the origin Web site can be reduced until the response time at the origin Web site is close to the length of the invalidation cycle, where the assured freshness of delivered content is optimal. On the other hand, for the configuration in Fig. 2(c), the proposed freshness-driven adaptive caching would decrease the number of pages cached at the edge cache servers so that invalidation cycles could be shorten until the response time at the origin Web site and the length of the invalidation cycle reach a new equilibrium as shown in Fig. 2(d). We define two terms as follows: At the equilibrium, the length of invalidation cycle is the assured freshness and the response time at the origin Web site is the assured response time. The equilibrium in Fig. 2(d); however, may change when the influential parameters change. For example, if network latency, database update rates, and user request rates increase, the response time at the origin Web site will increase accordingly (Fig. 2(red)). To reduce the response time at the origin Web site, the freshness-driven adaptive caching will increase the number of cached pages at the edge cache. Consequently, the request rate at the origin Web site is reduced (so is the response time) at the expense of invalidation cycle. Note that the equilibrium in Fig. 2(f) is higher than the equilibrium in Fig. 2(d). The proposed freshness-driven adaptive caching has various advantages. First, it allows the best content freshness among these system architectures; with or without dynamic content caching solutions. Second, it also yields fast response times. Third, it provides assurance for both freshness and response time as follows: the system does not serve the content that is older than the assured freshness; and the system does not serve the content slower than the assured response time. 4. Invalidation checking process In this section, we describe the invalidation process in the CachePortal technology: Consolidated Invalidation Checking, a method to enable invalidation checking process. In Section 5,

9 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) we will show the dependency between cache hit rates and invalidation cycles and how the invalidation checking process can be tuned to achieve target cache hit rate; and consequently, desired response time. We start by introducing an example database and terminology used later in this section Concept of polling queries for invalidation checking Assume that the database has the following two tables. Car(maker, model, price) Mileage(model, EPA) Say that the following query Query1 has been issued to produce a Web page, URL1: select maker, model, price from Car where maker ¼ 00Toyota00; Say, now we observe, based on the database log, that a new tuple (Toyota, Avalon, $25,000) is inserted into the table Car.Query1 only accesses the table Car, hence we can check if the new tuple value satisfies the condition stated in Query1. Since it satisfies the query, Query1 has been impacted and consequently URL1 needs to be invalidated or refreshed. Note that the update operation can be viewed as a delete operation followed by an insert operation. We can apply the procedure just described to handle update operations. Now assume the following query, Query2, has been issued to produce a Web page, URL2: select Car.maker, Car.model, Car.price, Mileage.EPA from Car, Mileage where Car.maker ¼ 00Toyota00 and Car.model ¼ Mileage.model; Note that this query involves more than one table and there is a join operation. Now we observe that a new tuple (Toyota, Avalon, $25,000) is inserted into the table Car. Since Query2 needs to access two tables, we first check if the new tuple value can satisfy the condition associated with only the table Car stated in Query2. If it does not satisfy, we do not need to test the other condition and we know the new insert operation does not impact the query result of Query2 and consequently URL2 does not need to be invalidated or refreshed. If the newly inserted tuple does satisfy the condition associated with the table Car, we cannot determine whether or not the query result of Query2 has been impacted unless we check the rest of the condition associated with the table Mileage. To check whether or not the condition Car.model ¼ Mileage.model can be satisfied, we need to access the table Mileage. To check this condition, we need to issue the following query, Query3, to the database: select Mileage.model, Mileage.EPA from Mileage where 00Avalon00 ¼ Mileage.model;

10 278 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) If the result of Query3 is non-empty, the query result for Query2 needs to be invalidated. The queries, such as Query3, that are issued to determine if certain query results need to be invalidated are referred as polling queries Query types We now introduce the terminology that is relevant to the proposed consolidated invalidation checking: Query type: A query type is the definition of a query. It is a valid SQL statement which may or may not contain variables. We denote a query type as QðV 1 ;...; V n Þ, where each V i is a variable that has to be instantiated by the application server before the query is passed to the DBMS. Bound-query type: A bound query type is a valid SQL statement which does not contain variables. We denote a bound query type as Qða 1 ;...; a n Þ, where each a i is a value instantiated for variable V i. Queries that are passed by the application server to the DBMS are bound queries. Query instances: A query instance, is a bound query type with an associated request time stamp. We denote a bound query type as Q t ða 1 ;...; a n Þ, where t is the time at which application server passed the request to the DBMS. Therefore, multiple query instances can have the same bound query type; and, multiple bound query types may have the same query type. If the Invalidator observes the following three query instances, Query4, Query5, and Query6, in the log to generate user requested pages: select maker, model, price from Car where maker ¼ 00Toyota00; select maker, model, price from Car where maker ¼ 00Honda00; select maker, model, price from Car where maker ¼ 00Ford00; we can derive a query type, Query_Type1 as: select maker, model, price from Car where maker ¼ $var; Therefore, multiple query instances can have the same bound query type; and, multiple bound query types may have the same query type. We can create a temporary table Query_Type1_ID to represent the above three query instances, Query4, Query5, and Query6, as follows:

11 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) QUERY_ID Query4 Query5 Query6 QUERY_INSTANCE Toyota Honda Ford There are usually a limited number of query types designed for a Web-based application for e-commerce. For example, the above query type could be associated with a query interface that allows users to specify car maker names to retrieve model and price information of the cars by the specified car maker Consolidated invalidation checking We now explain how to identify which query results have been impacted as a result of updates to the database. Assume that there are three queries to be checked, Query4, Query5, and Query6. Let us also assume that the following four tuples are inserted to the database: (Acura, TL, $30,000) (Toyota, Avalon, $25,000) (Honda, Accord, $20,000) (Lexus, LS430, $54,000) Assume that we create a temporary table Delta for the content changes in the table Car. One simple way is to replace maker in Query4, Query5, and Query6 with Acura, Toyota, Honda, and Lexus respectively. As a result, 12 (i.e. 3 4) polling queries need to be issued to perform invalidation checking. Obviously this is not very efficient. The other way is to consolidate a large number of required invalidation checks into more compact form through transformation. For example, a single polling query, Query10, can be issued as follows: select Query_Type1.QUERY_ID from Query_Type1, Delta where Delta.Maker ¼ Query_Type1.QUERY_INSTANCE; Query_Type1.QUERY_ID is a list of query results that need to be invalidated. In this example, Query4 and Query5 will be invalidated. Now let us look at another query type. For example, the following two queries, Query11, and Query12, are also in the log to generate user requested pages: select Car.maker, Car.model, Car.price, Mileage.EPA from Car, Mileage where Car.maker ¼ 00Toyota00 and Car.model ¼ Mileage.model;

12 280 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) select Car.maker, Car.model, Car.price, Mileage.EPA from Car, Mileage where Car.maker ¼ 00Ford00 and Car.model ¼ Mileage.model; we can derive a query type, Query_Type2 as: select Car.maker, Car.model, Car.price, Mileage.EPA from Car, Mileage where Car.maker ¼ $var and Car.model ¼ Mileage.model; And we may create a temporary table Query_Type2_ID to represent the above two query instances as follows: QUERY_ID Query11 Query12 QUERY_INSTANCE Toyota Ford A single polling query, Query13, can be issued as follows to perform fully consolidated checking: select Query_Type2.QUERY_ID from Car, Mileage, Query_Type2, Delta where Car.maker ¼ Delta.Maker and Car.model ¼ Mileage.model and Delta.Maker ¼ Query_Type1.QUERY_INSTANCE; Query_Type2.QUERY_ID is a list of query results that need to be invalidated. In this example, both Query11 and Query12 will be invalidated Summary In this section, we described the invalidation checking scheme in CachePortal. In our current implementation, the full consolidation scheme is deployed at default to speed up the invalidation process. We summarize some important characteristics of the consolidated invalidation checking schemes as follows: The invalidation cycle length is mainly impacted by the number of query types. The number of query types determine the number of polling queries that are executed for invalidation check. The number of query instances per query type is impacted by user request rate and request distribution. The database update rates and the number of query instance per query type have relatively lower impact on the invalidation cycle length since they only increase query processing complexity for each query type rather than the number of processes for query processing.

13 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Network latency has no impact to the invalidation cycle since the Invalidator is usually deployed on the same machine as the DBMS. However, invalidation messages need to be sent out across the Internet. Since invalidation messages are sent out through a persistent connection between the Invalidator and edge caches. The impact of network latency and connection delay on invalidating cached content is low. In the next section, we conduct experiments to validate the above characteristics based on analysis of invalidation process in this section. 5. Dependency between response time and invalidation cycle In this section, we examine the dependency between the response time and the invalidation cycle length. We have conducted experiments to verify this dependency. We first describe the general experiment setup that consists of databases, application servers, and network infrastructure that are used in the experiments General experimental setting We used the following experimental setup in our evaluations: We used two heterogeneous networks that are available in the NECÕs facility in Cupertino, California: one is used by the C&C Research Laboratories (referred to as CCRL) and the other one is used by cacheportal.com (referred to as CP). Users and cache servers (edge caches in this setting) are located in CCRL while the Web server, application server, and DBMS are located in CP. The average number of hubs between CCRL and CP is 15. The average throughput measured for CCRL CP, CP CP, and CCRL CCRL connections are 84.7, and KB/s respectively. The average round trip time on CCRL CP, CP CP, and CCRL CCRL connections are 321, 1, and 2 ms respectively. To summarize, the experiment environment represents real world networking characteristics: (1) connectivity within the same network is substantially better than that across the Internet; and (2) there is network latency. Users and the edge cache server are located in the CCRL network. The Web server, application server, and DBMS are located in the CP network. Thus, the network latency for cache-missed requests is 321 ms. Oracle 9i is used as the DBMS and is on a dedicated machine. The database contains seven tables with 1,000,000 rows each. The database update rate can be set as 5000, 10,000, and 15,000 rows per table per minutes. The maximum number of pages that can be cached is 2,000,000 pages. These pages are generated by queries that can be categorized into 200 query types. Thus, there are 10,000 pages (query instances) per query type. BEA WebLogic 6.1 is used for the WAS, and Apache is used as the edge cache server and they are located in dedicated machines. All of the machines are installed an Pentium III 700 MHz one CPU PCs running Redhat Linux 7.2 with 1 GB of memory.

14 282 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) The invalidation mode is set as continuous. In the continuous invalidation mode, the Invalidator will fetch a new block of database update log to start processing immediately after it completes an invalidation check cycle Correlation between number of query types and cache hit rates at the edge cache The problem of cache replacement has been extensively studied. Many algorithms have been proposed for general purpose caching, such as LRU and LFU. Some variations of these are designed specifically for cache replacement of Web pages. However, in the scope of dynamic caching for a Web site, cache invalidation rates is an important factor since a high invalidation rate will lead to a potentially high cache miss rate in the future. As a result of the high miss rate, the WAS and DBMS have to handle a higher load to generate Web pages. Also, as a result of high invalidation rate, the Invalidator component requires to handle more invalidation checking, issue more polling queries to the DBMS, and send out more invalidation messages. We have developed a self tuning cache replacement algorithm (ST) that takes into consideration (1) user access patterns, (2) page invalidation pattern, and (3) temporal locality of the requests. The caching priority of each page is re-calculated periodically. In the current implementation, the priority is re-calculated every minute. Note that the frequency of re-calculation does have an impact on the cache hit rate. Potentially, the more often the caching priorities are re-calculated, the higher are the cache hit rates. The frequency of re-calculation should be dynamically adjusted by considering the trade-off between the benefit of higher hit rates and the additional cost incurred due to frequent re-calculations. The access rate and the invalidation rate is the access count and invalidation count within a time period. The caching priority of a page during a time period t, Caching priorityðtþ, is calculated as access rate ð1 aþ þ a Caching priorityðt 1Þ invalidation rate where a is the temporal decay factor whose value is between 0 and 1. A value of 1 for a makes the system treat all access patterns equally, while a value of 0 makes the system consider only the access patterns during the current time period. In the experiments the value of a is set to 0.8, which yields the best results. The intuition behind this formula is that it estimates the average number of accesses for the page between any two successive invalidations. The higher this number the larger the benefit to keep this page in the cache. After we calculate the caching priority for each pages, we then calculate the caching priority of each query type in a similar way: we calculate their caching priority by aggregating the access rate and invalidation rate for all pages generated by the same query type. Consequently, we are able to select only a small number of query types and cache pages generated by these query types, but maintain a high hit rate at caches. Fig. 3 shows the correlation between the number of query types and the cache hit rates. As we can see in the figure, when we select 20 query types (10% of all query types), the cache hit rate is close to 30%. And, when we select 100 query types (50% of all query types), the cache hit rate is close to 80%. Additional study on correlation between the cache hit rate and the invalidation rate can be found in [25] as well as

15 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 3. Correlation between number of query types and cache hit rates. how to select appropriate query types (i.e. a small number of query types) to monitor while serving most of user requested pages. For example, in the experiments conducted in this paper, we set 200 as the maximum number of query types. Note that a Web site may have more than 200 query types; however, we can identify and monitor the 200 query types that will contribute to a high cache hit rate the most efficiently Effects of request rates and numbers of query types on invalidation cycles As the summary in Section 4 states, the number of query types has the major impact to the invalidation cycle since it directly determines the number of polling queries that must be executed. In the experiments, we vary the number of query types and measure the invalidation cycle time when the number of query types is 20, 60, 100, 140, and 180 respectively. Fig. 4 shows the experimental results that illustrate the effects of the number of query types on the invalidation cycle. As the number of query types increases, so does the length of the invalidation cycle. Note that the polling query for each query type takes two temporary tables to join: the query instance table and the database change table. When the invalidation cycle is shorter, the database change table is smaller accordingly when query instance table remains the same. As a result, the polling queries for the shorter invalidation cycle has lower query processing cost. Fig. 4 also shows that the request rates have limited impact on the invalidation cycle although a higher request rate does put a heavier load on the DBMS where the invalidation checks are performed Effects of request rates and cache hit rates on request response time at the origin Web site The next experiment we conducted is to measure the effects of cache hit rates and request rates on the response time at the origin Web site. Note that we do not present the effects of cache hit

16 284 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 4. Effects of request rates and numbers of query types on invalidation cycles (database update rate ¼ 5000 tuples per table per minute). rates and request rates on the response time at the edge cache server since the edge cache server (based on Squid implementation) is extremely scalable and fast to deliver small objects, like cached dynamic content, within the same network. In Fig. 5, we plot the response time at the origin Web site with respect to a set of different request rates (i.e. from 40 requests per second to 680 requests per second) and different cache hit Fig. 5. Effects of request rates and cache hit rates on request response time at the origin Web site (database update rate ¼ 5000 tuples per table per minute).

17 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) rates (i.e. 30%, 65%, 80%, 84%, and 88%). Note that since the cache hit rates are directly impacted by the numbers of query types, the cache hit rates of 30%, 65%, 80%, 84%, and 88% correspond to the setup where the numbers of query types are 20, 60, 100, 140, and 180 respectively. The setup for the experiments in Figs. 4 and 5 are the same, but we measure invalidation cycles in Fig. 4 and measure request response time in Fig. 5. The experimental results in Fig. 5 indicate that request response time is much more sensitive to the request rate than the length of invalidation cycles. When the request rate reaches a certain threshold, the response time increases sharply Effects of network latency on request response time at the origin Web site The network latency between users and the origin Web site is around 320 ms round trip time. Based on the network latency condition, we measure the request response time at the origin Web site as shown in Fig. 5. In this experiment to measure the effects of network latency on request response time at the origin Web site, we vary the edge cache hit rate so that the request rate at the origin Web site changes according. In the experiment in this section, we fixed the cache hit rate as 80% but varied the request rate and the network latency between users and the origin Web site (round trip time) as 160, 320, 480, 640, and 800 ms. In Fig. 6, we plot the response time at the origin Web site with respect to a set of different overall request rates and network latency settings. The experimental results in the figure indicate that request response time at the origin Web site is very sensitive to both request rate and network latency. When the request rate reaches a certain threshold, the response time increases sharply. And, the response time increases at the faster rate with longer network latency. As we can see from the figure, the slop of the plot for the network latency of 160 ms is much steeper than that for the network latency of 800 ms. Fig. 6. Effects of request rates and network latency on request response time at the origin Web site (database update rate ¼ 5000 tuples per table per minute, hit rate ¼ 80%).

18 286 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Effects of database update rates on invalidation cycles and request response time The next experiment we conducted is to measure the effects of database update rates on cache invalidation cycles and request response time. We repeated the experiments in Sections 5.3 and 5.4 by varying the database update rate from 5000 tuples per table per minute to 10,000 and 15,000 tuples per table per minute. Our observations based on the experimental results are as follows: The increase in the invalidation cycle length is almost proportional to the increase of the database update rate. This is expected as the polling query processing costs are proportional to the database update rate, when the number of query instances is fixed. This is illustrated in Figs. 4, 7, and 8. For example, when the number of query types is fixed to 40, increasing the database Fig. 7. Effects of request rates and numbers of query types on invalidation cycles (database update rate ¼ 10,000 tuples per table per minute). Fig. 8. Effects of request rates and numbers of query types on invalidation cycles (database update rate ¼ 15,000 tuples per table per minute).

19 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) update rate from 5000 to 10,000 per table per minute results in only 70% increase of the invalidation cycle (i.e. increasing from 1.7 to 2.9 s). The database update rates have limited impact on the request response time. For example, when the database update rate increases from 5000 tuples per table per minute to 10,000 tuples per table per minute, the request response time increases by about 10%. Although a higher database update rate puts a heavier load on the DBMS, the major portion of request response time is network latency rather than server process latency. As a result, the effects of database update rates to the request response time is very limited. This is illustrated in Figs. 4, 9, and 10. Fig. 9. Effects of request rates and hit rates on invalidation cycles (database update rate ¼ 10,000 tuples per table per minute). Fig. 10. Effects of request rates and hit rates on invalidation cycles (database update rate ¼ 15,000 tuples per table per minute).

20 288 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Freshness-driven adaptive dynamic content caching In this section, we describe the proposed freshness-driven adaptive dynamic content caching followed by the discussion and analysis of the experimental results Adaptive caching policy To achieve the best assured dynamic content freshness, we need to tune the caching policy so that the response time is close to the invalidation cycle in an equilibrium point. However, the response time and invalidation cycle can only be improved at the expense of each other. In Section 5, we found that response time can be impacted by (1) network latency, (2) request rates, (3) database update rates, (4) invalidation cycle (frequency), and (5) cache hit rates. Network latency, request rates, and database update rates depend on the network infrastructure and application characteristics and hence cannot be controlled. However, we can control (i.e. tune) the frequency of invalidation cycles. Furthermore, the cache hit rate can be controlled by the number of cached query types (Figs. 9 and 10). We can derive the following adaptive caching policy for maintaining request response time and invalidation cycle close to an equilibrium point: If the response time is larger than the length of the invalidation cycle, we can lower the request response time by increasing the number of query types until the request response time and invalidation cycle reach an equilibrium point. If the invalidation cycle is longer than the request response time, we can lower the invalidation cycle by decreasing the number of query types until the request response time and invalidation cycle reach an equilibrium point. Note that when the invalidation cycle is longer than the response time and we shorten the length of the invalidation cycle by decreasing the number of query types, the increase of request response time and the decrease of invalidation cycle will occur at the same time. As a result, an equilibrium point can be achieved fast. Similarly, when we increase the number of query types, the decrease of request response time and the increase of invalidation cycle will occur at the same time. In the current implementation, the adaptive caching policy is deployed at the edge server. The response time is measured at the cache server assuming that the round trip between users and the cache server (i.e. also functioning as a user side proxy) is negligible Experiments We conducted a series of experiments to evaluate the proposed freshness-driven adaptive dynamic content caching technique. For the first 60 min of experiments, we created setups by varying request rates and database update rates every 10 min while the network condition remains the same (network latency as 320 ms round trip). After the 60 min, we change the network latency every 10 min while the user request rate and database update rate are fixed. We then observed the effects of our freshness-driven adaptive caching technique on maintaining the best freshness that

21 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 11. Impact of adaptive caching on content freshness. can be assured for a given system configuration and traffic and network conditions. In Fig. 11, we plot the response time as a dotted line and invalidation cycle as a solid line. The numbers next to the dotted line are the number of query types being cached. We observe that the sudden changes of the request rate will cause temporary imbalance of response time and invalidation cycle. Especially the response time is very sensitive to the changes of request rates. As time moves on, the freshness-driven adaptive caching technique makes necessary adjustments to the number of query types; and this impacts the cache hit rate and the response time. As time elapses, the response time and invalidation cycle are shifted back to a new equilibrium point, which supports the best content freshness that can be assured. From Fig. 11, we can see that the request response time at the origin Web site can be adjusted quickly since it is very sensitive to the cache hit rate. We also observe that the invalidation cycle cannot be adjusted as quickly. These observations are consistent with our experimental results in Section 5. Next we compare the content freshness that can be assured by three different system configurations as follows: 1. a system configuration that does not deploy any dynamic content caching solution; 2. a system configuration that is dynamic content caching enabled but that does not employ the proposed freshness-driven adaptive caching technique; and 3. a system configuration that is dynamic content caching enabled and that employs the proposed freshness-driven adaptive caching technique. In Fig. 12 we plot the larger of the response time and the invalidation cycle length at these three system configurations. This value gives the content freshness that a given system configuration can

22 290 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 12. Comparisons of assured content freshness for three system configurations. support for given request and database update rates. The figure shows the huge benefit provided by the proposed freshness-driven adaptive caching technique. The third configuration consistently provides much fresher content than the two other configurations. Especially, during the heavy traffic conditions (i.e. during the periods of 10 20, 30 40, and min) and the long network latency delay condition (i.e. during the period of min), the system configuration with the freshness-driven adaptive caching supports content freshness up to 20 times better than those without dynamic content caching. It also supports content freshness up to 10 times better than even those systems that already deploy dynamic content caching solutions. The experiments strongly show the effectiveness and benefits of the proposed freshness-driven adaptive caching technique in providing fresh dynamic content Summary In Fig. 16 we summarize the assured content freshness for the experimental results in Figs and the period of min. The X -axis is the number of query types and the Y -axis is the assured content freshness (i.e. the maximum of response time and invalidation cycle). We plot the assured content freshness at different user request rates and database update rates with respect to different numbers of query types at the cache servers. We can see that the curves of the assured content freshness are similar to the shape of a bowl. The best assured content freshness can be achieved at the bottom of the bowl which is circled. The proposed freshness-driven adaptive caching technique continuously monitors response time and invalidation cycle to increase or to decrease the number of query types in the cache server so that the assured content freshness is maintained at the the bottom of the bowl.

23 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 13. Response time and invalidation cycle for the period of min (15,000 updates per second, 550 requests per second). Fig. 14. Response time and invalidation cycle for the period of min (10,000 updates per second, 360 requests per second). Fig. 15. Response time and invalidation cycle for the period of min (5000 updates per second, 45 requests per second).

24 292 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Fig. 16. Effects of adaptive caching on assured content freshness. 7. Related work Issues related to caching of dynamic data have received significant attention recently [26,27]. Applying caching solutions for Web applications and content distribution has also received a lot of attentions in the Web and database communities [7 12]. In citedy2002, Katsaros and Manolopoulos give summary of refresh, re-fetch, consistence maintenance technologies associated with Web-powered databases. Although a lot of research have been done on view maintenance [28,29] and constructions of composite events, such as [30,31], we believe that it is an option limited to building and customizing a a Web site which has a limited number of distinct pages, such as the IBM Web site for Olympic Games [22 24]. In our system [7,9], we intend to support automated constructions of the mapping between Web pages and queries that retrieve the database contents to generate such pages. Dynamai [21] from Persistence Software is one of the first dynamic caching solution that is available as a product. However, Dynamai relies on proprietary software for both database and application server components. Thus it cannot be easily incorporated in existing e-commerce framework. Other related works include [32,33] where authors propose a diffusion-based caching protocol that achieves load-balancing, [34] which uses meta-information in the cache-hierarchy to improve the hit ratio of the caches, [35] which evaluates the performance of traditional cache hierarchies and provides design principles for scalable cache systems, and [36] which highlights the fact that static client-to-server assignment may not perform well compared to dynamic server assignment or selection. SPREAD [37], a system for automated content distribution is an architecture which uses a hybrid of client validation, server invalidation, and replication to maintain consistency across servers. Note that the work in [37] focuses on static content and describes techniques to synchronize static content, which gets updated periodically, across Web servers. Therefore, in a sense,

25 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) the invalidation messages travel horizontally across Web servers. Other works which study the effects of invalidation on caching performance are [38,39]. Consequently, there has been various cache consistency protocol proposals which rely heavily on invalidation [37,40,41]. In our work, however, we concentrate on the updates of data in databases, which are by design not visible to the Web servers. Therefore, we introduce a vertical invalidation concept, where invalidation messages travel from database servers and Web servers to the front-end and edge cache servers as well as JDBC database access layer caches. 8. Concluding remarks Both response time and content freshness are essential to e-commerce applications on the Web. Many e-commerce Web applications deploy dynamic content caching solutions to serve user requests at locations much close to users, avoiding network latency. For such Web sites, the highest level of content freshness that can be assured is the larger value of the response time and invalidation cycle. In this paper, we propose a freshness-driven adaptive dynamic content caching technique. It monitors response time and invalidation cycle as feedback and dynamically adjusts caching policy. The technique aims at maintaining the best content freshness that a system configuration can support. By considering the trade-off of invalidation cycle and response time, our technique is able to maintain the best content freshness that can be assured, which occurs at the equilibrium point of response time and invalidation cycle. The experiments do show the effectiveness and benefits of the proposed freshness-driven adaptive caching. Under the heavy traffic, the freshness-driven adaptive caching supports content freshness up to 20 times better than those without dynamic content caching. It also supports content freshness up to 10 times better than those systems that deploy dynamic content caching solutions. References [1] N. Bhatti, A. Bouch, A. Kuchinsky, Integrating user-perceived quality into Web server design, in: Proceedings of the 9th World-Wide Web Conference, Amsterdam, The Netherlands, June 2000, pp [2] Zona Research, [3] Akamai Technology, Information available at [4] Digital Island, Ltd., Information available at [5] B. Krishnamurthy, C.E. Wills, Analyzing factors that influence end-to-end Web performance, in: Proceedings of the 9th World-Wide Web Conference, Amsterdam, The Netherlands, June 2000, pp [6] P. Deolasee, A. Katkar, A. Panchbudhe, K. Ramamritham, P. Shenoy, Adaptive push-pull: dissemination of dynamic Web data, in: The Proceedings of the 10th WWW Conference, Hong Kong, China, May [7] K. Selcßuk Candan, W.-S. Li, Q. Luo, W.-P. Hsiung, D. Agrawal, Enabling dynamic content caching for databasedriven Web sites, in: Proceedings of the 2001 ACM SIGMOD Conference, Santa Barbara, CA, USA, May 2001, ACM. [8] C. Mohan, Application servers: born-again TP monitors for the Web? (panel abstract), in: Proceedings of the 2001 ACM SIGMOD Conference, Santa Barbara, CA, USA, August [9] W.-S. Li, K. Seluk Candan, W.-P. Hsiung, O. Po, D. Agrawal, Q. Luo, W.-K.W. Huang, Y. Akca, C. Yilmaz, CachePortal: technology for accelerating database-driven e-commerce Web sites, in: Proceedings of the 2001 VLDB Conference, Roma, Italy, September 2001.

26 294 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) [10] C. Mohan, Caching technologies for Web applications, in: Proceedings of the 2001 VLDB Conference, Roma, Italy, September [11] M. Cherniack, M.J. Franklin, S.B. Zdonik, Data management for pervasive computing, in: Proceedings of the 2001 VLDB Conference, Roma, Italy, September [12] Q. Luo, J.F. Naughton, Form-based proxy caching for database-backed Web sites, in: Proceedings of the 2001 VLDB Conference, Roma, Italy, September [13] A. Ninan, P. Kulkarni, P. Shenoy, K. Ramamritham, R. Tewari, Cooperative leases: scalable consistency maintenance in content distribution networks, in: The Proceedings of the 11th WWW Conference, Honolulu, HI, USA, May [14] A. Datta, K. Dutta, H.M. Thomas, D.E. VanderMeer, Suresha, K. Ramamritham, Proxy-based acceleration of dynamically generated content on the World Wide Web: an approach and implementation, in: Proceedings of 2002 ACM SIGMOD Conference, Madison, WI, USA, June [15] Q. Luo, S. Krishnamurthy, C. Mohan, H. Pirahesh, H. Woo, B.G. Lindsay, J.F. Naughton, Middle-tier database caching for e-business, in: Proceedings of 2002 ACM SIGMOD Conference, Madison, Wisconsin, USA, June [16] K. Selcuk Candan, D. Agrawal, W.-S. Li, O. Po, W.-P. Hsiung, View invalidation for dynamic content caching in multitiered architectures, in: Proceedings of the 28th Very Large Data Bases Conference, Hongkong, China, August [17] W.-S. Li, W.-P. Hsiung, D.V. Kalashnikov, R. Sion, O. Po, D. Agrawal, K. Selcßuk Candan. Issues and evaluations of caching solutions for Web application acceleration, in: Proceedings of the 28th Very Large Data Bases Conference, Hongkong, China, August [18] A. Datta, K. Dutta, K. Ramamritham, H. Thomas, D. VanderMeer, Dynamic content acceleration: a caching solution to enable scalable dynamic Web page generation, in: Proceedings of the 2001 ACM SIGMOD Conference, Santa Barbara, CA, USA, May 2001, ACM. [19] Oracle9i data cache, [20] Oracle9i Web cache, [21] Persistent Software Systems Inc., [22] J. Challenger, P. Dantzig, A. Iyengar, A scalable and highly available system for serving dynamic data at frequently accessed Web sites, in: Proceedings of ACM/IEEE SupercomputingÕ98, Orlando, FL, November [23] J. Challenger, A. Iyengar, P. Dantzig, Scalable system for consistently caching dynamic Web data, in: Proceedings of the IEEE INFOCOMÕ99, New York, NY, March 1999, IEEE. [24] E. Levy, A. Iyengar, J. Song, D. Dias, Design and performance of a Web server accelerator, in: Proceedings of the IEEE INFOCOMÕ99, New York, NY, March 1999, IEEE. [25] W.-P. Hsiung, W.-S. Li, K. Selcßuk Candan, D. Agrawal, Multi-tiered cache management for e-commerce Web sites, in: Proceedings of Second International Workshop on Cooperative Internet Computing (CIC 2002), Hong Kong, China, August [26] F. Douglis, A. Haro, M. Rabinovich, HPP: HTML macro-preprocessing to support dynamic document caching, in: Proceedings of USENIX Symposium on Internet Technologies and Systems, [27] B. Smith, A. Acharya, T. Yang, H. Zhu, Exploiting result equivalence in caching dynamic Web content, in: Proceedings of USENIX Symposium on Internet Technologies and Systems, [28] A. Labrinidis, N. Roussopoulos, On the materialization of WebViews, in: Proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDBÕ99), Philadelphia, PA, USA, June [29] A. Labrinidis, N. Roussopoulos, WebView materialization, in: Proceedings of ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, May [30] G. Kappel, W. Retschitzegger, The TriGS active object-oriented database system an overview, SIGMOD Record 27 (3) (1998) [31] A. Labrinidis, N. Roussopoulos, Adaptive WebView materialization, in: Proceedings of the Fourth International Workshop on the Web and Databases (WebDBÕ2001), held in conjunction with ACM SIGMODÕ2001, Santa Barbara, CA, USA, May [32] A. Heddaya, S. Mirdad, WebWave: globally load balanced fully distributed caching of hot published documents, in: Proceedings of the 1997 IEEE International Conference on Distributed Computing and Systems, 1997.

27 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) [33] A. Heddaya, S. Mirdad, D. Yates, Diffusion-based caching: WebWave, in: Proceedings of the 1997 NLANR Web Caching Workshop, [34] M.R. Korupolu, M. Dahlin, Coordinated placement and replacement for large-scale distributed caches, in: Proceedings of the 1999 IEEE Workshop on Internet Applications, [35] R. Tewari, M. Dahlin, H.M. Vin, J.S. Kay, Design considerations for distributed caching on the Internet, in: Proceedings of the 19th International Conference on Distributed Computing Systems, [36] R.L. Carter, M.E. Crovella, On the network impact of dynamic server selection, Computer Networks 31 (23 24) (1999) [37] P. Rodriguez, S. Sibal, SPREAD: scalable platform for reliable and efficient automated distribution, in: Proceedings of the 9th World-Wide Web Conference, Amsterdam, The Netherlands, June 2000, pp [38] P. Cao, C. Liu, Maintaining strong cache consistency in the world wide Web, IEEE Transactions on Computers 47 (4) (1998). [39] J. Gwertzman, M. Seltzer, World-Wide Web cache consistency, in: Proceedings of 1996 USENIX Technical Conference, San Diego, CA, USA, January 1996, pp [40] H. Yu, L. Breslau, S. Shenker, A scalable Web cache consistency architecture, in: Proceedings of the ACM SIGCOMMÕ99 Conference, Boston, MA, USA, September [41] D. Li, P. Cao, M. Dahlin, WCIP: Web cache invalidation protocol, 2000, Wen-Syan Li is a Senior Research Staff Member at Computers & Communications Research Laboratories (CCRL), NEC USA Inc. He received his Ph.D. in Computer Science from Northwestern University in December He also holds an MBA degree. His main research interests include content delivery network, multimedia/hypermedia/document databases, WWW, E-Commerce, and information retrieval. He is leading CachePortal project at NEC USA Venture Development Center and Content Awareness Network project at NEC CCRL in San Jose. Wen-Syan is the recipient of the first NEC USA Achievement Award for his contributions in technology innovation. Oliver Po is currently a Research Engineer at Computer & Communications Research Laboratories (CCRL), NEC USA. From 1985 to 1993 he worked for IBM Taiwan as a system engineer. From 1994 to 1999 he joint IBM Toronto Lab and worked as a database developer focusing on database functions and tools design and enhancement. He received his B.S. degree at His current work involves mainly the CCRL CachePortal project which studies and provides dynamic content caching solutions. Wang-Pin Hsiung is a Research Associate at Computers & Communications Research Laboratories (CCRL), NEC USA Inc. He received his masters degree in Computer Science from Stanford University in 1999 and BSE from Arizona State University. His research interests are in the areas of computer network architectures and protocols; multimedia systems; performance evaluations; and databases. He works on CachePortal project developing static/dynamic Web content caching and also involve in the development of Content Awareness Network.

28 296 W.-S. Li et al. / Data & Knowledge Engineering 47 (2003) Kasım Selçuk Candan is a tenure track assistant professor at the Department of Computer Science and Engineering at the Arizona State University. He joined the department in August 1997, after receiving his Ph.D. from the Computer Science Department at the University of Maryland at College Park. His dissertation research concentrated on multimedia document authoring, presentation, and retrieval in distributed collaborative environments. He received the 1997 ACM DC Chapter award of Samuel N. Alexander Fellowship for his Ph.D. work. His research interests include development of formal models, indexing schemes, and retrieval algorithms for multimedia and Web information and development of novel query optimization and processing algorithms. He has published various articles in respected journals and conferences in related areas. He received his B.S. degree, first ranked in the department, in computer science from Bilkent University in Turkey in Divyakant Agrawal is a Visiting Senior Research Scientist at CCRL, NEC USA, Inc. in San Jose. He received his Ph.D. from SUNY, Stony Brook. His current research interests are in the area of distributed hypermedia databases, multimedia information storage and retrieval, and Web based technologies.

Freshness-driven Adaptive Caching for Dynamic Content

Freshness-driven Adaptive Caching for Dynamic Content Freshness-driven Adaptive Caching for Dynamic Content Wen-Syan Li Oliver Po Wang-Pin Hsiung K. Selçuk Candan Divyakant Agrawal C&C Research Laboratories - Silicon Valley, NEC USA, Inc. 10080 North Wolfe

More information

Edge/Frontend Cache. Sniffer. Web Server. Application Server. Invalidator Content Change Monitoring DBMS. End Users End Users. Request: URL+Parameters

Edge/Frontend Cache. Sniffer. Web Server. Application Server. Invalidator Content Change Monitoring DBMS. End Users End Users. Request: URL+Parameters Issues and Evaluations of Caching Solutions for Web Application Acceleration Wen-Syan Li Wang-Pin Hsiung Dmitri V. Kalashnikov Radu Sion Oliver Po Divyakant Agrawal K. Selοcuk Candan C&C Research Laboratories

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

SaaS Providers. ThousandEyes for. Summary

SaaS Providers. ThousandEyes for. Summary USE CASE ThousandEyes for SaaS Providers Summary With Software-as-a-Service (SaaS) applications rapidly replacing onpremise solutions, the onus of ensuring a great user experience for these applications

More information

Maintaining Mutual Consistency for Cached Web Objects

Maintaining Mutual Consistency for Cached Web Objects Maintaining Mutual Consistency for Cached Web Objects Bhuvan Urgaonkar, Anoop George Ninan, Mohammad Salimullah Raunak Prashant Shenoy and Krithi Ramamritham Department of Computer Science, University

More information

An Oracle White Paper April 2010

An Oracle White Paper April 2010 An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited

More information

Web-based Energy-efficient Cache Invalidation in Wireless Mobile Environment

Web-based Energy-efficient Cache Invalidation in Wireless Mobile Environment Web-based Energy-efficient Cache Invalidation in Wireless Mobile Environment Y.-K. Chang, M.-H. Hong, and Y.-W. Ting Dept. of Computer Science & Information Engineering, National Cheng Kung University

More information

ThousandEyes for. Application Delivery White Paper

ThousandEyes for. Application Delivery White Paper ThousandEyes for Application Delivery White Paper White Paper Summary The rise of mobile applications, the shift from on-premises to Software-as-a-Service (SaaS), and the reliance on third-party services

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

Caching Personalized and Database-related Dynamic Web Pages

Caching Personalized and Database-related Dynamic Web Pages Caching Personalized and Database-related Dynamic Web Pages Yeim-Kuan Chang, Yu-Ren Lin and Yi-Wei Ting Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan

More information

A Capacity Planning Methodology for Distributed E-Commerce Applications

A Capacity Planning Methodology for Distributed E-Commerce Applications A Capacity Planning Methodology for Distributed E-Commerce Applications I. Introduction Most of today s e-commerce environments are based on distributed, multi-tiered, component-based architectures. The

More information

How Far Can Client-Only Solutions Go for Mobile Browser Speed?

How Far Can Client-Only Solutions Go for Mobile Browser Speed? How Far Can Client-Only Solutions Go for Mobile Browser Speed? u Presenter: Ye Li LOGO Introduction u Web browser is one of the most important applications on mobile devices. It is known to be slow, taking

More information

Deploying the BIG-IP System v10 with Oracle s BEA WebLogic

Deploying the BIG-IP System v10 with Oracle s BEA WebLogic DEPLOYMENT GUIDE Deploying the BIG-IP System v10 with Oracle s BEA WebLogic Version 1.0 Table of Contents Table of Contents Deploying the BIG-IP system v10 with Oracle s BEA WebLogic Prerequisites and

More information

Part 1: Indexes for Big Data

Part 1: Indexes for Big Data JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,

More information

Virtualization of the MS Exchange Server Environment

Virtualization of the MS Exchange Server Environment MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of

More information

CLIENT SERVER ARCHITECTURE:

CLIENT SERVER ARCHITECTURE: CLIENT SERVER ARCHITECTURE: Client-Server architecture is an architectural deployment style that describe the separation of functionality into layers with each segment being a tier that can be located

More information

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases Gurmeet Goindi Principal Product Manager Oracle Flash Memory Summit 2013 Santa Clara, CA 1 Agenda Relational Database

More information

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads Gen9 server blades give more performance per dollar for your investment. Executive Summary Information Technology (IT)

More information

Seminar on. By Sai Rahul Reddy P. 2/2/2005 Web Caching 1

Seminar on. By Sai Rahul Reddy P. 2/2/2005 Web Caching 1 Seminar on By Sai Rahul Reddy P 2/2/2005 Web Caching 1 Topics covered 1. Why Caching 2. Advantages of Caching 3. Disadvantages of Caching 4. Cache-Control HTTP Headers 5. Proxy Caching 6. Caching architectures

More information

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm 161 CHAPTER 5 Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm 1 Introduction We saw in the previous chapter that real-life classifiers exhibit structure and

More information

ERDAS APOLLO PERFORMANCE BENCHMARK ECW DELIVERY PERFORMANCE OF ERDAS APOLLO VERSUS ESRI ARCGIS FOR SERVER

ERDAS APOLLO PERFORMANCE BENCHMARK ECW DELIVERY PERFORMANCE OF ERDAS APOLLO VERSUS ESRI ARCGIS FOR SERVER ERDAS APOLLO PERFORMANCE BENCHMARK ECW DELIVERY PERFORMANCE OF ERDAS APOLLO VERSUS ESRI ARCGIS FOR SERVER White Paper April 14, 2014 Contents Introduction... 3 Sample Dataset... 4 Test Hardware... 5 Summary...

More information

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4 W H I T E P A P E R Comparison of Storage Protocol Performance in VMware vsphere 4 Table of Contents Introduction................................................................... 3 Executive Summary............................................................

More information

Performance and Scalability with Griddable.io

Performance and Scalability with Griddable.io Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.

More information

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Data Virtualization Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Introduction Caching is one of the most important capabilities of a Data Virtualization

More information

Capacity Planning for Next Generation Utility Networks (PART 1) An analysis of utility applications, capacity drivers and demands

Capacity Planning for Next Generation Utility Networks (PART 1) An analysis of utility applications, capacity drivers and demands Capacity Planning for Next Generation Utility Networks (PART 1) An analysis of utility applications, capacity drivers and demands Utility networks are going through massive transformations towards next

More information

Cache Justification for Digital Signal Processors

Cache Justification for Digital Signal Processors Cache Justification for Digital Signal Processors by Michael J. Lee December 3, 1999 Cache Justification for Digital Signal Processors By Michael J. Lee Abstract Caches are commonly used on general-purpose

More information

CIT 668: System Architecture. Caching

CIT 668: System Architecture. Caching CIT 668: System Architecture Caching Topics 1. Cache Types 2. Web Caching 3. Replacement Algorithms 4. Distributed Caches 5. memcached A cache is a system component that stores data so that future requests

More information

Technical and Architectural Overview

Technical and Architectural Overview 100% Web-Based Time & Labor Management Technical and Architectural Overview Copyright 2007 Time America 15990 N. Greenway-Hayden Loop Suite D-500, Scottsdale, AZ (800) 227-9766 www.timeamerica.com Table

More information

An Integration Approach of Data Mining with Web Cache Pre-Fetching

An Integration Approach of Data Mining with Web Cache Pre-Fetching An Integration Approach of Data Mining with Web Cache Pre-Fetching Yingjie Fu 1, Haohuan Fu 2, and Puion Au 2 1 Department of Computer Science City University of Hong Kong, Hong Kong SAR fuyingjie@tsinghua.org.cn

More information

Technical papers Web caches

Technical papers Web caches Technical papers Web caches Web caches What is a web cache? In their simplest form, web caches store temporary copies of web objects. They are designed primarily to improve the accessibility and availability

More information

Performance Consistency

Performance Consistency White Paper Performance Consistency SanDIsk Corporation Corporate Headquarters 951 SanDisk Drive, Milpitas, CA 95035, U.S.A. Phone +1.408.801.1000 Fax +1.408.801.8657 www.sandisk.com Performance Consistency

More information

Advanced ODBC and JDBC Access to Salesforce Data

Advanced ODBC and JDBC Access to Salesforce Data Advanced ODBC and JDBC Access to Salesforce Data DATA SHEET FEATURES BENEFITS Use significantly less memory to do more work Expose Salesforce data to a full spectrum of custom and commercial apps Secure

More information

SSD Admission Control for Content Delivery Networks

SSD Admission Control for Content Delivery Networks Technical Disclosure Commons Defensive Publications Series September 20, 2017 SSD Admission Control for Content Delivery Networks Dragos Ionescu Richard Schooler Kenneth Barr Follow this and additional

More information

Yahoo Traffic Server -a Powerful Cloud Gatekeeper

Yahoo Traffic Server -a Powerful Cloud Gatekeeper Yahoo Traffic Server -a Powerful Cloud Gatekeeper Shih-Yong Wang Yahoo! Taiwan 2010 COSCUP Aug 15, 2010 What is Proxy Caching? Proxy Caching explicit client configuration transparent emulate responses

More information

Measuring VDI Fitness and User Experience Technical White Paper

Measuring VDI Fitness and User Experience Technical White Paper Measuring VDI Fitness and User Experience Technical White Paper 3600 Mansell Road Suite 200 Alpharetta, GA 30022 866.914.9665 main 678.397.0339 fax info@liquidwarelabs.com www.liquidwarelabs.com Table

More information

Oracle Streams. An Oracle White Paper October 2002

Oracle Streams. An Oracle White Paper October 2002 Oracle Streams An Oracle White Paper October 2002 Oracle Streams Executive Overview... 3 Introduction... 3 Oracle Streams Overview... 4... 5 Staging... 5 Propagation... 6 Transformations... 6 Consumption...

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0; How will execution time grow with SIZE? int array[size]; int A = ; for (int i = ; i < ; i++) { for (int j = ; j < SIZE ; j++) { A += array[j]; } TIME } Plot SIZE Actual Data 45 4 5 5 Series 5 5 4 6 8 Memory

More information

Extreme Performance Platform for Real-Time Streaming Analytics

Extreme Performance Platform for Real-Time Streaming Analytics Extreme Performance Platform for Real-Time Streaming Analytics Achieve Massive Scalability on SPARC T7 with Oracle Stream Analytics O R A C L E W H I T E P A P E R A P R I L 2 0 1 6 Disclaimer The following

More information

Optimizing Tiered Storage Workloads with Precise for Storage Tiering

Optimizing Tiered Storage Workloads with Precise for Storage Tiering Applied Technology Abstract By introducing Enterprise Flash Drives to EMC Symmetrix and CLARiiON storage systems, EMC revitalized the importance of tiered storage. Optimizing the use of these tiers provides

More information

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group WHITE PAPER: BEST PRACTICES Sizing and Scalability Recommendations for Symantec Rev 2.2 Symantec Enterprise Security Solutions Group White Paper: Symantec Best Practices Contents Introduction... 4 The

More information

Finding a needle in Haystack: Facebook's photo storage

Finding a needle in Haystack: Facebook's photo storage Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,

More information

Testing & Assuring Mobile End User Experience Before Production Neotys

Testing & Assuring Mobile End User Experience Before Production Neotys Testing & Assuring Mobile End User Experience Before Production Neotys Henrik Rexed Agenda Introduction The challenges Best practices NeoLoad mobile capabilities Mobile devices are used more and more At

More information

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Saiyad Sharik Kaji Prof.M.B.Chandak WCOEM, Nagpur RBCOE. Nagpur Department of Computer Science, Nagpur University, Nagpur-441111

More information

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM

More information

Bipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae

Bipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae ONE MILLION FINANCIAL TRANSACTIONS PER HOUR USING ORACLE DATABASE 10G AND XA Bipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae INTRODUCTION

More information

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of Minnesota

More information

Scaling DreamFactory

Scaling DreamFactory Scaling DreamFactory This white paper is designed to provide information to enterprise customers about how to scale a DreamFactory Instance. The sections below talk about horizontal, vertical, and cloud

More information

File Server Comparison: Executive Summary. Microsoft Windows NT Server 4.0 and Novell NetWare 5. Contents

File Server Comparison: Executive Summary. Microsoft Windows NT Server 4.0 and Novell NetWare 5. Contents File Server Comparison: Microsoft Windows NT Server 4.0 and Novell NetWare 5 Contents Executive Summary Updated: October 7, 1998 (PDF version 240 KB) Executive Summary Performance Analysis Price/Performance

More information

Oracle Event Processing Extreme Performance on Sparc T5

Oracle Event Processing Extreme Performance on Sparc T5 Oracle Event Processing Extreme Performance on Sparc T5 An Oracle Event Processing (OEP) Whitepaper ORACLE WHITE PAPER AUGUST 2014 Table of Contents Introduction 2 OEP Architecture 2 Server Architecture

More information

A10 Thunder ADC with Oracle E-Business Suite 12.2 DEPLOYMENT GUIDE

A10 Thunder ADC with Oracle E-Business Suite 12.2 DEPLOYMENT GUIDE A10 Thunder ADC with Oracle E-Business Suite 12.2 DEPLOYMENT GUIDE Table of Contents 1. Introduction... 2 2 Deployment Prerequisites... 2 3 Oracle E-Business Topology... 3 4 Accessing the Thunder ADC Application

More information

Deploying the BIG-IP System for LDAP Traffic Management

Deploying the BIG-IP System for LDAP Traffic Management Deploying the BIG-IP System for LDAP Traffic Management Welcome to the F5 deployment guide for LDAP traffic management. This document provides guidance for configuring the BIG-IP system version 11.4 and

More information

CHAPTER 4 OPTIMIZATION OF WEB CACHING PERFORMANCE BY CLUSTERING-BASED PRE-FETCHING TECHNIQUE USING MODIFIED ART1 (MART1)

CHAPTER 4 OPTIMIZATION OF WEB CACHING PERFORMANCE BY CLUSTERING-BASED PRE-FETCHING TECHNIQUE USING MODIFIED ART1 (MART1) 71 CHAPTER 4 OPTIMIZATION OF WEB CACHING PERFORMANCE BY CLUSTERING-BASED PRE-FETCHING TECHNIQUE USING MODIFIED ART1 (MART1) 4.1 INTRODUCTION One of the prime research objectives of this thesis is to optimize

More information

Cache introduction. April 16, Howard Huang 1

Cache introduction. April 16, Howard Huang 1 Cache introduction We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? The rest of CS232 focuses on memory and input/output issues, which are frequently

More information

White Paper. Major Performance Tuning Considerations for Weblogic Server

White Paper. Major Performance Tuning Considerations for Weblogic Server White Paper Major Performance Tuning Considerations for Weblogic Server Table of Contents Introduction and Background Information... 2 Understanding the Performance Objectives... 3 Measuring your Performance

More information

Scheduling Strategies for Processing Continuous Queries Over Streams

Scheduling Strategies for Processing Continuous Queries Over Streams Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 Scheduling Strategies for Processing Continuous Queries Over Streams Qingchun Jiang, Sharma Chakravarthy

More information

Dell OpenManage Power Center s Power Policies for 12 th -Generation Servers

Dell OpenManage Power Center s Power Policies for 12 th -Generation Servers Dell OpenManage Power Center s Power Policies for 12 th -Generation Servers This Dell white paper describes the advantages of using the Dell OpenManage Power Center to set power policies in a data center.

More information

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions Chapter 1: Abstract The Proway System is a powerful complete system for Process and Testing Data Analysis in IC

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER Hardware Sizing Using Amazon EC2 A QlikView Scalability Center Technical White Paper June 2013 qlikview.com Table of Contents Executive Summary 3 A Challenge

More information

MEMORY. Objectives. L10 Memory

MEMORY. Objectives. L10 Memory MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical

More information

Application of TRIE data structure and corresponding associative algorithms for process optimization in GRID environment

Application of TRIE data structure and corresponding associative algorithms for process optimization in GRID environment Application of TRIE data structure and corresponding associative algorithms for process optimization in GRID environment V. V. Kashansky a, I. L. Kaftannikov b South Ural State University (National Research

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

1 of 8 14/12/2013 11:51 Tuning long-running processes Contents 1. Reduce the database size 2. Balancing the hardware resources 3. Specifying initial DB2 database settings 4. Specifying initial Oracle database

More information

WHITE PAPER: ENTERPRISE AVAILABILITY. Introduction to Adaptive Instrumentation with Symantec Indepth for J2EE Application Performance Management

WHITE PAPER: ENTERPRISE AVAILABILITY. Introduction to Adaptive Instrumentation with Symantec Indepth for J2EE Application Performance Management WHITE PAPER: ENTERPRISE AVAILABILITY Introduction to Adaptive Instrumentation with Symantec Indepth for J2EE Application Performance Management White Paper: Enterprise Availability Introduction to Adaptive

More information

Lightweight caching strategy for wireless content delivery networks

Lightweight caching strategy for wireless content delivery networks Lightweight caching strategy for wireless content delivery networks Jihoon Sung 1, June-Koo Kevin Rhee 1, and Sangsu Jung 2a) 1 Department of Electrical Engineering, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon,

More information

TUTORIAL: WHITE PAPER. VERITAS Indepth for the J2EE Platform PERFORMANCE MANAGEMENT FOR J2EE APPLICATIONS

TUTORIAL: WHITE PAPER. VERITAS Indepth for the J2EE Platform PERFORMANCE MANAGEMENT FOR J2EE APPLICATIONS TUTORIAL: WHITE PAPER VERITAS Indepth for the J2EE Platform PERFORMANCE MANAGEMENT FOR J2EE APPLICATIONS 1 1. Introduction The Critical Mid-Tier... 3 2. Performance Challenges of J2EE Applications... 3

More information

Deployment Guide AX Series with Oracle E-Business Suite 12

Deployment Guide AX Series with Oracle E-Business Suite 12 Deployment Guide AX Series with Oracle E-Business Suite 12 DG_OEBS_032013.1 TABLE OF CONTENTS 1 Introduction... 4 2 Deployment Prerequisites... 4 3 Oracle E-Business Topology... 5 4 Accessing the AX Series

More information

Optimizing Testing Performance With Data Validation Option

Optimizing Testing Performance With Data Validation Option Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

Revisiting Join Site Selection in Distributed Database Systems

Revisiting Join Site Selection in Distributed Database Systems Revisiting Join Site Selection in Distributed Database Systems Haiwei Ye 1, Brigitte Kerhervé 2, and Gregor v. Bochmann 3 1 Département d IRO, Université de Montréal, CP 6128 succ Centre-Ville, Montréal

More information

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 143 CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE 6.1 INTRODUCTION This chapter mainly focuses on how to handle the inherent unreliability

More information

WHITE PAPER Application Performance Management. The Case for Adaptive Instrumentation in J2EE Environments

WHITE PAPER Application Performance Management. The Case for Adaptive Instrumentation in J2EE Environments WHITE PAPER Application Performance Management The Case for Adaptive Instrumentation in J2EE Environments Why Adaptive Instrumentation?... 3 Discovering Performance Problems... 3 The adaptive approach...

More information

TK2123: COMPUTER ORGANISATION & ARCHITECTURE. CPU and Memory (2)

TK2123: COMPUTER ORGANISATION & ARCHITECTURE. CPU and Memory (2) TK2123: COMPUTER ORGANISATION & ARCHITECTURE CPU and Memory (2) 1 Contents This lecture will discuss: Cache. Error Correcting Codes. 2 The Memory Hierarchy Trade-off: cost, capacity and access time. Faster

More information

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER Akhil Kumar and Michael Stonebraker EECS Department University of California Berkeley, Ca., 94720 Abstract A heuristic query optimizer must choose

More information

DATABASE SCALABILITY AND CLUSTERING

DATABASE SCALABILITY AND CLUSTERING WHITE PAPER DATABASE SCALABILITY AND CLUSTERING As application architectures become increasingly dependent on distributed communication and processing, it is extremely important to understand where the

More information

Annex 10 - Summary of analysis of differences between frequencies

Annex 10 - Summary of analysis of differences between frequencies Annex 10 - Summary of analysis of differences between frequencies Introduction A10.1 This Annex summarises our refined analysis of the differences that may arise after liberalisation between operators

More information

A Review on Cache Memory with Multiprocessor System

A Review on Cache Memory with Multiprocessor System A Review on Cache Memory with Multiprocessor System Chirag R. Patel 1, Rajesh H. Davda 2 1,2 Computer Engineering Department, C. U. Shah College of Engineering & Technology, Wadhwan (Gujarat) Abstract

More information

S1 Informatic Engineering

S1 Informatic Engineering S1 Informatic Engineering Advanced Software Engineering Web App. Process and Architecture By: Egia Rosi Subhiyakto, M.Kom, M.CS Informatic Engineering Department egia@dsn.dinus.ac.id +6285640392988 SYLLABUS

More information

Oracle and Tangosol Acquisition Announcement

Oracle and Tangosol Acquisition Announcement Oracle and Tangosol Acquisition Announcement March 23, 2007 The following is intended to outline our general product direction. It is intended for information purposes only, and may

More information

Complex Interactions in Content Distribution Ecosystem and QoE

Complex Interactions in Content Distribution Ecosystem and QoE Complex Interactions in Content Distribution Ecosystem and QoE Zhi-Li Zhang Qwest Chair Professor & Distinguished McKnight University Professor Dept. of Computer Science & Eng., University of Minnesota

More information

Amazon AWS-Solution-Architect-Associate Exam

Amazon AWS-Solution-Architect-Associate Exam Volume: 858 Questions Question: 1 You are trying to launch an EC2 instance, however the instance seems to go into a terminated status immediately. What would probably not be a reason that this is happening?

More information

Edge Side Includes (ESI) Overview

Edge Side Includes (ESI) Overview Edge Side Includes (ESI) Overview Abstract: Edge Side Includes (ESI) accelerates dynamic Web-based applications by defining a simple markup language to describe cacheable and non-cacheable Web page components

More information

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy. Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Performance Modeling and Evaluation of Web Systems with Proxy Caching

Performance Modeling and Evaluation of Web Systems with Proxy Caching Performance Modeling and Evaluation of Web Systems with Proxy Caching Yasuyuki FUJITA, Masayuki MURATA and Hideo MIYAHARA a a Department of Infomatics and Mathematical Science Graduate School of Engineering

More information

Supra-linear Packet Processing Performance with Intel Multi-core Processors

Supra-linear Packet Processing Performance with Intel Multi-core Processors White Paper Dual-Core Intel Xeon Processor LV 2.0 GHz Communications and Networking Applications Supra-linear Packet Processing Performance with Intel Multi-core Processors 1 Executive Summary Advances

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

Searching the Web What is this Page Known for? Luis De Alba

Searching the Web What is this Page Known for? Luis De Alba Searching the Web What is this Page Known for? Luis De Alba ldealbar@cc.hut.fi Searching the Web Arasu, Cho, Garcia-Molina, Paepcke, Raghavan August, 2001. Stanford University Introduction People browse

More information

Network Load Balancing Methods: Experimental Comparisons and Improvement

Network Load Balancing Methods: Experimental Comparisons and Improvement Network Load Balancing Methods: Experimental Comparisons and Improvement Abstract Load balancing algorithms play critical roles in systems where the workload has to be distributed across multiple resources,

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. Caching, Content Distribution and Load Balancing

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. Caching, Content Distribution and Load Balancing Department of Computer Science Institute for System Architecture, Chair for Computer Networks Caching, Content Distribution and Load Balancing Motivation Which optimization means do exist? Where should

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

Memory. Objectives. Introduction. 6.2 Types of Memory

Memory. Objectives. Introduction. 6.2 Types of Memory Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts

More information

Comprehensive Guide to Evaluating Event Stream Processing Engines

Comprehensive Guide to Evaluating Event Stream Processing Engines Comprehensive Guide to Evaluating Event Stream Processing Engines i Copyright 2006 Coral8, Inc. All rights reserved worldwide. Worldwide Headquarters: Coral8, Inc. 82 Pioneer Way, Suite 106 Mountain View,

More information

Price Performance Analysis of NxtGen Vs. Amazon EC2 and Rackspace Cloud.

Price Performance Analysis of NxtGen Vs. Amazon EC2 and Rackspace Cloud. Price Performance Analysis of Vs. EC2 and Cloud. Performance Report: ECS Performance Analysis of Virtual Machines on ECS and Competitive IaaS Offerings An Examination of Web Server and Database Workloads

More information

Trace Driven Simulation of GDSF# and Existing Caching Algorithms for Web Proxy Servers

Trace Driven Simulation of GDSF# and Existing Caching Algorithms for Web Proxy Servers Proceeding of the 9th WSEAS Int. Conference on Data Networks, Communications, Computers, Trinidad and Tobago, November 5-7, 2007 378 Trace Driven Simulation of GDSF# and Existing Caching Algorithms for

More information

Performance Monitoring

Performance Monitoring Performance Monitoring Performance Monitoring Goals Monitoring should check that the performanceinfluencing database parameters are correctly set and if they are not, it should point to where the problems

More information

Managing Performance Variance of Applications Using Storage I/O Control

Managing Performance Variance of Applications Using Storage I/O Control Performance Study Managing Performance Variance of Applications Using Storage I/O Control VMware vsphere 4.1 Application performance can be impacted when servers contend for I/O resources in a shared storage

More information

Accelerating Database Processing at e-commerce Sites

Accelerating Database Processing at e-commerce Sites Accelerating Database Processing at e-commerce Sites Seunglak Choi, Jinwon Lee, Su Myeon Kim, Junehwa Song, and Yoon-Joon Lee Korea Advanced Institute of Science and Technology, 373-1 Kusong-dong Yusong-gu

More information