Cluster-Based Scalable Network Services

Size: px

Start display at page:

Download "Cluster-Based Scalable Network Services"

Polly Ray
5 years ago
Views:

1 Cluster-Based Scalable Network Services Suhas Uppalapati INFT 803 Oct (Source : Fox, Gribble, Chawathe, and Brewer, SOSP, 1997)

2 Requirements for SNS Incremental scalability and overflow growth provisioning 24x7 availability through fault masking Cost-effectiveness

3 Multics Multiplexed Information and Computer Service Comprehensive Adapt to unknown future requirements Capable of evolving over time Obstacle: Absence of the network infrastructure

4 Challenges of Deploying Network Services Scalability - per user level of service Availability - 24x7 availability despite transient partial HW/SW failures Cost Effectiveness - services must be economical to administer and expand

5 Advantages of Clusters over SMPs Scalability - clusters grow incrementally, eliminates forklift upgrades High Availability - natural redundancy, hot upgrades Commodity Building Blocks - high - cost/performance, low lead time.

6 Challenges of Cluster Computing Administration - a serious concern for systems of many nodes Component Vs system replication - each commodity can support some components of the service rather than entire service Partial failures - ability to survive the failures of subsets of the system Shared state - No shared state

7 ACID & BASE Semantics ACID - Atomicity, Consistency, Isolation, Durability. Provides strongest semantics at the highest cost and complexity BASE - Basically Available, Soft State, Eventual Consistency. Handles partial failure in clusters with less complexity and cost. Trades consistency for simplicity and availability

8 Cluster-Based Scalable Service Architecture The architecture attempts to address both the challenges of cluster computing and the challenges of deploying network services, while exploiting clusters' strengths.

9 Architecture of a Generic SNS FE - front ends W - pool of workers some of which may be caches ($) MS - manager stubs WS - worker stubs Fault tolerant load manager s functionality logically extends into the MS & WS

10 Functional Organization of SNS Front Ends provide the interface to the SNS as seen by the outside world Worker Pool consists of caches and service specific modules that implement the actual service Customization Database stores user profiles that allow mass customization of request processing

11 Functional Organization of SNS Manager balances load across workers and spawns additional workers as offered load fluctuates or faults occur Graphical Monitor for system management supports tracking and visualization of the system's behavior System Area Network provides a low latency, high-bandwidth interconnect

12 Scalable Network Service Layered Model SNS: Scalable Network Service support Incremental and absolute scalability Worker load balancing and overflow management Front end availability, fault tolerance mechanisms System monitoring and logging

13 Scalable Network Service Layered Model TACC: Transformation,Aggregation, Caching, Customization API for composition of stateless data transformation and content aggregation modules Uniform caching of original, post aggregation and post transformation data Transparent access to Customization database

14 Scalable Network Service Layered Model Service: Service specific code Workers that present human interface to what TACC modules do, including device specific presentation User interface to control the service

15 Scalability Components in SNS replicated for fault tolerance, high availability, and scalability Load balancing is controlled by a centralized policy implemented in the manager. Manager spawns workers on the overflow machines on demand when unexpected load bursts arrive Soft state is used for fault tolerance and availability

16 TACC: A Programming Model for Internet Services Transformation - an operation on a single data object that changes its content. Aggregation - involves collecting data from several objects and collating it in a pre-specified way Caching - important because re-computing or storing data has become cheaper than moving it across the Internet Customization - represents a fundamental advantage of the Internet over traditional wide area media

17 Service Implementation TranSend - Scalable Web Distillation Proxy HotBot - Commercial Implementation of Inktomi Search Engine

18 TranSend Front Ends - SPARCstation 10 and 20 machines, switched 10 Mb/s Ethernet. A thread is assigned to each arriving TCP connection Load Balancing Manger - the manager spawns a new distiller on an unused node excessive load is detected on distillers. Overflow mechanism is used for adjusting to bursts in load

19 TranSend Fault Tolerance and Crash Recovery - Manager s state is maintained by soft state (BASE) rather than hard state (ACID) Cache Nodes - Caching in TranSend is only an optimization. All cached data can be thrown away at the cost of performance - cache nodes are workers whose only job is the management of BASE data

20 TranSend Exploits BASE BASE semantics greatly simplify TranSend's fault tolerance and improves availability. Only the user profile database is ACID. Stale load balancing data - slightly stale between updates from the manager. Timeouts are used to recover where stale data causes an incorrect load balancing choice.

21 TranSend Exploits BASE Soft state: improved performance from avoiding commits trivial recovery Approximate answers- an approximate answer delivered quickly is more useful than the exact answer delivered slowly

22 HotBot Implementation Inktomi predates the layered model and scalable server architecture Front ends and service interface: runs on a mixture of single and multiple CPU SPARCstation server nodes. The HTTP front ends run threads per node

23 HotBot Implementation Load balancing: workers statically partition the search engine database. Every query goes to all workers in parallel Failure management: worker nodes are not interchangeable. when a node is down, other nodes take over responsibility for that data, maintaining 100% data availability with degradation in performance.

24 Main differences between TranSend and HotBot Component TranSend HotBot Load balancing Application layer Service layer Failure management Worker placement User profile (ACID) database Caching Dynamic, by queue lengths at worker nodes Composable TACC Workers Worker dispatch logic, HTML / JavaScript UI Centralized but fault tolerant using process peers FE's and caches bound to their nodes Berkeley DB with read caches Harvest caches store preand posttransformation Web data Static partitioning of read only data Fixed search service application Dynamic HTML generation, HTML UI Distributed to each node All workers bound to their nodes Parallel Informix server integrated cache of recent searches, for incremental delivery

25 Measurements of TranSend Implementation HTTP Traces and the Playback Engine

26 Measurements of TranSend Implementation Burstiness

27 Measurements of TranSend Implementation Distiller Performance

28 Measurements of TranSend Implementation Cache Partition Performance The average cache hit - 27 ms to service, including network and OS overhead. TCP connection and tear down overhead is attributed to 15 ms of this service time. 95% of all cache hits take less than 100 ms to service. Cache hit rate has low variation. The miss penalty varies widely, from 100 ms through 100 seconds. End to end latency dominates, and hence, cache miss rate should be minimized.

29 Measurements of TranSend Implementation Self Tuning and Load Balancing

30 Measurements of TranSend Implementation Scalability

31 Extensibility Keyword Filtering Bay Area Culture Page TranSend Metasearch Anonymous Rewebber Real Web Access for PDAs and Smart Phones

32 Economic Feasibility Low hardware costs ISP Savings due to cache hit rate of 50% Savings due to elimination of T1 lines Cost of administration is nontrivial, but would be minimal.

33 Related Work Content transformation by proxy - Kanji transcoding, Knaji-to-GIF conversion Fault tolerance and high availability - Tandem, CORDS. BASE - Grapevine, Bayou Load balancing and scaling - WebOS, SWEB++

34 Future Work Investigation of proposed architecture outside the Internet-server domain, for write-intensive services where hard state and strong consistency is desired. Provide an adaptive solution for Web access from wireless clients Inadequacies of busy Internet Servers related to cluster-based middleware services

35 Conclusions The architecture is reusable A large class of network services can get by BASE, a weaker than ACID data semantics by trading consistency for availability and using soft state for performance and failure management. Increase the value of Internet access to end users while remaining cost-efficient to deploy and administer Cluster-based value-added network services will become and important Internet-service paradigm

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.