S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda. The Ohio State University

Size: px

Start display at page:

Download "S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda. The Ohio State University"

Colleen Summers
5 years ago
Views:

1 Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda The Ohio State University

2 Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions

3 Introduction Fast Internet Growth Number of Users Amount of data Types of services Several uses E-Commerce, Online Banking, Online Auctions, etc Web Server Scalability Multi-Tier Data-Centers Caching An Important Technique

4 Presentation Outline Introduction/Motivation Multi-Tier Data-Centers Active Caches Design and Implementation Experimental Results Conclusions

5 A Typical Multi-Tier Data-Center Web Servers Apache Clients Tier 0 Proxy Nodes Tier 2 Database Servers WAN Tier 1 Application Servers PHP SAN

6 InfiniBand High Performance Low latency High Bandwidth Open Industry Standard Provides rich features RDMA, Remote Atomic operations, etc Targeted for Data-Centers Transport Layers VAPI IPoIB

7 Presentation Outline Introduction/Motivation Multi-Tier Data-Centers Active Caches Design and Implementation Experimental Results Conclusions

8 Caching Can avoid re-fetching of content Beneficial if requests repeat Important for scalability Static content caching Well studied in the past Widely used Number of Requests Decrease Front-End Tiers Back-End Tiers

9 Active Caching Dynamic Data Stock Quotes, Scores, Personalized Content, etc Complexity of content Simple caching methods not suited Issues Consistency Coherency User Request Proxy Node Cache Back-End Data Update

10 Cache Coherency Refers to the average staleness of the document served from cache Strong or immediate (Strong Coherency) Required for certain kinds of data Cache Disabling Client Polling

11 Basic Client Polling * Front-End Back-End Request Cache Hit Version Read Response Cache Miss * SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand. Narravula, et. Al.

12 Multiple Object Dependencies Cache documents contain multiple objects A Many-to-Many mapping Single Cache document can contain Multiple Objects Single Object can be a part of multiple Documents Complexity!! Cache Documents Objects

13 Client Polling Front-End Back-End Request Cache Hit Version Read Single Check Possible Response Cache Miss Single Lookup counter essential for correct and efficient design

14 Objective To design an architecture that very efficiently supports strong cache coherency with multiple dynamic dependencies on InfiniBand

15 Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions

16 Basic System Architecture Server Node Server Node Mod Mod Proxy Servers Cooperation Application Servers Server Node Server Node Mod Mod Cache Lookup Counter maintained on the Application Servers

17 Basic Design Home Node based Client Polling Cache Documents assigned home nodes Proxy Server Modules Client polling functionality Application Server Modules Support Version Reads for client polling Handle updates

18 Many-to-Many Mappings Mapping of updates to dynamic objects Mapping of dynamic objects with Lookup counters Efficiency Factor of dependency Lookup counters Objects Updates

19 Mapping of updates Non-Trivial solution Three possibilities Database schema, constraints and dependencies are known Per query dependencies are known No dependency information known

20 Mapping Schemes Dependency Lists Home node based Complete dependency lists Invalidate All Single Lookup Counter for a given class of queries Low application server overheads

21 Handling Updates HTTP Request Update Notification VAPI Send Ack (Atomic) Local Search and Coherent Invalidate DB Query (TCP) DB Response HTTP Response Application Server Application Server Application Server Database Server

22 Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions

23 Experimental Test-bed Cluster 1: Eight Dual 3.0 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 533 MHz Front Side Bus Cluster 2: Eight Dual 2.4 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 400 MHz Front Side Bus Mellanox InfiniHost MT23108 Dual Port 4x HCAs MT43132 eight 4x port Switch Mellanox Golden CD 0.5.0

24 Experimental Outline Basic Data-Center Performance Cache Misses in Active Caching Impact of Cache Size Impact of Varying Dependencies Impact of Load in Backend Servers Traces Used Traces 1-5 with increasing update rate Trace 6: Zipf like trace

25 Basic Data-Center Performance Data-Center Throughput Data-Center Response Time T P S Trace 2 Trace 3 Trace 4 Trace 5 Traces with Increasing Update Rate No Cache Invalidate All Dependency Lists R e s p o n s e T i m e (m s ) Trace 2 Trace 3 Trace 4 Trace 5 Traces with Increasing Update Rate No Cache Invalidate All Dependency Lists Maintaining Dependency Lists perform significantly well for all traces

26 Cache Misses in Active Caching Cache Miss % Cache Misses Trace 2 Trace 3 Trace 4 Trace 5 Traces with Increasing Update Rate No Cache Invalidate All Dependency Lists Cache misses for Invalidate All increases drastically with increasing update rates

27 Impact of Cache Size Throughput Vs Cache Size TPS % 50% 10% 5% 1% 0.10% 0% Relative Cache Size Maintaining Dependency Lists perform significantly well for all traces Possible to cache a select few and still extract performance

28 Impact of Varying Dependencies Effect of Varying Dependencies TPS x 2x 4x 8x 16x 32x 64x Factor of Dependencies Throughput drops significantly with increase in the average number of dependencies per cache file

29 Impact of Load in Backend Servers Effect of Load TPS No Cache Dependency List # Compute Threads Our design can sustain high performance even under high loaded conditions with a factor of improvement close to 22

30 Conclusions An architecture for supporting Strong Cache Coherence with multiple dynamic dependencies Efficiently handle multiple dynamic dependencies Supporting RDMA-based Client polling Resilient to load on back-end servers

31 Web Pointers NBC home page {narravul, balaji, vaidyana, jinhy,

32 Back-up Slides

33 Cache Consistency Non-decreasing views of system state Updates seen by all or none Proxy Nodes Back-End Nodes User Requests Update

34 Performance Throughput (RDMA Read) Throughput (Mbps) K 4K 16K 64K 256K Message Size (bytes) 0 Send CPU Throughput (Poll) Recv CPU Throughput (Event) Receiver side CPU utilization is very low Leveraging the benefits of One sided communication

35 RDMA based Client Polling * DataCenter: Throughput Transactions per second (TPS) Number of Compute Threads No Cache IPoIB VAPI The VAPI module can sustain performance even with heavy load on the back-end servers * SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand. Narravula, et. Al.

36 Mechanism Cache Hit: Back-end Version Check If version current, use cache Invalidate data for failed version check Use of RDMA-Read Cache Miss Get data to cache Initialize local versions

37 Other Implementation Details Requests to read and update are mutually excluded at the back-end module to avoid simultaneous readers and writers accessing the same data. Minimal changes to existing application software

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda The Ohio State University