INF5071 Performance in distributed systems Distribution Part II

Similar documents
INF5071 Performance in distributed systems Distribution Part II

Page 1. Multilevel Memories (Improving performance using a little cash )

Web Caching and Content Delivery

Advanced Computer Architecture

Advanced Networking Technologies

CACHE MEMORIES ADVANCED COMPUTER ARCHITECTURES. Slides by: Pedro Tomás

Loopback: Exploiting Collaborative Caches for Large-Scale Streaming

Modeling and Caching of P2P Traffic

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Impact of Frequency-Based Cache Management Policies on the Performance of Segment Based Video Caching Proxies

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

Improving VoD System Efficiency with Multicast and Caching

Readings and References. Virtual Memory. Virtual Memory. Virtual Memory VPN. Reading. CSE Computer Systems December 5, 2001.

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

Basics (cont.) Characteristics of data communication technologies OSI-Model

12 Cache-Organization 1

CS 268: Computer Networking. Taking Advantage of Broadcast

Caching Basics. Memory Hierarchies

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 8, NO. 2, APRIL Segment-Based Streaming Media Proxy: Modeling and Optimization

Virtual Memory. Chapter 8

Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut

VIRTUAL MEMORY READING: CHAPTER 9

CS 138: Communication I. CS 138 V 1 Copyright 2012 Thomas W. Doeppner. All rights reserved.

COSC4201. Chapter 5. Memory Hierarchy Design. Prof. Mokhtar Aboelaze York University

Peer-to-Peer Streaming Systems. Behzad Akbari

Virtual Memory. Motivation:

Multimedia Streaming. Mike Zink

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Personal Content Caching for Mobiles

Memory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple

COMP 3221: Microprocessors and Embedded Systems

COMP 346 WINTER 2018 MEMORY MANAGEMENT (VIRTUAL MEMORY)

Cost-based cache replacement and server selection for multimedia proxy across wireless Internet

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Seminar on. By Sai Rahul Reddy P. 2/2/2005 Web Caching 1

Today s class. Scheduling. Informationsteknologi. Tuesday, October 9, 2007 Computer Systems/Operating Systems - Class 14 1

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

Quality of Service in the Internet

Portland State University ECE 587/687. Caches and Memory-Level Parallelism

The Strategy of Batch Using Dynamic Cache for Streaming Media

LECTURE 12. Virtual Memory


The Memory System. Components of the Memory System. Problems with the Memory System. A Solution

a process may be swapped in and out of main memory such that it occupies different regions

CE693: Adv. Computer Networking

Cost Effective and Scalable Video Streaming Techniques

The Transmitted Strategy of Proxy Cache Based on Segmented Video

Memory Hierarchy Basics

Contents. Overview Multicast = Send to a group of hosts. Overview. Overview. Implementation Issues. Motivation: ISPs charge by bandwidth

Lecture 13: Cache Hierarchies. Today: cache access basics and innovations (Sections )

COSC 6385 Computer Architecture. - Memory Hierarchies (II)

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

Streaming Flow Analyses for Prefetching in Segment-based Proxy Caching to Improve Media Delivery Quality

Topics. Digital Systems Architecture EECE EECE Need More Cache?

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

COOCHING: Cooperative Prefetching Strategy for P2P Video-on-Demand System

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery

Comprehensive Study and Review Various Routing Protocols in MANET

Chapter 3 - Memory Management

Physical characteristics (such as packaging, volatility, and erasability Organization.

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo

UNIVERSITY OF OSLO Department of informatics. Investigating the limitations of video stream scheduling in the Internet. Master thesis.

Multicast EECS 122: Lecture 16

LSN 7 Cache Memory. ECT466 Computer Architecture. Department of Engineering Technology

Buffer Management Scheme for Video-on-Demand (VoD) System

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Module objectives. Integrated services. Support for real-time applications. Real-time flows and the current Internet protocols

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

Cache Performance (H&P 5.3; 5.5; 5.6)

CMPSC 311- Introduction to Systems Programming Module: Caching

Review: Computer Organization

THE CACHE REPLACEMENT POLICY AND ITS SIMULATION RESULTS

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

CS370 Operating Systems

Memory hierarchy review. ECE 154B Dmitri Strukov

Dynamic Broadcast Scheduling in DDBMS

II. Principles of Computer Communications Network and Transport Layer

First-In-First-Out (FIFO) Algorithm

Quality of Service in the Internet

A Proxy Caching Scheme for Continuous Media Streams on the Internet

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

EEC 483 Computer Organization

ECE 30 Introduction to Computer Engineering

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming

EECS 470. Lecture 15. Prefetching. Fall 2018 Jon Beaumont. History Table. Correlating Prediction Table

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming

IMPROVING LIVE PERFORMANCE IN HTTP ADAPTIVE STREAMING SYSTEMS

5 Memory-Hierarchy Design

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming

UbiqStor: Server and Proxy for Remote Storage of Mobile Devices

EE6762. BROADBAND NETWORKS Project Report: Caching

UCB CS61C : Machine Structures

Replicate It! Scalable Content Delivery: Why? Scalable Content Delivery: How? Scalable Content Delivery: How? Scalable Content Delivery: What?

Chapter 8: Virtual Memory. Operating System Concepts

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Memory Consistency. Challenges. Program order Memory access order

Transcription:

INF5071 Performance in distributed systems Distribution Part II 5 November 2010

Type IV Distribution Systems Combine Types I, II or III Network of servers Server hierarchy Autonomous servers Cooperative servers Coordinated servers Proxy caches Not accurate Cache servers Keep copies on behalf of a remote server Proxy servers Perform actions on behalf of their clients

Type IV Distribution Systems Combine Types I, II or III Network of servers Server hierarchy Autonomous servers Cooperative servers Coordinated servers Proxy caches Not accurate Cache servers Keep copies on behalf of a remote server Proxy servers Perform actions on behalf of their clients

Type IV Distribution Systems Combine Types I, II or III Network of servers Server hierarchy Autonomous servers Cooperative servers Coordinated servers Proxy caches Not accurate Cache servers Keep copies on behalf of a remote server Proxy servers Perform actions on behalf of their clients

Type IV Distribution Systems Variations Gleaning Autonomous, coordinated possible In komssys Proxy prefix caching Coordinated, autonomous possible In Blue Coat (which was formerly Cacheflow, which was formerly Entera) Periodic multicasting with pre-storage Coordinated The theoretical optimum

Gleaning Webster s Dictionary: from Late Latin glennare, of Celtic origin 1. to gather grain or other produce left by reapers 2. to gather information or material bit by bit Combine patching with caching ideas non-conflicting benefits of caching and patching Caching reduce number of end-to-end transmissions distribute service access points no single point of failure true on-demand capabilities Patching shorten average streaming time per client true on-demand capabilities

Gleaning Combines Patching & Caching ideas Wide-area scalable Reduced server load Reduced network load Can support standard clients Join! Proxy cache multicast cyclic buffer Central server Unicast patch stream Proxy cache Unicast Unicast 1 st client 2 nd client

Proxy Prefix Caching Split movie Prefix Suffix Operation Store prefix in prefix cache Coordination necessary! On demand Deliver prefix immediately Prefetch suffix from central server Goal Reduce startup latency Hide bandwidth limitations, delay and/or jitter in backbone Reduce load in backbone Prefix cache Central server Unicast Unicast Client

MCache One of several Prefix Caching variations Combines Batching and Prefix Caching Can be optimized per movie server bandwidth network bandwidth cache space Uses multicast Needs non-standard clients Prefix cache Unicast Central server Batch (multicast) Prefix cache Unicast 1 st client 2 nd client

Periodic Multicasting with Pre Storage Optimize storage and network Wide-area scalable Minimal server load achievable Reduced network load Can support standard clients Assumed start of the show T h e i Central server Specials Can optimize network load per subtree Negative Bad error behaviour 1 st client 2 nd client

Periodic Multicasting with Pre Storage Optimize storage and network Wide-area scalable Minimal server load achievable Reduced network load Can support standard clients Central server T h e Specials Can optimize network load per subtree T h e Negative Bad error behaviour T h e i 1 st client 2 nd client

Type IV Distribution Systems Autonomous servers Requires decision making on each proxy Some content must be discarded Caching strategies Coordinated servers Requires central decision making Global optimization of the system Cooperative servers No quantitative research yet

Autonomous servers

Simulation Binary tree model allows Allows analytical comparison of Caching Patching Gleaning central server optional cache server network link Considering optimal cache placement per movie basic server cost per-stream costs of storage, interface card, network link movie popularity according to Zipf distribution relative probability/x/1 0.16 0.08 0 0 20 40 60 80 100

Simulation Example 500 different movies 220 concurrent active users basic server: $25000 interface cost: $100/stream network link cost: $350/stream storage cost: $1000/stream Analytical comparison demonstrates potential of the approach very simplified Caching λ-patching Caching Unicast transmission 4664 Mio $ No caching Client side buffer Multicast 375 Mio $ Gleaning Caching Proxy client buffer Multicast 276 Mio $

Caching Strategies FIFO: First-in-first-out Remove the oldest object in the cache in favor of new objects LRU: Least recently used strategy Maintain a list of objects Move to head of the list whenever accessed Remove the tail of the list in favor of new objects LFU: Least frequently used Maintain a list distance between last two requests per object Distance can be time or number of requests to other objects Sort list: shortest distance first Remove the tail of the list in favor of new objects

Caching Strategies Considerations limited uplink bandwidth quickly exhausted performance degrades immediately when working set is too large for storage space conditional overwrite strategies can be highly efficient ECT: Eternal, Conditional, Temporal LFU Forget object statistics when removed Cache all requested objects Log # requests or time between hits ECT Remember object statistics forever Compare requested object and replacement candidate Log times between hits

Simulation Movies 500 movies Zipf-distributed popularity 1.5 MBit/s 5400 sec File size ~7.9 GB Central server Limited uplink Cache Unlimited downlinks Clients

Effects of caching strategies on user hit rates Cache Hit Ratio for single server, 64 GB cache 1 1 Cache Hit Ratio for single server, 96 GB cache Hit Ratio 0.75 ECT FIFO LRU Hit Ratio 0.75 better ECT FIFO LRU 0.5 0 500010000 20000 30000 40000 50000 Users 0.5 0 500010000 20000 30000 40000 50000 Users Hit ratio dumb strategies do (almost) not profit from cache size increases intelligent strategies profit hugely from cache size increases strategies that use conditional overwrite outperform other strategies massively doesn t have to be ECT

Effects of caching strategies on throughput better Uplink usage profits greatly from small cache increase... if there is a strategy conditional overwrite reduces uplink usage

Effects of number of movies on uplink usage better In spite of 99% hit rates Increasing the number of users will congest the uplink Note scheduling techniques provide no savings on low-popularity movies identical to unicast scenario with minimally larger caches

Effects of number of movies on hit ratio better Limited uplink bandwidth Prevents the exchange of titles with medium popularity Unproportional drop of efficiency for more users Strategy can not recognize medium popularity titles

Hint based Caching Hit ratio development, increasing #hints, ECT history 8 Hit ratio development, increasing #hints, ECT history 64 better Idea Caches consider requests to neighbour caches in their removal decisions Conclusion Instability due to uplink congestion can not be prevented Advantage exists and is logarithmic as expected Larger hint numbers maintain the advantage to the point of instability Intensity of instability is due to ECT problem ECT inherits IRG drawback of fixed size histograms

Simulation: Summary High relevance of population sizes complex strategies require large customer bases Efficiency of small caches 90:10 rule of thumb reasonable unlike web caching Efficiency of distribution mechanisms considerable bandwidth savings for uncached titles Effects of removal strategies relevance of conditional overwrite unlike web caching, paging, swapping,... Irrelevance of popularity changes on short timescales few cache updates compared to many direct deliveries

Coordinated servers

Distribution Architectures Combined optimization Scheduling algorithm Proxy placement and dimensioning origin server d-1 st level cache d-2 nd level cache 2 nd level cache 1 st level cache client

Distribution Architectures Combined optimization Scheduling algorithm Proxy placement and dimensioning No problems with simple scheduling mechanisms Examples Caching with unicast communication Caching with greedy patching Patching window in greedy patching is the movie length

Distribution Architectures Caching" Caching and Greedy Patching" origin! origin! Movies move away from clients 5! 5! C!a!c!h!e!L!e!v!e!l! 4! 3! 2! C!a!c!h!e!!L!e!v!e!l! 4! 3! 2! Network for free (not shown all at origin) 1! client! 200! 100! Movie (0-300 of 500)! 0! 1! 0.5! Link! Cost! 1! client! Increasing network costs 200! 100! 0! 1! 0.5! Link! Cost! Decreasing popularity top movie Movie (0-300 of 500)!

Distribution Architectures Combined optimization Scheduling algorithm Proxy placement and dimensioning Problems with complex scheduling mechanisms Examples Caching with λ patching Patching window is optimized for minimal server load Caching with gleaning A 1st level proxy cache maintains the client buffer for several clients Caching with MPatch The initial portion of the movie is cached in a 1st level proxy cache

Distribution Architectures origin server origin server at level 6 d-1 st level cache d-2 nd level cache 2 nd level cache 10 level x caches per level x+1 cache 1 st level cache 5000 clients per level 1 cache client

Distribution Architectures: λ Patching 1st client multicast cyclic buffer Central server Unicast patch stream 2nd client position in movie (offset) ΔM = 2 F Δ U time Number of concurrent streams

Distribution Architectures: λ Patching Placement for λ patching Popular movies may be more distant to the client

Distribution Architectures: λ Patching Failure of the optimization Implicitly assumes perfect delivery Has no notion of quality User satisfaction is ignored Disadvantages Popular movies further away from clients Longer distance Higher startup latency Higher loss rate More jitter Popular movies are requested more frequently Average delivery quality is lower

Distribution Architectures: Gleaning Placement for gleaning Combines Caching of the full movie Optimized patching Mandatory proxy cache 2 degrees of freedom Gleaning server level Patch length Proxy cache Unicast multicast cyclic buffer Central server Unicast patch stream Proxy cache Unicast 1 st client 2 nd client

Distribution Architectures: Gleaning Placement for gleaning

Distribution Architectures: MPatch Placement for MPatch Combines Caching of the full movie Partial caching in proxy servers Multicast in access networks Patching from the full copy Prefix cache Central server Batch (multicast) Prefix cache 3 degrees of freedom Server level Patch length Prefix length Unicast Unicast 1 st client 2 nd client

Distribution Architectures: MPatch Placement for MPatch

Approaches Current approached does not consider quality Penalize distance in optimality calculation Sort Penalty approach Low penalties Doesn t achieve order because actual cost is higher High penalties Doesn t achieve order because optimizer gets confused Sorting Trivial Very low resource waste

Distribution Architectures Combined optimization Scheduling algorithm Proxy placement and dimensioning Impossible to achieve optimum with autonomous caching Solution for complex scheduling mechanisms A simple solution exists: Enforce order according to priorities (simple sorting) Increase in resource use is marginal

Summary A lot of performance can be gained using appropriate distribution mechanisms type 1: delayed delivery type 2: prescheduled delivery type 3: client side caching type 4: network of servers combining types 1-3 Caching is (almost) always beneficial better with conditional replacements / overwrites hints from neighboring caches