CDN TUNING FOR OTT - WHY DOESN T IT ALREADY DO THAT? CDN Tuning for OTT - Why Doesn t It Already Do That?

Similar documents
The Guide to Best Practices in PREMIUM ONLINE VIDEO STREAMING

NINE MYTHS ABOUT. DDo S PROTECTION

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

CS November 2018

CS November 2017

VIDEO ON DEMAND SOLUTION BRIEF

MULTIPLAYER GAMING SOLUTION BRIEF

HOW NEWNODE WORKS. Efficient and Inefficient Networks NEWNODE. And Who Needs a Content Distribution Network Anyway?

COMPETITIVE EDGE IN THE CLOUD DRIVING GROWTH AND VALUE WITH ADAPTIVE DELIVERY, SECURITY, AND ACCELERATION

The New Net, Edge Computing, and Services. Michael R. Nelson, Ph.D. Tech Strategy, Cloudflare May 2018

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Control and Adapt Mechanisms Drive Successful Video Delivery Within Global Enterprise Networks

CONTENT-AWARE DNS. IMPROVING CONTENT-AWARE DNS RESOLUTION WITH AKAMAI DNSi CACHESERVE EQUIVALENCE CLASS. AKAMAI DNSi CACHESERVE

When Chrome Isn t Chrome

Guaranteeing Video Quality

Varnish Streaming Server

SSD Telco/MSO Case Studies

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

irtc: Live Broadcasting

Managing Caching Performance and Differentiated Services

Performance Characterization of a Commercial Video Streaming Service. Mojgan Ghasemi, Akamai Technologies - Princeton University

About Terracotta Ehcache. Version 10.1

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

COMP6218: Content Caches. Prof Leslie Carr

AKAMAI WHITE PAPER. Security and Mutual SSL Identity Authentication for IoT. Author: Sonia Burney Solutions Architect, Akamai Technologies

Live Digital Video Advertising Live Digital Video Advertising

SaaS Providers. ThousandEyes for. Summary

Adaptive Video Acceleration. White Paper. 1 P a g e

InterCall Virtual Environments and Webcasting

Q&A TAKING ENTERPRISE SECURITY TO THE NEXT LEVEL. An interview with John Summers, Enterprise VP and GM, Akamai

The Google File System

Internet Load Balancing Guide. Peplink Balance Series. Peplink Balance. Internet Load Balancing Solution Guide

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Memory. Objectives. Introduction. 6.2 Types of Memory

IMPROVING LIVE PERFORMANCE IN HTTP ADAPTIVE STREAMING SYSTEMS

SOLUTION GUIDE FOR BROADCASTERS

NETACEA / WHITE PAPER DNS VS JAVASCRIPT

Enterprise Overview. Benefits and features of Cloudflare s Enterprise plan FLARE

The Google File System

AKAMAI THREAT ADVISORY. Satori Mirai Variant Alert

Congestion? What Congestion? Mark Handley

AKAMAI CLOUD SECURITY SOLUTIONS

Acceleration Systems Technical Overview. September 2014, v1.4

Preparing your network for the next wave of innovation

Survey: Global Efficiency Held Back by Infrastructure Spend in Pharmaceutical Industry

BUILDING SCALABLE AND RESILIENT OTT SERVICES AT SKY

Measuring KSA Broadband

Lessons Learned Operating Active/Active Data Centers Ethan Banks, CCIE

LINEAR VIDEO DELIVERY FROM THE CLOUD. A New Paradigm for 24/7 Broadcasting WHITE PAPER

Voice, Video and Data Convergence:

List of measurements in rural area

Service Mesh and Microservices Networking

Achieving Low-Latency Streaming At Scale

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Mobile App Performance SDK. Configuration Guide

Imperva Incapsula Product Overview

CIO INSIGHTS Boosting Agility and Performance on the Evolving Internet

WHITE PAPER QUANTUM S XCELLIS SCALE-OUT NAS. Industry-leading IP Performance for 4K, 8K and Beyond

Sam Pickles, F5 Networks A DAY IN THE LIFE OF A WAF

Identifying Workloads for the Cloud

Sections Describing Standard Software Features

vsan 6.6 Performance Improvements First Published On: Last Updated On:

TCP Tuning for the Web

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

Cache introduction. April 16, Howard Huang 1

Cisco 5G Now! Product Announcements. February, 2018

THE UTILITY OF DNS TRAFFIC MANAGEMENT

PowerVault MD3 SSD Cache Overview

Not all SD-WANs are Created Equal: Performance Matters

How to Choose a CDN. Improve Website Performance and User Experience. Imperva, Inc All Rights Reserved

All-Flash Storage Solution for SAP HANA:

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

416 Distributed Systems. March 23, 2018 CDNs

PeerApp Case Study. November University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs

Powering the Next-Generation Video Experience

Radware AppDirector Load Balancing Microsoft LCS servers, LCS Director and LCS Access Proxy Servers.

Kollective Software Defined Enterprise Delivery Network

Today s Agenda. Today s Agenda 9/8/17. Networking and Messaging

WHITE PAPER. Best Practices for Web Application Firewall Management

COMMUNICATION PROTOCOLS

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard

The Guide to Best Practices in PREMIUM ONLINE VIDEO STREAMING

Addressing Data Management and IT Infrastructure Challenges in a SharePoint Environment. By Michael Noel

Overview of Akamai s Personal Data Processing Activities and Role

Java Without the Jitter

OpenCache. A Platform for Efficient Video Delivery. Matthew Broadbent. 1 st Year PhD Student

Live Broadcast: Video Services from AT&T

Never Drop a Call With TecInfo SIP Proxy White Paper

Using Kollective with Citrix Virtual Desktop Infrastructure (VDI)

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

ThousandEyes for. Application Delivery White Paper

THE STATE OF MEDIA SECURITY HOW MEDIA COMPANIES ARE SECURING THEIR ONLINE PROPERTIES

WHAT DOES GOOD LOOK LIKE?

Switch to Parallels Remote Application Server and Save 60% Compared to Citrix XenApp

Go beyond broadband with BTnet.

WEBSCALE CONVERGED APPLICATION DELIVERY PLATFORM

Welcome to Part 3: Memory Systems and I/O

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Transcription:

CDN Tuning for OTT - Why Doesn t It Already Do That?

When you initially onboarded your OTT traffic to a CDN, you probably went with default settings. And to be honest, why wouldn t you? A standard media configuration is designed for the short http-based segment delivery at scale. It removes the bottleneck of your origin connectivity, taking you from a couple of hundred Mb/s to several Gb/s (and Tb/s) of OTT traffic to your customers as your service grows. So, do you understand what this standard configuration is doing, and how to fine tune it for your current use case?

What are you getting, out of the box? A media configuration is designed specifically for chunk-based delivery over http or https. Custom TCP settings TCP traffic directly between a customer origin and client is usually quite sensitive to congestion it s what TCP was designed for: fault tolerance (over an ever-congested public Internet). The number of TCP packets in flight for a given request slowly ramps up after the initial handshake. Whenever packets time out due to mid-route congestion, the number of packets per request drops by half immediately and slowly ramps up again. This improves reliability of data, but at the expense of throughput. The standard media configuration uses custom TCP settings specifically for adaptive streaming over http persistent connections to reduce TCP connect times and make faster ramp-up and faster/more suitable TCP timeouts. Improved routing back to origin when getting a route for a request from client to origin, Internet traffic paths generally default to the lowest number of hops (subject to network peering agreements). However, this isn t always the fastest route, as hop congestion isn t taken into account. Akamai media configurations will look at alternate routes to the next cache layer or origin and route the requests according to load and traffic, bypassing slow hops usually caused by congested peering between networks over the Internet. Sometimes the quickest way from A to B is via C! Cache settings specific to streaming format the CDN media configuration isn t just a scaling extension to your origin. It s designed to default to best practice caching settings for the streaming format you are using, in order to populate edge caches efficiently and reduce traffic back to your origin, even if you don t configure these settings or content types on your own origin. Tiered distribution all media configs come with tiered distribution. Regional edge caches with no cached or expired objects check with a common parent tier or edge peer before going back to origin, reducing offload. This tiered distribution will be regionally specific to your customer footprint. Media map edge cache machines and parent tiers can be grouped into maps. These maps determine how traffic is routed using traffic behaviour patterns. Media maps can be specific to geography such as local (incountry) traffic distribution or worldwide traffic or content type, such as short-tail or long-tail VoD, live streams, or even high-demand event traffic. Traffic over your media configuration will be delivered specifically using a media map rather than a generic http(s) web traffic map, which improves throughput and reliability tuned specifically for short chunk-based streaming. The machines allocated to some maps may favour memory read/ write access while others may favour disk read speed, for example. Other maps may offer a balance between the two, or favour improved edge peer connectivity over improved parent connectivity. Ensuring you are on the right map for your content is always worth revisiting if your needs have changed since you launched the service.

So, all good so far. Why revisit it? IIf you haven t revisited your media configuration for some time, you may see some changes to the self-service options. These are mainly about fine-tuning to be more specific to your requirements. By understanding where your origin is and your audience distribution is, even if you ve moved to smaller fragments or higher resolution since launch, underlying settings can improve your performance. As an example, if you ve attempted to reduce your latency by reducing your fragment length, then the TCP connect timeouts, retry settings and even peer request timeouts would work better if they were aligned with your fragment length. If you ve tweaked your bitrate ladder, adding higher bitrates and larger resolutions may mean that your TCP settings need to be adjusted to provide the right performance. If you originally set your audience location to unknown or global, and are actually serving a regional audience, configuring this correctly will improve cacheability and performance. Even adjusting the popularity characteristics can improve performance by moving long-tail content to a lower cache turnover footprint so your content is less likely to be evicted from cache by other short-lived Internet traffic. Now that you ve fine tuned your configuration for a more appropriate audience footprint, selected the correct origin location, understood your content popularity and even adjusted to match your latest encoding ladder and fragment size, what more needs to be done? For most OTT customers, this out-of-the-box configuration with use-caseappropriate self-tuning will serve them well, even when they are serving hundreds of Gb/s of linear simulcast traffic. The CDN will handle scaling, improve throughput to origin in contrast to direct connections, mitigate DDOS attacks, and even provide additional features such as authentication or edge-based logic rules. The usual CDN functionality such as logic for origin health detection, origin failover and retry options, and dynamic construction of response objects such as manifests can be added in addition to the defaults through self service.

Now we can start to look at KPIs you ve established as your service has matured and tune your delivery configuration around those KPIs. You don t need to be serving Tb/s of traffic to know that a percentage of traffic will still go back to your origin. Is your origin particularly sensitive to load spikes? Do you want to reduce your origin offload even further? Do you have a KPI around client buffering instances? One of the questions we usually get when we propose fine-tuning a configuration for a specific solution is why doesn t it already do that? Well, the answer can often be complex, but it boils down to balancing customer-specific business objectives that may not always be mutually compatible. Out of the box, a media configuration offers a balance of reliability, performance, and offload. First and foremost, it is designed to scale. When you start to tweak in favour of one of those objectives, it is often at the expense of one of the others, and it is ultimately understanding the business priorities of a service that allows us to provide the right solution. One of the tuning tweaks we can introduce, for example, is mid-tier retry. The configuration constantly measures the response throughput for a forward object request to a parent. If it falls below certain throughput thresholds, it opens a simultaneous concurrent connection. The object will be delivered from whichever connection returns it first. It can even construct a large object response from partial byte-range requests via separate routes. This is good for mitigating midtier slowness, especially involving network providers peering over the public Internet, and improves KPIs based around buffering, but at the expense of additional traffic requests going to the origin. If offload is your key objective (for example, you can t dynamically scale your origin farm, or you are sensitive to egress costs or demand spikes), we can look to see if the traffic distribution would benefit from the introduction of an additional cache tier or additional peer requests. Additional parent or grandparent tiers can provide more offload, especially where the audience is more widely distributed. However, additional tiers introduce additional hops something you want to be avoiding if zero-buffer performance or ultra-low latency is your main KPI. The CDN can also smooth spikes at the edge by queuing requests whilst a forward object request is in flight.

Popular items can push other objects out of edge cache storage during a demand lull. Sometimes cache eviction is unavoidable, even when an object is within TTL, due to the finite edge capacity. As CDNs continue to add edge capacity, it is nonetheless shared among other streaming traffic. Large customers with high traffic needs can even request custom maps, specifically targeting their audience footprint and origin location. You can even scale up to a managed or licensed CDN deployment that offers dedicated machines for your traffic. Even without a dedicated managed CDN (MCDN), understanding your content s popularity (long tail, etc.) can assist in selecting a delivery map where you have more chance of a cache hit and objects aren t competing with short-lived content for cache footprint. Other adjustments checking with non-obvious edge peers for objects before going forward to a cache parent, tweaking retry counts before taking any fail action, queuing requests for the same object or response code on the edge, manipulating cache keys, consolidating or spreading cache footprint, failing over to alternate maps, etc. all have some degree of compromise. They will either favour performance, reliability, or offload, usually at the expense of the others. However, if your business objectives need weighting in one specific area, we can help give the config a nudge in the right direction, by engaging with Professional Services Consultants and ensuring we understand your KPIs. Similar under-the-hood adjustments can be made if you are pushing the boundaries with either ultra-low latency or extreme VR/UHD bitrates and need specialist tuning such as object pre-fetching to maximise performance or to give you that buffer zone on reliability. When your OTT traffic levels start to hit Tb/s, even the smallest tweak such as adding or reducing a second in TCP retry or timeout settings can have a noticeable impact on overall performance and crucial operational KPIs that you report back to the business. About Author Del Fowler is an Enterprise Architect with 20+ years of Media Streaming experience, working with broadcast and blue-chip companies. His specialties are transcode engine design and OTT workflow optimizations. Del is a certified ScrumMaster and a streaming engineer by heart. During his 3-year tenure at Akamai he has participated in a number of notable projects such as major sporting events and customer consultancy on VR and low latency requirements. Akamai secures and delivers digital experiences for the world s largest companies. Akamai s intelligent edge platform surrounds everything, from the enterprise to the cloud, so customers and their businesses can be fast, smart, and secure. Top brands globally rely on Akamai to help them realize competitive advantage through agile solutions that extend the power of their multi-cloud architectures. Akamai keeps decisions, apps and experiences closer to users than anyone and attacks and threats far away. Akamai s portfolio of edge security, web and mobile performance, enterprise access and video delivery solutions is supported by unmatched customer service, analytics and 24/7/365 monitoring. To learn why the world s top brands trust Akamai, visit www.akamai.com, blogs.akamai.com, or @Akamai on Twitter. You can find our global contact information at www.akamai.com/locations. Published 11/18.