From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profile

Similar documents
From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profiles

Predictive Elastic Database Systems. Rebecca Taft HPTS 2017

Review of Morphus Abstract 1. Introduction 2. System design

Developing SQL Databases (762)

Large-Scale Web Applications

CS 425 / ECE 428 Distributed Systems Fall 2015

BigDataBench-MT: Multi-tenancy version of BigDataBench

CIB Session 12th NoSQL Databases Structures

Developing Microsoft Azure Solutions (70-532) Syllabus

Caching Memcached vs. Redis

A Serializability Violation Detector for Shared-Memory Server Programs

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith

Auto Management for Apache Kafka and Distributed Stateful System in General

Multi-tenancy version of BigDataBench

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads

70-532: Developing Microsoft Azure Solutions

Tools for Social Networking Infrastructures

Scaling a Web Site. Lecture 14: Scale-out Parallelism, Elasticity, and Caching

Model-Driven Geo-Elasticity In Database Clouds

Facebook Tao Distributed Data Store for the Social Graph

A Key Value Store that Supports Strict SLAs and the Applications that Need it. Christopher Stewart The Ohio State University

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

MySQL for Developers. Duration: 5 Days

Systems Support for Carbon-Aware Cloud Applications

WHITEPAPER. MemSQL Enterprise Feature List

MySQL for Developers. Duration: 5 Days

Developing Microsoft Azure Solutions (70-532) Syllabus

Jinho Hwang and Timothy Wood George Washington University

SEER: LEVERAGING BIG DATA TO NAVIGATE THE COMPLEXITY OF PERFORMANCE DEBUGGING IN CLOUD MICROSERVICES

Power Provisioning for Diverse Datacenter Workloads

Help! I need more servers! What do I do?

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction:

CIT 668: System Architecture. Amazon Web Services

70-532: Developing Microsoft Azure Solutions

Automated Control for Elastic Storage

Discovering Dependencies between Virtual Machines Using CPU Utilization. Renuka Apte, Liting Hu, Karsten Schwan, Arpan Ghosh

Architekturen für die Cloud

Key Value Store. Yiding Wang, Zhaoxiong Yang

Developing Microsoft Azure Solutions (70-532) Syllabus

Scaling a Web Site. Lecture 14: Scale-out Parallelism, Elasticity, and Caching

Concepts Introduced in Chapter 6. Warehouse-Scale Computers. Programming Models for WSCs. Important Design Factors for WSCs

DIVING IN: INSIDE THE DATA CENTER

Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

A Study of Skew in MapReduce Applications

5 Fundamental Strategies for Building a Data-centered Data Center

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)

SugarCRM on IBM i Performance and Scalability TECHNICAL WHITE PAPER

Accelerate Big Data Insights

HPC in Cloud. Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni

Using MySQL for Distributed Database Architectures

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Middle East Technical University. Jeren AKHOUNDI ( ) Ipek Deniz Demirtel ( ) Derya Nur Ulus ( ) CENG553 Database Management Systems

Gables: A Roofline Model for Mobile SoCs

RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from Cache-Rich to Cache-Poor

Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross

Architecting Microsoft Azure Solutions (proposed exam 535)

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems

The Road to a Complete Tweet Index

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

SEO and UAEX.EDU GETTING YOUR WEB PAGES FOUND IN GOOGLE

systems & research project

Qlik Sense Enterprise architecture and scalability

CSE 124: Networked Services Lecture-17

Real Life Web Development. Joseph Paul Cohen

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Multimedia Streaming. Mike Zink

Memory-Based Cloud Architectures

Getafix: Workload-aware Distributed Interactive Analytics

DESIGN AS RISK MINIMIZATION

Ubiquitous and Mobile Computing CS 525M: Virtually Unifying Personal Storage for Fast and Pervasive Data Accesses

Part 12 殷亚凤. Homepage: Room 301, Building of Computer Science and Technology

Practical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars

MidoNet Scalability Report

Next-Generation Cloud Platform

Storage Systems for Serverless Analytics

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Data Center Performance

Aaron Sun, in collaboration with Taehoon Kang, William Greene, Ben Speakmon and Chris Mills

An Empirical Study of High Availability in Stream Processing Systems

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

The new Ehcache and Hibernate Caching SPI Provider

Efficient On-Demand Operations in Distributed Infrastructures

CS 655 Advanced Topics in Distributed Systems

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016

ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Aerospike Scales with Google Cloud Platform

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

RAMCloud. Scalable High-Performance Storage Entirely in DRAM. by John Ousterhout et al. Stanford University. presented by Slavik Derevyanko

Energy Management with AWS

Transcription:

From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profile Nan Deng, Zichen Xu, Christopher Stewart and Xiaorui Wang The Ohio State University

From the Outside Looking In Internet Service ProgrammableWeb, 2014 1. Motivation 2. Problem 3. Our Approach Web APIs Google Maps Facebook Amazon S3 The typical web page loads data from 7-25 third party providers [Everts, 2013] In 2013, the number of indexed APIs grew By 32% year over year [PW, 2013]

From the Outside Looking In 1. Motivation 2. Problem 3. Our Approach Using Web APIs Improve content without programming Benefits Salaries are 20% of expenses [tripadvisor] Published interfaces provide well defined, often RESTful, output Failures, dynamic workloads, corner cases covered Data is centralized, managed by experts Efficient to move compute to big data

From the Outside Looking In 1. Motivation 2. Problem 3. Our Approach Using web APIs risks availability & performance Everyone has bad days, and third-party content providers are no exception. Tammy Everts...a bug [affected] people from third party sites integrated with Facebook on Feb 7, 2013 Took down CNN & WaPost Soms web APIs perform poorly because the were implemented poorly CDN Planet Homepage reported that Facebook took 796 ms to load, 2X longer than any other critical content Slow responses cost 3.5B/y [Everts, 2013]

From the Outside Looking In 1. Motivation 2. Problem 3. Our Approach Web API Probes Web APIs Google Maps Models of Cloud Design Captured Response Times CDF Profile Extraction Workload serv. per request CDF energy Is the web API well implemented? How will it respond under extreme operating conditions? Challenge: Create useful profiles faster than trial-and-error approach Hypothesis: Given cloud design, response s carry a lot of valuable information.

From the Outside Looking In 1. Motivation 2. Problem 3. Our Approach Use Case: Planning web API Usage Google Maps versus Bing Maps Same avg. resp. & availability Which has heavier tail? Should we use both (Replication for predictability [Fast 10]) Use case: Model Resource Needs of Third Parties DataGreening and Ecosia are green hosts [ICAC 2012] DataGreening offsets the carbon footprint of email users that route through its servers Must model carbon footprint of IMAP web APIs

From the Outside Looking In 1. Motivation 2. Problem 3. Our Approach Related work and alternative approaches Controlled offline tests yield workload profiles [ugaonkar,2005][stewart,2005] Tracing online execution of requests [isaacs,2004][shen,2008][powertracer,icac] Use logs from online execution to infer profiles [stewart,2008] More inside access than web API permit web API encouraged to hide details and provide false data

Widely used cloud computing practices Structure of cloud-based web API Hierarchical tiered design Elastic scaling Make the common case fast Independent component Analysis Application to workload profiling Early results

Elastic scaling: When resource demands increase, provision more resources. When demands decrease, release resources. High response Images from GoGrid blog, 2013 Stable response Active research: React to metrics other than response to better stabilize performance without using too many resources [Ghandi,TOC 2012] [Nguyen,ICAC 2013] Response is less sensitive to workload changes

Make the common case fast [P & H] In software design: Data processing in background In platform design: Garbage collection not on critical path In hardware design: Guard band prevents 99.99% of timing errors that would otherwise trigger ECC in processor cache Response s follow skewed multi-modal distributions 99th % > 99X mean Access (ms) 3-node Zookeeper on 4 core 2.4Ghz, data size = 1 GB, 100K writes issued serially Graph and data from Stewart et al. ICAC 13

Blind source separation techniques infer source signals from output signals Given F(X,Y,Z) infer X, Y and/or Z Independent Component Analysis (ICA) is an established approach [Herault & Jutton, 1986] provided sources are Independent & Non-Gaussian

ICA is famously used to recover audio signals Example: 3 people sing their favorite song at the same. They stand still in the same room. 3 microphones record output. 3 Noisy Recordings

ICA is famously used to recover audio signals Let It Go Menzel Normalized Happy Williams Roar Perry Time Mic #1 Mic #2 Mic #3 1 3 4 10 2 4 5 11 3 7 2 6 What is the least Gaussian signal that could have produced this data?

Key insight: Use ICA to infer web API components - Transform spatial dimension into concurrency Resp. Time Concurrent Query #1 Response is the sum of service and queuing delay of comp. on the critical path Resp. Time Concurrent Query #2 Resp. Time Concurrent Query # 3

Key insight: Use ICA to infer web API components - Transform spatial dimension into concurrency - Audio sources to component service s Resp. Time Concurrent Query #1 Apache Httpd Memcached MySQL Resp. Time Concurrent Query #2 Resp. Time Concurrent Query # 3

ICA captures service distribution of web API components provided Service s are non-gaussian (Common case fast) Service s are independent (Elastic scaling) Final output is normalized CDF for each component

Benchmark comprised of widely used components Serving non-stationary request arrivals [stewart, 2007] 3 components on the critical path User Zookeeper Cluster Load Balance Cluster Auto Scale Controller Partitioned Database Redis Cluster Cache hit Cache miss Auto scale control

Hand matched normalized CDF to source components Less than 5% prediction error Setup the same benchmark using Zookeeper as a cache instead of Redis Zookeeper is a poor choice because each insert involves Paxos and capacity per cluster is lower than Redis Zookeeper setup had the fatter tail. This could be a warning sign

Conclusion Web API hide workload details that could help Internet services plan for performance Programming practices in cloud computing allow new inferences about workloads Blind source separation techniques yield useful workload profiles within the web API model From the outside looking in, we can infer a lot